HELP COMPILE Robert Duncan, November 1989 The PML compiler. CONTENTS - (Use g to access required sections) -- Compiling Files -- File Names -- Searching for Files -- Autoloading -- Compiler Modes -- Compiler Output -- Compiler Warnings -- The Compile Module -- Summary of Compiler Functions and Variables -- Compiling Files ---------------------------------------------------- The standard method of compiling a file in PML is to use the command load which invokes the compiler on the named file (the syntax and use of commands is described in HELP * COMMANDS). The contents of the file are read into the PML system just as if they had been typed at top-level; in particular, the file itself may contain more load commands to compile other files. Because LOAD is a command rather than a function, the filename argument needs no special quoting. Thus load lib.ml compiles the file ``lib.ml'' from the current working directory. The argument may also be omitted entirely, in which case the compiler recompiles the last file loaded; useful when a file contains errors of any kind. If the named file can't be found or can't be read because of access permissions, the load fails with an error something like: Error: cannot open file (no such file or directory) lib.ml (but see below, in the section on 'Searching for Files'). An alternative method of calling the compiler is to apply the function -use- defined in structure -Compile-. In this case, the argument must be a string: Compile.use "lib.ml"; Although this is slightly more verbose than using LOAD, it does have potential advantages. In the first place, if the named file can't be read for any reason, -use- raises the exception -Compile.Use- which allows the calling program to retain control. It's also possible to use the function in an expression, perhaps to compile a whole list of files as in the next example: List.app Compile.use [ "file-1.ml", .... "file-n.ml" ]; However, the precise semantics of the -use- function require some explanation: it does not compile the file immediately at the point at which it is applied but rather just checks that the file exists and is readable, and adds its name to a queue of files awaiting compilation. The files in the queue are not actually compiled until the current declaration has completed, and then only if no error occurs. The effect of this is that it's impossible to compile a file while any evaluation is in progress. In the previous example, if any one of the files "file-1" to "file-n" did not exist (so causing an exception), then none of them would be compiled: the exception would abort the evaluation and so prevent the compiler from processing the queue. You can also compile files from inside VED: the commands load (* compiles a named file *) l1 (* compiles the current buffer *) lmr (* compiles the marked range *) are described in HELP * MLVED. -- File Names --------------------------------------------------------- File names, whether quoted or not, may be any legal operating system pathname. Abbreviations such as logical names or environment variables (and, on UNIX, C-shell style "~" abbreviations) are translated appropriately. Example: load $poplib/pml/functions.ml (* UNIX *) load poplib:[pml]functions.ml (* VMS *) As a general rule, ML program files should end in one or other of the extensions "sig" or "ml", "sig" for files defining signatures and "ml" for others: having the two distinct allows a signature file to have the same basename as its structure or functor counterpart. Sticking to this rule means that it's often possible to omit the extension from a file name given to the compiler, as in: load lib When interpreting a filename, the rule used by the compiler is: first, look for a file with the name as given. If not found, and the name doesn't already have an extension, try adding the extension "ml"; if that doesn't exist either, try "sig". The "ml" and "sig" extensions are conventions used by the PML system. Different sites may have their own rules: "sml" is a typical alternative. You can extend the convention by assigning to the variables -Compile.filetype- and -Compile.sigfiletype-: the compiler will try these extensions before the standard ones. The default values for these variables are ".ml" and ".sig" respectively. It's important to include the dot ("."), as the extensions are simply appended to the basename. This makes it possible to add arbitrary text to the end of a name: for example, doing Compile.sigfiletype := "_sig.ml"; allows for the convention S.ml (* for structure S *) S_sig.ml (* for signature S *) -- Searching for Files ------------------------------------------------ If a filename given to the compiler is not absolute, it need not necessarily be located in the current directory. When called at top-level, the compiler will always look first for files in the current directory, but if not found there, it will then search each of the directories listed in the variable -Compile.searchpath-. This is initially set up to contain the names of system library directories, so that a system library file can be loaded simply by giving its basename, as in load Option which compiles the library *OPTION. The searchpath is user-assignable, so you can add your own library directories to it: this is typically done in the "init.ml" initialisation file. If a file itself contains more calls to the compiler, those subsidiary files will be sought first relative to the directory of the parent file and only then in the current directory and the -searchpath- list. This can simplify the organisation of applications built from several files, since only the location of the primary file needs to be known exactly: the locations of any other files loaded from it can be expressed relative to that. Example: a directory called "example" contains some source files, "file-1.ml" to "file-n.ml", and a link file, "link.ml", containing the lines: load file-1 load file-2 .... load file-n The command load example/link is sufficient to load the whole application. This works whether "example" is a subdirectory of the current working directory, or of some library directory in the -searchpath- list. -- Autoloading -------------------------------------------------------- Certain files can be compiled automatically. Whenever an undefined structure name is encountered in a program. the compiler will construct a file name by appending ".ml" (or the value of -Compile.filetype- if different) to the structure name and try compiling the resulting file using the standard searching mechanism described above. This process is called AUTOLOADING, and the idea is that the file (if found) should provide a definition for the undefined structure, allowing compilation to continue. Only if the file doesn't exist or doesn't define the right structure is an ``Unbound identifier'' error given. Autoloading makes it possible to use library structures without having to explicitly compile them: the declaration open Option; for example will automatically compile the *OPTION library, assuming it hasn't already been done. It works for functors and signatures too, although with signatures the signature file extension (-sigfiletype-) is used instead to construct the file name, making it possible to autoload both a structure (or functor) and its signature. Autoloading blurs the distinction between builtin and library modules. This is generally convenient, but may not always be so: a good example is copying a program to a different system, when it's important to know exactly which libraries are being used. In such a case you can stop autoloading by assigning -false- to the variable -Compile.autoload-. -- Compiler Modes ----------------------------------------------------- The compiler has two modes of working: normal mode and debug mode. The current mode is determined by the flag -Compile.debug-: set this -false- for normal mode (the default) or -true- for debug mode. The flag does not apply to autoloaded files which are always compiled in normal mode: if you want to compile a file in debug mode, you must load it explicitly while the -debug- flag is set. In debug mode, the compiler adds extra code to every function to provide information and hooks for use by dynamic debugging tools. This obviously increases code size and may reduce execution speed, but code compiled in normal mode does not support any kind of dynamic debugging. The only debugging aid currently provided by PML is a function call tracer described in HELP * TRACE. Other tools may be added at a later date. -- Compiler Output ---------------------------------------------------- The compiler will normally produce a report on every top-level declaration processed, consisting of a list of all the identifiers bound by the declaration, together with their types (or signatures) and where appropriate their values. Such information is important: values obviously, but types equally so, since the true type of an identifier may not always be that envisaged by the programmer (due to a misconception of some kind) leading almost certainly to type errors later on. Nevertheless, the amount of output can become excessive, and so can be controlled by two flags. Setting -Compile.quiet- to -true- suppresses all (non-error) output, even from expressions evaluated interactively. Less drastically, setting -Compile.quiet_load- suppresses output only from compilations invoked via the LOAD command or -use- function, output from the original interactive session being left untouched. This flag is set automatically when files are autoloaded, making autoloading invisible. Finally, if the values of variables are complex, leading to unwanted amounts of output, this can be curtailed by changing the -depth- indicator of the top-level printer described in HELP * PRINTER. The compiler will produce extra output summarising the time taken for each top-level declaration or compilation when the flag -Compile.timer- is set (this is independent of the flags described above). For each declaration typed interactively, the compiler will add two lines to the end of the normal report looking something like: ;;; Compilation: CPU time = 0.01 secs, GC time = 0.0 secs ;;; Execution: CPU time = 0.04 secs, GC time = 0.0 secs This is fairly self-explanatory. ``Compilation'' covers the time taken to parse and typecheck the declaration and to generate code; ``Execution'' is the time taken executing any code planted. The CPU time figure is the total CPU time; the GC figure is the amount of that taken up in garbage collection. When compiling a file, times are collected for the whole file and summarised at the end, prefixed with the file name: ;;; File: lib.ml ;;; Compilation: CPU time = 0.56 secs, GC time = 0.0 secs ;;; Execution: CPU time = 0.02 secs, GC time = 0.0 secs -- Compiler Warnings -------------------------------------------------- The flag -Compile.warnings- controls the production of warning messages by the compiler. Currently the only warnings produced are those required by the Standard ML definition concerning the exhaustiveness and irredundancy of matches, but it is envisaged that a more sophisticated series of warnings will be enabled at a later date. For now, warnings are either on or off depending on the setting of the -warnings- flag. Disabling warnings turns off the exhaustiveness checking as well as the messages, so can improve the running speed of the compiler, but this is recommended only for programs which are already known to produce no warnings. -- The Compile Module ------------------------------------------------- signature Compile structure Compile : Compile The variables and functions discussed above are defined in the builtin structure -Compile-, described by the following signature: signature Compile = sig (* Compiling Files *) exception Use of string val use : string -> unit val filetype : string ref val sigfiletype : string ref val searchpath : string list ref (* Compiler Flags *) val autoload : bool ref val closure_rules : bool ref val debug : bool ref val quiet : bool ref val quiet_load : bool ref val timer : bool ref val traceback : bool ref val warnings : bool ref (* Localising Compiler Flags *) val localise : 'a ref -> 'a ref end -- Summary of Compiler Functions and Variables ------------------------ exception Use of string val use (filename : string) : unit Adds the given -filename- to a queue of files waiting to be compiled: the files are compiled in order at the end of the current declaration. The existence of the file and its access permissions are checked before the name is added to the queue; the exception Use(filename) is raised if the file cannot be read for any reason. val filetype : string ref The default extension for ML program files. The compiler will append this extension automatically to filenames where required. VED also recognises files with this extension as being ML files. Default value: ".ml" val sigfiletype : string ref The default extension for ML signature files. The compiler will append this extension automatically to filenames where required. VED also recognises files with this extension as being ML files. Default value: ".sig" val searchpath : string list ref A list of directories for the compiler to search when looking for a file with a relative pathname. Default value: ["$poplocal/local/pml/lib/", "$usepop/pop/pml/lib/"] (* UNIX *) ["poplocal:[local.pml.lib]", "usepop:[pop.pml.lib]"] (* VMS *) val autoload : bool ref When -true-, the compiler will try automatically compiling library files in order to resolve undefined module names (structures, signatures and functors). When -false-, undefined module names cause an immediate error. Default value: true val debug : bool ref Determines the current compilation mode. Set this -false- for normal mode or -true- for debug mode. The different modes are described in the section ``Compiler Modes'' above. Default value: false val closure_rules : bool ref When -true-, the compiler imposes ``closure rules'' on modules. The exact rules are particular to PML: no module (structure, signature or functor) may refer to any non-module name (variable, constructor, exception or type constructor) external to the module which has not been declared "pervasive" (see HELP * PERVASIVE). The initial basis is automatically pervasive. When -false-, no closure rules are applied: all names have the same scope inside modules as outside. This is in line with the Standard ML definition. Default value: false val quiet : bool ref Suppresses all non-error output from the compiler. In the normal case, the compiler reports on every top-level declaration processed (but see -quiet_load- below). Setting -quiet- stops all output. Default value: false val quiet_load : bool ref Suppresses output from recursive calls to the compiler, i.e. not from the very top-level session, but from calls invoked via the LOAD command or -use- function. Setting this -true- means that expressions or simple declarations evaluated interactively or from inside VED will still print results, but more extensive output caused by compiling files will be suppressed. Note that -quiet- takes precedence over -quiet_load-: setting -quiet- suppresses all output regardless of the setting of -quiet_load-. Default value: false val timer : bool ref Enables the printing of timing statistics by the compiler. Default value: false val traceback : bool ref Enables exception traceback. When this flag is -true-, each exception raised displays a traceback on the standard error stream. Default value: false. NB: this feature is experimental, and is not guaranteed to be provided in exactly this form in future releases of PML. The general form of an exception traceback is as follows: E X .... X H The first line (prefixed "E") displays the exception packet raised. Subsequent lines (prefixed "X") display the names of functions aborted by the exception, where the first such line names the function which raised the exception. Anonymous functions print as "". Consecutive calls to the same function are abbreviated to a single line, with the number of calls given in parentheses after the function name. So the line X loop (*28) indicates that the function -loop- had been called 28 times. The last line of the traceback (prefixed "H") gives the name of the function which handled the exception. This line may not appear if there was no handler installed; in this case, the traceback should be followed immediately by the top-level exception message. val warnings : bool ref Controls any warnings produced by the compiler: if set -false-, no warning messages are produced. This allows the compiler to omit a certain amount of checking, so speeding up compilation; however, it is inadvisable to routinely disregard all warnings. Default value: true val localise (r : 'a ref) : 'a ref Localises the value of the reference -r- for the duration of the current compilation. The value of the reference is saved away, to be restored whenever the compilation finishes, whether this happens normally or because of some error. The function is meant for use in localising the effects of changes to compiler flags. For example, placing the line Compile.localise Compile.warnings := false; at the start of a program file will ensure that warning messages are suppressed throughout the rest of the file, but the previous value of the -warnings- flag will be restored a soon as compilation of the file has been completed. Note that changes to localised flags do effect nested compilations. --- C.all/pml/help/compile --- Copyright University of Sussex 1994. All rights reserved. ----------