The Caml Light system release 0.71 Documentation and user's manual Xavier Leroy March 11, 1996 Copyright oc 1996 Institut National de Recherche en Informatique et Automatique CCoonntteennttss II GGeettttiinngg ssttaarrtteedd 77 11 IInnssttaallllaattiioonn iinnssttrruuccttiioonnss 88 1.1 The Unix version. . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2 The Macintosh version . . . . . . . . . . . . . . . . . . . . . . 8 1.3 The PC version. . . . . . . . . . . . . . . . . . . . . . . . . . 9 IIII TThhee CCaammll LLiigghhtt llaanngguuaaggee rreeffeerreennccee mmaannuuaall 1111 22 TThhee ccoorree CCaammll LLiigghhtt llaanngguuaaggee 1122 2.1 Lexical conventions . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 Global names. . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 Type expressions. . . . . . . . . . . . . . . . . . . . . . . . . 17 2.5 Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.6 Patterns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.7 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.8 Global definitions. . . . . . . . . . . . . . . . . . . . . . . . 28 2.9 Directives. . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.10 Module implementations. . . . . . . . . . . . . . . . . . . . . . 30 2.11 Module interfaces . . . . . . . . . . . . . . . . . . . . . . . . 31 33 LLaanngguuaaggee eexxtteennssiioonnss 3322 3.1 Streams, parsers, and printers. . . . . . . . . . . . . . . . . . 32 3.2 Guards. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.3 Range patterns. . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.4 Recursive definitions of values . . . . . . . . . . . . . . . . . 34 3.5 Local definitions using where . . . . . . . . . . . . . . . . . . 34 3.6 Mutable variant types . . . . . . . . . . . . . . . . . . . . . . 34 3.7 String access . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.8 Alternate syntax. . . . . . . . . . . . . . . . . . . . . . . . . 35 3.9 Infix symbols . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.10 Directives. . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 IIIIII TThhee CCaammll LLiigghhtt ccoommmmaannddss 3388 44 BBaattcchh ccoommppiillaattiioonn ((ccaammllcc)) 3399 4.1 Overview of the compiler. . . . . . . . . . . . . . . . . . . . . 39 4.2 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.3 Modules and the file system . . . . . . . . . . . . . . . . . . . 43 4.4 Common errors . . . . . . . . . . . . . . . . . . . . . . . . . . 43 55 TThhee ttoopplleevveell ssyysstteemm ((ccaammlllliigghhtt)) 4477 5.1 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 5.2 Toplevel control functions. . . . . . . . . . . . . . . . . . . . 50 5.3 The toplevel and the module system. . . . . . . . . . . . . . . . 52 5.4 Common errors . . . . . . . . . . . . . . . . . . . . . . . . . . 54 1 2 5.5 Building custom toplevel systems: camlmktop. . . . . . . . . . . 54 5.6 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 66 TThhee rruunnttiimmee ssyysstteemm ((ccaammllrruunn)) 5566 6.1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.2 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.3 Common errors . . . . . . . . . . . . . . . . . . . . . . . . . . 57 77 TThhee lliibbrraarriiaann ((ccaammlllliibbrr)) 5599 7.1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 7.2 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 7.3 Turning code into a library . . . . . . . . . . . . . . . . . . . 60 88 LLeexxeerr aanndd ppaarrsseerr ggeenneerraattoorrss ((ccaammlllleexx,, ccaammllyyaacccc)) 6622 8.1 Overview of camllex . . . . . . . . . . . . . . . . . . . . . . . 62 8.2 Syntax of lexer definitions . . . . . . . . . . . . . . . . . . . 63 8.3 Overview of camlyacc. . . . . . . . . . . . . . . . . . . . . . . 64 8.4 Syntax of grammar definitions . . . . . . . . . . . . . . . . . . 65 8.5 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 8.6 A complete example. . . . . . . . . . . . . . . . . . . . . . . . 67 99 TThhee ddeebbuuggggeerr ((ccaammllddeebbuugg)) 6699 9.1 Compiling for debugging . . . . . . . . . . . . . . . . . . . . . 69 9.2 Invocation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 9.3 Commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 9.4 Executing a program . . . . . . . . . . . . . . . . . . . . . . . 71 9.5 Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 9.6 The call stack. . . . . . . . . . . . . . . . . . . . . . . . . . 74 9.7 Examining variable values . . . . . . . . . . . . . . . . . . . . 75 9.8 Controlling the debugger. . . . . . . . . . . . . . . . . . . . . 76 9.9 Miscellaneous commands. . . . . . . . . . . . . . . . . . . . . . 78 1100 PPrrooffiilliinngg ((ccaammllpprroo)) 7799 10.1 Compiling for profiling . . . . . . . . . . . . . . . . . . . . . 79 10.2 Profiling an execution. . . . . . . . . . . . . . . . . . . . . . 80 10.3 Printing profiling information. . . . . . . . . . . . . . . . . . 80 10.4 Known bugs. . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 1111 UUssiinngg CCaammll LLiigghhtt uunnddeerr EEmmaaccss 8811 11.1 Updating your .emacs. . . . . . . . . . . . . . . . . . . . . . . 81 11.2 The caml editing mode . . . . . . . . . . . . . . . . . . . . . . 81 11.3 Running the toplevel as an inferior process . . . . . . . . . . . 82 11.4 Running the debugger as an inferior process . . . . . . . . . . . 82 1122 IInntteerrffaacciinngg CC wwiitthh CCaammll LLiigghhtt 8844 12.1 Overview and compilation information. . . . . . . . . . . . . . . 84 12.2 The value type. . . . . . . . . . . . . . . . . . . . . . . . . . 86 12.3 Representation of Caml Light data types . . . . . . . . . . . . . 87 12.4 Operations on values. . . . . . . . . . . . . . . . . . . . . . . 88 12.5 Living in harmony with the garbage collector. . . . . . . . . . . 90 12.6 A complete example. . . . . . . . . . . . . . . . . . . . . . . . 92 IIVV TThhee CCaammll LLiigghhtt lliibbrraarryy 9955 1133 TThhee ccoorree lliibbrraarryy 9966 13.1 bool: boolean operations . . . . . . . . . . . . . . . . . . . . 96 13.2 builtin: base types and constructors . . . . . . . . . . . . . . 97 13.3 char: character operations . . . . . . . . . . . . . . . . . . . 98 3 13.4 eq: generic comparisons . . . . . . . . . . . . . . . . . . . . 98 13.5 exc: exceptions . . . . . . . . . . . . . . . . . . . . . . . . 99 13.6 fchar: character operations, without sanity checks . . . . . . . 100 13.7 float: operations on floating-point numbers . . . . . . . . . . 100 13.8 fstring: string operations, without sanity checks . . . . . . . 101 13.9 fvect: operations on vectors, without sanity checks . . . . . . 102 13.10int: operations on integers . . . . . . . . . . . . . . . . . . 102 13.11io: buffered input and output . . . . . . . . . . . . . . . . . 104 13.12list: operations on lists . . . . . . . . . . . . . . . . . . . 109 13.13pair: operations on pairs . . . . . . . . . . . . . . . . . . . 112 13.14ref: operations on references . . . . . . . . . . . . . . . . . 112 13.15stream: operations on streams . . . . . . . . . . . . . . . . . 113 13.16string: string operations . . . . . . . . . . . . . . . . . . . 114 13.17vect: operations on vectors . . . . . . . . . . . . . . . . . . 115 1144 TThhee ssttaannddaarrdd lliibbrraarryy 111188 14.1 arg: parsing of command line arguments . . . . . . . . . . . . . 118 14.2 baltree: basic balanced binary trees . . . . . . . . . . . . . . 119 14.3 filename: operations on file names . . . . . . . . . . . . . . . 120 14.4 format: pretty printing . . . . . . . . . . . . . . . . . . . . 121 14.5 gc: memory management control and statistics . . . . . . . . . . 125 14.6 genlex: a generic lexical analyzer . . . . . . . . . . . . . . . 127 14.7 hashtbl: hash tables and hash functions . . . . . . . . . . . . 128 14.8 lexing: the run-time library for lexers generated by camllex . . 129 14.9 map: association tables over ordered types . . . . . . . . . . . 130 14.10parsing: the run-time library for parsers generated by camlyacc 131 14.11printexc: a catch-all exception handler . . . . . . . . . . . . 132 14.12printf: formatting printing functions . . . . . . . . . . . . . 132 14.13queue: queues . . . . . . . . . . . . . . . . . . . . . . . . . 133 14.14random: pseudo-random number generator . . . . . . . . . . . . . 134 14.15set: sets over ordered types . . . . . . . . . . . . . . . . . . 134 14.16sort: sorting and merging lists . . . . . . . . . . . . . . . . 136 14.17stack: stacks . . . . . . . . . . . . . . . . . . . . . . . . . 136 14.18sys: system interface. . . . . . . . . . . . . . . . . . . . . . 137 1155 TThhee ggrraapphhiiccss lliibbrraarryy 114400 15.1 graphics: machine-independent graphics primitives . . . . . . . 141 1166 TThhee uunniixx lliibbrraarryy:: UUnniixx ssyysstteemm ccaallllss 114477 16.1 unix: interface to the Unix system . . . . . . . . . . . . . . . 147 1177 TThhee nnuumm lliibbrraarryy:: aarrbbiittrraarryy--pprreecciissiioonn rraattiioonnaall aarriitthhmmeettiicc 116666 17.1 num: operations on numbers . . . . . . . . . . . . . . . . . . . 166 17.2 arith_status: flags that control rational arithmetic . . . . . . 169 1188 TThhee ssttrr lliibbrraarryy:: rreegguullaarr eexxpprreessssiioonnss aanndd ssttrriinngg pprroocceessssiinngg 117700 18.1 str: regular expressions and high-level string processing . . . 170 VV AAppppeennddiixx 117744 1199 FFuurrtthheerr rreeaaddiinngg 117755 19.1 Programming in ML . . . . . . . . . . . . . . . . . . . . . . . . 175 19.2 Descriptions of ML dialects . . . . . . . . . . . . . . . . . . . 176 19.3 Implementing functional programming languages . . . . . . . . . . 177 19.4 Applications of ML. . . . . . . . . . . . . . . . . . . . . . . . 178 IInnddeexx ttoo tthhee lliibbrraarryy 117799 4 IInnddeexx ooff kkeeyywwoorrddss 118866 FFoorreewwoorrdd This manual documents the release 0.71 of the Caml Light system. It is organized as follows. - Part I, ``Getting started'', explains how to install Caml Light on your machine. - Part II, ``The Caml Light language reference manual'', is the reference description of the Caml Light language. - Part III, ``The Caml Light commands'', documents the Caml Light compiler, toplevel system, and programming utilities. - Part IV, ``The Caml Light library'', describes the modules provided in the standard library. - Part V, ``Appendix'', contains a short bibliography, an index of all identifiers defined in the standard library, and an index of Caml Light keywords. CCoonnvveennttiioonnss The Caml Light system comes in several versions: for Unix machines, for Macintoshes, and for PCs. The parts of this manual that are specific to one version are presented as shown below: UUnniixx:: This is material specific to the Unix version. MMaacc:: This is material specific to the Macintosh version. PPCC:: This is material specific to the PC version. LLiicceennssee c The Caml Light system is copyright o 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996 Institut National de Recherche en Informatique et en Automatique (INRIA). INRIA holds all ownership rights to the Caml Light system. See the file COPYRIGHT in the distribution for the copyright notice. The Caml Light system can be freely copied, but not sold. More precisely, INRIA grants any user of the Caml Light system the right to reproduce it, provided that the copies are distributed free of charge and under the conditions given in the COPYRIGHT file. The present documentation is distributed under the same conditions. 5 6 AAvvaaiillaabbiilliittyy bbyy FFTTPP The complete Caml Light distribution resides on the machine ftp.inria.fr. The distribution files can be transferred by anonymous FTP: Host: ftp.inria.fr (Internet address 192.93.2.54) Login name: anonymous Password: your e-mail address Directory: lang/caml-light Files: see the index in file README PPaarrtt II GGeettttiinngg ssttaarrtteedd 7 CChhaapptteerr 11 IInnssttaallllaattiioonn iinnssttrruuccttiioonnss This chapter explains how to install Caml Light on your machine. 11..11 TThhee UUnniixx vveerrssiioonn RReeqquuiirreemmeennttss.. Any machine that runs under one of the various flavors of the Unix operating system, and that has a flat, non-segmented, 32-bit or 64-bit address space. 4M of RAM, 2M of free disk space. The graphics library requires X11 release 4 or later. IInnssttaallllaattiioonn.. The Unix version is distributed in source format, as a compressed tar file named cl7unix.tar.Z. To extract, move to the directory where you want the source files to reside, transfer cl7unix.tar.Z to that directory, and execute zcat cl7unix.tar.Z | tar xBf - This extracts the source files in the current directory. The file INSTALL contains complete instructions on how to configure, compile and install Caml Light. Read it and follow the instructions. TTrroouubblleesshhoooottiinngg.. See the file INSTALL. 11..22 TThhee MMaacciinnttoosshh vveerrssiioonn RReeqquuiirreemmeennttss.. Any Macintosh with at least 1M of RAM (2M is recommended), running System 6 or 7. About 850K of free space on the disk. The parts of the Caml Light system that support batch compilation currently require the Macintosh Programmer's Workshop (MPW) version 3.2. MPW is Apple's development environment, and it is distributed by APDA, Apple's Programmers and Developers Association. See the file READ ME in the distribution for APDA's address. IInnssttaallllaattiioonn.. Create the folder where the Caml Light files will reside. Double-click on the file cl7macbin.sea from the distribution. This displays a file dialog box. Open the folder where the Caml Light files will reside, and click on the Extract button. This will re-create all files from the distribution in the Caml Light folder. To test the installation, double-click on the application Caml Light. The ``Caml Light output'' window should display something like > Caml Light version 0.7 # 8 Chapter 1. Installation instructions 9 In the ``Caml Light input'' window, enter 1+2;; and press the Return key. The ``Caml Light output'' window should display: > Caml Light version 0.7 #1+2;; - : int = 3 # Select ``Quit'' from the ``File'' menu to return to the Finder. If you have MPW, you can install the batch compilation tools as follows. The tools and scripts from the tools folder must reside in a place where MPW will find them as commands. There are two ways to achieve this result: either copy the files in the tools folder to the Tools or the Scripts folder in your MPW folder; or keep the files in the tools folder and add the following line to your UserStartup file (assuming Caml Light resides in folder Caml Light on the disk named My HD): Set Commands "{Commands},My HD:Caml Light:tools:" In either case, you now have to edit the camlc script, and replace the string Macintosh HD:Caml Light:lib: (in the first line) with the actual pathname of the lib folder. For example, if you put Caml Light in folder Caml Light on the disk named My HD, the first line of camlc should read: Set stdlib "My HD:Caml Light:lib:" TTrroouubblleesshhoooottiinngg.. Here is one commonly encountered problem. Cannot find file stream.zi (Displayed in the ``Caml Light output'' window, with an alert box telling you that Caml Light has terminated abnormally.) This is an installation error. The folder named lib in the distribution must always be in the same folder as the Caml Light application. It's OK to move the application to another folder; but remember to move the lib directory to the same folder. (To return to the Finder, first select ``Quit'' from the ``File'' menu.) 11..33 TThhee PPCC vveerrssiioonn RReeqquuiirreemmeennttss.. A PC equipped with a 80386, 80486 or Pentium processor, running Windows 3.x, Windows 95 or Windows NT. About 3M of free space on the disk. At least 8M of RAM is recommended. IInnssttaallllaattiioonn.. Windows 3.x users must install first the Win32s compatibility system. Win32s is distributed along with Caml Light and contains detailed installation instructions. In the following, we assume that the distribution files resides in drive A:, and that the hard disk on which you are installing Caml Light is drive C:. If this is not the case, replace A: and C: by the appropriate drives. Change to a directory on the hard disk where the Caml Light distribution will reside. The installation will create a subdirectory named CAML in that directory, and put all the Caml Light files in CAML. In the following, we Chapter 1. Installation instructions 10 assume that you will be installing from C:\, thus putting all Caml Light files in C:\CAML. Execute the following commands: C: cd \ A:pkunzip -d A:cl71win (Be careful not to omit the -d option to pkunzip.) The remainder of the installation procedure is described in the CAML\INSTALL.TXT file contained in the distribution. PPaarrtt IIII TThhee CCaammll LLiigghhtt llaanngguuaaggee rreeffeerreennccee mmaannuuaall 11 CChhaapptteerr 22 TThhee ccoorree CCaammll LLiigghhtt llaanngguuaaggee FFoorreewwoorrdd This document is intended as a reference manual for the Caml Light language. It lists all language constructs, and gives their precise syntax and informal semantics. It is by no means a tutorial introduction to the language: there is not a single example. A good working knowledge of the language, as provided by the companion tutorial F_u_n_c_t_i_o_n_a_l_ p_r_o_g_r_a_m_m_i_n_g_ u_s_i_n_g_ C_a_m_l_ L_i_g_h_t_, is assumed. No attempt has been made at mathematical rigor: words are employed with their intuitive meaning, without further definition. As a consequence, the typing rules have been left out, by lack of the mathematical framework required to express them, while they are definitely part of a full formal definition of the language. The reader interested in truly formal descriptions of languages from the ML family is referred to T_h_e_ d_e_f_i_n_i_t_i_o_n_ o_f_ S_t_a_n_d_a_r_d_ M_L_ and C_o_m_m_e_n_t_a_r_y_ o_n_ S_t_a_n_d_a_r_d_ M_L_, by Milner, Tofte and Harper, MIT Press. WWaarrnniinngg Several implementations of the Caml Light language are available, and they evolve at each release. Consequently, this document carefully distinguishes the language and its implementations. Implementations can provide extra language constructs; moreover, all points left unspecified in this reference manual can be interpreted differently by the implementations. The purpose of this reference manual is to specify those features that all implementations must provide. NNoottaattiioonnss The syntax of the language is given in BNF-like notation. Terminal symbols are set in typewriter font (like this). Non-terminal symbols are set in italic font (l_i_k_e_ t_h_a_t_). Square brackets [...] denote optional components. Curly brackets {...} denotes zero, one or several repetitions of the enclosed components. Curly bracket with a trailing plus sign {...}+ denote one or several repetitions of the enclosed components. Parentheses (...) denote grouping. 12 Chapter 2. The core Caml Light language 13 22..11 LLeexxiiccaall ccoonnvveennttiioonnss BBllaannkkss The following characters are considered as blanks: space, newline, horizontal tabulation, carriage return, line feed and form feed. Blanks are ignored, but they separate adjacent identifiers, literals and keywords that would otherwise be confused as one single identifier, literal or keyword. CCoommmmeennttss Comments are introduced by the two characters (*, with no intervening blanks, and terminated by the characters *), with no intervening blanks. Comments are treated as blank characters. Comments do not occur inside string or character literals. Nested comments are correctly handled. IIddeennttiiffiieerrss i_d_e_n_t_ ::= l_e_t_t_e_r_ {l_e_t_t_e_r_ | 0...9 | _} l_e_t_t_e_r_ ::= A...Z | a...z Identifiers are sequences of letters, digits and _ (the underscore character), starting with a letter. Letters contain at least the 52 lowercase and uppercase letters from the ASCII set. Implementations can recognize as letters other characters from the extended ASCII set. Identifiers cannot contain two adjacent underscore characters (__). Implementation may limit the number of characters of an identifier, but this limit must be above 256 characters. All characters in an identifier are meaningful. IInntteeggeerr lliitteerraallss i_n_t_e_g_e_r_-_l_i_t_e_r_a_l_ ::= [-] {0...9}+ | [-] (0x | 0X) {0...9 | A...F | a...f}+ | [-] (0o | 0O) {0...7}+ | [-] (0b | 0B) {0...1}+ An integer literal is a sequence of one or more digits, optionally preceded by a minus sign. By default, integer literals are in decimal (radix 10). The following prefixes select a different radix: -------------------------------- |Prefix|Radix | -------------------------------- |0x, 0X|hexadecimal (radix 16) | |0o, 0O|octal (radix 8) | |0b, 0B|binary (radix 2) | -------------------------------- (The initial 0 is the digit zero; the O for octal is the letter O.) FFllooaattiinngg--ppooiinntt lliitteerraallss f_l_o_a_t_-_l_i_t_e_r_a_l_ ::= [-] {0...9}+ [. {0...9}] [(e | E) [+ | -] {0...9}+] Floating-point decimals consist in an integer part, a decimal part and an exponent part. The integer part is a sequence of one or more digits, optionally preceded by a minus sign. The decimal part is a decimal point followed by zero, one or more digits. The exponent part is the character e or E followed by an optional + or - sign, followed by one or more digits. The decimal part or the exponent part can be omitted, but not both to avoid ambiguity with integer literals. Chapter 2. The core Caml Light language 14 CChhaarraacctteerr lliitteerraallss c_h_a_r_-_l_i_t_e_r_a_l_ ::= ` r_e_g_u_l_a_r_-_c_h_a_r_ ` | ` \ (\ | ` | n | t | b | r) ` | ` \ (0...9) (0...9) (0...9) ` Character literals are delimited by ` (backquote) characters. The two backquotes enclose either one character different from ` and \, or one of the escape sequences below: -------------------------------------------------------- |Sequence|Character denoted | -------------------------------------------------------- |\\ |backslash (\) | |\` |backquote (`) | |\n |newline (LF) | |\r |return (CR) | |\t |horizontal tabulation (TAB) | |\b |backspace (BS) | |\d_d_d_ |the character with ASCII code d_d_d_ in decimal | -------------------------------------------------------- SSttrriinngg lliitteerraallss s_t_r_i_n_g_-_l_i_t_e_r_a_l_ ::= " {s_t_r_i_n_g_-_c_h_a_r_a_c_t_e_r_} " s_t_r_i_n_g_-_c_h_a_r_a_c_t_e_r_ ::= r_e_g_u_l_a_r_-_c_h_a_r_ | \ (\ | " | n | t | b | r) | \ (0...9) (0...9) (0...9) String literals are delimited by " (double quote) characters. The two double quotes enclose a sequence of either characters different from " and \, or escape sequences from the table below: -------------------------------------------------------- |Sequence|Character denoted | -------------------------------------------------------- |\\ |backslash (\) | |\" |double quote (") | |\n |newline (LF) | |\r |return (CR) | |\t |horizontal tabulation (TAB) | |\b |backspace (BS) | |\d_d_d_ |the character with ASCII code d_d_d_ in decimal | -------------------------------------------------------- 16 Implementations must support string literals up to 2 -1 characters in length (65535 characters). KKeeyywwoorrddss The identifiers below are reserved as keywords, and cannot be employed otherwise: and as begin do done downto else end exception for fun function if in let match mutable not of or prefix rec then to try type value where while with The following character sequences are also keywords: # ! != & ( ) * *. + +. , - -. -> . .( / /. : :: Chapter 2. The core Caml Light language 15 := ; ;; < <. <- <= <=. <> <>. = =. == > >. >= >=. @ [ [| ] ^ _ __ { | |] } ' AAmmbbiigguuiittiieess Lexical ambiguities are resolved according to the ``longest match'' rule: when a character sequence can be decomposed into two tokens in several different ways, the decomposition retained is the one with the longest first token. 22..22 GGlloobbaall nnaammeess Global names are used to denote value variables, value constructors (constant or non-constant), type constructors, and record labels. Internally, a global name consists of two parts: the name of the defining module (the module name), and the name of the global inside that module (the local name). The two parts of the name must be valid identifiers. Externally, global names have the following syntax: g_l_o_b_a_l_-_n_a_m_e_ ::= i_d_e_n_t_ | i_d_e_n_t_ __ i_d_e_n_t_ The form i_d_e_n_t_ __ i_d_e_n_t_ is called a qualified name. The first identifier is the module name, the second identifier is the local name. The form i_d_e_n_t_ is called an unqualified name. The identifier is the local name; the module name is omitted. The compiler infers this module name following the completion rules given below, therefore transforming the unqualified name into a full global name. To complete an unqualified identifier, the compiler checks a list of modules, the opened modules, to see if they define a global with the same local name as the unqualified identifier. When one is found, the identifier is completed into the full name of that global. That is, the compiler takes as module name the name of an opened module that defines a global with the same local name as the unqualified identifier. If several modules satisfy this condition, the one that comes first in the list of opened modules is selected. The list of opened modules always includes the module currently being compiled (checked first). (In the case of a toplevel-based implementation, this is the module where all toplevel definitions are entered.) It also includes a number of standard library modules that provide the initial environment (checked last). In addition, the #open and #close directives can be used to add or remove modules from that list. The modules added with #open are checked after the module currently being compiled, but before the initial standard library modules. Chapter 2. The core Caml Light language 16 v_a_r_i_a_b_l_e_ ::= g_l_o_b_a_l_-_n_a_m_e_ | prefix o_p_e_r_a_t_o_r_-_n_a_m_e_ o_p_e_r_a_t_o_r_-_n_a_m_e_ ::= + | - | * | / | mod | +. | -. | *. | /. | @ | ^ | ! | := | = | <> | == | != | ! | < | <= | > | <= | <. | <=. | >. | <=. c_c_o_n_s_t_r_ ::= g_l_o_b_a_l_-_n_a_m_e_ | [] | () n_c_c_o_n_s_t_r_ ::= g_l_o_b_a_l_-_n_a_m_e_ | prefix :: t_y_p_e_c_o_n_s_t_r_ ::= g_l_o_b_a_l_-_n_a_m_e_ l_a_b_e_l_ ::= g_l_o_b_a_l_-_n_a_m_e_ Depending on the context, global names can stand for global variables (v_a_r_i_a_b_l_e_), constant value constructors (c_c_o_n_s_t_r_), non-constant value constructors (n_c_c_o_n_s_t_), type constructors (t_y_p_e_c_o_n_s_t_r_), or record labels (l_a_b_e_l_). For variables and value constructors, special names built with prefix and an operator name are recognized. The tokens [] and () are also recognized as built-in constant constructors (the empty list and the unit value). The syntax of the language restricts labels and type constructors to appear in certain positions, where no other kind of global names are accepted. Hence labels and type constructors have their own name spaces. Value constructors and value variables live in the same name space: a global name in value position is interpreted as a value constructor if it appears in the scope of a type declaration defining that constructor; otherwise, the global name is taken to be a value variable. For value constructors, the type declaration determines whether a constructor is constant or not. 22..33 VVaalluueess This section describes the kinds of values that are manipulated by Caml Light programs. 22..33..11 BBaassee vvaalluueess IInntteeggeerr nnuummbbeerrss 30 30 Integer values are integer numbers from -2 to 2 -1, that is -1073741824 to 1073741823. Implementations may support a wider range of integer values. FFllooaattiinngg--ppooiinntt nnuummbbeerrss Floating-point values are numbers in floating-point representation. Everything about floating-point values is implementation-dependent, including the range of representable numbers, the number of significant digits, and the way floating-point results are rounded. CChhaarraacctteerrss Character values are represented as 8-bit integers between 0 and 255. Character codes between 0 and 127 are interpreted following the ASCII standard. The interpretation of character codes between 128 and 255 is implementation-dependent. Chapter 2. The core Caml Light language 17 CChhaarraacctteerr ssttrriinnggss String values are finite sequences of characters. Implementations must 16 support strings up to 2 -1 characters in length (65535 characters). Implementations may support longer strings. 22..33..22 TTuupplleess Tuples of values are written (v1,...,vn), standing for the n-tuple of values 14 v1 to vn. Tuples of up to 2 -1 elements (16383 elements) must be supported, though implementations may support tuples with more elements. 22..33..33 RReeccoorrddss Record values are labeled tuples of values. The record value written {label1=v1 ;...;labeln =vn} associates the value vi to the record label 14 labeli, for i=1...n. Records with up to 2 -1 fields (16383 fields) must be supported, though implementations may support records with more fields. 22..33..44 AArrrraayyss Arrays are finite, variable-sized sequences of values of the same type. 14 Arrays of length up to 2 -1 (16383 elements) must be supported, though implementations may support larger arrays. 22..33..55 VVaarriiaanntt vvaalluueess Variant values are either a constant constructor, or a pair of a non-constant constructor and a value. The former case is written cconstr; the latter case is written ncconstr(v), where v is said to be the argument of the non-constant constructor ncconstr. The following constants are treated like built-in constant constructors: ------------------------------ Constant Constructor ------------------------------ false the boolean false true the boolean true () the ``unit'' value [] the empty list ------------------------------ 22..33..66 FFuunnccttiioonnss Functional values are mappings from values to values. 22..44 TTyyppee eexxpprreessssiioonnss t_y_p_e_x_p_r_ ::= ' i_d_e_n_t_ | ( t_y_p_e_x_p_r_ ) | t_y_p_e_x_p_r_ -> t_y_p_e_x_p_r_ | t_y_p_e_x_p_r_ {* t_y_p_e_x_p_r_}+ | t_y_p_e_c_o_n_s_t_r_ | t_y_p_e_x_p_r_ t_y_p_e_c_o_n_s_t_r_ | ( t_y_p_e_x_p_r_ {, t_y_p_e_x_p_r_} ) t_y_p_e_c_o_n_s_t_r_ Chapter 2. The core Caml Light language 18 The table below shows the relative precedences and associativity of operators and non-closed type constructions. The constructions with higher precedences come first. --------------------------------------------- |Operator |Associativity | --------------------------------------------- |Type constructor application |-- | |* |-- | |-> |right | --------------------------------------------- Type expressions denote types in definitions of data types as well as in type constraints over patterns and expressions. TTyyppee vvaarriiaabblleess The type expression ' i_d_e_n_t_ stands for the type variable named i_d_e_n_t_. In data type definitions, type variables are names for the data type parameters. In type constraints, they represent unspecified types that can be instantiated by any type to satisfy the type constraint. PPaarreenntthheessiizzeedd ttyyppeess The type expression ( t_y_p_e_x_p_r_ ) denotes the same type as t_y_p_e_x_p_r_. FFuunnccttiioonn ttyyppeess The type expression t_y_p_e_x_p_r_1 -> t_y_p_e_x_p_r_2 denotes the type of functions mapping arguments of type t_y_p_e_x_p_r_1 to results of type t_y_p_e_x_p_r_2. TTuuppllee ttyyppeess The type expression t_y_p_e_x_p_r_1 *...* t_y_p_e_x_p_r_n denotes the type of tuples whose elements belong to types t_y_p_e_x_p_r_1,...t_y_p_e_x_p_r_n respectively. CCoonnssttrruucctteedd ttyyppeess Type constructors with no parameter, as in t_y_p_e_c_o_n_s_t_r_, are type expressions. The type expression t_y_p_e_x_p_r_ t_y_p_e_c_o_n_s_t_r_, where t_y_p_e_c_o_n_s_t_r_ is a type constructor with one parameter, denotes the application of the unary type constructor t_y_p_e_c_o_n_s_t_r_ to the type t_y_p_e_x_p_r_. The type expression (t_y_p_e_x_p_r_1,...,t_y_p_e_x_p_r_n) t_y_p_e_c_o_n_s_t_r_, where t_y_p_e_c_o_n_s_t_r_ is a type constructor with n parameters, denotes the application of the n-ary type constructor t_y_p_e_c_o_n_s_t_r_ to the types t_y_p_e_x_p_r_1 through t_y_p_e_x_p_r_n. 22..55 CCoonnssttaannttss c_o_n_s_t_a_n_t_ ::= i_n_t_e_g_e_r_-_l_i_t_e_r_a_l_ | f_l_o_a_t_-_l_i_t_e_r_a_l_ | c_h_a_r_-_l_i_t_e_r_a_l_ | s_t_r_i_n_g_-_l_i_t_e_r_a_l_ | c_c_o_n_s_t_r_ The syntactic class of constants comprises literals from the four base types (integers, floating-point numbers, characters, character strings), and constant constructors. Chapter 2. The core Caml Light language 19 22..66 PPaatttteerrnnss p_a_t_t_e_r_n_ ::= i_d_e_n_t_ | _ | p_a_t_t_e_r_n_ as i_d_e_n_t_ | ( p_a_t_t_e_r_n_ ) | ( p_a_t_t_e_r_n_ : t_y_p_e_x_p_r_ ) | p_a_t_t_e_r_n_ | p_a_t_t_e_r_n_ | c_o_n_s_t_a_n_t_ | n_c_c_o_n_s_t_r_ p_a_t_t_e_r_n_ | p_a_t_t_e_r_n_ , p_a_t_t_e_r_n_ {, p_a_t_t_e_r_n_} | { l_a_b_e_l_ = p_a_t_t_e_r_n_ {; l_a_b_e_l_ = p_a_t_t_e_r_n_} } | [ ] | [ p_a_t_t_e_r_n_ {; p_a_t_t_e_r_n_} ] | p_a_t_t_e_r_n_ :: p_a_t_t_e_r_n_ The table below shows the relative precedences and associativity of operators and non-closed pattern constructions. The constructions with higher precedences come first. ---------------------------------------- |Operator |Associativity | ---------------------------------------- |Constructor application|-- | |:: |right | |, |-- | || |left | |as |-- | ---------------------------------------- Patterns are templates that allow selecting data structures of a given shape, and binding identifiers to components of the data structure. This selection operation is called pattern matching; its outcome is either ``this value does not match this pattern'', or ``this value matches this pattern, resulting in the following bindings of identifiers to values''. VVaarriiaabbllee ppaatttteerrnnss A pattern that consists in an identifier matches any value, binding the identifier to the value. The pattern _ also matches any value, but does not bind any identifier. AAlliiaass ppaatttteerrnnss The pattern p_a_t_t_e_r_n_1 as i_d_e_n_t_ matches the same values as p_a_t_t_e_r_n_1. If the matching against p_a_t_t_e_r_n_1 is successful, the identifier i_d_e_n_t_ is bound to the matched value, in addition to the bindings performed by the matching against p_a_t_t_e_r_n_1. PPaarreenntthheessiizzeedd ppaatttteerrnnss The pattern ( p_a_t_t_e_r_n_1 ) matches the same values as p_a_t_t_e_r_n_1. A type constraint can appear in a parenthesized patterns, as in ( p_a_t_t_e_r_n_1 : t_y_p_e_x_p_r_ ). This constraint forces the type of p_a_t_t_e_r_n_1 to be compatible with t_y_p_e_. ````OOrr'''' ppaatttteerrnnss The pattern p_a_t_t_e_r_n_1 | p_a_t_t_e_r_n_2 represents the logical ``or'' of the two patterns p_a_t_t_e_r_n_1 and p_a_t_t_e_r_n_2. A value matches p_a_t_t_e_r_n_1 | p_a_t_t_e_r_n_2 either if it matches p_a_t_t_e_r_n_1 or if it matches p_a_t_t_e_r_n_2. The two sub-patterns Chapter 2. The core Caml Light language 20 p_a_t_t_e_r_n_1 and p_a_t_t_e_r_n_2 must contain no identifiers. Hence no bindings are returned by matching against an ``or'' pattern. CCoonnssttaanntt ppaatttteerrnnss A pattern consisting in a constant matches the values that are equal to this constant. VVaarriiaanntt ppaatttteerrnnss The pattern n_c_c_o_n_s_t_r_ p_a_t_t_e_r_n_1 matches all variants whose constructor is equal to n_c_c_o_n_s_t_r_, and whose argument matches p_a_t_t_e_r_n_1. The pattern p_a_t_t_e_r_n_1 :: p_a_t_t_e_r_n_2 matches non-empty lists whose heads match p_a_t_t_e_r_n_1, and whose tails match p_a_t_t_e_r_n_2. This pattern behaves like prefix :: ( p_a_t_t_e_r_n_1 , p_a_t_t_e_r_n_2 ). The pattern [ p_a_t_t_e_r_n_1 ;...; p_a_t_t_e_r_n_n ] matches lists of length n whose elements match p_a_t_t_e_r_n_1 ...p_a_t_t_e_r_n_n, respectively. This pattern behaves like p_a_t_t_e_r_n_1 ::...:: p_a_t_t_e_r_n_n :: []. TTuuppllee ppaatttteerrnnss The pattern p_a_t_t_e_r_n_1 ,..., p_a_t_t_e_r_n_n matches n-tuples whose components match the patterns p_a_t_t_e_r_n_1 through p_a_t_t_e_r_n_n. That is, the pattern matches the tuple values (v_1,...,v_n) such that p_a_t_t_e_r_n_i matches v_i for i =1, ...,n. RReeccoorrdd ppaatttteerrnnss The pattern { l_a_b_e_l_1 = p_a_t_t_e_r_n_1 ;...; l_a_b_e_l_n = p_a_t_t_e_r_n_n } matches records that define at least the labels l_a_b_e_l_1 through l_a_b_e_l_n, and such that the value associated to l_a_b_e_l_i match the pattern p_a_t_t_e_r_n_i, for i= 1,...,n. The record value can define more labels than l_a_b_e_l_1 ...l_a_b_e_l_n; the values associated to these extra labels are not taken into account for matching. Chapter 2. The core Caml Light language 21 22..77 EExxpprreessssiioonnss e_x_p_r_ ::= i_d_e_n_t_ | v_a_r_i_a_b_l_e_ | c_o_n_s_t_a_n_t_ | ( e_x_p_r_ ) | begin e_x_p_r_ end | ( e_x_p_r_ : t_y_p_e_x_p_r_ ) | e_x_p_r_ , e_x_p_r_ {, e_x_p_r_} | n_c_c_o_n_s_t_r_ e_x_p_r_ | e_x_p_r_ :: e_x_p_r_ | [ e_x_p_r_ {; e_x_p_r_} ] | [| e_x_p_r_ {; e_x_p_r_} |] | { l_a_b_e_l_ = e_x_p_r_ {; l_a_b_e_l_ = e_x_p_r_} } | e_x_p_r_ e_x_p_r_ | p_r_e_f_i_x_-_o_p_ e_x_p_r_ | e_x_p_r_ i_n_f_i_x_-_o_p_ e_x_p_r_ | e_x_p_r_ . l_a_b_e_l_ | e_x_p_r_ . l_a_b_e_l_ <- e_x_p_r_ | e_x_p_r_ .( e_x_p_r_ ) | e_x_p_r_ .( e_x_p_r_ ) <- e_x_p_r_ | e_x_p_r_ & e_x_p_r_ | e_x_p_r_ or e_x_p_r_ | if e_x_p_r_ then e_x_p_r_ [else e_x_p_r_] | while e_x_p_r_ do e_x_p_r_ done | for i_d_e_n_t_ = e_x_p_r_ (to | downto) e_x_p_r_ do e_x_p_r_ done | e_x_p_r_ ; e_x_p_r_ | match e_x_p_r_ with s_i_m_p_l_e_-_m_a_t_c_h_i_n_g_ | fun m_u_l_t_i_p_l_e_-_m_a_t_c_h_i_n_g_ | function s_i_m_p_l_e_-_m_a_t_c_h_i_n_g_ | try e_x_p_r_ with s_i_m_p_l_e_-_m_a_t_c_h_i_n_g_ | let [rec] l_e_t_-_b_i_n_d_i_n_g_ {and l_e_t_-_b_i_n_d_i_n_g_} in e_x_p_r_ s_i_m_p_l_e_-_m_a_t_c_h_i_n_g_ ::= p_a_t_t_e_r_n_ -> e_x_p_r_ {| p_a_t_t_e_r_n_ -> e_x_p_r_} m_u_l_t_i_p_l_e_-_m_a_t_c_h_i_n_g_ ::= p_a_t_t_e_r_n_-_l_i_s_t_ -> e_x_p_r_ {| p_a_t_t_e_r_n_-_l_i_s_t_ -> e_x_p_r_} p_a_t_t_e_r_n_-_l_i_s_t_ ::= p_a_t_t_e_r_n_ {p_a_t_t_e_r_n_} l_e_t_-_b_i_n_d_i_n_g_ ::= p_a_t_t_e_r_n_ = e_x_p_r_ | v_a_r_i_a_b_l_e_ p_a_t_t_e_r_n_-_l_i_s_t_ = e_x_p_r_ p_r_e_f_i_x_-_o_p_ ::= - | -. | ! i_n_f_i_x_-_o_p_ ::= + | - | * | / | mod | +. | -. | *. | /. | @ | ^ | ! | := | = | <> | == | != | < | <= | > | <= | <. | <=. | >. | <=. The table below shows the relative precedences and associativity of operators and non-closed constructions. The constructions with higher precedence come first. Chapter 2. The core Caml Light language 22 --------------------------------------------- |Construction or operator |Associativity | --------------------------------------------- |! |-- | |. .( |-- | |function application |right | |constructor application |-- | |- -. (prefix) |-- | |mod |left | |* *. / /. |left | |+ +. - -. |left | |:: |right | |@ ^ |right | |comparisons (= == < etc.) |left | |not |-- | |& |left | |or |left | |, |-- | |<- := |right | |if |-- | |; |right | |let match fun function try |-- | --------------------------------------------- 22..77..11 SSiimmppllee eexxpprreessssiioonnss CCoonnssttaannttss Expressions consisting in a constant evaluate to this constant. VVaarriiaabblleess Expressions consisting in a variable evaluate to the value bound to this variable in the current evaluation environment. The variable can be either a qualified identifier or a simple identifier. Qualified identifiers always denote global variables. Simple identifiers denote either a local variable, if the identifier is locally bound, or a global variable, whose full name is obtained by qualifying the simple identifier, as described in section 2.2. PPaarreenntthheessiizzeedd eexxpprreessssiioonnss The expressions ( e_x_p_r_ ) and begin e_x_p_r_ end have the same value as e_x_p_r_. Both constructs are semantically equivalent, but it is good style to use begin...end inside control structures: if ... then begin ... ; ... end else begin ... ; ... end and (...) for the other grouping situations. Parenthesized expressions can contain a type constraint, as in ( e_x_p_r_ : t_y_p_e_ ). This constraint forces the type of e_x_p_r_ to be compatible with t_y_p_e_. FFuunnccttiioonn aabbssttrraaccttiioonn The most general form of function abstraction is: 1 m fun pattern1 ... pattern1 -> expr1 | ... 1 m | patternn ... patternn -> exprn Chapter 2. The core Caml Light language 23 This expression evaluates to a functional value with m_ curried arguments. When this function is applied to m_ values v_1 ... v_m, the values are matched 1 m against each pattern row p_a_t_t_e_r_n_i...p_a_t_t_e_r_n_i for i_ from 1 to n_. If one of these matchings succeeds, that is if the value v_j matches the pattern j p_a_t_t_e_r_n_i for all j=1, ...,m, then the expression e_x_p_r_i associated to the selected pattern row is evaluated, and its value becomes the value of the function application. The evaluation of e_x_p_r_i takes place in an environment enriched by the bindings performed during the matching. If several pattern rows match the arguments, the one that occurs first in the function definition is selected. If none of the pattern rows matches the argument, the exception Match_failure is raised. If the function above is applied to less than m_ arguments, a functional value is returned, that represents the partial application of the function to the arguments provided. This partial application is a function that, when applied to the remaining arguments, matches all arguments against the pattern rows as described above. Matching does not start until all m_ arguments have been provided to the function; hence, partial applications of the function to less than m_ arguments never raise Match_failure. All pattern rows in the function body must contain the same number of patterns. A variable must not be bound more than once in one pattern row. Functions with only one argument can be defined with the function keyword instead of fun: function pattern1 -> expr1 | ... | patternn -> exprn The function thus defined behaves exactly as described above. The only difference between the two forms of function definition is how a parsing ambiguity is resolved. The two forms c_c_o_n_s_t_r_ p_a_t_t_e_r_n_ (two patterns in a row) and n_c_c_o_n_s_t_r_ p_a_t_t_e_r_n_ (one pattern) cannot be distinguished syntactically. Function definitions introduced by fun resolve the ambiguity to the former form; function definitions introduced by function resolve it to the latter form (the former form makes no sense in this case). FFuunnccttiioonn aapppplliiccaattiioonn Function application is denoted by juxtaposition of expressions. The expression e_x_p_r_1 e_x_p_r_2...e_x_p_r_n evaluates the expressions e_x_p_r_1 to e_x_p_r_n. The expression e_x_p_r_1 must evaluate to a functional value, which is then applied to the values of e_x_p_r_2,...,e_x_p_r_n. The order in which the expressions e_x_p_r_1,...,e_x_p_r_n are evaluated is not specified. LLooccaall ddeeffiinniittiioonnss The let and let rec constructs bind variables locally. The construct let p_a_t_t_e_r_n_1 = e_x_p_r_1 and...and p_a_t_t_e_r_n_n = e_x_p_r_n in e_x_p_r_ evaluates e_x_p_r_1...e_x_p_r_n in some unspecified order, then matches their values against the patterns p_a_t_t_e_r_n_1...p_a_t_t_e_r_n_n. If the matchings succeed, e_x_p_r_ is evaluated in the environment enriched by the bindings performed during matching, and the value of e_x_p_r_ is returned as the value of the whole let expression. If one of the matchings fails, the exception Match_failure is raised. Chapter 2. The core Caml Light language 24 An alternate syntax is provided to bind variables to functional values: instead of writing i_d_e_n_t_ = fun p_a_t_t_e_r_n_1...p_a_t_t_e_r_n_m -> e_x_p_r_ in a let expression, one may instead write i_d_e_n_t_ p_a_t_t_e_r_n_1 ...p_a_t_t_e_r_n_m = e_x_p_r_ Both forms bind i_d_e_n_t_ to the curried function with m_ arguments and only one case, p_a_t_t_e_r_n_1 ...p_a_t_t_e_r_n_m -> e_x_p_r_. Recursive definitions of variables are introduced by let rec: let rec p_a_t_t_e_r_n_1 = e_x_p_r_1 and...and p_a_t_t_e_r_n_n = e_x_p_r_n in e_x_p_r_ The only difference with the let construct described above is that the bindings of variables to values performed by the pattern-matching are considered already performed when the expressions e_x_p_r_1 to e_x_p_r_n are evaluated. That is, the expressions e_x_p_r_1 to e_x_p_r_n can reference identifiers that are bound by one of the patterns p_a_t_t_e_r_n_1,...,p_a_t_t_e_r_n_n, and expect them to have the same value as in e_x_p_r_, the body of the let rec construct. The recursive definition is guaranteed to behave as described above if the expressions e_x_p_r_1 to e_x_p_r_n are function definitions (fun... or function...), and the patterns p_a_t_t_e_r_n_1...p_a_t_t_e_r_n_n consist in a single variable, as in: let rec i_d_e_n_t_1 = fun...and...and i_d_e_n_t_n = fun...in e_x_p_r_ This defines i_d_e_n_t_1...i_d_e_n_t_n as mutually recursive functions local to e_x_p_r_. The behavior of other forms of let rec definitions is implementation-dependent. 22..77..22 CCoonnttrrooll ccoonnssttrruuccttss SSeeqquueennccee The expression e_x_p_r_1 ; e_x_p_r_2 evaluates e_x_p_r_1 first, then e_x_p_r_2, and returns the value of e_x_p_r_2. CCoonnddiittiioonnaall The expression if e_x_p_r_1 then e_x_p_r_2 else e_x_p_r_3 evaluates to the value of e_x_p_r_2 if e_x_p_r_1 evaluates to the boolean true, and to the value of e_x_p_r_3 if e_x_p_r_1 evaluates to the boolean false. The else e_x_p_r_3 part can be omitted, in which case it defaults to else (). CCaassee eexxpprreessssiioonn The expression match expr with pattern1 -> expr1 | ... | patternn -> exprn Chapter 2. The core Caml Light language 25 matches the value of e_x_p_r_ against the patterns p_a_t_t_e_r_n_1 to p_a_t_t_e_r_n_n. If the matching against p_a_t_t_e_r_n_i succeeds, the associated expression e_x_p_r_i is evaluated, and its value becomes the value of the whole match expression. The evaluation of e_x_p_r_i takes place in an environment enriched by the bindings performed during matching. If several patterns match the value of e_x_p_r_, the one that occurs first in the match expression is selected. If none of the patterns match the value of e_x_p_r_, the exception Match_failure is raised. BBoooolleeaann ooppeerraattoorrss The expression e_x_p_r_1 & e_x_p_r_2 evaluates to true if both e_x_p_r_1 and e_x_p_r_2 evaluate to true; otherwise, it evaluates to false. The first component, e_x_p_r_1, is evaluated first. The second component, e_x_p_r_2, is not evaluated if the first component evaluates to false. Hence, the expression e_x_p_r_1 & e_x_p_r_2 behaves exactly as if e_x_p_r_1 then e_x_p_r_2 else false. The expression e_x_p_r_1 or e_x_p_r_2 evaluates to true if one of e_x_p_r_1 and e_x_p_r_2 evaluates to true; otherwise, it evaluates to false. The first component, e_x_p_r_1, is evaluated first. The second component, e_x_p_r_2, is not evaluated if the first component evaluates to true. Hence, the expression e_x_p_r_1 or e_x_p_r_2 behaves exactly as if e_x_p_r_1 then true else e_x_p_r_2. LLooooppss The expression while e_x_p_r_1 do e_x_p_r_2 done repeatedly evaluates e_x_p_r_2 while e_x_p_r_1 evaluates to true. The loop condition e_x_p_r_1 is evaluated and tested at the beginning of each iteration. The whole while...done expression evaluates to the unit value (). The expression for i_d_e_n_t_ = e_x_p_r_1 to e_x_p_r_2 do e_x_p_r_3 done first evaluates the expressions e_x_p_r_1 and e_x_p_r_2 (the boundaries) into integer values n_ and p_. Then, the loop body e_x_p_r_3 is repeatedly evaluated in an environment where the local variable named i_d_e_n_t_ is successively bound to the values n, n+1, ..., p-1, p. The loop body is never evaluated if n >p. The expression for i_d_e_n_t_ = e_x_p_r_1 downto e_x_p_r_2 do e_x_p_r_3 done first evaluates the expressions e_x_p_r_1 and e_x_p_r_2 (the boundaries) into integer values n_ and p_. Then, the loop body e_x_p_r_3 is repeatedly evaluated in an environment where the local variable named i_d_e_n_t_ is successively bound to the values n, n-1, ..., p+1, p. The loop body is never evaluated if n expr1 | ... | patternn -> exprn evaluates the expression e_x_p_r_ and returns its value if the evaluation of e_x_p_r_ does not raise any exception. If the evaluation of e_x_p_r_ raises an exception, the exception value is matched against the patterns p_a_t_t_e_r_n_1 to p_a_t_t_e_r_n_n. If the matching against p_a_t_t_e_r_n_i succeeds, the associated expression e_x_p_r_i is Chapter 2. The core Caml Light language 26 evaluated, and its value becomes the value of the whole try expression. The evaluation of e_x_p_r_i takes place in an environment enriched by the bindings performed during matching. If several patterns match the value of e_x_p_r_, the one that occurs first in the try expression is selected. If none of the patterns matches the value of e_x_p_r_, the exception value is raised again, thereby transparently ``passing through'' the try construct. 22..77..33 OOppeerraattiioonnss oonn ddaattaa ssttrruuccttuurreess PPrroodduuccttss The expression e_x_p_r_1 ,..., e_x_p_r_n evaluates to the n_-tuple of the values of expressions e_x_p_r_1 to e_x_p_r_n. The evaluation order for the subexpressions is not specified. VVaarriiaannttss The expression n_c_c_o_n_s_t_r_ e_x_p_r_ evaluates to the variant value whose constructor is n_c_c_o_n_s_t_r_, and whose argument is the value of e_x_p_r_. For lists, some syntactic sugar is provided. The expression e_x_p_r_1 :: e_x_p_r_2 stands for the constructor prefix :: applied to the argument ( e_x_p_r_1 , e_x_p_r_2 ), and therefore evaluates to the list whose head is the value of e_x_p_r_1 and whose tail is the value of e_x_p_r_2. The expression [ e_x_p_r_1 ;...; e_x_p_r_n ] is equivalent to e_x_p_r_1 ::...:: e_x_p_r_n :: [], and therefore evaluates to the list whose elements are the values of e_x_p_r_1 to e_x_p_r_n. RReeccoorrddss The expression { l_a_b_e_l_1 = e_x_p_r_1 ;...; l_a_b_e_l_n = e_x_p_r_n } evaluates to the record value { l_a_b_e_l_1 = v_1 ;...; l_a_b_e_l_n = v_n }, where v_i is the value of e_x_p_r_i for i=1, ...,n. The labels l_a_b_e_l_1 to l_a_b_e_l_n must all belong to the same record types; all labels belonging to this record type must appear exactly once in the record expression, though they can appear in any order. The order in which e_x_p_r_1 to e_x_p_r_n are evaluated is not specified. The expression e_x_p_r_1 . l_a_b_e_l_ evaluates e_x_p_r_1 to a record value, and returns the value associated to l_a_b_e_l_ in this record value. The expression e_x_p_r_1 . l_a_b_e_l_ <- e_x_p_r_2 evaluates e_x_p_r_1 to a record value, which is then modified in-place by replacing the value associated to l_a_b_e_l_ in this record by the value of e_x_p_r_2. This operation is permitted only if l_a_b_e_l_ has been declared mutable in the definition of the record type. The whole expression e_x_p_r_1 . l_a_b_e_l_ <- e_x_p_r_2 evaluates to the unit value (). AArrrraayyss The expression [| e_x_p_r_1 ;...; e_x_p_r_n |] evaluates to a n_-element array, whose elements are initialized with the values of e_x_p_r_1 to e_x_p_r_n respectively. The order in which these expressions are evaluated is unspecified. The expression e_x_p_r_1 .( e_x_p_r_2 ) is equivalent to the application vect_item e_x_p_r_1 e_x_p_r_2. In the initial environment, the identifier vect_item resolves to a built-in function that returns the value of element number e_x_p_r_2 in the array denoted by e_x_p_r_1. The first element has number 0; the last element has number n-1, where n_ is the size of the array. The exception Invalid_argument is raised if the access is out of bounds. The expression e_x_p_r_1 .( e_x_p_r_2 ) <- e_x_p_r_3 is equivalent to vect_assign e_x_p_r_1 e_x_p_r_2 e_x_p_r_3. In the initial environment, the identifier vect_assign resolves to a built-in function that modifies in-place the array Chapter 2. The core Caml Light language 27 denoted by e_x_p_r_1, replacing element number e_x_p_r_2 by the value of e_x_p_r_3. The exception Invalid_argument is raised if the access is out of bounds. The built-in function returns (). Hence, the whole expression e_x_p_r_1 .( e_x_p_r_2 ) <- e_x_p_r_3 evaluates to the unit value (). This behavior of the two constructs e_x_p_r_1 .( e_x_p_r_2 ) and e_x_p_r_1 .( e_x_p_r_2 ) <- e_x_p_r_3 may change if the meaning of the identifiers vect_item and vect_assign is changed, either by redefinition or by modification of the list of opened modules. See the discussion below on operators. 22..77..44 OOppeerraattoorrss The operators written infix-op in the grammar above can appear in infix position (between two expressions). The operators written prefix-op in the grammar above can appear in prefix position (in front of an expression). The expression p_r_e_f_i_x_-_o_p_ e_x_p_r_ is interpreted as the application i_d_e_n_t_ e_x_p_r_, where i_d_e_n_t_ is the identifier associated to the operator p_r_e_f_i_x_-_o_p_ in the table below. Similarly, the expression e_x_p_r_1 i_n_f_i_x_-_o_p_ e_x_p_r_2 is interpreted as the application i_d_e_n_t_ e_x_p_r_1 e_x_p_r_2, where i_d_e_n_t_ is the identifier associated to the operator i_n_f_i_x_-_o_p_ in the table below. The identifiers written i_d_e_n_t_ above are then evaluated following the rules in section 2.7.1. In the initial environment, they evaluate to built-in functions whose behavior is described in the table. The behavior of the constructions p_r_e_f_i_x_-_o_p_ e_x_p_r_ and e_x_p_r_1 i_n_f_i_x_-_o_p_ e_x_p_r_2 may change if the meaning of the identifiers associated to p_r_e_f_i_x_-_o_p_ or i_n_f_i_x_-_o_p_ is changed, either by redefinition of the identifiers, or by modification of the list of opened modules, through the #open and #close directives. Chapter 2. The core Caml Light language 28 --------------------------------------------------------------------------- |Operator |Associated |Behavior in the default environment | | |identifier | | --------------------------------------------------------------------------- |+ |prefix + |Integer addition. | |- (infix) |prefix - |Integer subtraction. | |- (prefix) |minus |Integer negation. | |* |prefix * |Integer multiplication. | |/ |prefix / |Integer division. Raise Division_by_zero if | | | |second argument is zero. The result is | | | |unspecified if either argument is negative. | |mod |prefix mod |Integer modulus. Raise Division_by_zero if | | | |second argument is zero. The result is | | | |unspecified if either argument is negative. | |+. |prefix +. |Floating-point addition. | |-. (infix) |prefix -. |Floating-point subtraction. | |-. (prefix) |minus_float |Floating-point negation. | |*. |prefix *. |Floating-point multiplication. | |/. |prefix /. |Floating-point division. Raise Divi- | | | |sion_by_zero if second argument is zero. | |@ |prefix @ |List concatenation. | |^ |prefix ^ |String concatenation. | |! |prefix ! |Dereferencing (return the current contents of | | | |a reference). | |:= |prefix := |Reference assignment (update the reference | | | |given as first argument with the value of the | | | |second argument). | |= |prefix = |Structural equality test. | |<> |prefix <> |Structural inequality test. | |== |prefix == |Physical equality test. | |!= |prefix != |Physical inequality test. | |< |prefix < |Test ``less than'' on integers. | |<= |prefix <= |Test ``less than or equal '' on integers. | |> |prefix > |Test ``greater than'' on integers. | |>= |prefix >= |Test ``greater than or equal'' on integers. | |<. |prefix <. |Test ``less than'' on floating-point numbers. | |<=. |prefix <=. |Test ``less than or equal '' on floating-point | | | |numbers. | |>. |prefix >. |Test ``greater than'' on floating-point | | | |numbers. | |>=. |prefix >=. |Test ``greater than or equal'' on floating- | | | |point numbers. | --------------------------------------------------------------------------- The behavior of the +, -, *, /, mod, +., -., *. or /. operators is unspecified if the result falls outside of the range of representable integers or floating-point numbers, respectively. See chapter 13 for a more precise description of the behavior of the operators above. 22..88 GGlloobbaall ddeeffiinniittiioonnss This section describes the constructs that bind global identifiers (value variables, value constructors, type constructors, record labels). Chapter 2. The core Caml Light language 29 22..88..11 TTyyppee ddeeffiinniittiioonnss t_y_p_e_-_d_e_f_i_n_i_t_i_o_n_ ::= type t_y_p_e_d_e_f_ {and t_y_p_e_d_e_f_} t_y_p_e_d_e_f_ ::= t_y_p_e_-_p_a_r_a_m_s_ i_d_e_n_t_ = c_o_n_s_t_r_-_d_e_c_l_ {| c_o_n_s_t_r_-_d_e_c_l_} | t_y_p_e_-_p_a_r_a_m_s_ i_d_e_n_t_ = { l_a_b_e_l_-_d_e_c_l_ {; l_a_b_e_l_-_d_e_c_l_} } | t_y_p_e_-_p_a_r_a_m_s_ i_d_e_n_t_ == t_y_p_e_x_p_r_ | t_y_p_e_-_p_a_r_a_m_s_ i_d_e_n_t_ t_y_p_e_-_p_a_r_a_m_s_ ::= n_o_t_h_i_n_g_ | ' i_d_e_n_t_ | ( ' i_d_e_n_t_ {, ' i_d_e_n_t_} ) c_o_n_s_t_r_-_d_e_c_l_ ::= i_d_e_n_t_ | i_d_e_n_t_ of t_y_p_e_x_p_r_ l_a_b_e_l_-_d_e_c_l_ ::= i_d_e_n_t_ : t_y_p_e_x_p_r_ | mutable i_d_e_n_t_ : t_y_p_e_x_p_r_ Type definitions bind type constructors to data types: either variant types, record types, type abbreviations, or abstract data types. Type definitions are introduced by the type keyword, and consist in one or several simple definitions, possibly mutually recursive, separated by the and keyword. Each simple definition defines one type constructor. A simple definition consists in an identifier, possibly preceded by one or several type parameters, and followed by a data type description. The identifier is the local name of the type constructor being defined. (The module name for this type constructor is the name of the module being compiled.) The optional type parameters are either one type variable ' i_d_e_n_t_, for type constructors with one parameter, or a list of type variables (' i_d_e_n_t_1,...,' i_d_e_n_t_n), for type constructors with several parameters. These type parameters can appear in the type expressions of the right-hand side of the definition. VVaarriiaanntt ttyyppeess The type definition t_y_p_e_p_a_r_a_m_s_ i_d_e_n_t_ = c_o_n_s_t_r_-_d_e_c_l_1 |...| c_o_n_s_t_r_-_d_e_c_l_n defines a variant type. The constructor declarations c_o_n_s_t_r_-_d_e_c_l_1,...,c_o_n_s_t_r_-_d_e_c_l_n describe the constructors associated to this variant type. The constructor declaration i_d_e_n_t_ of t_y_p_e_x_p_r_ declares the local name i_d_e_n_t_ (in the module being compiled) as a non-constant constructor, whose argument has type t_y_p_e_x_p_r_. The constructor declaration i_d_e_n_t_ declares the local name i_d_e_n_t_ (in the module being compiled) as a constant constructor. RReeccoorrdd ttyyppeess The type definition t_y_p_e_p_a_r_a_m_s_ i_d_e_n_t_ = { l_a_b_e_l_-_d_e_c_l_1 ;...; l_a_b_e_l_-_d_e_c_l_n } defines a record type. The label declarations l_a_b_e_l_-_d_e_c_l_1,...,l_a_b_e_l_-_d_e_c_l_n describe the labels associated to this record type. The label declaration i_d_e_n_t_ : t_y_p_e_x_p_r_ declares the local name i_d_e_n_t_ in the module being compiled as a label, whose argument has type t_y_p_e_x_p_r_. The label declaration mutable i_d_e_n_t_ : t_y_p_e_x_p_r_ behaves similarly; in addition, it allows physical modification over the argument to this label. TTyyppee aabbbbrreevviiaattiioonnss The type definition t_y_p_e_p_a_r_a_m_s_ i_d_e_n_t_ == t_y_p_e_x_p_r_ defines the type constructor i_d_e_n_t_ as an abbreviation for the type expression t_y_p_e_x_p_r_. Chapter 2. The core Caml Light language 30 AAbbssttrraacctt ttyyppeess The type definition t_y_p_e_p_a_r_a_m_s_ i_d_e_n_t_ defines i_d_e_n_t_ as an abstract type. When appearing in a module interface, this definition allows exporting a type constructor while hiding how it is represented in the module implementation. 22..88..22 EExxcceeppttiioonn ddeeffiinniittiioonnss e_x_c_e_p_t_i_o_n_-_d_e_f_i_n_i_t_i_o_n_ ::= exception c_o_n_s_t_r_-_d_e_c_l_ {and c_o_n_s_t_r_-_d_e_c_l_} Exception definitions add new constructors to the built-in variant type exn of exception values. The constructors are declared as for a definition of a variant type. 22..99 DDiirreeccttiivveess d_i_r_e_c_t_i_v_e_ ::= # open s_t_r_i_n_g_ | # close s_t_r_i_n_g_ | # i_d_e_n_t_ s_t_r_i_n_g_ Directives control the behavior of the compiler. They apply to the remainder of the current compilation unit. The two directives #open and #close modify the list of opened modules, that the compiler uses to complete unqualified identifiers, as described in section 2.2. The directive #open s_t_r_i_n_g_ adds the module whose name is given by the string constant s_t_r_i_n_g_ to the list of opened modules, in first position. The directive #close s_t_r_i_n_g_ removes the first occurrence of the module whose name is given by the string constant s_t_r_i_n_g_ from the list of opened modules. Implementations can provide other directives, provided they follow the syntax # i_d_e_n_t_ s_t_r_i_n_g_, where i_d_e_n_t_ is the name of the directive, and the string constant s_t_r_i_n_g_ is the argument to the directive. The behavior of these additional directives is implementation-dependent. 22..1100 MMoodduullee iimmpplleemmeennttaattiioonnss i_m_p_l_e_m_e_n_t_a_t_i_o_n_ ::= {i_m_p_l_-_p_h_r_a_s_e_ ;;} i_m_p_l_-_p_h_r_a_s_e_ ::= e_x_p_r_ | v_a_l_u_e_-_d_e_f_i_n_i_t_i_o_n_ | t_y_p_e_-_d_e_f_i_n_i_t_i_o_n_ | e_x_c_e_p_t_i_o_n_-_d_e_f_i_n_i_t_i_o_n_ | d_i_r_e_c_t_i_v_e_ v_a_l_u_e_-_d_e_f_i_n_i_t_i_o_n_ ::= let [rec] l_e_t_-_b_i_n_d_i_n_g_ {and l_e_t_-_b_i_n_d_i_n_g_} A module implementation consists in a sequence of implementation phrases, terminated by double semicolons. An implementation phrase is either an expression, a value definition, a type or exception definition, or a directive. At run-time, implementation phrases are evaluated sequentially, in the order in which they appear in the module implementation. Implementation phrases consisting in an expression are evaluated for their side-effects. Value definitions bind global value variables in the same way as a let...in... expression binds local variables. The expressions are evaluated, and their values are matched against the left-hand sides of the = sides, as explained in section 2.7.1. If the matching succeeds, the bindings of identifiers to values performed during matching are interpreted as bindings to the global value variables whose local name is the identifier, and whose Chapter 2. The core Caml Light language 31 module name is the name of the module. If the matching fails, the exception Match_failure is raised. The scope of these bindings is the phrases that follow the value definition in the module implementation. Type and exception definitions introduce type constructors, variant constructors and record labels as described in sections 2.8.1 and 2.8.2. The scope of these definitions is the phrases that follow the value definition in the module implementation. The evaluation of an implementation phrase consisting in a type or exception definition produces no effect at run-time. Directives modify the behavior of the compiler on the subsequent phrases of the module implementation, as described in section 2.9. The evaluation of an implementation phrase consisting in a directive produces no effect at run-time. Directives apply only to the module currently being compiled; in particular, they have no effect on other modules that refer to globals exported by the module being compiled. 22..1111 MMoodduullee iinntteerrffaacceess i_n_t_e_r_f_a_c_e_ ::= {i_n_t_f_-_p_h_r_a_s_e_ ;;} i_n_t_f_-_p_h_r_a_s_e_ ::= v_a_l_u_e_-_d_e_c_l_a_r_a_t_i_o_n_ | t_y_p_e_-_d_e_f_i_n_i_t_i_o_n_ | e_x_c_e_p_t_i_o_n_-_d_e_f_i_n_i_t_i_o_n_ | d_i_r_e_c_t_i_v_e_ v_a_l_u_e_-_d_e_c_l_a_r_a_t_i_o_n_ ::= value i_d_e_n_t_ : t_y_p_e_x_p_r_ {and i_d_e_n_t_ : t_y_p_e_x_p_r_} Module interfaces declare the global objects (value variables, type constructors, variant constructors, record labels) that a module exports, that is, makes available to other modules. Other modules can refer to these globals using qualified identifiers or the #open directive, as explained in section 2.2. A module interface consists in a sequence of interface phrases, terminated by double semicolons. An interface phrase is either a value declaration, a type definition, an exception definition, or a directive. Value declarations declare global value variables that are exported by the module implementation, and the types with which they are exported. The module implementation must define these variables, with types at least as general as the types declared in the interface. The scope of the bindings for these global variables extends from the module implementation itself to all modules that refer to those variables. Type or exception definitions introduce type constructors, variant constructors and record labels as described in sections 2.8.1 and 2.8.2. Exception definitions and type definitions that are not abstract type declarations also take effect in the module implementation; that is, the type constructors, variant constructors and record labels they define are considered bound on entrance to the module implementation, and can be referred to by the implementation phrases. Type definitions that are not abstract type declarations must not be redefined in the module implementation. In contrast, the type constructors that are declared abstract in a module interface must be defined in the module implementation, with the same names. Directives modify the behavior of the compiler on the subsequent phrases of the module interface, as described in section 2.9. Directives apply only to the interface currently being compiled; in particular, they have no effect on other modules that refer to globals exported by the interface being compiled. CChhaapptteerr 33 LLaanngguuaaggee eexxtteennssiioonnss This chapter describes the language features that are implemented in Caml Light, but not described in the Caml Light reference manual. In contrast with the fairly stable kernel language that is described in the reference manual, the extensions presented here are still experimental, and may be removed or changed in the future. 33..11 SSttrreeaammss,, ppaarrsseerrss,, aanndd pprriinntteerrss Caml Light comprises a built-in type for s_t_r_e_a_m_s_ (possibly infinite sequences of elements, that are evaluated on demand), and associated stream expressions, to build streams, and stream patterns, to destructure streams. Streams and stream patterns provide a natural approach to the writing of recursive-descent parsers. Streams are presented by the following extensions to the syntactic classes of expressions: e_x_p_r_ ::= ... | [< >] | [< s_t_r_e_a_m_-_c_o_m_p_o_n_e_n_t_ {; s_t_r_e_a_m_-_c_o_m_p_o_n_e_n_t_} >] | function s_t_r_e_a_m_-_m_a_t_c_h_i_n_g_ | match e_x_p_r_ with s_t_r_e_a_m_-_m_a_t_c_h_i_n_g_ s_t_r_e_a_m_-_c_o_m_p_o_n_e_n_t_ ::= ' e_x_p_r_ | e_x_p_r_ s_t_r_e_a_m_-_m_a_t_c_h_i_n_g_ ::= s_t_r_e_a_m_-_p_a_t_t_e_r_n_ -> e_x_p_r_ {| s_t_r_e_a_m_-_p_a_t_t_e_r_n_ -> e_x_p_r_} s_t_r_e_a_m_-_p_a_t_t_e_r_n_ ::= [< >] | [< s_t_r_e_a_m_-_c_o_m_p_-_p_a_t_ {; s_t_r_e_a_m_-_c_o_m_p_-_p_a_t_} >] s_t_r_e_a_m_-_c_o_m_p_-_p_a_t_ ::= ' p_a_t_t_e_r_n_ | e_x_p_r_ p_a_t_t_e_r_n_ | i_d_e_n_t_ Stream expressions are bracketed by [< and >]. They represent the concatenation of their components. The component ' e_x_p_r_ represents the one-element stream whose element is the value of e_x_p_r_. The component e_x_p_r_ represents a sub-stream. For instance, if both s and t are streams of integers, then [<'1; s; t; '2>] is a stream of integers containing the element 1, then the elements of s, then those of t, and finally 2. The empty stream is denoted by [< >]. Unlike any other kind of expressions in the language, stream expressions are submitted to lazy evaluation: the components are not evaluated when the stream is built, but only when they are accessed during stream matching. The components are evaluated once, the first time they are accessed; the following 32 Chapter 3. Language extensions 33 accesses reuse the value computed the first time. Stream patterns, also bracketed by [< and >], describe initial segments of streams. In particular, the stream pattern [< >] matches all streams. Stream pattern components are matched against the corresponding elements of a stream. The component ' p_a_t_t_e_r_n_ matches the corresponding stream element against the pattern. The component e_x_p_r_ p_a_t_t_e_r_n_ applies the function denoted by e_x_p_r_ to the current stream, then matches the result of the function against p_a_t_t_e_r_n_. Finally, the component i_d_e_n_t_ simply binds the identifier to the stream being matched. (The current implementation limits i_d_e_n_t_ to appear last in a stream pattern.) Stream matching proceeds destructively: once a component has been matched, it is discarded from the stream (by in-place modification). Stream matching proceeds in two steps: first, a pattern is selected by matching the stream against the first components of the stream patterns; then, the following components of the selected pattern are checked against the stream. If the following components do not match, the exception Parse_error is raised. There is no backtracking here: stream matching commits to the pattern selected according to the first element. If none of the first components of the stream patterns match, the exception Parse_failure is raised. The Parse_failure exception causes the next alternative to be tried, if it occurs during the matching of the first element of a stream, before matching has committed to one pattern. See F_u_n_c_t_i_o_n_a_l_ p_r_o_g_r_a_m_m_i_n_g_ u_s_i_n_g_ C_a_m_l_ L_i_g_h_t_ for a more gentle introductions to streams, and for some examples of their use in writing parsers. A more formal presentation of streams, and a discussion of alternate semantics, can be found in P_a_r_s_e_r_s_ i_n_ M_L_ by Michel Mauny and Daniel de Rauglaudre, in the proceedings of the 1992 ACM conference on Lisp and Functional Programming. 33..22 GGuuaarrddss Cases of a pattern matching can include guard expressions, which are arbitrary boolean expressions that must evaluate to true for the match case to be selected. Guards occur just before the -> token and are introduced by the when keyword: match expr with pattern1[whencond1] -> expr1 | ... | patternn[whencondn] -> exprn (Same syntax for the fun, function, and try ...with constructs.) During matching, if the value of e_x_p_r_ matches some pattern p_a_t_t_e_r_n_i which has a guard c_o_n_d_i, then the expression c_o_n_d_i is evaluated (in an environment enriched by the bindings performed during matching). If c_o_n_d_i evaluates to true, then e_x_p_r_i is evaluated and its value returned as the result of the matching, as usual. But if c_o_n_d_i evaluates to false, the matching is resumed against the patterns following p_a_t_t_e_r_n_i. 33..33 RRaannggee ppaatttteerrnnss In patterns, Caml Light recognizes the form ` c_ ` .. ` d_ ` (two character constants separated by ..) as a shorthand for the pattern ` c_ ` | ` c_1 ` | ` c_2 ` |...| ` c_n ` | ` d_ ` Chapter 3. Language extensions 34 where c1,c2,...,cn are the characters that occur between c and d in the ASCII character set. For instance, the pattern `0`..`9` matches all characters that are digits. 33..44 RReeccuurrssiivvee ddeeffiinniittiioonnss ooff vvaalluueess Besides let rec definitions of functional values, as described in the reference manual, Caml Light supports a certain class of recursive definitions of non-functional values. For instance, the following definition is accepted: let rec x = 1 :: y and y = 2 :: x;; and correctly binds x to the cyclic list 1::2::1::2::..., and y to the cyclic list 2::1::2::1::...Informally, the class of accepted definitions consists of those definitions where the defined variables occur only inside function bodies or as a field of a data structure. Moreover, the patterns in the left-hand sides must be identifiers, nothing more complex. 33..55 LLooccaall ddeeffiinniittiioonnss uussiinngg where A postfix syntax for local definitions is provided: e_x_p_r_ ::= ... | e_x_p_r_ where [rec] l_e_t_-_b_i_n_d_i_n_g_ The expression e_x_p_r_ where l_e_t_-_b_i_n_d_i_n_g_ behaves exactly as let l_e_t_-_b_i_n_d_i_n_g_ in e_x_p_r_, and similarly for where rec and let rec. 33..66 MMuuttaabbllee vvaarriiaanntt ttyyppeess The argument of a value constructor can be declared ``mutable'' when the variant type is defined: type foo = A of mutable int | B of mutable int * int | ... This allows in-place modification of the argument part of a constructed value. Modification is performed by a new kind of expressions, written i_d_e_n_t_ <- e_x_p_r_, where i_d_e_n_t_ is an identifier bound by pattern-matching to the argument of a mutable constructor, and e_x_p_r_ denotes the value that must be stored in place of that argument. Continuing the example above: let x = A 1 in begin match x with A y -> y <- 2 | _ -> () end; x returns the value A 2. The notation i_d_e_n_t_ <- e_x_p_r_ works also if i_d_e_n_t_ is an identifier bound by pattern-matching to the value of a mutable field in a record. For instance, type bar = {mutable lbl : int};; let x = {lbl = 1} in begin match x with {lbl = y} -> y <- 2 end; x Chapter 3. Language extensions 35 returns the value {lbl = 2}. 33..77 SSttrriinngg aacccceessss Extra syntactic constructs are provided to access and modify characters in strings: e_x_p_r_ ::= ... | e_x_p_r_ .[ e_x_p_r_ ] | e_x_p_r_ .[ e_x_p_r_ ] <- e_x_p_r_ The expression e_x_p_r_1 .[ e_x_p_r_2 ] is equivalent to the application nth_char e_x_p_r_1 e_x_p_r_2. In the initial environment, the identifier nth_char resolves to a built-in function that returns the character number e_x_p_r_2 in the string denoted by e_x_p_r_1. The first element has number 0; the last element has number n-1, where n is the length of the string. The exception Invalid_argument is raised if the access is out of bounds. The expression e_x_p_r_1 .[ e_x_p_r_2 ] <- e_x_p_r_3 is equivalent to set_nth_char e_x_p_r_1 e_x_p_r_2 e_x_p_r_3. In the initial environment, the identifier set_nth_char resolves to a built-in function that modifies in-place the string denoted by e_x_p_r_1, replacing character number e_x_p_r_2 by the value of e_x_p_r_3. The exception Invalid_argument is raised if the access is out of bounds. The built-in function returns (). 33..88 AAlltteerrnnaattee ssyynnttaaxx The syntax of some constructs has been slightly relaxed: - An optional ; may terminate a sequence, list expression, or record expression. For instance, begin e_1 ; e_2 ; end is syntactically correct and synonymous with begin e_1 ; e_2 end. - Similarly, an optional | may begin a pattern-matching expression. For instance, function | p_a_t_1 -> e_x_p_r_1 |... is syntactically correct and synonymous with function p_a_t_1 -> e_x_p_r_1 |.... - The tokens && and || are recognized as synonymous for & (sequential ``and'') and or (sequential ``or''), respectively. 33..99 IInnffiixx ssyymmbboollss Sequences of ``operator characters'', such as <=> or !!, are read as a single token from the i_n_f_i_x_-_s_y_m_b_o_l_ or p_r_e_f_i_x_-_s_y_m_b_o_l_ class: i_n_f_i_x_-_s_y_m_b_o_l_ ::= (= | < | > | @ | ^ | | | & | ~ | + | - | * | / | $ | %) {o_p_e_r_a_t_o_r_-_c_h_a_r_} p_r_e_f_i_x_-_s_y_m_b_o_l_ ::= (! | ?) {o_p_e_r_a_t_o_r_-_c_h_a_r_} o_p_e_r_a_t_o_r_-_c_h_a_r_ ::= ! | $ | % | & | * | + | - | . | / | : | ; | < | = | > | ? | @ | ^ | | | ~ Tokens from these two classes generalize the built-in infix and prefix operators described in chapter 2: Chapter 3. Language extensions 36 e_x_p_r_ ::= ... | p_r_e_f_i_x_-_s_y_m_b_o_l_ e_x_p_r_ | e_x_p_r_ i_n_f_i_x_-_s_y_m_b_o_l_ e_x_p_r_ v_a_r_i_a_b_l_e_ ::= ... | prefix p_r_e_f_i_x_-_s_y_m_b_o_l_ | prefix i_n_f_i_x_-_s_y_m_b_o_l_ No #infix directive (section 3.10) is needed to give infix symbols their infix status. The precedences and associativities of infix symbols in expressions are determined by their first character(s): symbols beginning with ** have highest precedence (exponentiation), followed by symbols beginning with *, / or % (multiplication), then + and - (addition), then @ and ^ (concatenation), then all others symbols (comparisons). The updated precedence table for expressions is shown below. We write ``*...'' to mean ``any infix symbol starting with *''. ---------------------------------------------------------------------- |Construction or operator |Associativity | ---------------------------------------------------------------------- |!... ?... |-- | |. .( .[ |-- | |function application |right | |constructor application |-- | |- -. (prefix) |-- | |**... |right | |*... /... %... mod |left | |+... -... |left | |:: |right | |@... ^... |right | |comparisons (= == < etc.), all other infix symbols|left | |not |-- | |& && |left | |or || |left | |, |-- | |<- := |right | |if |-- | |; |right | |let match fun function try |-- | ---------------------------------------------------------------------- Some infix and prefix symbols are predefined in the default environment (see chapters 2 and 13 for a description of their behavior). The others are initially unbound and must be bound before use, with a let prefix i_n_f_i_x_-_s_y_m_b_o_l_ = e_x_p_r_ or let prefix p_r_e_f_i_x_-_s_y_m_b_o_l_ = e_x_p_r_ binding. 33..1100 DDiirreeccttiivveess In addition to the standard #open and #close directives, Caml Light provides three additional directives. #infix " i_d_ " Change the lexical status of the identifier i_d_: in the remainder of the compilation unit, i_d_ is recognized as an infix operator, just like +. The notation prefix i_d_ can be used to refer to the identifier i_d_ itself. Expressions of the form e_x_p_r_1 i_d_ e_x_p_r_2 are parsed as the application prefix i_d_ e_x_p_r_1 e_x_p_r_2. The argument to the #infix directive must be an identifier, that is, a sequence of letters, digits and underscores starting with a letter; otherwise, the #infix declaration has no effect. Example: Chapter 3. Language extensions 37 #infix "union";; let prefix union = fun x y -> ... ;; [1,2] union [3,4];; #uninfix " i_d_ " Remove the infix status attached to the identifier i_d_ by a previous #infix " i_d_ " directive. #directory " d_i_r_-_n_a_m_e_ " Add the named directory to the path of directories searched for compiled module interface files. This is equivalent to the -I command-line option to the batch compiler and the toplevel system. PPaarrtt IIIIII TThhee CCaammll LLiigghhtt ccoommmmaannddss 38 CChhaapptteerr 44 BBaattcchh ccoommppiillaattiioonn ((ccaammllcc)) This chapter describes how Caml Light programs can be compiled non-interactively, and turned into standalone executable files. This is achieved by the command camlc, which compiles and links Caml Light source files. MMaacc:: This command is not a standalone Macintosh application. To run camlc, you need the Macintosh Programmer's Workshop (MPW) programming environment. The programs generated by camlc are also MPW tools, not standalone Macintosh applications. 44..11 OOvveerrvviieeww ooff tthhee ccoommppiilleerr The camlc command has a command-line interface similar to the one of most C compilers. It accepts several types of arguments: source files for module implementations; source files for module interfaces; and compiled module implementations. - Arguments ending in .mli are taken to be source files for module interfaces. Module interfaces declare exported global identifiers, define public data types, and so on. From the file x_.mli, the camlc compiler produces a compiled interface in the file x_.zi. - Arguments ending in .ml are taken to be source files for module implementation. Module implementations bind global identifiers to values, define private data types, and contain expressions to be evaluated for their side-effects. From the file x_.ml, the camlc compiler produces compiled object code in the file x_.zo. If the interface file x_.mli exists, the module implementation x_.ml is checked against the corresponding compiled interface x_.zi, which is assumed to exist. If no interface x_.mli is provided, the compilation of x_.ml produces a compiled interface file x_.zi in addition to the compiled object code file x_.zo. The file x_.zi produced corresponds to an interface that exports everything that is defined in the implementation x_.ml. - Arguments ending in .zo are taken to be compiled object code. These files are linked together, along with the object code files obtained by compiling .ml arguments (if any), and the Caml Light standard library, to produce a standalone executable program. The order in which .zo and .ml arguments are presented on the command line is relevant: global identifiers are initialized in that order at run-time, and it is a link-time error to use a global identifier before having initialized it. Hence, a given x_.zo file must come before all .zo files that refer to 39 Chapter 4. Batch compilation (camlc) 40 identifiers defined in the file x_.zo. The output of the linking phase is a file containing compiled code that can be executed by the Caml Light runtime system: the command named camlrun. If caml.out is the name of the file produced by the linking phase, the command camlrun caml.out a_r_g_1 a_r_g_2 ... a_r_g_n executes the compiled code contained in caml.out, passing it as arguments the character strings a_r_g_1 to a_r_g_n. (See chapter 6 for more details.) UUnniixx:: On most Unix systems, the file produced by the linking phase can be run directly, as in: ./caml.out a_r_g_1 a_r_g_2 ... a_r_g_n The produced file has the executable bit set, and it manages to launch the bytecode interpreter by itself. PPCC:: The output file produced by the linking phase is directly executable, provided it is given extension .EXE. Hence, if the output file is named caml_out.exe, you can execute it with the command caml_out a_r_g_1 a_r_g_2 ... a_r_g_n Actually, the produced file caml_out.exe is a tiny executable file prepended to the bytecode file. The executable simply runs the camlrun runtime system on the remainder of the file. (As a consequence, this is not a standalone executable: it still requires camlrun.exe to reside in one of the directories in the path.) 44..22 OOppttiioonnss The following command-line options are recognized by camlc. -c Compile only. Suppress the linking phase of the compilation. Source code files are turned into compiled files, but no executable file is produced. This option is useful to compile modules separately. -ccopt o_p_t_i_o_n_ Pass the given option to the C compiler and linker, when linking in ``custom runtime'' mode (see the -custom option). For instance, -ccopt -Ld_i_r_ causes the C linker to search for C libraries in directory d_i_r_. -custom Link in ``custom runtime'' mode. In the default linking mode, the linker produces bytecode that is intended to be executed with the shared runtime system, camlrun. In the custom runtime mode, the linker produces an output file that contains both the runtime system and the bytecode for the program. The resulting file is considerably larger, but it can be executed directly, even if the camlrun command is not installed. Moreover, the ``custom runtime'' mode enables linking Caml Light code with user-defined C functions, as described in chapter 12. Chapter 4. Batch compilation (camlc) 41 UUnniixx:: Never strip an executable produced with the -custom option. PPCC:: This option requires the DJGPP port of the GNU C compiler to be installed. -files r_e_s_p_o_n_s_e_-_f_i_l_e_ Process the files whose names are listed in file r_e_s_p_o_n_s_e_-_f_i_l_e_, just as if these names appeared on the command line. File names in r_e_s_p_o_n_s_e_-_f_i_l_e_ are separated by blanks (spaces, tabs, newlines). This option allows to overcome silly limitations on the length of the command line. -g Cause the compiler to produce additional debugging information. During the linking phase, this option add information at the end of the executable bytecode file produced. This information is required by the debugger camldebug and also by the catch-all exception handler from the standard library module printexc. During the compilation of an implementation file (.ml file), when the -g option is set, the compiler adds debugging information to the .zo file. It also writes a .zix file that describes the full interface of the .ml file, that is, all types and values defined in the .ml file, including those that are local to the .ml file (i.e. not declared in the .mli interface file). Used in conjunction with the -g option to the toplevel system (chapter 5), the .zix file gives access to the local values of the module, making it possible to print or ``trace'' them. The .zix file is not produced if the implementation file has no explicit interface, since, in this case, the module has no local values. -i Cause the compiler to print the declared types, exceptions, and global variables (with their inferred types) when compiling an implementation (.ml file). This can be useful to check the types inferred by the compiler. Also, since the output follows the syntax of module interfaces, it can help in writing an explicit interface (.mli file) for a file: just redirect the standard output of the compiler to a .mli file, and edit that file to remove all declarations of unexported globals. -I d_i_r_e_c_t_o_r_y_ Add the given directory to the list of directories searched for compiled interface files (.zi) and compiled object code files (.zo). By default, the current directory is searched first, then the standard library directory. Directories added with -I are searched after the current directory, but before the standard library directory. When several directories are added with several -I options on the command line, these directories are searched from right to left (the rightmost directory is searched first, the leftmost is searched last). (Directories can also be added to the search path from inside the programs with the #directory directive; see chapter 3.) -lang l_a_n_g_u_a_g_e_-_c_o_d_e_ Translate the compiler messages to the specified language. The l_a_n_g_u_a_g_e_-_c_o_d_e_ is fr for French, es for Spanish, de for German, ... (See the file camlmsgs.txt in the Caml Light standard library directory for a list of available languages.) When an unknown language is specified, or no translation is available for a message, American English is used by default. Chapter 4. Batch compilation (camlc) 42 -o e_x_e_c_-_f_i_l_e_ Specify the name of the output file produced by the linker. UUnniixx:: The default output name is a.out, in keeping with the tradition. PPCC:: The default output name is caml_out.exe. MMaacc:: The default output name is Caml.Out. -O m_o_d_u_l_e_-_s_e_t_ Specify which set of standard modules is to be implicitly ``opened'' at the beginning of a compilation. There are three module sets currently available: ccaauuttiioouuss provides the standard operations on integers, floating-point numbers, characters, strings, arrays, ..., as well as exception handling, basic input/output, etc. Operations from the cautious set perform range and bound checking on string and array operations, as well as various sanity checks on their arguments. ffaasstt provides the same operations as the cautious set, but without sanity checks on their arguments. Programs compiled with -O fast are therefore slightly faster, but unsafe. nnoonnee suppresses all automatic opening of modules. Compilation starts in an almost empty environment. This option is not of general use, except to compile the standard library itself. The default compilation mode is -O cautious. See chapter 13 for a complete listing of the modules in the cautious and fast sets. -p Compile and link in profiling mode. See the description of the profiler camlpro in chapter 10. -v Print the version number of the compiler. -W Print extra warning messages for the following events: - A #open directive is useless (no identifier in the opened module is ever referenced). - A variable name in a pattern matching is capitalized (often corresponds to a misspelled constant constructor). UUnniixx:: The following environment variable is also consulted: Chapter 4. Batch compilation (camlc) 43 LANGWhen set, control which language is used to print the compiler messages (see the -lang command-line option). PPCC:: The following environment variables are also consulted: CAMLLIB Contain the path to the standard library directory. LANGWhen set, control which language is used to print the compiler messages (see the -lang command-line option). 44..33 MMoodduulleess aanndd tthhee ffiillee ssyysstteemm This short section is intended to clarify the relationship between the names of the modules and the names of the files that contain their compiled interface and compiled implementation. The compiler always derives the name of the compiled module by taking the base name of the source file (.ml or .mli file). That is, it strips the leading directory name, if any, as well as the .ml or .mli suffix. The produced .zi and .zo files have the same base name as the source file; hence, the compiled files produced by the compiler always have their base name equal to the name of the module they describe (for .zi files) or implement (for .zo files). For compiled interface files (.zi files), this invariant must be preserved at all times, since the compiler relies on it to load the compiled interface file for the modules that are used from the module being compiled. Hence, it is risky and generally incorrect to rename .zi files. It is admissible to move them to another directory, if their base name is preserved, and the correct -I options are given to the compiler. Compiled bytecode files (.zo files), on the other hand, can be freely renamed once created. That's because 1- .zo files contain the true name of the module they define, so there is no need to derive that name from the file name; 2- the linker never attempts to find by itself the .zo file that implements a module of a given name: it relies on the user providing the list of .zo files by hand. 44..44 CCoommmmoonn eerrrroorrss This section describes and explains the most frequently encountered error messages. CCaannnnoott ffiinndd ffiillee f_i_l_e_n_a_m_e_ The named file could not be found in the current directory, nor in the directories of the search path. The f_i_l_e_n_a_m_e_ is either a compiled interface file (.zi file), or a compiled bytecode file (.zo file). If f_i_l_e_n_a_m_e_ has the format m_o_d_.zi, this means you are trying to compile a file that references identifiers from module m_o_d_, but you have not yet compiled an interface for module m_o_d_. Fix: compile m_o_d_.mli or m_o_d_.ml first, to create the compiled interface m_o_d_.zi. If f_i_l_e_n_a_m_e_ has the format m_o_d_.zo, this means you are trying to link a bytecode object file that does not exist yet. Fix: compile m_o_d_.ml first. Chapter 4. Batch compilation (camlc) 44 If your program spans several directories, this error can also appear because you haven't specified the directories to look into. Fix: add the correct -I options to the command line. CCoorrrruupptteedd ccoommppiilleedd iinntteerrffaaccee ffiillee f_i_l_e_n_a_m_e_ The compiler produces this error when it tries to read a compiled interface file (.zi file) that has the wrong structure. This means something went wrong when this .zi file was written: the disk was full, the compiler was interrupted in the middle of the file creation, and so on. This error can also appear if a .zi file is modified after its creation by the compiler. Fix: remove the corrupted .zi file, and rebuild it. TThhiiss eexxpprreessssiioonn hhaass ttyyppee t_1,, bbuutt iiss uusseedd wwiitthh ttyyppee t_2 This is by far the most common type error in programs. Type t_1 is the type inferred for the expression (the part of the program that is displayed in the error message), by looking at the expression itself. Type t_2 is the type expected by the context of the expression; it is deduced by looking at how the value of this expression is used in the rest of the program. If the two types t_1 and t_2 are not compatible, then the error above is produced. In some cases, it is hard to understand why the two types t_1 and t_2 are incompatible. For instance, the compiler can report that ``expression of type foo cannot be used with type foo'', and it really seems that the two types foo are compatible. This is not always true. Two type constructors can have the same name, but actually represent different types. This can happen if a type constructor is redefined. Example: type foo = A | B;; let f = function A -> 0 | B -> 1;; type foo = C | D;; f C;; This result in the error message ``expression C of type foo cannot be used with type foo''. Incompatible types with the same names can also appear when a module is changed and recompiled, but some of its clients are not recompiled. That's because type constructors in .zi files are not represented by their name (that would not suffice to identify them, because of type redefinitions), but by unique stamps that are assigned when the type declaration is compiled. Consider the three modules: mod1.ml: type t = A | B;; let f = function A -> 0 | B -> 1;; mod2.ml: let g x = 1 + mod1__f(x);; mod3.ml: mod2__g mod1__A;; Now, assume mod1.ml is changed and recompiled, but mod2.ml is not recompiled. The recompilation of mod1.ml can change the stamp assigned to type t. But the interface mod2.zi will still use the old stamp for Chapter 4. Batch compilation (camlc) 45 mod1__t in the type of mod2__g. Hence, when compiling mod3.ml, the system complains that the argument type of mod2__g (that is, mod1__t with the old stamp) is not compatible with the type of mod1__A (that is, mod1__t with the new stamp). Fix: use make or a similar tool to ensure that all clients of a module m_o_d_ are recompiled when the interface m_o_d_.zi changes. To check that the Makefile contains the right dependencies, remove all .zi files and rebuild the whole program; if no ``Cannot find file'' error appears, you're all set. TThhee ttyyppee iinnffeerrrreedd ffoorr n_a_m_e_,, tthhaatt iiss,, t_,, ccoonnttaaiinnss nnoonn--ggeenneerraalliizzaabbllee ttyyppee vvaarriiaabblleess Type variables ('a, 'b, ...) in a type t_ can be in either of two states: generalized (which means that the type t_ is valid for all possible instantiations of the variables) and not generalized (which means that the type t_ is valid only for one instantiation of the variables). In a let binding let n_a_m_e_ = e_x_p_r_, the type-checker normally generalizes as many type variables as possible in the type of e_x_p_r_. However, this leads to unsoundness (a well-typed program can crash) in conjunction with polymorphic mutable data structures. To avoid this, generalization is performed at let bindings only if the bound expression e_x_p_r_ belongs to the class of ``syntactic values'', which includes constants, identifiers, functions, tuples of syntactic values, etc. In all other cases (for instance, e_x_p_r_ is a function application), a polymorphic mutable could have been created and generalization is therefore turned off. Non-generalized type variables in a type cause no difficulties inside a given compilation unit (the contents of a .ml file, or an interactive session), but they cannot be allowed in types written in a .zi compiled interface file, because they could be used inconsistently in other compilation units. Therefore, the compiler flags an error when a .ml implementation without a .mli interface defines a global variable n_a_m_e_ whose type contains non-generalized type variables. There are two solutions to this problem: - Add a type constraint or a .mli interface to give a monomorphic type (without type variables) to n_a_m_e_. For instance, instead of writing let sort_int_list = sort (prefix <);; (* inferred type 'a list -> 'a list, with 'a not generalized *) write let sort_int_list = (sort (prefix <) : int list -> int list);; - If you really need n_a_m_e_ to have a polymorphic type, turn its defining expression into a function by adding an extra parameter. For instance, instead of writing let map_length = map vect_length;; (* inferred type 'a vect list -> int list, with 'a not general- ized *) Chapter 4. Batch compilation (camlc) 46 write let map_length lv = map vect_length lv;; m_o_d___n_a_m_e_ iiss rreeffeerreenncceedd bbeeffoorree bbeeiinngg ddeeffiinneedd This error appears when trying to link an incomplete or incorrectly ordered set of files. Either you have forgotten to provide an implementation for the module named m_o_d_ on the command line (typically, the file named m_o_d_.zo, or a library containing that file). Fix: add the missing .ml or .zo file to the command line. Or, you have provided an implementation for the module named m_o_d_, but it comes too late on the command line: the implementation of m_o_d_ must come before all bytecode object files that reference one of the global variables defined in module m_o_d_. Fix: change the order of .ml and .zo files on the command line. Of course, you will always encounter this error if you have mutually recursive functions across modules. That is, function mod1__f calls function mod2__g, and function mod2__g calls function mod1__f. In this case, no matter what permutations you perform on the command line, the program will be rejected at link-time. Fixes: - Put f and g in the same module. - Parameterize one function by the other. That is, instead of having mod1.ml: let f x = ... mod2__g ... ;; mod2.ml: let g y = ... mod1__f ... ;; define mod1.ml: let f g x = ... g ... ;; mod2.ml: let rec g y = ... mod1__f g ... ;; and link mod1 before mod2. - Use a reference to hold one of the two functions, as in : mod1.ml: let forward_g = ref((fun x -> failwith "forward_g") : );; let f x = ... !forward_g ... ;; mod2.ml: let g y = ... mod1__f ... ;; mod1__forward_g := g;; UUnnaavvaaiillaabbllee CC pprriimmiittiivvee f_ This error appears when trying to link code that calls external functions written in C in ``default runtime'' mode. As explained in chapter 12, such code must be linked in ``custom runtime'' mode. Fix: add the -custom option, as well as the (native code) libraries and (native code) object files that implement the required external functions. CChhaapptteerr 55 TThhee ttoopplleevveell ssyysstteemm ((ccaammlllliigghhtt)) This chapter describes the toplevel system for Caml Light, that permits interactive use of the Caml Light system, through a read-eval-print loop. In this mode, the system repeatedly reads Caml Light phrases from the input, then typechecks, compile and evaluate them, then prints the inferred type and result value, if any. The system prints a # (sharp) prompt before reading each phrase. A phrase can span several lines. Phrases are delimited by ;; (the final double-semicolon). From the standpoint of the module system, all phrases entered at toplevel are treated as the implementation of a module named top. Hence, all toplevel definitions are entered in the module top. UUnniixx:: The toplevel system is started by the command camllight. Phrases are read on standard input, results are printed on standard output, errors on standard error. End-of-file on standard input terminates camllight (see also the quit system function below). The toplevel system does not perform line editing, but it can easily be used in conjunction with an external line editor such as fep; just run fep -emacs camllight or fep -vi camllight. Another option is to use camllight under Gnu Emacs, which gives the full editing power of Emacs (see the directory contrib/camlmode in the distribution). At any point, the parsing, compilation or evaluation of the current phrase can be interrupted by pressing ctrl-C (or, more precisely, by sending the intr signal to the camllight process). This goes back to the # prompt. MMaacc:: The toplevel system is presented as the standalone Macintosh application Caml Light. This application does not require the Macintosh Programmer's Workshop to run. Once launched from the Finder, the application opens two windows, ``Caml Light Input'' and ``Caml Light Output''. Phrases are entered in the ``Caml Light Input'' window. The ``Caml Light Output'' window displays a copy of the input phrases as they are processed by the Caml Light toplevel, interspersed with the toplevel responses. The ``Return'' key sends the contents of the Input window to the Caml Light toplevel. The ``Enter'' key inserts a newline without sending the contents of the Input window. (This can be configured with the ``Preferences'' menu item.) The contents of the input window can be edited at all times, with the standard Macintosh interface. An history of previously entered phrases 47 Chapter 5. The toplevel system (camllight) 48 is maintained, and can be accessed with the ``Previous entry'' (command-P) and ``Next entry'' (command-N) menu items. To quit the Caml Light application, either select ``Quit'' from the ``Files'' menu, or use the quit function described below. At any point, the parsing, compilation or evaluation of the current phrase can be interrupted by pressing ``command-period'', or by selecting the item ``Interrupt Caml Light'' in the ``Caml Light'' menu. This goes back to the # prompt. PPCC:: The toplevel system is presented as a Windows application named Camlwin.exe. It should be launched from the Windows file manager or program manager. The ``Terminal'' windows is split in two panes. Phrases are entered and edited in the bottom pane. The top pane displays a copy of the input phrases as they are processed by the Caml Light toplevel, interspersed with the toplevel responses. The ``Return'' key sends the contents of the bottom pane to the Caml Light toplevel. The ``Enter'' key inserts a newline without sending the contents of the Input window. (This can be configured with the ``Preferences'' menu item.) The contents of the input window can be edited at all times, with the standard Macintosh interface. An history of previously entered phrases is maintained and displayed in a separate window. To quit the Camlwin application, either select ``Quit'' from the ``File'' menu, or use the quit function described below. At any point, the parsing, compilation or evaluation of the current phrase can be interrupted by selecting the ``Interrupt Caml Light'' menu item. This goes back to the # prompt. A text-only version of the toplevel system is available under the name caml.exe. It runs under MSDOS as well as under Windows in a DOS window. No editing facilities are provided. 55..11 OOppttiioonnss The following command-line options are recognized by the caml or camllight commands. -g Start the toplevel system in debugging mode. This mode gives access to values and types that are local to a module, that is, not exported by the interface of the module. When debugging mode is off, these local objects are not accessible (attempts to access them produce an ``Unbound identifier'' error). When debugging mode is on, these objects become visible, just like the objects that are exported in the module interface. In particular, values of abstract types are printed using their concrete representations, and the functions local to a module can be ``traced'' (see the trace function in section 5.2). This applies only to the modules that have been compiled in debugging mode (either by the batch compiler with the -g option, or by the toplevel system in debugging mode), that is, those modules that have an associated .zix file. Chapter 5. The toplevel system (camllight) 49 -I d_i_r_e_c_t_o_r_y_ Add the given directory to the list of directories searched for compiled interface files (.zi) and compiled object code files (.zo). By default, the current directory is searched first, then the standard library directory. Directories added with -I are searched after the current directory, but before the standard library directory. When several directories are added with several -I options on the command line, these directories are searched from right to left (the rightmost directory is searched first, the leftmost is searched last). Directories can also be added to the search path once the toplevel is running with the #directory directive; see chapter 3. -lang l_a_n_g_u_a_g_e_-_c_o_d_e_ Translate the toplevel messages to the specified language. The l_a_n_g_u_a_g_e_-_c_o_d_e_ is fr for French, es for Spanish, de for German, ... (See the file camlmsgs.txt in the Caml Light standard library directory for a list of available languages.) When an unknown language is specified, or no translation is available for a message, American English is used by default. -O m_o_d_u_l_e_-_s_e_t_ Specify which set of standard modules is to be implicitly ``opened'' when the toplevel starts. There are three module sets currently available: ccaauuttiioouuss provides the standard operations on integers, floating-point numbers, characters, strings, arrays, ..., as well as exception handling, basic input/output, ...Operations from the cautious set perform range and bound checking on string and vector operations, as well as various sanity checks on their arguments. ffaasstt p