The Caml Light system
release 0.71
Documentation and user's manual
Xavier Leroy
March 11, 1996
Copyright oc 1996 Institut National de Recherche en Informatique et
Automatique
CCoonntteennttss
II GGeettttiinngg ssttaarrtteedd 77
11 IInnssttaallllaattiioonn iinnssttrruuccttiioonnss 88
1.1 The Unix version. . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 The Macintosh version . . . . . . . . . . . . . . . . . . . . . . 8
1.3 The PC version. . . . . . . . . . . . . . . . . . . . . . . . . . 9
IIII TThhee CCaammll LLiigghhtt llaanngguuaaggee rreeffeerreennccee mmaannuuaall 1111
22 TThhee ccoorree CCaammll LLiigghhtt llaanngguuaaggee 1122
2.1 Lexical conventions . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Global names. . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Type expressions. . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6 Patterns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.7 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.8 Global definitions. . . . . . . . . . . . . . . . . . . . . . . . 28
2.9 Directives. . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.10 Module implementations. . . . . . . . . . . . . . . . . . . . . . 30
2.11 Module interfaces . . . . . . . . . . . . . . . . . . . . . . . . 31
33 LLaanngguuaaggee eexxtteennssiioonnss 3322
3.1 Streams, parsers, and printers. . . . . . . . . . . . . . . . . . 32
3.2 Guards. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Range patterns. . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4 Recursive definitions of values . . . . . . . . . . . . . . . . . 34
3.5 Local definitions using where . . . . . . . . . . . . . . . . . . 34
3.6 Mutable variant types . . . . . . . . . . . . . . . . . . . . . . 34
3.7 String access . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.8 Alternate syntax. . . . . . . . . . . . . . . . . . . . . . . . . 35
3.9 Infix symbols . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.10 Directives. . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
IIIIII TThhee CCaammll LLiigghhtt ccoommmmaannddss 3388
44 BBaattcchh ccoommppiillaattiioonn ((ccaammllcc)) 3399
4.1 Overview of the compiler. . . . . . . . . . . . . . . . . . . . . 39
4.2 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Modules and the file system . . . . . . . . . . . . . . . . . . . 43
4.4 Common errors . . . . . . . . . . . . . . . . . . . . . . . . . . 43
55 TThhee ttoopplleevveell ssyysstteemm ((ccaammlllliigghhtt)) 4477
5.1 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2 Toplevel control functions. . . . . . . . . . . . . . . . . . . . 50
5.3 The toplevel and the module system. . . . . . . . . . . . . . . . 52
5.4 Common errors . . . . . . . . . . . . . . . . . . . . . . . . . . 54
1
2
5.5 Building custom toplevel systems: camlmktop. . . . . . . . . . . 54
5.6 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
66 TThhee rruunnttiimmee ssyysstteemm ((ccaammllrruunn)) 5566
6.1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.2 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.3 Common errors . . . . . . . . . . . . . . . . . . . . . . . . . . 57
77 TThhee lliibbrraarriiaann ((ccaammlllliibbrr)) 5599
7.1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.2 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.3 Turning code into a library . . . . . . . . . . . . . . . . . . . 60
88 LLeexxeerr aanndd ppaarrsseerr ggeenneerraattoorrss ((ccaammlllleexx,, ccaammllyyaacccc)) 6622
8.1 Overview of camllex . . . . . . . . . . . . . . . . . . . . . . . 62
8.2 Syntax of lexer definitions . . . . . . . . . . . . . . . . . . . 63
8.3 Overview of camlyacc. . . . . . . . . . . . . . . . . . . . . . . 64
8.4 Syntax of grammar definitions . . . . . . . . . . . . . . . . . . 65
8.5 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.6 A complete example. . . . . . . . . . . . . . . . . . . . . . . . 67
99 TThhee ddeebbuuggggeerr ((ccaammllddeebbuugg)) 6699
9.1 Compiling for debugging . . . . . . . . . . . . . . . . . . . . . 69
9.2 Invocation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.3 Commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.4 Executing a program . . . . . . . . . . . . . . . . . . . . . . . 71
9.5 Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
9.6 The call stack. . . . . . . . . . . . . . . . . . . . . . . . . . 74
9.7 Examining variable values . . . . . . . . . . . . . . . . . . . . 75
9.8 Controlling the debugger. . . . . . . . . . . . . . . . . . . . . 76
9.9 Miscellaneous commands. . . . . . . . . . . . . . . . . . . . . . 78
1100 PPrrooffiilliinngg ((ccaammllpprroo)) 7799
10.1 Compiling for profiling . . . . . . . . . . . . . . . . . . . . . 79
10.2 Profiling an execution. . . . . . . . . . . . . . . . . . . . . . 80
10.3 Printing profiling information. . . . . . . . . . . . . . . . . . 80
10.4 Known bugs. . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
1111 UUssiinngg CCaammll LLiigghhtt uunnddeerr EEmmaaccss 8811
11.1 Updating your .emacs. . . . . . . . . . . . . . . . . . . . . . . 81
11.2 The caml editing mode . . . . . . . . . . . . . . . . . . . . . . 81
11.3 Running the toplevel as an inferior process . . . . . . . . . . . 82
11.4 Running the debugger as an inferior process . . . . . . . . . . . 82
1122 IInntteerrffaacciinngg CC wwiitthh CCaammll LLiigghhtt 8844
12.1 Overview and compilation information. . . . . . . . . . . . . . . 84
12.2 The value type. . . . . . . . . . . . . . . . . . . . . . . . . . 86
12.3 Representation of Caml Light data types . . . . . . . . . . . . . 87
12.4 Operations on values. . . . . . . . . . . . . . . . . . . . . . . 88
12.5 Living in harmony with the garbage collector. . . . . . . . . . . 90
12.6 A complete example. . . . . . . . . . . . . . . . . . . . . . . . 92
IIVV TThhee CCaammll LLiigghhtt lliibbrraarryy 9955
1133 TThhee ccoorree lliibbrraarryy 9966
13.1 bool: boolean operations . . . . . . . . . . . . . . . . . . . . 96
13.2 builtin: base types and constructors . . . . . . . . . . . . . . 97
13.3 char: character operations . . . . . . . . . . . . . . . . . . . 98
3
13.4 eq: generic comparisons . . . . . . . . . . . . . . . . . . . . 98
13.5 exc: exceptions . . . . . . . . . . . . . . . . . . . . . . . . 99
13.6 fchar: character operations, without sanity checks . . . . . . . 100
13.7 float: operations on floating-point numbers . . . . . . . . . . 100
13.8 fstring: string operations, without sanity checks . . . . . . . 101
13.9 fvect: operations on vectors, without sanity checks . . . . . . 102
13.10int: operations on integers . . . . . . . . . . . . . . . . . . 102
13.11io: buffered input and output . . . . . . . . . . . . . . . . . 104
13.12list: operations on lists . . . . . . . . . . . . . . . . . . . 109
13.13pair: operations on pairs . . . . . . . . . . . . . . . . . . . 112
13.14ref: operations on references . . . . . . . . . . . . . . . . . 112
13.15stream: operations on streams . . . . . . . . . . . . . . . . . 113
13.16string: string operations . . . . . . . . . . . . . . . . . . . 114
13.17vect: operations on vectors . . . . . . . . . . . . . . . . . . 115
1144 TThhee ssttaannddaarrdd lliibbrraarryy 111188
14.1 arg: parsing of command line arguments . . . . . . . . . . . . . 118
14.2 baltree: basic balanced binary trees . . . . . . . . . . . . . . 119
14.3 filename: operations on file names . . . . . . . . . . . . . . . 120
14.4 format: pretty printing . . . . . . . . . . . . . . . . . . . . 121
14.5 gc: memory management control and statistics . . . . . . . . . . 125
14.6 genlex: a generic lexical analyzer . . . . . . . . . . . . . . . 127
14.7 hashtbl: hash tables and hash functions . . . . . . . . . . . . 128
14.8 lexing: the run-time library for lexers generated by camllex . . 129
14.9 map: association tables over ordered types . . . . . . . . . . . 130
14.10parsing: the run-time library for parsers generated by camlyacc 131
14.11printexc: a catch-all exception handler . . . . . . . . . . . . 132
14.12printf: formatting printing functions . . . . . . . . . . . . . 132
14.13queue: queues . . . . . . . . . . . . . . . . . . . . . . . . . 133
14.14random: pseudo-random number generator . . . . . . . . . . . . . 134
14.15set: sets over ordered types . . . . . . . . . . . . . . . . . . 134
14.16sort: sorting and merging lists . . . . . . . . . . . . . . . . 136
14.17stack: stacks . . . . . . . . . . . . . . . . . . . . . . . . . 136
14.18sys: system interface. . . . . . . . . . . . . . . . . . . . . . 137
1155 TThhee ggrraapphhiiccss lliibbrraarryy 114400
15.1 graphics: machine-independent graphics primitives . . . . . . . 141
1166 TThhee uunniixx lliibbrraarryy:: UUnniixx ssyysstteemm ccaallllss 114477
16.1 unix: interface to the Unix system . . . . . . . . . . . . . . . 147
1177 TThhee nnuumm lliibbrraarryy:: aarrbbiittrraarryy--pprreecciissiioonn rraattiioonnaall aarriitthhmmeettiicc 116666
17.1 num: operations on numbers . . . . . . . . . . . . . . . . . . . 166
17.2 arith_status: flags that control rational arithmetic . . . . . . 169
1188 TThhee ssttrr lliibbrraarryy:: rreegguullaarr eexxpprreessssiioonnss aanndd ssttrriinngg pprroocceessssiinngg 117700
18.1 str: regular expressions and high-level string processing . . . 170
VV AAppppeennddiixx 117744
1199 FFuurrtthheerr rreeaaddiinngg 117755
19.1 Programming in ML . . . . . . . . . . . . . . . . . . . . . . . . 175
19.2 Descriptions of ML dialects . . . . . . . . . . . . . . . . . . . 176
19.3 Implementing functional programming languages . . . . . . . . . . 177
19.4 Applications of ML. . . . . . . . . . . . . . . . . . . . . . . . 178
IInnddeexx ttoo tthhee lliibbrraarryy 117799
4
IInnddeexx ooff kkeeyywwoorrddss 118866
FFoorreewwoorrdd
This manual documents the release 0.71 of the Caml Light system. It is
organized as follows.
- Part I, ``Getting started'', explains how to install Caml Light on your
machine.
- Part II, ``The Caml Light language reference manual'', is the reference
description of the Caml Light language.
- Part III, ``The Caml Light commands'', documents the Caml Light compiler,
toplevel system, and programming utilities.
- Part IV, ``The Caml Light library'', describes the modules provided in
the standard library.
- Part V, ``Appendix'', contains a short bibliography, an index of all
identifiers defined in the standard library, and an index of Caml Light
keywords.
CCoonnvveennttiioonnss
The Caml Light system comes in several versions: for Unix machines, for
Macintoshes, and for PCs. The parts of this manual that are specific to one
version are presented as shown below:
UUnniixx:: This is material specific to the Unix version.
MMaacc:: This is material specific to the Macintosh version.
PPCC:: This is material specific to the PC version.
LLiicceennssee
c
The Caml Light system is copyright o 1989, 1990, 1991, 1992, 1993, 1994,
1995, 1996 Institut National de Recherche en Informatique et en Automatique
(INRIA). INRIA holds all ownership rights to the Caml Light system. See the
file COPYRIGHT in the distribution for the copyright notice.
The Caml Light system can be freely copied, but not sold. More precisely,
INRIA grants any user of the Caml Light system the right to reproduce it,
provided that the copies are distributed free of charge and under the
conditions given in the COPYRIGHT file. The present documentation is
distributed under the same conditions.
5
6
AAvvaaiillaabbiilliittyy bbyy FFTTPP
The complete Caml Light distribution resides on the machine ftp.inria.fr. The
distribution files can be transferred by anonymous FTP:
Host: ftp.inria.fr (Internet address 192.93.2.54)
Login name: anonymous
Password: your e-mail address
Directory: lang/caml-light
Files: see the index in file README
PPaarrtt II
GGeettttiinngg ssttaarrtteedd
7
CChhaapptteerr 11
IInnssttaallllaattiioonn iinnssttrruuccttiioonnss
This chapter explains how to install Caml Light on your machine.
11..11 TThhee UUnniixx vveerrssiioonn
RReeqquuiirreemmeennttss.. Any machine that runs under one of the various flavors of the
Unix operating system, and that has a flat, non-segmented, 32-bit or 64-bit
address space. 4M of RAM, 2M of free disk space. The graphics library
requires X11 release 4 or later.
IInnssttaallllaattiioonn.. The Unix version is distributed in source format, as a
compressed tar file named cl7unix.tar.Z. To extract, move to the directory
where you want the source files to reside, transfer cl7unix.tar.Z to that
directory, and execute
zcat cl7unix.tar.Z | tar xBf -
This extracts the source files in the current directory. The file INSTALL
contains complete instructions on how to configure, compile and install Caml
Light. Read it and follow the instructions.
TTrroouubblleesshhoooottiinngg.. See the file INSTALL.
11..22 TThhee MMaacciinnttoosshh vveerrssiioonn
RReeqquuiirreemmeennttss.. Any Macintosh with at least 1M of RAM (2M is recommended),
running System 6 or 7. About 850K of free space on the disk. The parts of
the Caml Light system that support batch compilation currently require the
Macintosh Programmer's Workshop (MPW) version 3.2. MPW is Apple's development
environment, and it is distributed by APDA, Apple's Programmers and Developers
Association. See the file READ ME in the distribution for APDA's address.
IInnssttaallllaattiioonn.. Create the folder where the Caml Light files will reside.
Double-click on the file cl7macbin.sea from the distribution. This displays a
file dialog box. Open the folder where the Caml Light files will reside, and
click on the Extract button. This will re-create all files from the
distribution in the Caml Light folder.
To test the installation, double-click on the application Caml Light. The
``Caml Light output'' window should display something like
> Caml Light version 0.7
#
8
Chapter 1. Installation instructions 9
In the ``Caml Light input'' window, enter 1+2;; and press the Return key. The
``Caml Light output'' window should display:
> Caml Light version 0.7
#1+2;;
- : int = 3
#
Select ``Quit'' from the ``File'' menu to return to the Finder.
If you have MPW, you can install the batch compilation tools as follows.
The tools and scripts from the tools folder must reside in a place where MPW
will find them as commands. There are two ways to achieve this result:
either copy the files in the tools folder to the Tools or the Scripts folder
in your MPW folder; or keep the files in the tools folder and add the
following line to your UserStartup file (assuming Caml Light resides in folder
Caml Light on the disk named My HD):
Set Commands "{Commands},My HD:Caml Light:tools:"
In either case, you now have to edit the camlc script, and replace the string
Macintosh HD:Caml Light:lib:
(in the first line) with the actual pathname of the lib folder. For example,
if you put Caml Light in folder Caml Light on the disk named My HD, the first
line of camlc should read:
Set stdlib "My HD:Caml Light:lib:"
TTrroouubblleesshhoooottiinngg.. Here is one commonly encountered problem.
Cannot find file stream.zi
(Displayed in the ``Caml Light output'' window, with an alert box telling
you that Caml Light has terminated abnormally.) This is an installation
error. The folder named lib in the distribution must always be in the
same folder as the Caml Light application. It's OK to move the
application to another folder; but remember to move the lib directory to
the same folder. (To return to the Finder, first select ``Quit'' from
the ``File'' menu.)
11..33 TThhee PPCC vveerrssiioonn
RReeqquuiirreemmeennttss.. A PC equipped with a 80386, 80486 or Pentium processor, running
Windows 3.x, Windows 95 or Windows NT. About 3M of free space on the disk. At
least 8M of RAM is recommended.
IInnssttaallllaattiioonn.. Windows 3.x users must install first the Win32s compatibility
system. Win32s is distributed along with Caml Light and contains detailed
installation instructions.
In the following, we assume that the distribution files resides in drive A:,
and that the hard disk on which you are installing Caml Light is drive C:. If
this is not the case, replace A: and C: by the appropriate drives.
Change to a directory on the hard disk where the Caml Light distribution
will reside. The installation will create a subdirectory named CAML in that
directory, and put all the Caml Light files in CAML. In the following, we
Chapter 1. Installation instructions 10
assume that you will be installing from C:\, thus putting all Caml Light files
in C:\CAML. Execute the following commands:
C:
cd \
A:pkunzip -d A:cl71win
(Be careful not to omit the -d option to pkunzip.)
The remainder of the installation procedure is described in the
CAML\INSTALL.TXT file contained in the distribution.
PPaarrtt IIII
TThhee CCaammll LLiigghhtt llaanngguuaaggee rreeffeerreennccee mmaannuuaall
11
CChhaapptteerr 22
TThhee ccoorree CCaammll LLiigghhtt llaanngguuaaggee
FFoorreewwoorrdd
This document is intended as a reference manual for the Caml Light language.
It lists all language constructs, and gives their precise syntax and informal
semantics. It is by no means a tutorial introduction to the language: there
is not a single example. A good working knowledge of the language, as
provided by the companion tutorial F_u_n_c_t_i_o_n_a_l_ p_r_o_g_r_a_m_m_i_n_g_ u_s_i_n_g_ C_a_m_l_ L_i_g_h_t_, is
assumed.
No attempt has been made at mathematical rigor: words are employed with
their intuitive meaning, without further definition. As a consequence, the
typing rules have been left out, by lack of the mathematical framework
required to express them, while they are definitely part of a full formal
definition of the language. The reader interested in truly formal
descriptions of languages from the ML family is referred to T_h_e_ d_e_f_i_n_i_t_i_o_n_ o_f_
S_t_a_n_d_a_r_d_ M_L_ and C_o_m_m_e_n_t_a_r_y_ o_n_ S_t_a_n_d_a_r_d_ M_L_, by Milner, Tofte and Harper, MIT
Press.
WWaarrnniinngg
Several implementations of the Caml Light language are available, and they
evolve at each release. Consequently, this document carefully distinguishes
the language and its implementations. Implementations can provide extra
language constructs; moreover, all points left unspecified in this reference
manual can be interpreted differently by the implementations. The purpose of
this reference manual is to specify those features that all implementations
must provide.
NNoottaattiioonnss
The syntax of the language is given in BNF-like notation. Terminal symbols
are set in typewriter font (like this). Non-terminal symbols are set in
italic font (l_i_k_e_ t_h_a_t_). Square brackets [...] denote optional components.
Curly brackets {...} denotes zero, one or several repetitions of the enclosed
components. Curly bracket with a trailing plus sign {...}+ denote one or
several repetitions of the enclosed components. Parentheses (...) denote
grouping.
12
Chapter 2. The core Caml Light language 13
22..11 LLeexxiiccaall ccoonnvveennttiioonnss
BBllaannkkss
The following characters are considered as blanks: space, newline, horizontal
tabulation, carriage return, line feed and form feed. Blanks are ignored, but
they separate adjacent identifiers, literals and keywords that would otherwise
be confused as one single identifier, literal or keyword.
CCoommmmeennttss
Comments are introduced by the two characters (*, with no intervening blanks,
and terminated by the characters *), with no intervening blanks. Comments are
treated as blank characters. Comments do not occur inside string or character
literals. Nested comments are correctly handled.
IIddeennttiiffiieerrss
i_d_e_n_t_ ::= l_e_t_t_e_r_ {l_e_t_t_e_r_ | 0...9 | _}
l_e_t_t_e_r_ ::= A...Z | a...z
Identifiers are sequences of letters, digits and _ (the underscore
character), starting with a letter. Letters contain at least the 52 lowercase
and uppercase letters from the ASCII set. Implementations can recognize as
letters other characters from the extended ASCII set. Identifiers cannot
contain two adjacent underscore characters (__). Implementation may limit the
number of characters of an identifier, but this limit must be above 256
characters. All characters in an identifier are meaningful.
IInntteeggeerr lliitteerraallss
i_n_t_e_g_e_r_-_l_i_t_e_r_a_l_ ::= [-] {0...9}+
| [-] (0x | 0X) {0...9 | A...F | a...f}+
| [-] (0o | 0O) {0...7}+
| [-] (0b | 0B) {0...1}+
An integer literal is a sequence of one or more digits, optionally preceded
by a minus sign. By default, integer literals are in decimal (radix 10). The
following prefixes select a different radix:
--------------------------------
|Prefix|Radix |
--------------------------------
|0x, 0X|hexadecimal (radix 16) |
|0o, 0O|octal (radix 8) |
|0b, 0B|binary (radix 2) |
--------------------------------
(The initial 0 is the digit zero; the O for octal is the letter O.)
FFllooaattiinngg--ppooiinntt lliitteerraallss
f_l_o_a_t_-_l_i_t_e_r_a_l_ ::= [-] {0...9}+ [. {0...9}] [(e | E) [+ | -] {0...9}+]
Floating-point decimals consist in an integer part, a decimal part and an
exponent part. The integer part is a sequence of one or more digits,
optionally preceded by a minus sign. The decimal part is a decimal point
followed by zero, one or more digits. The exponent part is the character e or
E followed by an optional + or - sign, followed by one or more digits. The
decimal part or the exponent part can be omitted, but not both to avoid
ambiguity with integer literals.
Chapter 2. The core Caml Light language 14
CChhaarraacctteerr lliitteerraallss
c_h_a_r_-_l_i_t_e_r_a_l_ ::= ` r_e_g_u_l_a_r_-_c_h_a_r_ `
| ` \ (\ | ` | n | t | b | r) `
| ` \ (0...9) (0...9) (0...9) `
Character literals are delimited by ` (backquote) characters. The two
backquotes enclose either one character different from ` and \, or one of the
escape sequences below:
--------------------------------------------------------
|Sequence|Character denoted |
--------------------------------------------------------
|\\ |backslash (\) |
|\` |backquote (`) |
|\n |newline (LF) |
|\r |return (CR) |
|\t |horizontal tabulation (TAB) |
|\b |backspace (BS) |
|\d_d_d_ |the character with ASCII code d_d_d_ in decimal |
--------------------------------------------------------
SSttrriinngg lliitteerraallss
s_t_r_i_n_g_-_l_i_t_e_r_a_l_ ::= " {s_t_r_i_n_g_-_c_h_a_r_a_c_t_e_r_} "
s_t_r_i_n_g_-_c_h_a_r_a_c_t_e_r_ ::= r_e_g_u_l_a_r_-_c_h_a_r_
| \ (\ | " | n | t | b | r)
| \ (0...9) (0...9) (0...9)
String literals are delimited by " (double quote) characters. The two
double quotes enclose a sequence of either characters different from " and \,
or escape sequences from the table below:
--------------------------------------------------------
|Sequence|Character denoted |
--------------------------------------------------------
|\\ |backslash (\) |
|\" |double quote (") |
|\n |newline (LF) |
|\r |return (CR) |
|\t |horizontal tabulation (TAB) |
|\b |backspace (BS) |
|\d_d_d_ |the character with ASCII code d_d_d_ in decimal |
--------------------------------------------------------
16
Implementations must support string literals up to 2 -1 characters in
length (65535 characters).
KKeeyywwoorrddss
The identifiers below are reserved as keywords, and cannot be employed
otherwise:
and as begin do done downto
else end exception for fun function
if in let match mutable not
of or prefix rec then to
try type value where while with
The following character sequences are also keywords:
# ! != & ( ) * *. + +.
, - -. -> . .( / /. : ::
Chapter 2. The core Caml Light language 15
:= ; ;; < <. <- <= <=. <> <>.
= =. == > >. >= >=. @ [ [|
] ^ _ __ { | |] } '
AAmmbbiigguuiittiieess
Lexical ambiguities are resolved according to the ``longest match'' rule:
when a character sequence can be decomposed into two tokens in several
different ways, the decomposition retained is the one with the longest first
token.
22..22 GGlloobbaall nnaammeess
Global names are used to denote value variables, value constructors (constant
or non-constant), type constructors, and record labels. Internally, a global
name consists of two parts: the name of the defining module (the module
name), and the name of the global inside that module (the local name). The
two parts of the name must be valid identifiers. Externally, global names
have the following syntax:
g_l_o_b_a_l_-_n_a_m_e_ ::= i_d_e_n_t_
| i_d_e_n_t_ __ i_d_e_n_t_
The form i_d_e_n_t_ __ i_d_e_n_t_ is called a qualified name. The first identifier is
the module name, the second identifier is the local name. The form i_d_e_n_t_ is
called an unqualified name. The identifier is the local name; the module name
is omitted. The compiler infers this module name following the completion
rules given below, therefore transforming the unqualified name into a full
global name.
To complete an unqualified identifier, the compiler checks a list of
modules, the opened modules, to see if they define a global with the same
local name as the unqualified identifier. When one is found, the identifier
is completed into the full name of that global. That is, the compiler takes
as module name the name of an opened module that defines a global with the
same local name as the unqualified identifier. If several modules satisfy
this condition, the one that comes first in the list of opened modules is
selected.
The list of opened modules always includes the module currently being
compiled (checked first). (In the case of a toplevel-based implementation,
this is the module where all toplevel definitions are entered.) It also
includes a number of standard library modules that provide the initial
environment (checked last). In addition, the #open and #close directives can
be used to add or remove modules from that list. The modules added with #open
are checked after the module currently being compiled, but before the initial
standard library modules.
Chapter 2. The core Caml Light language 16
v_a_r_i_a_b_l_e_ ::= g_l_o_b_a_l_-_n_a_m_e_
| prefix o_p_e_r_a_t_o_r_-_n_a_m_e_
o_p_e_r_a_t_o_r_-_n_a_m_e_ ::= + | - | * | / | mod | +. | -. | *. | /.
| @ | ^ | ! | := | = | <> | == | != | !
| < | <= | > | <= | <. | <=. | >. | <=.
c_c_o_n_s_t_r_ ::= g_l_o_b_a_l_-_n_a_m_e_
| []
| ()
n_c_c_o_n_s_t_r_ ::= g_l_o_b_a_l_-_n_a_m_e_
| prefix ::
t_y_p_e_c_o_n_s_t_r_ ::= g_l_o_b_a_l_-_n_a_m_e_
l_a_b_e_l_ ::= g_l_o_b_a_l_-_n_a_m_e_
Depending on the context, global names can stand for global variables
(v_a_r_i_a_b_l_e_), constant value constructors (c_c_o_n_s_t_r_), non-constant value
constructors (n_c_c_o_n_s_t_), type constructors (t_y_p_e_c_o_n_s_t_r_), or record labels
(l_a_b_e_l_). For variables and value constructors, special names built with
prefix and an operator name are recognized. The tokens [] and () are also
recognized as built-in constant constructors (the empty list and the unit
value).
The syntax of the language restricts labels and type constructors to appear
in certain positions, where no other kind of global names are accepted. Hence
labels and type constructors have their own name spaces. Value constructors
and value variables live in the same name space: a global name in value
position is interpreted as a value constructor if it appears in the scope of a
type declaration defining that constructor; otherwise, the global name is
taken to be a value variable. For value constructors, the type declaration
determines whether a constructor is constant or not.
22..33 VVaalluueess
This section describes the kinds of values that are manipulated by Caml Light
programs.
22..33..11 BBaassee vvaalluueess
IInntteeggeerr nnuummbbeerrss
30 30
Integer values are integer numbers from -2 to 2 -1, that is -1073741824
to 1073741823. Implementations may support a wider range of integer values.
FFllooaattiinngg--ppooiinntt nnuummbbeerrss
Floating-point values are numbers in floating-point representation.
Everything about floating-point values is implementation-dependent, including
the range of representable numbers, the number of significant digits, and the
way floating-point results are rounded.
CChhaarraacctteerrss
Character values are represented as 8-bit integers between 0 and 255.
Character codes between 0 and 127 are interpreted following the ASCII
standard. The interpretation of character codes between 128 and 255 is
implementation-dependent.
Chapter 2. The core Caml Light language 17
CChhaarraacctteerr ssttrriinnggss
String values are finite sequences of characters. Implementations must
16
support strings up to 2 -1 characters in length (65535 characters).
Implementations may support longer strings.
22..33..22 TTuupplleess
Tuples of values are written (v1,...,vn), standing for the n-tuple of values
14
v1 to vn. Tuples of up to 2 -1 elements (16383 elements) must be
supported, though implementations may support tuples with more elements.
22..33..33 RReeccoorrddss
Record values are labeled tuples of values. The record value written
{label1=v1 ;...;labeln =vn} associates the value vi to the record label
14
labeli, for i=1...n. Records with up to 2 -1 fields (16383 fields) must be
supported, though implementations may support records with more fields.
22..33..44 AArrrraayyss
Arrays are finite, variable-sized sequences of values of the same type.
14
Arrays of length up to 2 -1 (16383 elements) must be supported, though
implementations may support larger arrays.
22..33..55 VVaarriiaanntt vvaalluueess
Variant values are either a constant constructor, or a pair of a non-constant
constructor and a value. The former case is written cconstr; the latter case
is written ncconstr(v), where v is said to be the argument of the non-constant
constructor ncconstr.
The following constants are treated like built-in constant constructors:
------------------------------
Constant Constructor
------------------------------
false the boolean false
true the boolean true
() the ``unit'' value
[] the empty list
------------------------------
22..33..66 FFuunnccttiioonnss
Functional values are mappings from values to values.
22..44 TTyyppee eexxpprreessssiioonnss
t_y_p_e_x_p_r_ ::= ' i_d_e_n_t_
| ( t_y_p_e_x_p_r_ )
| t_y_p_e_x_p_r_ -> t_y_p_e_x_p_r_
| t_y_p_e_x_p_r_ {* t_y_p_e_x_p_r_}+
| t_y_p_e_c_o_n_s_t_r_
| t_y_p_e_x_p_r_ t_y_p_e_c_o_n_s_t_r_
| ( t_y_p_e_x_p_r_ {, t_y_p_e_x_p_r_} ) t_y_p_e_c_o_n_s_t_r_
Chapter 2. The core Caml Light language 18
The table below shows the relative precedences and associativity of
operators and non-closed type constructions. The constructions with higher
precedences come first.
---------------------------------------------
|Operator |Associativity |
---------------------------------------------
|Type constructor application |-- |
|* |-- |
|-> |right |
---------------------------------------------
Type expressions denote types in definitions of data types as well as in
type constraints over patterns and expressions.
TTyyppee vvaarriiaabblleess
The type expression ' i_d_e_n_t_ stands for the type variable named i_d_e_n_t_. In data
type definitions, type variables are names for the data type parameters. In
type constraints, they represent unspecified types that can be instantiated by
any type to satisfy the type constraint.
PPaarreenntthheessiizzeedd ttyyppeess
The type expression ( t_y_p_e_x_p_r_ ) denotes the same type as t_y_p_e_x_p_r_.
FFuunnccttiioonn ttyyppeess
The type expression t_y_p_e_x_p_r_1 -> t_y_p_e_x_p_r_2 denotes the type of functions
mapping arguments of type t_y_p_e_x_p_r_1 to results of type t_y_p_e_x_p_r_2.
TTuuppllee ttyyppeess
The type expression t_y_p_e_x_p_r_1 *...* t_y_p_e_x_p_r_n denotes the type of tuples whose
elements belong to types t_y_p_e_x_p_r_1,...t_y_p_e_x_p_r_n respectively.
CCoonnssttrruucctteedd ttyyppeess
Type constructors with no parameter, as in t_y_p_e_c_o_n_s_t_r_, are type expressions.
The type expression t_y_p_e_x_p_r_ t_y_p_e_c_o_n_s_t_r_, where t_y_p_e_c_o_n_s_t_r_ is a type
constructor with one parameter, denotes the application of the unary type
constructor t_y_p_e_c_o_n_s_t_r_ to the type t_y_p_e_x_p_r_.
The type expression (t_y_p_e_x_p_r_1,...,t_y_p_e_x_p_r_n) t_y_p_e_c_o_n_s_t_r_, where t_y_p_e_c_o_n_s_t_r_ is
a type constructor with n parameters, denotes the application of the n-ary
type constructor t_y_p_e_c_o_n_s_t_r_ to the types t_y_p_e_x_p_r_1 through t_y_p_e_x_p_r_n.
22..55 CCoonnssttaannttss
c_o_n_s_t_a_n_t_ ::= i_n_t_e_g_e_r_-_l_i_t_e_r_a_l_
| f_l_o_a_t_-_l_i_t_e_r_a_l_
| c_h_a_r_-_l_i_t_e_r_a_l_
| s_t_r_i_n_g_-_l_i_t_e_r_a_l_
| c_c_o_n_s_t_r_
The syntactic class of constants comprises literals from the four base types
(integers, floating-point numbers, characters, character strings), and
constant constructors.
Chapter 2. The core Caml Light language 19
22..66 PPaatttteerrnnss
p_a_t_t_e_r_n_ ::= i_d_e_n_t_
| _
| p_a_t_t_e_r_n_ as i_d_e_n_t_
| ( p_a_t_t_e_r_n_ )
| ( p_a_t_t_e_r_n_ : t_y_p_e_x_p_r_ )
| p_a_t_t_e_r_n_ | p_a_t_t_e_r_n_
| c_o_n_s_t_a_n_t_
| n_c_c_o_n_s_t_r_ p_a_t_t_e_r_n_
| p_a_t_t_e_r_n_ , p_a_t_t_e_r_n_ {, p_a_t_t_e_r_n_}
| { l_a_b_e_l_ = p_a_t_t_e_r_n_ {; l_a_b_e_l_ = p_a_t_t_e_r_n_} }
| [ ]
| [ p_a_t_t_e_r_n_ {; p_a_t_t_e_r_n_} ]
| p_a_t_t_e_r_n_ :: p_a_t_t_e_r_n_
The table below shows the relative precedences and associativity of
operators and non-closed pattern constructions. The constructions with higher
precedences come first.
----------------------------------------
|Operator |Associativity |
----------------------------------------
|Constructor application|-- |
|:: |right |
|, |-- |
|| |left |
|as |-- |
----------------------------------------
Patterns are templates that allow selecting data structures of a given
shape, and binding identifiers to components of the data structure. This
selection operation is called pattern matching; its outcome is either ``this
value does not match this pattern'', or ``this value matches this pattern,
resulting in the following bindings of identifiers to values''.
VVaarriiaabbllee ppaatttteerrnnss
A pattern that consists in an identifier matches any value, binding the
identifier to the value. The pattern _ also matches any value, but does not
bind any identifier.
AAlliiaass ppaatttteerrnnss
The pattern p_a_t_t_e_r_n_1 as i_d_e_n_t_ matches the same values as p_a_t_t_e_r_n_1. If the
matching against p_a_t_t_e_r_n_1 is successful, the identifier i_d_e_n_t_ is bound to the
matched value, in addition to the bindings performed by the matching against
p_a_t_t_e_r_n_1.
PPaarreenntthheessiizzeedd ppaatttteerrnnss
The pattern ( p_a_t_t_e_r_n_1 ) matches the same values as p_a_t_t_e_r_n_1. A type
constraint can appear in a parenthesized patterns, as in
( p_a_t_t_e_r_n_1 : t_y_p_e_x_p_r_ ). This constraint forces the type of p_a_t_t_e_r_n_1 to be
compatible with t_y_p_e_.
````OOrr'''' ppaatttteerrnnss
The pattern p_a_t_t_e_r_n_1 | p_a_t_t_e_r_n_2 represents the logical ``or'' of the two
patterns p_a_t_t_e_r_n_1 and p_a_t_t_e_r_n_2. A value matches p_a_t_t_e_r_n_1 | p_a_t_t_e_r_n_2 either
if it matches p_a_t_t_e_r_n_1 or if it matches p_a_t_t_e_r_n_2. The two sub-patterns
Chapter 2. The core Caml Light language 20
p_a_t_t_e_r_n_1 and p_a_t_t_e_r_n_2 must contain no identifiers. Hence no bindings are
returned by matching against an ``or'' pattern.
CCoonnssttaanntt ppaatttteerrnnss
A pattern consisting in a constant matches the values that are equal to this
constant.
VVaarriiaanntt ppaatttteerrnnss
The pattern n_c_c_o_n_s_t_r_ p_a_t_t_e_r_n_1 matches all variants whose constructor is equal
to n_c_c_o_n_s_t_r_, and whose argument matches p_a_t_t_e_r_n_1.
The pattern p_a_t_t_e_r_n_1 :: p_a_t_t_e_r_n_2 matches non-empty lists whose heads match
p_a_t_t_e_r_n_1, and whose tails match p_a_t_t_e_r_n_2. This pattern behaves like
prefix :: ( p_a_t_t_e_r_n_1 , p_a_t_t_e_r_n_2 ).
The pattern [ p_a_t_t_e_r_n_1 ;...; p_a_t_t_e_r_n_n ] matches lists of length n whose
elements match p_a_t_t_e_r_n_1 ...p_a_t_t_e_r_n_n, respectively. This pattern behaves like
p_a_t_t_e_r_n_1 ::...:: p_a_t_t_e_r_n_n :: [].
TTuuppllee ppaatttteerrnnss
The pattern p_a_t_t_e_r_n_1 ,..., p_a_t_t_e_r_n_n matches n-tuples whose components match
the patterns p_a_t_t_e_r_n_1 through p_a_t_t_e_r_n_n. That is, the pattern matches the
tuple values (v_1,...,v_n) such that p_a_t_t_e_r_n_i matches v_i for i =1, ...,n.
RReeccoorrdd ppaatttteerrnnss
The pattern { l_a_b_e_l_1 = p_a_t_t_e_r_n_1 ;...; l_a_b_e_l_n = p_a_t_t_e_r_n_n } matches records
that define at least the labels l_a_b_e_l_1 through l_a_b_e_l_n, and such that the
value associated to l_a_b_e_l_i match the pattern p_a_t_t_e_r_n_i, for i= 1,...,n. The
record value can define more labels than l_a_b_e_l_1 ...l_a_b_e_l_n; the values
associated to these extra labels are not taken into account for matching.
Chapter 2. The core Caml Light language 21
22..77 EExxpprreessssiioonnss
e_x_p_r_ ::= i_d_e_n_t_
| v_a_r_i_a_b_l_e_
| c_o_n_s_t_a_n_t_
| ( e_x_p_r_ )
| begin e_x_p_r_ end
| ( e_x_p_r_ : t_y_p_e_x_p_r_ )
| e_x_p_r_ , e_x_p_r_ {, e_x_p_r_}
| n_c_c_o_n_s_t_r_ e_x_p_r_
| e_x_p_r_ :: e_x_p_r_
| [ e_x_p_r_ {; e_x_p_r_} ]
| [| e_x_p_r_ {; e_x_p_r_} |]
| { l_a_b_e_l_ = e_x_p_r_ {; l_a_b_e_l_ = e_x_p_r_} }
| e_x_p_r_ e_x_p_r_
| p_r_e_f_i_x_-_o_p_ e_x_p_r_
| e_x_p_r_ i_n_f_i_x_-_o_p_ e_x_p_r_
| e_x_p_r_ . l_a_b_e_l_
| e_x_p_r_ . l_a_b_e_l_ <- e_x_p_r_
| e_x_p_r_ .( e_x_p_r_ )
| e_x_p_r_ .( e_x_p_r_ ) <- e_x_p_r_
| e_x_p_r_ & e_x_p_r_
| e_x_p_r_ or e_x_p_r_
| if e_x_p_r_ then e_x_p_r_ [else e_x_p_r_]
| while e_x_p_r_ do e_x_p_r_ done
| for i_d_e_n_t_ = e_x_p_r_ (to | downto) e_x_p_r_ do e_x_p_r_ done
| e_x_p_r_ ; e_x_p_r_
| match e_x_p_r_ with s_i_m_p_l_e_-_m_a_t_c_h_i_n_g_
| fun m_u_l_t_i_p_l_e_-_m_a_t_c_h_i_n_g_
| function s_i_m_p_l_e_-_m_a_t_c_h_i_n_g_
| try e_x_p_r_ with s_i_m_p_l_e_-_m_a_t_c_h_i_n_g_
| let [rec] l_e_t_-_b_i_n_d_i_n_g_ {and l_e_t_-_b_i_n_d_i_n_g_} in e_x_p_r_
s_i_m_p_l_e_-_m_a_t_c_h_i_n_g_ ::= p_a_t_t_e_r_n_ -> e_x_p_r_ {| p_a_t_t_e_r_n_ -> e_x_p_r_}
m_u_l_t_i_p_l_e_-_m_a_t_c_h_i_n_g_ ::= p_a_t_t_e_r_n_-_l_i_s_t_ -> e_x_p_r_ {| p_a_t_t_e_r_n_-_l_i_s_t_ -> e_x_p_r_}
p_a_t_t_e_r_n_-_l_i_s_t_ ::= p_a_t_t_e_r_n_ {p_a_t_t_e_r_n_}
l_e_t_-_b_i_n_d_i_n_g_ ::= p_a_t_t_e_r_n_ = e_x_p_r_
| v_a_r_i_a_b_l_e_ p_a_t_t_e_r_n_-_l_i_s_t_ = e_x_p_r_
p_r_e_f_i_x_-_o_p_ ::= - | -. | !
i_n_f_i_x_-_o_p_ ::= + | - | * | / | mod | +. | -. | *. | /. | @ | ^ | ! | :=
| = | <> | == | != | < | <= | > | <= | <. | <=. | >. | <=.
The table below shows the relative precedences and associativity of
operators and non-closed constructions. The constructions with higher
precedence come first.
Chapter 2. The core Caml Light language 22
---------------------------------------------
|Construction or operator |Associativity |
---------------------------------------------
|! |-- |
|. .( |-- |
|function application |right |
|constructor application |-- |
|- -. (prefix) |-- |
|mod |left |
|* *. / /. |left |
|+ +. - -. |left |
|:: |right |
|@ ^ |right |
|comparisons (= == < etc.) |left |
|not |-- |
|& |left |
|or |left |
|, |-- |
|<- := |right |
|if |-- |
|; |right |
|let match fun function try |-- |
---------------------------------------------
22..77..11 SSiimmppllee eexxpprreessssiioonnss
CCoonnssttaannttss
Expressions consisting in a constant evaluate to this constant.
VVaarriiaabblleess
Expressions consisting in a variable evaluate to the value bound to this
variable in the current evaluation environment. The variable can be either a
qualified identifier or a simple identifier. Qualified identifiers always
denote global variables. Simple identifiers denote either a local variable,
if the identifier is locally bound, or a global variable, whose full name is
obtained by qualifying the simple identifier, as described in section 2.2.
PPaarreenntthheessiizzeedd eexxpprreessssiioonnss
The expressions ( e_x_p_r_ ) and begin e_x_p_r_ end have the same value as e_x_p_r_. Both
constructs are semantically equivalent, but it is good style to use
begin...end inside control structures:
if ... then begin ... ; ... end else begin ... ; ... end
and (...) for the other grouping situations.
Parenthesized expressions can contain a type constraint, as in
( e_x_p_r_ : t_y_p_e_ ). This constraint forces the type of e_x_p_r_ to be compatible
with t_y_p_e_.
FFuunnccttiioonn aabbssttrraaccttiioonn
The most general form of function abstraction is:
1 m
fun pattern1 ... pattern1 -> expr1
| ... 1 m
| patternn ... patternn -> exprn
Chapter 2. The core Caml Light language 23
This expression evaluates to a functional value with m_ curried arguments.
When this function is applied to m_ values v_1 ... v_m, the values are matched
1 m
against each pattern row p_a_t_t_e_r_n_i...p_a_t_t_e_r_n_i for i_ from 1 to n_. If one of
these matchings succeeds, that is if the value v_j matches the pattern
j
p_a_t_t_e_r_n_i for all j=1, ...,m, then the expression e_x_p_r_i associated to the
selected pattern row is evaluated, and its value becomes the value of the
function application. The evaluation of e_x_p_r_i takes place in an environment
enriched by the bindings performed during the matching.
If several pattern rows match the arguments, the one that occurs first in
the function definition is selected. If none of the pattern rows matches the
argument, the exception Match_failure is raised.
If the function above is applied to less than m_ arguments, a functional
value is returned, that represents the partial application of the function to
the arguments provided. This partial application is a function that, when
applied to the remaining arguments, matches all arguments against the pattern
rows as described above. Matching does not start until all m_ arguments have
been provided to the function; hence, partial applications of the function to
less than m_ arguments never raise Match_failure.
All pattern rows in the function body must contain the same number of
patterns. A variable must not be bound more than once in one pattern row.
Functions with only one argument can be defined with the function keyword
instead of fun:
function pattern1 -> expr1
| ...
| patternn -> exprn
The function thus defined behaves exactly as described above. The only
difference between the two forms of function definition is how a parsing
ambiguity is resolved. The two forms c_c_o_n_s_t_r_ p_a_t_t_e_r_n_ (two patterns in a row)
and n_c_c_o_n_s_t_r_ p_a_t_t_e_r_n_ (one pattern) cannot be distinguished syntactically.
Function definitions introduced by fun resolve the ambiguity to the former
form; function definitions introduced by function resolve it to the latter
form (the former form makes no sense in this case).
FFuunnccttiioonn aapppplliiccaattiioonn
Function application is denoted by juxtaposition of expressions. The
expression e_x_p_r_1 e_x_p_r_2...e_x_p_r_n evaluates the expressions e_x_p_r_1 to e_x_p_r_n. The
expression e_x_p_r_1 must evaluate to a functional value, which is then applied
to the values of e_x_p_r_2,...,e_x_p_r_n. The order in which the expressions
e_x_p_r_1,...,e_x_p_r_n are evaluated is not specified.
LLooccaall ddeeffiinniittiioonnss
The let and let rec constructs bind variables locally. The construct
let p_a_t_t_e_r_n_1 = e_x_p_r_1 and...and p_a_t_t_e_r_n_n = e_x_p_r_n in e_x_p_r_
evaluates e_x_p_r_1...e_x_p_r_n in some unspecified order, then matches their values
against the patterns p_a_t_t_e_r_n_1...p_a_t_t_e_r_n_n. If the matchings succeed, e_x_p_r_ is
evaluated in the environment enriched by the bindings performed during
matching, and the value of e_x_p_r_ is returned as the value of the whole let
expression. If one of the matchings fails, the exception Match_failure is
raised.
Chapter 2. The core Caml Light language 24
An alternate syntax is provided to bind variables to functional values:
instead of writing
i_d_e_n_t_ = fun p_a_t_t_e_r_n_1...p_a_t_t_e_r_n_m -> e_x_p_r_
in a let expression, one may instead write
i_d_e_n_t_ p_a_t_t_e_r_n_1 ...p_a_t_t_e_r_n_m = e_x_p_r_
Both forms bind i_d_e_n_t_ to the curried function with m_ arguments and only one
case,
p_a_t_t_e_r_n_1 ...p_a_t_t_e_r_n_m -> e_x_p_r_.
Recursive definitions of variables are introduced by let rec:
let rec p_a_t_t_e_r_n_1 = e_x_p_r_1 and...and p_a_t_t_e_r_n_n = e_x_p_r_n in e_x_p_r_
The only difference with the let construct described above is that the
bindings of variables to values performed by the pattern-matching are
considered already performed when the expressions e_x_p_r_1 to e_x_p_r_n are
evaluated. That is, the expressions e_x_p_r_1 to e_x_p_r_n can reference identifiers
that are bound by one of the patterns p_a_t_t_e_r_n_1,...,p_a_t_t_e_r_n_n, and expect them
to have the same value as in e_x_p_r_, the body of the let rec construct.
The recursive definition is guaranteed to behave as described above if the
expressions e_x_p_r_1 to e_x_p_r_n are function definitions (fun... or function...),
and the patterns p_a_t_t_e_r_n_1...p_a_t_t_e_r_n_n consist in a single variable, as in:
let rec i_d_e_n_t_1 = fun...and...and i_d_e_n_t_n = fun...in e_x_p_r_
This defines i_d_e_n_t_1...i_d_e_n_t_n as mutually recursive functions local to e_x_p_r_.
The behavior of other forms of let rec definitions is
implementation-dependent.
22..77..22 CCoonnttrrooll ccoonnssttrruuccttss
SSeeqquueennccee
The expression e_x_p_r_1 ; e_x_p_r_2 evaluates e_x_p_r_1 first, then e_x_p_r_2, and returns
the value of e_x_p_r_2.
CCoonnddiittiioonnaall
The expression if e_x_p_r_1 then e_x_p_r_2 else e_x_p_r_3 evaluates to the value of e_x_p_r_2
if e_x_p_r_1 evaluates to the boolean true, and to the value of e_x_p_r_3 if e_x_p_r_1
evaluates to the boolean false.
The else e_x_p_r_3 part can be omitted, in which case it defaults to else ().
CCaassee eexxpprreessssiioonn
The expression
match expr
with pattern1 -> expr1
| ...
| patternn -> exprn
Chapter 2. The core Caml Light language 25
matches the value of e_x_p_r_ against the patterns p_a_t_t_e_r_n_1 to p_a_t_t_e_r_n_n. If the
matching against p_a_t_t_e_r_n_i succeeds, the associated expression e_x_p_r_i is
evaluated, and its value becomes the value of the whole match expression. The
evaluation of e_x_p_r_i takes place in an environment enriched by the bindings
performed during matching. If several patterns match the value of e_x_p_r_, the
one that occurs first in the match expression is selected. If none of the
patterns match the value of e_x_p_r_, the exception Match_failure is raised.
BBoooolleeaann ooppeerraattoorrss
The expression e_x_p_r_1 & e_x_p_r_2 evaluates to true if both e_x_p_r_1 and e_x_p_r_2
evaluate to true; otherwise, it evaluates to false. The first component,
e_x_p_r_1, is evaluated first. The second component, e_x_p_r_2, is not evaluated if
the first component evaluates to false. Hence, the expression e_x_p_r_1 & e_x_p_r_2
behaves exactly as
if e_x_p_r_1 then e_x_p_r_2 else false.
The expression e_x_p_r_1 or e_x_p_r_2 evaluates to true if one of e_x_p_r_1 and e_x_p_r_2
evaluates to true; otherwise, it evaluates to false. The first component,
e_x_p_r_1, is evaluated first. The second component, e_x_p_r_2, is not evaluated if
the first component evaluates to true. Hence, the expression e_x_p_r_1 or e_x_p_r_2
behaves exactly as
if e_x_p_r_1 then true else e_x_p_r_2.
LLooooppss
The expression while e_x_p_r_1 do e_x_p_r_2 done repeatedly evaluates e_x_p_r_2 while
e_x_p_r_1 evaluates to true. The loop condition e_x_p_r_1 is evaluated and tested at
the beginning of each iteration. The whole while...done expression evaluates
to the unit value ().
The expression for i_d_e_n_t_ = e_x_p_r_1 to e_x_p_r_2 do e_x_p_r_3 done first evaluates the
expressions e_x_p_r_1 and e_x_p_r_2 (the boundaries) into integer values n_ and p_.
Then, the loop body e_x_p_r_3 is repeatedly evaluated in an environment where the
local variable named i_d_e_n_t_ is successively bound to the values n, n+1, ...,
p-1, p. The loop body is never evaluated if n >p.
The expression for i_d_e_n_t_ = e_x_p_r_1 downto e_x_p_r_2 do e_x_p_r_3 done first evaluates
the expressions e_x_p_r_1 and e_x_p_r_2 (the boundaries) into integer values n_ and p_.
Then, the loop body e_x_p_r_3 is repeatedly evaluated in an environment where the
local variable named i_d_e_n_t_ is successively bound to the values n, n-1, ...,
p+1, p. The loop body is never evaluated if n
expr1
| ...
| patternn -> exprn
evaluates the expression e_x_p_r_ and returns its value if the evaluation of e_x_p_r_
does not raise any exception. If the evaluation of e_x_p_r_ raises an exception,
the exception value is matched against the patterns p_a_t_t_e_r_n_1 to p_a_t_t_e_r_n_n. If
the matching against p_a_t_t_e_r_n_i succeeds, the associated expression e_x_p_r_i is
Chapter 2. The core Caml Light language 26
evaluated, and its value becomes the value of the whole try expression. The
evaluation of e_x_p_r_i takes place in an environment enriched by the bindings
performed during matching. If several patterns match the value of e_x_p_r_, the
one that occurs first in the try expression is selected. If none of the
patterns matches the value of e_x_p_r_, the exception value is raised again,
thereby transparently ``passing through'' the try construct.
22..77..33 OOppeerraattiioonnss oonn ddaattaa ssttrruuccttuurreess
PPrroodduuccttss
The expression e_x_p_r_1 ,..., e_x_p_r_n evaluates to the n_-tuple of the values of
expressions e_x_p_r_1 to e_x_p_r_n. The evaluation order for the subexpressions is
not specified.
VVaarriiaannttss
The expression n_c_c_o_n_s_t_r_ e_x_p_r_ evaluates to the variant value whose constructor
is n_c_c_o_n_s_t_r_, and whose argument is the value of e_x_p_r_.
For lists, some syntactic sugar is provided. The expression e_x_p_r_1 :: e_x_p_r_2
stands for the constructor prefix :: applied to the argument
( e_x_p_r_1 , e_x_p_r_2 ), and therefore evaluates to the list whose head is the
value of e_x_p_r_1 and whose tail is the value of e_x_p_r_2. The expression
[ e_x_p_r_1 ;...; e_x_p_r_n ] is equivalent to e_x_p_r_1 ::...:: e_x_p_r_n :: [], and
therefore evaluates to the list whose elements are the values of e_x_p_r_1 to
e_x_p_r_n.
RReeccoorrddss
The expression { l_a_b_e_l_1 = e_x_p_r_1 ;...; l_a_b_e_l_n = e_x_p_r_n } evaluates to the
record value { l_a_b_e_l_1 = v_1 ;...; l_a_b_e_l_n = v_n }, where v_i is the value of
e_x_p_r_i for i=1, ...,n. The labels l_a_b_e_l_1 to l_a_b_e_l_n must all belong to the
same record types; all labels belonging to this record type must appear
exactly once in the record expression, though they can appear in any order.
The order in which e_x_p_r_1 to e_x_p_r_n are evaluated is not specified.
The expression e_x_p_r_1 . l_a_b_e_l_ evaluates e_x_p_r_1 to a record value, and returns
the value associated to l_a_b_e_l_ in this record value.
The expression e_x_p_r_1 . l_a_b_e_l_ <- e_x_p_r_2 evaluates e_x_p_r_1 to a record value,
which is then modified in-place by replacing the value associated to l_a_b_e_l_ in
this record by the value of e_x_p_r_2. This operation is permitted only if l_a_b_e_l_
has been declared mutable in the definition of the record type. The whole
expression e_x_p_r_1 . l_a_b_e_l_ <- e_x_p_r_2 evaluates to the unit value ().
AArrrraayyss
The expression [| e_x_p_r_1 ;...; e_x_p_r_n |] evaluates to a n_-element array, whose
elements are initialized with the values of e_x_p_r_1 to e_x_p_r_n respectively. The
order in which these expressions are evaluated is unspecified.
The expression e_x_p_r_1 .( e_x_p_r_2 ) is equivalent to the application
vect_item e_x_p_r_1 e_x_p_r_2. In the initial environment, the identifier vect_item
resolves to a built-in function that returns the value of element number
e_x_p_r_2 in the array denoted by e_x_p_r_1. The first element has number 0; the
last element has number n-1, where n_ is the size of the array. The exception
Invalid_argument is raised if the access is out of bounds.
The expression e_x_p_r_1 .( e_x_p_r_2 ) <- e_x_p_r_3 is equivalent to
vect_assign e_x_p_r_1 e_x_p_r_2 e_x_p_r_3. In the initial environment, the identifier
vect_assign resolves to a built-in function that modifies in-place the array
Chapter 2. The core Caml Light language 27
denoted by e_x_p_r_1, replacing element number e_x_p_r_2 by the value of e_x_p_r_3. The
exception Invalid_argument is raised if the access is out of bounds. The
built-in function returns (). Hence, the whole expression
e_x_p_r_1 .( e_x_p_r_2 ) <- e_x_p_r_3 evaluates to the unit value ().
This behavior of the two constructs e_x_p_r_1 .( e_x_p_r_2 ) and
e_x_p_r_1 .( e_x_p_r_2 ) <- e_x_p_r_3 may change if the meaning of the identifiers
vect_item and vect_assign is changed, either by redefinition or by
modification of the list of opened modules. See the discussion below on
operators.
22..77..44 OOppeerraattoorrss
The operators written infix-op in the grammar above can appear in infix
position (between two expressions). The operators written prefix-op in the
grammar above can appear in prefix position (in front of an expression).
The expression p_r_e_f_i_x_-_o_p_ e_x_p_r_ is interpreted as the application i_d_e_n_t_ e_x_p_r_,
where i_d_e_n_t_ is the identifier associated to the operator p_r_e_f_i_x_-_o_p_ in the
table below. Similarly, the expression e_x_p_r_1 i_n_f_i_x_-_o_p_ e_x_p_r_2 is interpreted
as the application i_d_e_n_t_ e_x_p_r_1 e_x_p_r_2, where i_d_e_n_t_ is the identifier
associated to the operator i_n_f_i_x_-_o_p_ in the table below. The identifiers
written i_d_e_n_t_ above are then evaluated following the rules in section 2.7.1.
In the initial environment, they evaluate to built-in functions whose behavior
is described in the table. The behavior of the constructions p_r_e_f_i_x_-_o_p_ e_x_p_r_
and e_x_p_r_1 i_n_f_i_x_-_o_p_ e_x_p_r_2 may change if the meaning of the identifiers
associated to p_r_e_f_i_x_-_o_p_ or i_n_f_i_x_-_o_p_ is changed, either by redefinition of the
identifiers, or by modification of the list of opened modules, through the
#open and #close directives.
Chapter 2. The core Caml Light language 28
---------------------------------------------------------------------------
|Operator |Associated |Behavior in the default environment |
| |identifier | |
---------------------------------------------------------------------------
|+ |prefix + |Integer addition. |
|- (infix) |prefix - |Integer subtraction. |
|- (prefix) |minus |Integer negation. |
|* |prefix * |Integer multiplication. |
|/ |prefix / |Integer division. Raise Division_by_zero if |
| | |second argument is zero. The result is |
| | |unspecified if either argument is negative. |
|mod |prefix mod |Integer modulus. Raise Division_by_zero if |
| | |second argument is zero. The result is |
| | |unspecified if either argument is negative. |
|+. |prefix +. |Floating-point addition. |
|-. (infix) |prefix -. |Floating-point subtraction. |
|-. (prefix) |minus_float |Floating-point negation. |
|*. |prefix *. |Floating-point multiplication. |
|/. |prefix /. |Floating-point division. Raise Divi- |
| | |sion_by_zero if second argument is zero. |
|@ |prefix @ |List concatenation. |
|^ |prefix ^ |String concatenation. |
|! |prefix ! |Dereferencing (return the current contents of |
| | |a reference). |
|:= |prefix := |Reference assignment (update the reference |
| | |given as first argument with the value of the |
| | |second argument). |
|= |prefix = |Structural equality test. |
|<> |prefix <> |Structural inequality test. |
|== |prefix == |Physical equality test. |
|!= |prefix != |Physical inequality test. |
|< |prefix < |Test ``less than'' on integers. |
|<= |prefix <= |Test ``less than or equal '' on integers. |
|> |prefix > |Test ``greater than'' on integers. |
|>= |prefix >= |Test ``greater than or equal'' on integers. |
|<. |prefix <. |Test ``less than'' on floating-point numbers. |
|<=. |prefix <=. |Test ``less than or equal '' on floating-point |
| | |numbers. |
|>. |prefix >. |Test ``greater than'' on floating-point |
| | |numbers. |
|>=. |prefix >=. |Test ``greater than or equal'' on floating- |
| | |point numbers. |
---------------------------------------------------------------------------
The behavior of the +, -, *, /, mod, +., -., *. or /. operators is
unspecified if the result falls outside of the range of representable integers
or floating-point numbers, respectively. See chapter 13 for a more precise
description of the behavior of the operators above.
22..88 GGlloobbaall ddeeffiinniittiioonnss
This section describes the constructs that bind global identifiers (value
variables, value constructors, type constructors, record labels).
Chapter 2. The core Caml Light language 29
22..88..11 TTyyppee ddeeffiinniittiioonnss
t_y_p_e_-_d_e_f_i_n_i_t_i_o_n_ ::= type t_y_p_e_d_e_f_ {and t_y_p_e_d_e_f_}
t_y_p_e_d_e_f_ ::= t_y_p_e_-_p_a_r_a_m_s_ i_d_e_n_t_ = c_o_n_s_t_r_-_d_e_c_l_ {| c_o_n_s_t_r_-_d_e_c_l_}
| t_y_p_e_-_p_a_r_a_m_s_ i_d_e_n_t_ = { l_a_b_e_l_-_d_e_c_l_ {; l_a_b_e_l_-_d_e_c_l_} }
| t_y_p_e_-_p_a_r_a_m_s_ i_d_e_n_t_ == t_y_p_e_x_p_r_
| t_y_p_e_-_p_a_r_a_m_s_ i_d_e_n_t_
t_y_p_e_-_p_a_r_a_m_s_ ::= n_o_t_h_i_n_g_
| ' i_d_e_n_t_
| ( ' i_d_e_n_t_ {, ' i_d_e_n_t_} )
c_o_n_s_t_r_-_d_e_c_l_ ::= i_d_e_n_t_
| i_d_e_n_t_ of t_y_p_e_x_p_r_
l_a_b_e_l_-_d_e_c_l_ ::= i_d_e_n_t_ : t_y_p_e_x_p_r_
| mutable i_d_e_n_t_ : t_y_p_e_x_p_r_
Type definitions bind type constructors to data types: either variant
types, record types, type abbreviations, or abstract data types.
Type definitions are introduced by the type keyword, and consist in one or
several simple definitions, possibly mutually recursive, separated by the and
keyword. Each simple definition defines one type constructor.
A simple definition consists in an identifier, possibly preceded by one or
several type parameters, and followed by a data type description. The
identifier is the local name of the type constructor being defined. (The
module name for this type constructor is the name of the module being
compiled.) The optional type parameters are either one type variable ' i_d_e_n_t_,
for type constructors with one parameter, or a list of type variables
(' i_d_e_n_t_1,...,' i_d_e_n_t_n), for type constructors with several parameters.
These type parameters can appear in the type expressions of the right-hand
side of the definition.
VVaarriiaanntt ttyyppeess
The type definition t_y_p_e_p_a_r_a_m_s_ i_d_e_n_t_ = c_o_n_s_t_r_-_d_e_c_l_1 |...| c_o_n_s_t_r_-_d_e_c_l_n
defines a variant type. The constructor declarations
c_o_n_s_t_r_-_d_e_c_l_1,...,c_o_n_s_t_r_-_d_e_c_l_n describe the constructors associated to this
variant type. The constructor declaration i_d_e_n_t_ of t_y_p_e_x_p_r_ declares the local
name i_d_e_n_t_ (in the module being compiled) as a non-constant constructor, whose
argument has type t_y_p_e_x_p_r_. The constructor declaration i_d_e_n_t_ declares the
local name i_d_e_n_t_ (in the module being compiled) as a constant constructor.
RReeccoorrdd ttyyppeess
The type definition t_y_p_e_p_a_r_a_m_s_ i_d_e_n_t_ = { l_a_b_e_l_-_d_e_c_l_1 ;...; l_a_b_e_l_-_d_e_c_l_n }
defines a record type. The label declarations l_a_b_e_l_-_d_e_c_l_1,...,l_a_b_e_l_-_d_e_c_l_n
describe the labels associated to this record type. The label declaration
i_d_e_n_t_ : t_y_p_e_x_p_r_ declares the local name i_d_e_n_t_ in the module being compiled as
a label, whose argument has type t_y_p_e_x_p_r_. The label declaration
mutable i_d_e_n_t_ : t_y_p_e_x_p_r_ behaves similarly; in addition, it allows physical
modification over the argument to this label.
TTyyppee aabbbbrreevviiaattiioonnss
The type definition t_y_p_e_p_a_r_a_m_s_ i_d_e_n_t_ == t_y_p_e_x_p_r_ defines the type constructor
i_d_e_n_t_ as an abbreviation for the type expression t_y_p_e_x_p_r_.
Chapter 2. The core Caml Light language 30
AAbbssttrraacctt ttyyppeess
The type definition t_y_p_e_p_a_r_a_m_s_ i_d_e_n_t_ defines i_d_e_n_t_ as an abstract type. When
appearing in a module interface, this definition allows exporting a type
constructor while hiding how it is represented in the module implementation.
22..88..22 EExxcceeppttiioonn ddeeffiinniittiioonnss
e_x_c_e_p_t_i_o_n_-_d_e_f_i_n_i_t_i_o_n_ ::= exception c_o_n_s_t_r_-_d_e_c_l_ {and c_o_n_s_t_r_-_d_e_c_l_}
Exception definitions add new constructors to the built-in variant type exn
of exception values. The constructors are declared as for a definition of a
variant type.
22..99 DDiirreeccttiivveess
d_i_r_e_c_t_i_v_e_ ::= # open s_t_r_i_n_g_
| # close s_t_r_i_n_g_
| # i_d_e_n_t_ s_t_r_i_n_g_
Directives control the behavior of the compiler. They apply to the
remainder of the current compilation unit.
The two directives #open and #close modify the list of opened modules, that
the compiler uses to complete unqualified identifiers, as described in
section 2.2. The directive #open s_t_r_i_n_g_ adds the module whose name is given
by the string constant s_t_r_i_n_g_ to the list of opened modules, in first
position. The directive #close s_t_r_i_n_g_ removes the first occurrence of the
module whose name is given by the string constant s_t_r_i_n_g_ from the list of
opened modules.
Implementations can provide other directives, provided they follow the
syntax # i_d_e_n_t_ s_t_r_i_n_g_, where i_d_e_n_t_ is the name of the directive, and the
string constant s_t_r_i_n_g_ is the argument to the directive. The behavior of
these additional directives is implementation-dependent.
22..1100 MMoodduullee iimmpplleemmeennttaattiioonnss
i_m_p_l_e_m_e_n_t_a_t_i_o_n_ ::= {i_m_p_l_-_p_h_r_a_s_e_ ;;}
i_m_p_l_-_p_h_r_a_s_e_ ::= e_x_p_r_
| v_a_l_u_e_-_d_e_f_i_n_i_t_i_o_n_
| t_y_p_e_-_d_e_f_i_n_i_t_i_o_n_
| e_x_c_e_p_t_i_o_n_-_d_e_f_i_n_i_t_i_o_n_
| d_i_r_e_c_t_i_v_e_
v_a_l_u_e_-_d_e_f_i_n_i_t_i_o_n_ ::= let [rec] l_e_t_-_b_i_n_d_i_n_g_ {and l_e_t_-_b_i_n_d_i_n_g_}
A module implementation consists in a sequence of implementation phrases,
terminated by double semicolons. An implementation phrase is either an
expression, a value definition, a type or exception definition, or a
directive. At run-time, implementation phrases are evaluated sequentially, in
the order in which they appear in the module implementation.
Implementation phrases consisting in an expression are evaluated for their
side-effects.
Value definitions bind global value variables in the same way as a
let...in... expression binds local variables. The expressions are evaluated,
and their values are matched against the left-hand sides of the = sides, as
explained in section 2.7.1. If the matching succeeds, the bindings of
identifiers to values performed during matching are interpreted as bindings to
the global value variables whose local name is the identifier, and whose
Chapter 2. The core Caml Light language 31
module name is the name of the module. If the matching fails, the exception
Match_failure is raised. The scope of these bindings is the phrases that
follow the value definition in the module implementation.
Type and exception definitions introduce type constructors, variant
constructors and record labels as described in sections 2.8.1 and 2.8.2. The
scope of these definitions is the phrases that follow the value definition in
the module implementation. The evaluation of an implementation phrase
consisting in a type or exception definition produces no effect at run-time.
Directives modify the behavior of the compiler on the subsequent phrases of
the module implementation, as described in section 2.9. The evaluation of an
implementation phrase consisting in a directive produces no effect at
run-time. Directives apply only to the module currently being compiled; in
particular, they have no effect on other modules that refer to globals
exported by the module being compiled.
22..1111 MMoodduullee iinntteerrffaacceess
i_n_t_e_r_f_a_c_e_ ::= {i_n_t_f_-_p_h_r_a_s_e_ ;;}
i_n_t_f_-_p_h_r_a_s_e_ ::= v_a_l_u_e_-_d_e_c_l_a_r_a_t_i_o_n_
| t_y_p_e_-_d_e_f_i_n_i_t_i_o_n_
| e_x_c_e_p_t_i_o_n_-_d_e_f_i_n_i_t_i_o_n_
| d_i_r_e_c_t_i_v_e_
v_a_l_u_e_-_d_e_c_l_a_r_a_t_i_o_n_ ::= value i_d_e_n_t_ : t_y_p_e_x_p_r_ {and i_d_e_n_t_ : t_y_p_e_x_p_r_}
Module interfaces declare the global objects (value variables, type
constructors, variant constructors, record labels) that a module exports, that
is, makes available to other modules. Other modules can refer to these
globals using qualified identifiers or the #open directive, as explained in
section 2.2.
A module interface consists in a sequence of interface phrases, terminated
by double semicolons. An interface phrase is either a value declaration, a
type definition, an exception definition, or a directive.
Value declarations declare global value variables that are exported by the
module implementation, and the types with which they are exported. The module
implementation must define these variables, with types at least as general as
the types declared in the interface. The scope of the bindings for these
global variables extends from the module implementation itself to all modules
that refer to those variables.
Type or exception definitions introduce type constructors, variant
constructors and record labels as described in sections 2.8.1 and 2.8.2.
Exception definitions and type definitions that are not abstract type
declarations also take effect in the module implementation; that is, the type
constructors, variant constructors and record labels they define are
considered bound on entrance to the module implementation, and can be referred
to by the implementation phrases. Type definitions that are not abstract type
declarations must not be redefined in the module implementation. In contrast,
the type constructors that are declared abstract in a module interface must be
defined in the module implementation, with the same names.
Directives modify the behavior of the compiler on the subsequent phrases of
the module interface, as described in section 2.9. Directives apply only to
the interface currently being compiled; in particular, they have no effect on
other modules that refer to globals exported by the interface being compiled.
CChhaapptteerr 33
LLaanngguuaaggee eexxtteennssiioonnss
This chapter describes the language features that are implemented in Caml
Light, but not described in the Caml Light reference manual. In contrast with
the fairly stable kernel language that is described in the reference manual,
the extensions presented here are still experimental, and may be removed or
changed in the future.
33..11 SSttrreeaammss,, ppaarrsseerrss,, aanndd pprriinntteerrss
Caml Light comprises a built-in type for s_t_r_e_a_m_s_ (possibly infinite sequences
of elements, that are evaluated on demand), and associated stream expressions,
to build streams, and stream patterns, to destructure streams. Streams and
stream patterns provide a natural approach to the writing of recursive-descent
parsers.
Streams are presented by the following extensions to the syntactic classes
of expressions:
e_x_p_r_ ::= ...
| [< >]
| [< s_t_r_e_a_m_-_c_o_m_p_o_n_e_n_t_ {; s_t_r_e_a_m_-_c_o_m_p_o_n_e_n_t_} >]
| function s_t_r_e_a_m_-_m_a_t_c_h_i_n_g_
| match e_x_p_r_ with s_t_r_e_a_m_-_m_a_t_c_h_i_n_g_
s_t_r_e_a_m_-_c_o_m_p_o_n_e_n_t_ ::= ' e_x_p_r_
| e_x_p_r_
s_t_r_e_a_m_-_m_a_t_c_h_i_n_g_ ::= s_t_r_e_a_m_-_p_a_t_t_e_r_n_ -> e_x_p_r_ {| s_t_r_e_a_m_-_p_a_t_t_e_r_n_ -> e_x_p_r_}
s_t_r_e_a_m_-_p_a_t_t_e_r_n_ ::= [< >]
| [< s_t_r_e_a_m_-_c_o_m_p_-_p_a_t_ {; s_t_r_e_a_m_-_c_o_m_p_-_p_a_t_} >]
s_t_r_e_a_m_-_c_o_m_p_-_p_a_t_ ::= ' p_a_t_t_e_r_n_
| e_x_p_r_ p_a_t_t_e_r_n_
| i_d_e_n_t_
Stream expressions are bracketed by [< and >]. They represent the
concatenation of their components. The component ' e_x_p_r_ represents the
one-element stream whose element is the value of e_x_p_r_. The component e_x_p_r_
represents a sub-stream. For instance, if both s and t are streams of
integers, then [<'1; s; t; '2>] is a stream of integers containing the element
1, then the elements of s, then those of t, and finally 2. The empty stream
is denoted by [< >].
Unlike any other kind of expressions in the language, stream expressions are
submitted to lazy evaluation: the components are not evaluated when the
stream is built, but only when they are accessed during stream matching. The
components are evaluated once, the first time they are accessed; the following
32
Chapter 3. Language extensions 33
accesses reuse the value computed the first time.
Stream patterns, also bracketed by [< and >], describe initial segments of
streams. In particular, the stream pattern [< >] matches all streams. Stream
pattern components are matched against the corresponding elements of a stream.
The component ' p_a_t_t_e_r_n_ matches the corresponding stream element against the
pattern. The component e_x_p_r_ p_a_t_t_e_r_n_ applies the function denoted by e_x_p_r_ to
the current stream, then matches the result of the function against p_a_t_t_e_r_n_.
Finally, the component i_d_e_n_t_ simply binds the identifier to the stream being
matched. (The current implementation limits i_d_e_n_t_ to appear last in a stream
pattern.)
Stream matching proceeds destructively: once a component has been matched,
it is discarded from the stream (by in-place modification).
Stream matching proceeds in two steps: first, a pattern is selected by
matching the stream against the first components of the stream patterns; then,
the following components of the selected pattern are checked against the
stream. If the following components do not match, the exception Parse_error
is raised. There is no backtracking here: stream matching commits to the
pattern selected according to the first element. If none of the first
components of the stream patterns match, the exception Parse_failure is
raised. The Parse_failure exception causes the next alternative to be tried,
if it occurs during the matching of the first element of a stream, before
matching has committed to one pattern.
See F_u_n_c_t_i_o_n_a_l_ p_r_o_g_r_a_m_m_i_n_g_ u_s_i_n_g_ C_a_m_l_ L_i_g_h_t_ for a more gentle introductions
to streams, and for some examples of their use in writing parsers. A more
formal presentation of streams, and a discussion of alternate semantics, can
be found in P_a_r_s_e_r_s_ i_n_ M_L_ by Michel Mauny and Daniel de Rauglaudre, in the
proceedings of the 1992 ACM conference on Lisp and Functional Programming.
33..22 GGuuaarrddss
Cases of a pattern matching can include guard expressions, which are arbitrary
boolean expressions that must evaluate to true for the match case to be
selected. Guards occur just before the -> token and are introduced by the
when keyword:
match expr
with pattern1[whencond1] -> expr1
| ...
| patternn[whencondn] -> exprn
(Same syntax for the fun, function, and try ...with constructs.) During
matching, if the value of e_x_p_r_ matches some pattern p_a_t_t_e_r_n_i which has a
guard c_o_n_d_i, then the expression c_o_n_d_i is evaluated (in an environment
enriched by the bindings performed during matching). If c_o_n_d_i evaluates to
true, then e_x_p_r_i is evaluated and its value returned as the result of the
matching, as usual. But if c_o_n_d_i evaluates to false, the matching is resumed
against the patterns following p_a_t_t_e_r_n_i.
33..33 RRaannggee ppaatttteerrnnss
In patterns, Caml Light recognizes the form ` c_ ` .. ` d_ ` (two character
constants separated by ..) as a shorthand for the pattern
` c_ ` | ` c_1 ` | ` c_2 ` |...| ` c_n ` | ` d_ `
Chapter 3. Language extensions 34
where c1,c2,...,cn are the characters that occur between c and d in the
ASCII character set. For instance, the pattern `0`..`9` matches all
characters that are digits.
33..44 RReeccuurrssiivvee ddeeffiinniittiioonnss ooff vvaalluueess
Besides let rec definitions of functional values, as described in the
reference manual, Caml Light supports a certain class of recursive definitions
of non-functional values. For instance, the following definition is accepted:
let rec x = 1 :: y and y = 2 :: x;;
and correctly binds x to the cyclic list 1::2::1::2::..., and y to the cyclic
list 2::1::2::1::...Informally, the class of accepted definitions consists of
those definitions where the defined variables occur only inside function
bodies or as a field of a data structure. Moreover, the patterns in the
left-hand sides must be identifiers, nothing more complex.
33..55 LLooccaall ddeeffiinniittiioonnss uussiinngg where
A postfix syntax for local definitions is provided:
e_x_p_r_ ::= ...
| e_x_p_r_ where [rec] l_e_t_-_b_i_n_d_i_n_g_
The expression e_x_p_r_ where l_e_t_-_b_i_n_d_i_n_g_ behaves exactly as
let l_e_t_-_b_i_n_d_i_n_g_ in e_x_p_r_, and similarly for where rec and let rec.
33..66 MMuuttaabbllee vvaarriiaanntt ttyyppeess
The argument of a value constructor can be declared ``mutable'' when the
variant type is defined:
type foo = A of mutable int
| B of mutable int * int
| ...
This allows in-place modification of the argument part of a constructed value.
Modification is performed by a new kind of expressions, written i_d_e_n_t_ <- e_x_p_r_,
where i_d_e_n_t_ is an identifier bound by pattern-matching to the argument of a
mutable constructor, and e_x_p_r_ denotes the value that must be stored in place
of that argument. Continuing the example above:
let x = A 1 in
begin match x with A y -> y <- 2 | _ -> () end;
x
returns the value A 2. The notation i_d_e_n_t_ <- e_x_p_r_ works also if i_d_e_n_t_ is an
identifier bound by pattern-matching to the value of a mutable field in a
record. For instance,
type bar = {mutable lbl : int};;
let x = {lbl = 1} in
begin match x with {lbl = y} -> y <- 2 end;
x
Chapter 3. Language extensions 35
returns the value {lbl = 2}.
33..77 SSttrriinngg aacccceessss
Extra syntactic constructs are provided to access and modify characters in
strings:
e_x_p_r_ ::= ...
| e_x_p_r_ .[ e_x_p_r_ ]
| e_x_p_r_ .[ e_x_p_r_ ] <- e_x_p_r_
The expression e_x_p_r_1 .[ e_x_p_r_2 ] is equivalent to the application
nth_char e_x_p_r_1 e_x_p_r_2. In the initial environment, the identifier nth_char
resolves to a built-in function that returns the character number e_x_p_r_2 in
the string denoted by e_x_p_r_1. The first element has number 0; the last
element has number n-1, where n is the length of the string. The exception
Invalid_argument is raised if the access is out of bounds.
The expression e_x_p_r_1 .[ e_x_p_r_2 ] <- e_x_p_r_3 is equivalent to
set_nth_char e_x_p_r_1 e_x_p_r_2 e_x_p_r_3. In the initial environment, the identifier
set_nth_char resolves to a built-in function that modifies in-place the string
denoted by e_x_p_r_1, replacing character number e_x_p_r_2 by the value of e_x_p_r_3.
The exception Invalid_argument is raised if the access is out of bounds. The
built-in function returns ().
33..88 AAlltteerrnnaattee ssyynnttaaxx
The syntax of some constructs has been slightly relaxed:
- An optional ; may terminate a sequence, list expression, or record
expression. For instance, begin e_1 ; e_2 ; end is syntactically correct
and synonymous with begin e_1 ; e_2 end.
- Similarly, an optional | may begin a pattern-matching expression. For
instance, function | p_a_t_1 -> e_x_p_r_1 |... is syntactically correct and
synonymous with function p_a_t_1 -> e_x_p_r_1 |....
- The tokens && and || are recognized as synonymous for & (sequential
``and'') and or (sequential ``or''), respectively.
33..99 IInnffiixx ssyymmbboollss
Sequences of ``operator characters'', such as <=> or !!, are read as a single
token from the i_n_f_i_x_-_s_y_m_b_o_l_ or p_r_e_f_i_x_-_s_y_m_b_o_l_ class:
i_n_f_i_x_-_s_y_m_b_o_l_ ::= (= | < | > | @ | ^ | | | & | ~ | + | - | * | / | $ | %) {o_p_e_r_a_t_o_r_-_c_h_a_r_}
p_r_e_f_i_x_-_s_y_m_b_o_l_ ::= (! | ?) {o_p_e_r_a_t_o_r_-_c_h_a_r_}
o_p_e_r_a_t_o_r_-_c_h_a_r_ ::= ! | $ | % | & | * | + | - | . | / | : | ; | < | = | > | ? | @ | ^ | | | ~
Tokens from these two classes generalize the built-in infix and prefix
operators described in chapter 2:
Chapter 3. Language extensions 36
e_x_p_r_ ::= ...
| p_r_e_f_i_x_-_s_y_m_b_o_l_ e_x_p_r_
| e_x_p_r_ i_n_f_i_x_-_s_y_m_b_o_l_ e_x_p_r_
v_a_r_i_a_b_l_e_ ::= ...
| prefix p_r_e_f_i_x_-_s_y_m_b_o_l_
| prefix i_n_f_i_x_-_s_y_m_b_o_l_
No #infix directive (section 3.10) is needed to give infix symbols their infix
status. The precedences and associativities of infix symbols in expressions
are determined by their first character(s): symbols beginning with ** have
highest precedence (exponentiation), followed by symbols beginning with *, /
or % (multiplication), then + and - (addition), then @ and ^ (concatenation),
then all others symbols (comparisons). The updated precedence table for
expressions is shown below. We write ``*...'' to mean ``any infix symbol
starting with *''.
----------------------------------------------------------------------
|Construction or operator |Associativity |
----------------------------------------------------------------------
|!... ?... |-- |
|. .( .[ |-- |
|function application |right |
|constructor application |-- |
|- -. (prefix) |-- |
|**... |right |
|*... /... %... mod |left |
|+... -... |left |
|:: |right |
|@... ^... |right |
|comparisons (= == < etc.), all other infix symbols|left |
|not |-- |
|& && |left |
|or || |left |
|, |-- |
|<- := |right |
|if |-- |
|; |right |
|let match fun function try |-- |
----------------------------------------------------------------------
Some infix and prefix symbols are predefined in the default environment (see
chapters 2 and 13 for a description of their behavior). The others are
initially unbound and must be bound before use, with a
let prefix i_n_f_i_x_-_s_y_m_b_o_l_ = e_x_p_r_ or let prefix p_r_e_f_i_x_-_s_y_m_b_o_l_ = e_x_p_r_ binding.
33..1100 DDiirreeccttiivveess
In addition to the standard #open and #close directives, Caml Light provides
three additional directives.
#infix " i_d_ "
Change the lexical status of the identifier i_d_: in the remainder of the
compilation unit, i_d_ is recognized as an infix operator, just like +.
The notation prefix i_d_ can be used to refer to the identifier i_d_ itself.
Expressions of the form e_x_p_r_1 i_d_ e_x_p_r_2 are parsed as the application
prefix i_d_ e_x_p_r_1 e_x_p_r_2. The argument to the #infix directive must be an
identifier, that is, a sequence of letters, digits and underscores
starting with a letter; otherwise, the #infix declaration has no effect.
Example:
Chapter 3. Language extensions 37
#infix "union";;
let prefix union = fun x y -> ... ;;
[1,2] union [3,4];;
#uninfix " i_d_ "
Remove the infix status attached to the identifier i_d_ by a previous
#infix " i_d_ " directive.
#directory " d_i_r_-_n_a_m_e_ "
Add the named directory to the path of directories searched for compiled
module interface files. This is equivalent to the -I command-line option
to the batch compiler and the toplevel system.
PPaarrtt IIIIII
TThhee CCaammll LLiigghhtt ccoommmmaannddss
38
CChhaapptteerr 44
BBaattcchh ccoommppiillaattiioonn ((ccaammllcc))
This chapter describes how Caml Light programs can be compiled
non-interactively, and turned into standalone executable files. This is
achieved by the command camlc, which compiles and links Caml Light source
files.
MMaacc:: This command is not a standalone Macintosh application. To run camlc,
you need the Macintosh Programmer's Workshop (MPW) programming
environment. The programs generated by camlc are also MPW tools, not
standalone Macintosh applications.
44..11 OOvveerrvviieeww ooff tthhee ccoommppiilleerr
The camlc command has a command-line interface similar to the one of most C
compilers. It accepts several types of arguments: source files for module
implementations; source files for module interfaces; and compiled module
implementations.
- Arguments ending in .mli are taken to be source files for module
interfaces. Module interfaces declare exported global identifiers,
define public data types, and so on. From the file x_.mli, the camlc
compiler produces a compiled interface in the file x_.zi.
- Arguments ending in .ml are taken to be source files for module
implementation. Module implementations bind global identifiers to
values, define private data types, and contain expressions to be
evaluated for their side-effects. From the file x_.ml, the camlc compiler
produces compiled object code in the file x_.zo. If the interface file
x_.mli exists, the module implementation x_.ml is checked against the
corresponding compiled interface x_.zi, which is assumed to exist. If no
interface x_.mli is provided, the compilation of x_.ml produces a compiled
interface file x_.zi in addition to the compiled object code file x_.zo.
The file x_.zi produced corresponds to an interface that exports
everything that is defined in the implementation x_.ml.
- Arguments ending in .zo are taken to be compiled object code. These
files are linked together, along with the object code files obtained by
compiling .ml arguments (if any), and the Caml Light standard library, to
produce a standalone executable program. The order in which .zo and .ml
arguments are presented on the command line is relevant: global
identifiers are initialized in that order at run-time, and it is a
link-time error to use a global identifier before having initialized it.
Hence, a given x_.zo file must come before all .zo files that refer to
39
Chapter 4. Batch compilation (camlc) 40
identifiers defined in the file x_.zo.
The output of the linking phase is a file containing compiled code that can
be executed by the Caml Light runtime system: the command named camlrun. If
caml.out is the name of the file produced by the linking phase, the command
camlrun caml.out a_r_g_1 a_r_g_2 ... a_r_g_n
executes the compiled code contained in caml.out, passing it as arguments the
character strings a_r_g_1 to a_r_g_n. (See chapter 6 for more details.)
UUnniixx:: On most Unix systems, the file produced by the linking phase can be run
directly, as in:
./caml.out a_r_g_1 a_r_g_2 ... a_r_g_n
The produced file has the executable bit set, and it manages to launch
the bytecode interpreter by itself.
PPCC:: The output file produced by the linking phase is directly executable,
provided it is given extension .EXE. Hence, if the output file is named
caml_out.exe, you can execute it with the command
caml_out a_r_g_1 a_r_g_2 ... a_r_g_n
Actually, the produced file caml_out.exe is a tiny executable file
prepended to the bytecode file. The executable simply runs the camlrun
runtime system on the remainder of the file. (As a consequence, this
is not a standalone executable: it still requires camlrun.exe to
reside in one of the directories in the path.)
44..22 OOppttiioonnss
The following command-line options are recognized by camlc.
-c Compile only. Suppress the linking phase of the compilation. Source
code files are turned into compiled files, but no executable file is
produced. This option is useful to compile modules separately.
-ccopt o_p_t_i_o_n_
Pass the given option to the C compiler and linker, when linking in
``custom runtime'' mode (see the -custom option). For instance, -ccopt
-Ld_i_r_ causes the C linker to search for C libraries in directory d_i_r_.
-custom
Link in ``custom runtime'' mode. In the default linking mode, the linker
produces bytecode that is intended to be executed with the shared runtime
system, camlrun. In the custom runtime mode, the linker produces an
output file that contains both the runtime system and the bytecode for
the program. The resulting file is considerably larger, but it can be
executed directly, even if the camlrun command is not installed.
Moreover, the ``custom runtime'' mode enables linking Caml Light code
with user-defined C functions, as described in chapter 12.
Chapter 4. Batch compilation (camlc) 41
UUnniixx:: Never strip an executable produced with the -custom option.
PPCC:: This option requires the DJGPP port of the GNU C compiler to be
installed.
-files r_e_s_p_o_n_s_e_-_f_i_l_e_
Process the files whose names are listed in file r_e_s_p_o_n_s_e_-_f_i_l_e_, just as
if these names appeared on the command line. File names in r_e_s_p_o_n_s_e_-_f_i_l_e_
are separated by blanks (spaces, tabs, newlines). This option allows to
overcome silly limitations on the length of the command line.
-g Cause the compiler to produce additional debugging information. During
the linking phase, this option add information at the end of the
executable bytecode file produced. This information is required by the
debugger camldebug and also by the catch-all exception handler from the
standard library module printexc.
During the compilation of an implementation file (.ml file), when the -g
option is set, the compiler adds debugging information to the .zo file.
It also writes a .zix file that describes the full interface of the .ml
file, that is, all types and values defined in the .ml file, including
those that are local to the .ml file (i.e. not declared in the .mli
interface file). Used in conjunction with the -g option to the toplevel
system (chapter 5), the .zix file gives access to the local values of the
module, making it possible to print or ``trace'' them. The .zix file is
not produced if the implementation file has no explicit interface, since,
in this case, the module has no local values.
-i Cause the compiler to print the declared types, exceptions, and global
variables (with their inferred types) when compiling an implementation
(.ml file). This can be useful to check the types inferred by the
compiler. Also, since the output follows the syntax of module
interfaces, it can help in writing an explicit interface (.mli file) for
a file: just redirect the standard output of the compiler to a .mli
file, and edit that file to remove all declarations of unexported
globals.
-I d_i_r_e_c_t_o_r_y_
Add the given directory to the list of directories searched for compiled
interface files (.zi) and compiled object code files (.zo). By default,
the current directory is searched first, then the standard library
directory. Directories added with -I are searched after the current
directory, but before the standard library directory. When several
directories are added with several -I options on the command line, these
directories are searched from right to left (the rightmost directory is
searched first, the leftmost is searched last). (Directories can also be
added to the search path from inside the programs with the #directory
directive; see chapter 3.)
-lang l_a_n_g_u_a_g_e_-_c_o_d_e_
Translate the compiler messages to the specified language. The
l_a_n_g_u_a_g_e_-_c_o_d_e_ is fr for French, es for Spanish, de for German, ... (See
the file camlmsgs.txt in the Caml Light standard library directory for a
list of available languages.) When an unknown language is specified, or
no translation is available for a message, American English is used by
default.
Chapter 4. Batch compilation (camlc) 42
-o e_x_e_c_-_f_i_l_e_
Specify the name of the output file produced by the linker.
UUnniixx:: The default output name is a.out, in keeping with the tradition.
PPCC:: The default output name is caml_out.exe.
MMaacc:: The default output name is Caml.Out.
-O m_o_d_u_l_e_-_s_e_t_
Specify which set of standard modules is to be implicitly ``opened'' at
the beginning of a compilation. There are three module sets currently
available:
ccaauuttiioouuss
provides the standard operations on integers, floating-point numbers,
characters, strings, arrays, ..., as well as exception handling,
basic input/output, etc. Operations from the cautious set perform
range and bound checking on string and array operations, as well as
various sanity checks on their arguments.
ffaasstt
provides the same operations as the cautious set, but without sanity
checks on their arguments. Programs compiled with -O fast are
therefore slightly faster, but unsafe.
nnoonnee
suppresses all automatic opening of modules. Compilation starts in
an almost empty environment. This option is not of general use,
except to compile the standard library itself.
The default compilation mode is -O cautious. See chapter 13 for a
complete listing of the modules in the cautious and fast sets.
-p Compile and link in profiling mode. See the description of the profiler
camlpro in chapter 10.
-v Print the version number of the compiler.
-W Print extra warning messages for the following events:
- A #open directive is useless (no identifier in the opened module is
ever referenced).
- A variable name in a pattern matching is capitalized (often
corresponds to a misspelled constant constructor).
UUnniixx:: The following environment variable is also consulted:
Chapter 4. Batch compilation (camlc) 43
LANGWhen set, control which language is used to print the compiler
messages (see the -lang command-line option).
PPCC:: The following environment variables are also consulted:
CAMLLIB
Contain the path to the standard library directory.
LANGWhen set, control which language is used to print the compiler
messages (see the -lang command-line option).
44..33 MMoodduulleess aanndd tthhee ffiillee ssyysstteemm
This short section is intended to clarify the relationship between the names
of the modules and the names of the files that contain their compiled
interface and compiled implementation.
The compiler always derives the name of the compiled module by taking the
base name of the source file (.ml or .mli file). That is, it strips the
leading directory name, if any, as well as the .ml or .mli suffix. The
produced .zi and .zo files have the same base name as the source file; hence,
the compiled files produced by the compiler always have their base name equal
to the name of the module they describe (for .zi files) or implement (for .zo
files).
For compiled interface files (.zi files), this invariant must be preserved
at all times, since the compiler relies on it to load the compiled interface
file for the modules that are used from the module being compiled. Hence, it
is risky and generally incorrect to rename .zi files. It is admissible to
move them to another directory, if their base name is preserved, and the
correct -I options are given to the compiler.
Compiled bytecode files (.zo files), on the other hand, can be freely
renamed once created. That's because 1- .zo files contain the true name of
the module they define, so there is no need to derive that name from the file
name; 2- the linker never attempts to find by itself the .zo file that
implements a module of a given name: it relies on the user providing the list
of .zo files by hand.
44..44 CCoommmmoonn eerrrroorrss
This section describes and explains the most frequently encountered error
messages.
CCaannnnoott ffiinndd ffiillee f_i_l_e_n_a_m_e_
The named file could not be found in the current directory, nor in the
directories of the search path. The f_i_l_e_n_a_m_e_ is either a compiled
interface file (.zi file), or a compiled bytecode file (.zo file). If
f_i_l_e_n_a_m_e_ has the format m_o_d_.zi, this means you are trying to compile a
file that references identifiers from module m_o_d_, but you have not yet
compiled an interface for module m_o_d_. Fix: compile m_o_d_.mli or m_o_d_.ml
first, to create the compiled interface m_o_d_.zi.
If f_i_l_e_n_a_m_e_ has the format m_o_d_.zo, this means you are trying to link a
bytecode object file that does not exist yet. Fix: compile m_o_d_.ml
first.
Chapter 4. Batch compilation (camlc) 44
If your program spans several directories, this error can also appear
because you haven't specified the directories to look into. Fix: add
the correct -I options to the command line.
CCoorrrruupptteedd ccoommppiilleedd iinntteerrffaaccee ffiillee f_i_l_e_n_a_m_e_
The compiler produces this error when it tries to read a compiled
interface file (.zi file) that has the wrong structure. This means
something went wrong when this .zi file was written: the disk was full,
the compiler was interrupted in the middle of the file creation, and so
on. This error can also appear if a .zi file is modified after its
creation by the compiler. Fix: remove the corrupted .zi file, and
rebuild it.
TThhiiss eexxpprreessssiioonn hhaass ttyyppee t_1,, bbuutt iiss uusseedd wwiitthh ttyyppee t_2
This is by far the most common type error in programs. Type t_1 is the
type inferred for the expression (the part of the program that is
displayed in the error message), by looking at the expression itself.
Type t_2 is the type expected by the context of the expression; it is
deduced by looking at how the value of this expression is used in the
rest of the program. If the two types t_1 and t_2 are not compatible, then
the error above is produced.
In some cases, it is hard to understand why the two types t_1 and t_2 are
incompatible. For instance, the compiler can report that ``expression of
type foo cannot be used with type foo'', and it really seems that the two
types foo are compatible. This is not always true. Two type
constructors can have the same name, but actually represent different
types. This can happen if a type constructor is redefined. Example:
type foo = A | B;;
let f = function A -> 0 | B -> 1;;
type foo = C | D;;
f C;;
This result in the error message ``expression C of type foo cannot be
used with type foo''.
Incompatible types with the same names can also appear when a module is
changed and recompiled, but some of its clients are not recompiled.
That's because type constructors in .zi files are not represented by
their name (that would not suffice to identify them, because of type
redefinitions), but by unique stamps that are assigned when the type
declaration is compiled. Consider the three modules:
mod1.ml: type t = A | B;;
let f = function A -> 0 | B -> 1;;
mod2.ml: let g x = 1 + mod1__f(x);;
mod3.ml: mod2__g mod1__A;;
Now, assume mod1.ml is changed and recompiled, but mod2.ml is not
recompiled. The recompilation of mod1.ml can change the stamp assigned
to type t. But the interface mod2.zi will still use the old stamp for
Chapter 4. Batch compilation (camlc) 45
mod1__t in the type of mod2__g. Hence, when compiling mod3.ml, the
system complains that the argument type of mod2__g (that is, mod1__t with
the old stamp) is not compatible with the type of mod1__A (that is,
mod1__t with the new stamp). Fix: use make or a similar tool to ensure
that all clients of a module m_o_d_ are recompiled when the interface m_o_d_.zi
changes. To check that the Makefile contains the right dependencies,
remove all .zi files and rebuild the whole program; if no ``Cannot find
file'' error appears, you're all set.
TThhee ttyyppee iinnffeerrrreedd ffoorr n_a_m_e_,, tthhaatt iiss,, t_,, ccoonnttaaiinnss nnoonn--ggeenneerraalliizzaabbllee ttyyppee vvaarriiaabblleess
Type variables ('a, 'b, ...) in a type t_ can be in either of two states:
generalized (which means that the type t_ is valid for all possible
instantiations of the variables) and not generalized (which means that
the type t_ is valid only for one instantiation of the variables). In a
let binding let n_a_m_e_ = e_x_p_r_, the type-checker normally generalizes as
many type variables as possible in the type of e_x_p_r_. However, this leads
to unsoundness (a well-typed program can crash) in conjunction with
polymorphic mutable data structures. To avoid this, generalization is
performed at let bindings only if the bound expression e_x_p_r_ belongs to
the class of ``syntactic values'', which includes constants, identifiers,
functions, tuples of syntactic values, etc. In all other cases (for
instance, e_x_p_r_ is a function application), a polymorphic mutable could
have been created and generalization is therefore turned off.
Non-generalized type variables in a type cause no difficulties inside a
given compilation unit (the contents of a .ml file, or an interactive
session), but they cannot be allowed in types written in a .zi compiled
interface file, because they could be used inconsistently in other
compilation units. Therefore, the compiler flags an error when a .ml
implementation without a .mli interface defines a global variable n_a_m_e_
whose type contains non-generalized type variables. There are two
solutions to this problem:
- Add a type constraint or a .mli interface to give a monomorphic type
(without type variables) to n_a_m_e_. For instance, instead of writing
let sort_int_list = sort (prefix <);;
(* inferred type 'a list -> 'a list, with 'a not generalized *)
write
let sort_int_list = (sort (prefix <) : int list -> int list);;
- If you really need n_a_m_e_ to have a polymorphic type, turn its defining
expression into a function by adding an extra parameter. For
instance, instead of writing
let map_length = map vect_length;;
(* inferred type 'a vect list -> int list, with 'a not general-
ized *)
Chapter 4. Batch compilation (camlc) 46
write
let map_length lv = map vect_length lv;;
m_o_d___n_a_m_e_ iiss rreeffeerreenncceedd bbeeffoorree bbeeiinngg ddeeffiinneedd
This error appears when trying to link an incomplete or incorrectly
ordered set of files. Either you have forgotten to provide an
implementation for the module named m_o_d_ on the command line (typically,
the file named m_o_d_.zo, or a library containing that file). Fix: add the
missing .ml or .zo file to the command line. Or, you have provided an
implementation for the module named m_o_d_, but it comes too late on the
command line: the implementation of m_o_d_ must come before all bytecode
object files that reference one of the global variables defined in module
m_o_d_. Fix: change the order of .ml and .zo files on the command line.
Of course, you will always encounter this error if you have mutually
recursive functions across modules. That is, function mod1__f calls
function mod2__g, and function mod2__g calls function mod1__f. In this
case, no matter what permutations you perform on the command line, the
program will be rejected at link-time. Fixes:
- Put f and g in the same module.
- Parameterize one function by the other. That is, instead of having
mod1.ml: let f x = ... mod2__g ... ;;
mod2.ml: let g y = ... mod1__f ... ;;
define
mod1.ml: let f g x = ... g ... ;;
mod2.ml: let rec g y = ... mod1__f g ... ;;
and link mod1 before mod2.
- Use a reference to hold one of the two functions, as in :
mod1.ml: let forward_g =
ref((fun x -> failwith "forward_g") : );;
let f x = ... !forward_g ... ;;
mod2.ml: let g y = ... mod1__f ... ;;
mod1__forward_g := g;;
UUnnaavvaaiillaabbllee CC pprriimmiittiivvee f_
This error appears when trying to link code that calls external functions
written in C in ``default runtime'' mode. As explained in chapter 12,
such code must be linked in ``custom runtime'' mode. Fix: add the
-custom option, as well as the (native code) libraries and (native code)
object files that implement the required external functions.
CChhaapptteerr 55
TThhee ttoopplleevveell ssyysstteemm ((ccaammlllliigghhtt))
This chapter describes the toplevel system for Caml Light, that permits
interactive use of the Caml Light system, through a read-eval-print loop. In
this mode, the system repeatedly reads Caml Light phrases from the input, then
typechecks, compile and evaluate them, then prints the inferred type and
result value, if any. The system prints a # (sharp) prompt before reading
each phrase. A phrase can span several lines. Phrases are delimited by ;;
(the final double-semicolon).
From the standpoint of the module system, all phrases entered at toplevel
are treated as the implementation of a module named top. Hence, all toplevel
definitions are entered in the module top.
UUnniixx:: The toplevel system is started by the command camllight. Phrases are
read on standard input, results are printed on standard output, errors
on standard error. End-of-file on standard input terminates camllight
(see also the quit system function below).
The toplevel system does not perform line editing, but it can easily be
used in conjunction with an external line editor such as fep; just run
fep -emacs camllight or fep -vi camllight. Another option is to use
camllight under Gnu Emacs, which gives the full editing power of Emacs
(see the directory contrib/camlmode in the distribution).
At any point, the parsing, compilation or evaluation of the current
phrase can be interrupted by pressing ctrl-C (or, more precisely, by
sending the intr signal to the camllight process). This goes back to
the # prompt.
MMaacc:: The toplevel system is presented as the standalone Macintosh
application Caml Light. This application does not require the
Macintosh Programmer's Workshop to run.
Once launched from the Finder, the application opens two windows,
``Caml Light Input'' and ``Caml Light Output''. Phrases are entered in
the ``Caml Light Input'' window. The ``Caml Light Output'' window
displays a copy of the input phrases as they are processed by the Caml
Light toplevel, interspersed with the toplevel responses. The
``Return'' key sends the contents of the Input window to the Caml Light
toplevel. The ``Enter'' key inserts a newline without sending the
contents of the Input window. (This can be configured with the
``Preferences'' menu item.)
The contents of the input window can be edited at all times, with the
standard Macintosh interface. An history of previously entered phrases
47
Chapter 5. The toplevel system (camllight) 48
is maintained, and can be accessed with the ``Previous entry''
(command-P) and ``Next entry'' (command-N) menu items.
To quit the Caml Light application, either select ``Quit'' from the
``Files'' menu, or use the quit function described below.
At any point, the parsing, compilation or evaluation of the current
phrase can be interrupted by pressing ``command-period'', or by
selecting the item ``Interrupt Caml Light'' in the ``Caml Light'' menu.
This goes back to the # prompt.
PPCC:: The toplevel system is presented as a Windows application named
Camlwin.exe. It should be launched from the Windows file manager or
program manager.
The ``Terminal'' windows is split in two panes. Phrases are entered
and edited in the bottom pane. The top pane displays a copy of the
input phrases as they are processed by the Caml Light toplevel,
interspersed with the toplevel responses. The ``Return'' key sends the
contents of the bottom pane to the Caml Light toplevel. The ``Enter''
key inserts a newline without sending the contents of the Input window.
(This can be configured with the ``Preferences'' menu item.)
The contents of the input window can be edited at all times, with the
standard Macintosh interface. An history of previously entered phrases
is maintained and displayed in a separate window.
To quit the Camlwin application, either select ``Quit'' from the
``File'' menu, or use the quit function described below.
At any point, the parsing, compilation or evaluation of the current
phrase can be interrupted by selecting the ``Interrupt Caml Light''
menu item. This goes back to the # prompt.
A text-only version of the toplevel system is available under the name
caml.exe. It runs under MSDOS as well as under Windows in a DOS
window. No editing facilities are provided.
55..11 OOppttiioonnss
The following command-line options are recognized by the caml or camllight
commands.
-g Start the toplevel system in debugging mode. This mode gives access to
values and types that are local to a module, that is, not exported by the
interface of the module. When debugging mode is off, these local objects
are not accessible (attempts to access them produce an ``Unbound
identifier'' error). When debugging mode is on, these objects become
visible, just like the objects that are exported in the module interface.
In particular, values of abstract types are printed using their concrete
representations, and the functions local to a module can be ``traced''
(see the trace function in section 5.2). This applies only to the
modules that have been compiled in debugging mode (either by the batch
compiler with the -g option, or by the toplevel system in debugging
mode), that is, those modules that have an associated .zix file.
Chapter 5. The toplevel system (camllight) 49
-I d_i_r_e_c_t_o_r_y_
Add the given directory to the list of directories searched for compiled
interface files (.zi) and compiled object code files (.zo). By default,
the current directory is searched first, then the standard library
directory. Directories added with -I are searched after the current
directory, but before the standard library directory. When several
directories are added with several -I options on the command line, these
directories are searched from right to left (the rightmost directory is
searched first, the leftmost is searched last). Directories can also be
added to the search path once the toplevel is running with the #directory
directive; see chapter 3.
-lang l_a_n_g_u_a_g_e_-_c_o_d_e_
Translate the toplevel messages to the specified language. The
l_a_n_g_u_a_g_e_-_c_o_d_e_ is fr for French, es for Spanish, de for German, ... (See
the file camlmsgs.txt in the Caml Light standard library directory for a
list of available languages.) When an unknown language is specified, or
no translation is available for a message, American English is used by
default.
-O m_o_d_u_l_e_-_s_e_t_
Specify which set of standard modules is to be implicitly ``opened'' when
the toplevel starts. There are three module sets currently available:
ccaauuttiioouuss
provides the standard operations on integers, floating-point numbers,
characters, strings, arrays, ..., as well as exception handling,
basic input/output, ...Operations from the cautious set perform range
and bound checking on string and vector operations, as well as
various sanity checks on their arguments.
ffaasstt
p