Welcome to Command Line Tools Help

In the background, OCIDE uses several command line tools known as the Orange Toolchain to perform various functions such as compiling and linking.

This document describes the Orange Toolchain in some detail.

This is somewhat of a classic toolchain, except there is an addition of a linker post-processing or download stage to manage turning the linker output file into an executable program.   There are several linker post-processing stages to target different platforms, including a post-processing stage that will generate various types of HEX file formats.

General tools compile or assemble code, and manage the resulting object files

OCC is an optimizing C compiler which handles the various flavors of C from C89 through C11.
OAsm is an x86 assembler.  It uses a syntax that is very similar to the Netwide Assembler (NASM)
OLib is an object file librarian.
OLink is an object file linker object file linker.

Linker postprocessing tools take the linker output, and make some sort of device or OS-specific binary image that serves as the final executable\ image.
DLHex is the utility to make hex and binary files, for ROM-based images
DLMZ is the utility to make 16-bit MSDOS executables.  The compiler will not output 16 bit code but this may be used with the assembler.
DLLE is the utility to make 32-bit MSDOS executables that aren't windows compatible.
DLPE is the utility to make Windows 32-bit executables.<

There are several external utilities that the IDE doesn't use directly, but can be useful from the command line
OCPP is a C and assembly language preprocessor.  It understands C89, C99, C11, and OAsm preprocessor directive syntaxes.
OGrep looks for regular expressions within source code files.
OMake is a make utility very similar to GNU make.


Some utilities are specific to developing WIN32 programs:
OImpLib is a WIN32 import librarian capable of managing imports from DLL and .DEF files.
ORC is a WIN32 resource compiler.

Here are the license terms



The BSD Licensing terms are as follows:

Software License Agreement (BSD License)
 
Copyright (c) 1997-2013, David Lindauer, (LADSoft).
All rights reserved.
 
Redistribution and use of this software in source and binary forms, 
    with or without modification, are permitted provided that the following 
    conditions are met:
 
* Redistributions of source code must retain the above
   copyright notice, this list of conditions and the
   following disclaimer.
 
* Redistributions in binary form must reproduce the above
   copyright notice, this list of conditions and the
   following disclaimer in the documentation and/or other
   materials provided with the distribution.
 
* Neither the name of LADSoft nor the names of its
   contributors may be used to endorse or promote products
   derived from this software without specific prior
   written permission of LADSoft.
 
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 
    AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, 
    THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 
    PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER 
    OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, 
    EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 
    PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; 
    OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 
    WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR 
    OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


OCC

OCC is an optimiziing compiler capable of compilng C language files written to the C99 and C11 standards.  However in its default mode, it compiles to the older standard for which most legacy programs are written.

OCC currently only generates code for the x86 series processor. Together with the rest of the toolchain and supplied libraries, it can be used to create WIN32 program files. This toolchain also includes extenders necessary for running WIN32 applications on MSDOS, so OCC may be run on MSDOS and used to generate MSDOS programs as well.

By default OCC will spawn the necessary subprograms to generate a completed executable from a source file.

A companion program, OCL, may be used to generate MSDOS executables which depend on one of a variety of MSDOS extenders.

The general form of an OCC Command Line is:

OCC [options] filename-list

Where filename-list gives a list of files to compile. 

In addition to support for the C99 standard, OCC supports a variety of the usual compiler extensions  found in MSDOS and WIN32 compilers.

OCC also supports a range of #pragma preprocessor directives to allow some level of control over the generated code. Such directives include support for structure alignment, having the CRTL run routines as part of its normal startup and shutdown process, and so forth.


Command Line

OCC has a variety of command line parameters, most of which aren't needed to just compile a file. It will also allow you to specify multiple input files. The current default for OCC is to generate executable files. While processing the command line, OCC may encounter a command to process command line arguments from a file.

The general format of the command line is as follows:

OCC [parameters] list of files

The list of files can be a list of one or more C language files. C++ language files are partially supported, but the C++ support is minimal at this time. If you don't specify an extension on the command line it will default to .C; it will detect a .CPP extension and activate C++ mode as required.

OCC will accept response files with a list of command line options. To use a response file, prefix its name with '@':

OCC [parameters] @resp.cc

There are a variety of parameters that can be set. Help is available for the following:

Output Control
Error Control
List File Control
Preprocessor File Control
Compilation Modes
Defining Macros
Specifying Include Paths
Translating Trigraphs
Code Generation Parameters
Optimizer Parameters


Output Control

The following switches are available for output control:

/c generate object file only

Causes OCC to not generate an EXE file automatically. Useful when compiling many files prior to a later link stage.

OCC /c hello.c

generates a file hello.o instead of generating hello.exe

/o  set output file name

Causes OCC to rename the output file. If generating an EXE output, OCC will rename the exe file. If generating an object (OBJ) file, OCC will rename the obj file. Note that you cannot set the output file name for a group of files unless you are generating an EXE file.

OCC /ohi hello.c

generates an EXE file called HI.EXE.

whereas 

OCC /c /obeep hello.c

generates an object file called BEEP.O.

/S generate assembly language file only

Causes OCC to generate an assembly language file in NASM format, but no object or EXE files

OCC /c hello.c

generates a file HELLO.ASM

/s generate intermediate assembly language file

OCC will generate an executable file by compiling via assembly. The intermediate assembly language file will remain after compilation.

OCC /s hello.c
 
generates the files HELLO.O, HELLO.ASM, and HELLO.EXE

/v generate debug information

OCC will generate debug information for use by the IDE.

/W  set exe file type

When OCC is generating an EXE file, several formats are possible. These are as follows:

/Wd - generate a WIN32 DLL program 
/Wdc - generate a WIN32 DLL program, use CRTDLL run time library
/Wdl - generate a WIN32 DLL program, use LSCRTL run time library
/Wdm - generate a WIN32 DLL program, use MSVCRT run time library
/Wc - generate a WIN32 console program
/Wcc - generate a WIN32 console program, use CRTDLL run time library
/Wcl - generate a WIN32 console program, use LSCRTL run time library
/Wcm - generate a WIN32 console program, use MSVCRT run time library
/Wg - generate a WIN32 gui program
/Wgc - generate a WIN32 gui program, use CRTDLL run time library
/Wgl - generate a WIN32 gui program, use LSCRTL run time library
/Wgm - generate a WIN32 gui program, use MSVCRT run time library
/We - generate an MSDOS program (using Tran's PMODE)
/Wa - generate an MSDOS program (using DOS32A)
/Wh - generate an MSDOS program (using HXDOS/WIN32 runtime)
/Wr - generete a RAW program
/Wx - generate an MSDOS program (using HXDOS/DOS runtime)

  OCC /Wcl hello.c

  generates a win32 console program hello.exe, which will use the LSCRTL.DLL run time library.

Note: when compiling files for use with the LSCRTL.DLL, one of the switches /Wdl /Wcl or /Wgl must be present to indicate to the compiler that linkage will be against that library. Failing to use one of these switches can result in errant run-time behavior.


Error Control

The following switches are available for error messages control:

+e  put the compiler errors in a file. 

For each file processed, OCC will create a file with the same name as the original source with the extension.'.err'. The contents of this file will be a listing of any errors or warnings which occurred during the compile. For example:

OCC +e myfile.c

results in myfile.err

+Q Quiet mode

Don't display errors or warnings on the console. Generally this is used in conjunction with the +e switch. For example:

OCC +e +Q myfile.c

puts the errors in a file, without displaying them on the console.


/E[+-]nn error control

nn is the maximum number of errors before the  compile fails; if + is specified extended warnings will be shown that are normally disabled by default. If - is specified warnings will be turned off.  For example:

OCC /E+44 myfile.c

enables extended warnings and limits the number of  errors to 44. By default only 25 errors will be shown and then the compiler will abort and

OCC /E- myfile.c

compiles myfile.c without displaying any warnings.


List File Control

The following switches are available for list file control:

+l create a listing file 

For each file processed, OCC will create a file with the same name as the original source with the extension.  '.lst'. The contents of this file will be various information gathered about the program which was processed. For example:

OCC +l myfile.c

results in myfile.lst


Preprocessor File Control

The following switches are available for preprocessor control

+i create a file with preprocessed text. 

For each file processed, OCC will create a file with the same name as the original source with the extension.  '.i'. The contents of this file will be the source code, with each identifier which corresponded to a macro expanded out to its full value. For example:

OCC +i myfile.c

results in myfile.i where each #defined identifier replaced with its value.




Compilation Modes

The following switches are available to set the language compatibility mode:

+A disable non-ansi extensions

By default the compiler allows several extensions to ansi, to make coding easier. If you want strict adherence to ansi, use this switch. For example:

OCC +A myfunc.c

will enable ANSI mode

+9 C99 Compatibility

By default the compiler compiles for pre-99 standard. If you want extended features available in the later C99 standard, use this switch. For example:

OCC /9 myfunc.c

Will enable C99 mode.

+1 C11 Compatibility

Enables compatibility with the C11 standard.  For example:

OCC /1 myfunc.c

Will enable C11 mode.


Defining Macros

The following switches are useful for definint macros:

/Dxxx define a macro

This switch defines a macro as if a #define statement had been used somewhere in the source. It is useful for building different versions of a program without modifying the source files between compiles. Note that you may not give macros defined this way a value. For example:

OCC /DORANGE myfunc.c

  is equivalent to placing the following statement at the beginning of the source file and compiling it.

#define ORANGE 



Specifying Include Paths

The following switches are useful for specifying where to find #included files.

/Ipath specify include path. 

If your file uses headers that aren't in the directory you are running Orange C from, you will have to tell it what directory to look in. You can have multiple search directories by using a semicolon between the directory names. If there are multiple files with the same name in different directories being searched, CC386 will use the first instance of the file it finds by searching the path in order. For example:

OCC /I..\include;..\source;c:\libraries\include myfile.c

Will search the paths ..\include , ..\source, and c:\libraries\include in that order. Note that you generally don't have to specify a path to the OCC compiler header files such as stdio.h, as they will be added to the list of paths automatically


Translating Trigraphs

The following switches are available to aid in translating trigraphs.

/T Translate trigraphs

use this switch to have OCC translate trigraphs. By default OCC will not translate trigraphs for compatibility and to compile slightly faster. For example:

OCC /T myfile.c

Translates any trigraphs in the text.


Code Generation Parameters

The following switches guide the code generation process

/Cparams specifies code generation parameters

Params is a list of parameters, seperated by + and - symbols. Use the + symbol to indicate a parameter is to be turned on, the minus symbol that the parameter is to be turned off. The default states of the various parameters are opposite what is shown here; i.e. this lists how to change the default state to something else.

Where params is one or more of:

+d display diagnostics

  displays memory diagnostics, and some indication of what internal errors have occurred

-b merge BSS with initialized data

Normally there are two segments used, one for initialized data and one for uninitialized data.  This prevents the OS from having to load uninitialized data from a file; instead the program just zeroes it during startup. This switch merges the two sections into one initialized data section.

-l don't put C source in the ASM file

When an ASM file output option is specified, this will create an ASM file without cross-referencing the assembly code to the lines of the source file.

-m don't use leading underscores

Normal C language procedure is to prepend each identifier with an underscore. If you want to use the compiler to generate function and variable names without an underscore use this switch. However, doing so will create an incompatibility with the run time libraries and you won't be able to link.

+r reverse order of bit operations

Normally bit fields are allocated from least significant bit to most significant bit of whatever size variable is being used. Use this switch to reverse the order, however this may create incompatibilities with the libraries which result in code bugs.

+s align stack

This switch causes OCC to emit code to align the stack to 16-byte boundaries. This is useful to speed up operations that involve loading and storing double-precision floating point values to auto variables. By default, the run-time libraries cause main() or WinMain() to execute with an aligned stack.

+F use FLAT model in ASM file

When using ASM file, select FLAT model.

+I use Microsoft-style imports

Normally the linker creates a thunk table with jump addresses that jump indirectly through the import table. This allows basic C code to compile and link.  However some linkers do not support this and instead need the compiler to call indirectly through the import table rather than to a thunk table. Use this switch to generate code compatible with these linkers.

+M generate MASM assembler file

+N generate NASM assembler file 

+NX generate generic NASM assembler file

+R use the far keyword to create far pointers or far procedure frames

+S add stack checking code
  
This switch adds calls to the run-time library's stack checking code. If the stack overruns, an error is generated.

+T generate TASM assembler file

+U do not assume DS == SS

+Z add profiler calls

This switch adds calls to a profiler module at the beginning and ending of each compiled function. This is DOS compatibility; the WIN32 profiler module does not exist at present.  For example:

OCC /C+NX+Z myfile.c

generates generic NASM assembler module, with profiler calls inserted.


Optimizer Parameters

The following switches deal with optimization.

/O- turn off optimizer

This switch disables the OCC optimizer. For example:

OCC /O- myfile.c

compiles a program without optimization.

Note that specifying the /v switch will also turn off optimization


#Pragma statements

#pragma preprocessor directives control the interpretation of source code, or extend the functionality of the compiler in some way.


#pragma error

#pragma error <text> allows conditional generation of errors. For example:

#ifndef WIN32
#pragma error Not a win32 program
#endif

generates a compile time error if the WIN32 macro is not defined.


#pragma warning

#pragma warning <text> allows conditional generation of errors. For example:

#ifndef LONG
#pragma warning long type not defined
#endif

generates a compile time warning if the LONG macro is not defined.


#pragma aux

#pragma aux <funcname> = <alias>  Creates an alias for a function. The alias name is substituted for the function name in the OBJ and ASM output files. For example:

#pragma aux "myfunc"="mynewname"

causes the linker to see the function 'myfunc' as being called 'mynewname'.  In the source code you would still write 'myfunc' however.


#pragma pack

#pragma pack(n) Sets the alignment for structure members and global variables. The default alignment is 1. Changing the alignment can increase performance by causing variable and structure alignment to optimal sizes, at the expense of using extra  memory. However, altered alignment can sometimes cause problems for example when a structure is used directly in a network packet or as the contents of a file.

The actual alignment of any given variable depends both on the value of 'n' and on the size of the variable. CC386 will pick the minimum of the two values for the alignment of any given variable; for example if n is 2 characters will be aligned on byte boundaries and everything else will be aligned on two byte  boundaries. If n is 4 characters will be on byte boundaries, words (short quantities) on two-byte boundaries, and dwords (ints) on four byte boundaries.

#pragma pack() Resets the alignment to the last selection, or to the default.


#pragma startup <function> <priority>
#pragma rundown <function> <priority>

These two directives allow you to specify functions that are automatically executed by the run time library before and after the main program is executed. The priority scheme allows you to order functions in a priority order. When the RTL is executing startup or rundown functions it executes all functions at priority 1, then all functions at priority 2, then all functions at priority 3, and so forth. To have a function executed before your program runs, use #pragma startup, or to have it execute after the program exits, use #pragma rundown. You should use priorities in the range 50-200, as priorities outside that range are used by run time library functions and their execution (or lack thereof) may prevent some functions in the RTL from  working properly. For example:

#pragma startup myfunc 100

runs the function 'myfunc' after the RTL functions have initialized. Myfunc would be defined as follows:

void myfunc(void) ;

Note that #pragma rundown is equivalent to atexit. 





>



Compiler Extensions

Extended keywords extend ANSI C in a variety of ways that are sometimes useful for example to add new functionality (such as alloca or typeof) or to ease integration with operating systems and programming languages (for example __stdcall or __pascal).

_absolute

create a variable at an absolute address. Forexample:

_absolute(0x4c21) int a ;

places the variable 'a' at address 0x4c21. No storage is generated for such variables and no relocation is done.


alloca

Allocate memory from automatic storage (the processor stack). The primary motivation for using this function is that it is much faster than the malloc() function and the allocation gets freed automatically at the end of the function.

alloca is implicitly defined by the compiler as follows:

void *alloca(size_t size);

For example:

int size = 24;
int *p = alloca(size * sizeof(int));

will allocate enough space to store an array of 24 integers.

alloca allocates space without checking there is enough. If the space used by calls to this pseudo-function plus the space used by lower level functions and their data exceeds the
stack size, the program will probably crash. 

Memory allocated by alloca is normally freed at the end of the function it appears in, which makes it possible to allocate a lot of data in a loop. However, if a block has both a
call to alloca and uses variable length arrays, at the end of the block the variable length arrays will be freed, which will also result in freeing the memory allocated by alloca.


_cdecl

use standard C language linking mechanism (here for compatibility with other compilers). In this linking mechanism, a leading underscore is prepended to the name, which is case sensitive. The caller is responsible for cleaning up the stack. For example:

void _cdecl myfunc() ;

creates a function myfunc with standard linkage.


_export

make an export record for the linker to process. The current record becomes an entry in the EXE files export table. For example:

void _export myfunc() ;
 
will cause myfunc to be an exported function.


_genbyte

Generate a byte of data into the code segment associated with the current function. For example:

void myfunc()
{
.
.
.
_genbyte(0x90) ;
.
.
.
}

puts a NOP into the code stream.


_import

This can be used for one of two purposes.  First it can make import record for the linker to process, which will result in the appropriate DLL being loaded at run-time. Second, it can be used to declare a variable from a DLL so that the compiled code can access it. For example:

int _import myvariable ;

declares myvariable as being imported. While

int _import("mylib.dll") myvariable ;

declares myvariable as being imported from mylib.dll.


_interrupt

Create an interrupt function. Pushes registers and uses an IRET to return from the function. For example:

_interrupt void myfunc() 
{
}

Creates a function myfunc which can be used as an interrupt handler.


_fault is similar to _interrupt, but also pops the exception word from the stack. Used when returning from certain processor fault vectors


_loadds

For an Interrupt function, force DS to be loaded at the beginning of the interrupt. This will be done by adding 0x10 to the CS selector to make a new DS selector. For example:

_loadds _interrupt void myfunc() 
{

}

will create an interrupt function that loads DS


_pascal
  
use PASCAL linking mechanism. This linking mechanism converts the function name to upper case, removes the leading underscore, pushes arguments in reverse order from standard functions,  and uses callee stack cleanup. For example:

void _pascal myfunc() ;

Creates a function myfunc with pascal linkage.


_stdcall

Use STDCALL linking mechanism. This linking mechanism removes the leading underscore and uses callee stack cleanup. For example:

void _stdcall myfunc() ;

Creates a function myfunc with pascal linkage.  This is the linkage method most windows functions use.


typeof

the typeof operator may be used anywhere a type declaration is used, e.g. as the base type for a variable or in a cast expression. It allows you to access the variable's type without explicitly knowing what that type is. For example:

long double aa ;
typeof(aa) bb;

declares bb as long double type.


OAsm

OAsm is an intel x86 assembler in the spirit of the Netwide Assembler (NASM). While it shared many features with NASM, programs written for NASM are not 100% compatible with it. 

OAsm converts a textual form of processor-specific instructions into numbers the computer will understand. However, discussion of these processor-specific instructions is beyond the scope of this documentation; instead the documentation will focus on various other features OAsm presents to help easily create code and data.

OAsm supports both 16 and 32-bit code.

In comparison to standard Intel assemblers, OAsm shares some features. But unlike most other Intel assemblers, which keep track of the type of variables, OAsm is typeless. Which means any time a variable is used in an ambiguous context, type information has to be provided. Another major difference is the interpretation of  brackets... in an Intel assembler brackets around simple variable names are often optional, with the assembler interpreting:

mov ax,[myvar]

and

mov ax,myvar

in the same way. In these assemblers, an additional keyword offset is used to refer to a variable's address:

mov ax,offset myvar

However in OAsm, the offset keyword is done away with. Instead, the brackets are given a more concrete meaning. When they exist, they indicate the value of a variable; absence of the brackets denotes the address of the variable.

OAsm understands the usual comment syntax starting with a ';' and extending to the end of the line. For example in:

sub eax,edx ; normalize

everything after the semicolon is a comment and is ignored.

Additionally OAsm understands C and C++ style comments since it uses an extended C language preprocessor. For example to write a block of comments use /* and */:  Everything between the /* and the */ is a comment.  Multiple lines may be typed between these sequences.  

The final commenting style is the C++ double slash style, which is similar to ';' except uses a '//' sequence to delimit the beginning of the comment:

sub eax,edx // normalize

The general form of an OAsm Command Line is:

OAsm [options] filename-list

Where filename-list gives a list of files to assemble. 


OAsm understands several integer number formats, as well as floating point. It is also capable of evaluating complex expressions involving numbers, labels, and special symbols for things like the current program counter. Some of these expressions can be evaluated immediately; others are pushed off to the linker.

There are three kinds of labels. Each Standard label may be defined at most once in a source file. Such labels have a global context and are accessible from anywhere within the current source modules, and sometimes from other modules as well. 

On the other hand Local labels inherit a context from standard labels, and may be defined multiple times provided they are defined at most once between the occurrance of any two standard labels. 

Non-local labels are sometimes useful - they share the idea of having a global context with standard labels, but don't start a new context for local labels.

Directives are psuedo-instructions to the assembler which guide how the assembly is done. In the most rudimentary form, they can provide a mechanism for defining numeric or textual values, or reserving space. Other directives allow breaking the code and data up into distinct sections such as CODE and DATA that can be linked against other files.

Natively, directives are always enclosed in brackets, for example to define a byte of data:

myvar [db 44]

However, the directives are redefined with default macros, so that the brackets are not necessary when typing code:

myvar db 44

Macros are described further in the section on the preprocessor.

Some of the macro redefinitions of the directives are simply a mapping from the non-bracket version to the bracketized version. Other macros are more complex, adding behavior to the assembler. For example the macros for psuedo-structures also define the structure size, and keep track of the current section and switch back to it when done.

OAsm uses a C99-style preprocessor, which has been enhanced in various ways. One difference is that instead of using '#' to start a preprocessor statement, OAsm uses '%'. This does not apply to the stringizing and tokenizing sequences however; those still use '#'.

Basic extensions to the preprocessor include %assign, which is similar to %define except the preprocessor will evaluate an arithmetic expression as part of the assignment. Other extensions include repeat blocks and multiline macros.

The basic conditional statements allow for a variety of ways to conditionally compile code.  There are also various types of conditional statements which have been added to do textual comparisons and find out a token type. A context mechanism useful for keeping track of assembler states.For example the context mechanism might be used with multiline macros to create a set of high level constructs such as 'if-else-endif', 'do-while' and so forth.


Command Line

The general format of an OAsm
command line is:

OAsm [options] file-list

where the file-list is an arbitrary list of input files.

For example:

OAsm one.asm two.asm three.asm

assembles several files, and makes output files called ONE.O, TWO.O and THREE.O.

The file list can have wildcards:

OAsm *.asm

assembles all the files in the curent directory.


Response files can be used as an alternate to specifying input on the command line. For example:

OAsm @myresp.lst

will take command line options from myresp.lst.  Response files aren't particularly useful with the assembler, but they are supported.

Following is a list of the command line switches OAsm supports.

-i case insensity

if this switch is specified on the command line, labels will be treated as case insensitive. This can allow easier linkage between modules which use different case. Note that case insensitivity only extends to labels; preprocessor symbols are always case-sensitive in OAsm.

-e preprocess only

if this switch is specified on the assembler command line, OAsm will act as a preprocessor and not perform the actual assembly phase.  In preprocessor mode, the full functionality of the preprocessor is available, when it does not rely on information that would only be  available while assemblying the file. Specifically, the preprocessor is fully functional, except that expressions that refer to labels or program counter locations will result in an error when used for example with the %assign preprocessor statement.

-oname specifying output file name

By default, OAsm will take the input file name, and replace the extension with the extension .o.  However in some cases it is useful to be able to specify the output file name with this switch. The specified name can have its extension specified, or it can be typed without an extension to allow OAsm to add the .o extension.  OAsm is only capable of producing object files.  For example

OAsm /omyfile test.asm

makes an object file called MYFILE.O

-Ipath

By default, OAsm will search for include paths in the current directory. If there are other paths OAsm should search for included files, this switch can be specified to have it search them.

-Ddefine = xxx

If it is necessary to define a preprocessor definition on the command line, use this switch.  For example:

OAsm /DMYINT=4 test.asm

defines the variable MYINT and gives it a value of 4. 

A preprocessor definition doesn't have to be defined with a value:

OAsm /DMSDOS test.asm

might be used to specify preprocessing based on the program looking for the word MSDOS in #ifdef statements.

-lfile listing file

When this switch is specified, OAsm will produce a listing file correlating the output byte stream with input lines of the file. Note the listing file is not available in preprocessor mode as no assembly is done. 

By default the listing file will not show macro expansions. To get a listing file where macros are shown in expanded form, add -ml. This will also expand preprocessor repeat statements.  Note that there is a special qualifier used in a macro definition, .nolist, which can be used on a macro-by-macro basis. When used, it prevents macros from being expanded in the listing file even when -ml is used This is useful for example to prevent cluttering the  the listing file with expansions of often-used macros or macros that are composed largely of preprocessor statements. In fact the builtin macros that map preprocessor directives to native form all use this qualifier.


Directives

Directives are used to indicate the assembler should interpret the statement as something other than an instruction to the processor. For example they can be used to define data, or to create seperate sections for grouping code and/or data. 

Natively, directives are always enclosed in brackets, for example to define a byte of data:

myvar [db 44]

However, the directives are redefined with default multiline macros, so that the brackets are not necessary when typing code:

myvar db 44

This documentation describes the macro version of the directives.

Some of the macro redefinitions of the directives are simply a mapping from the non-bracket version to the bracketized version. Other macros are more complex, adding behavior. For example the macros for psuedo-structures define the structure size, and keep track of the current section and switch back to it when the end of structure macro is encountered.

There are several types of directives. 
  • Data Definition Directives define the value of a variable, either in terms of a numeric value or as a label (address) defined elsewhere in the assembler. 
  • Data Reservation Directives reserve space for data, and optionally initialize that space with default values. 
  • Label Directives give some attribute to a label, such as defining how it will be viewed from outside the module. 
  • Section Directives group related code or data in such a way that it can be combined with other modules. 
  • The EQU Directive allows definition of a label in terms of a constant value. 
  • The INCBIN Directive allows the possibility of directly importing binary data into the current section. 
  • The TIMES Directive allows a very simple form of repetitive code generation. 
  • Psuedo-structure Directives allow you to define structured data in a very basic way.





    Data Definition Directives

    Data Definition directives define the value of a variable. There are several types of values that can be defined. 

    The first is a number.  The number can sometimes be an integer, other times floating point, depending on the directive. Another type is a character or string.  Another type of value is use of a label, or the difference between labels. Another type sometimes useful in real mode programming is a segment, which is very loosely a reference to a section. The reference can be made either by saying the section name, or using the SEG operator on a label to extract its section.

    There are five Data Definition Directives:
  • db - define byte
  • dw - define word
  • dd - define dword
  • dq - define quadword
  • dt - define tbyte


    Table 1 cross-references the various directives along with the data types can be defined with each.







    integer floating point character/string label segment
    db yes no yes no no
    dw yes no yes yes yes
    dd yes yes yes yes no
    dq no yes no no no
    dt no yes no no no


    Table 1 - directives and data types



    In 16-bit mode, it makes sense to use dw with labels, whereas in 32-bit mode, it makes sense to use dd with labels.

    Some examples follow:

    mylab:
    db 44
    dw 0234h
    dd 9999
    dd 43.72
    dd mylab
    dq 19.21e17
    dt 0.001

    Multiple values may be specified per line for these directives; for example:

    mylab:
      db "hello world",13, 10, 0
      dq 44.7,33,2.19.8


    Data Reservation Directives



    Data Reservation directives reserve space for data, and optionally give an initial value to load the space with. For example:

    resb 64

    reserves space for 64 bytes of data. This data will default to zeros in the section it is defined it. However, an alternative form of the directive specifies an initial value. Thus:\

    resb 64,'a'

    fills the space with lower case 'a' values.

    There are five data reservation directives:
  • resb - reserve bytes  
  • resw - reserve words
  • resd - reserve dwords
  • resq - reserve quadwords
  • rest - reserve tbytes

    As an example:

    mylab db "hello world"
      resb mylab + 80 - $, '.'

    defines the string "hello world", then adds enough dots on the end to make up an 80-character buffer.

    Generally, the type of data that can be defined in the optional argument to one of the Data Reservation directives is the same as for the corresponding Data Definition directive.


    Label Directives


    Label Directives bestow some type of attribute to a label. Generally these attributes center around visibility of the label - is it visible to some entity outside the current OAsm assembly session or not?  

    For example the 'global' and 'extern' directives bestow attributes that allow the linker to resolve references to a label when the references are made in one source code file but the label is defined in a different source code file. Some of these directives require the label to be defined in the same file as the directive occurs; and others require the label to be defined elsewhere.

    The Label Directives are as follows:
  • global - the label is defined in this file, but the linker may use it to resolve references in other files
  • extern - the label is not defined in this file, the linker should find it elsewhere
  • export - the label is defined in this file, and will become a DLL or EXE export that Windows can use to resolve against other executables
  • import - the label is not defined in this file, it will be imported from some other DLL or EXE file by windows
      

    Each use of theglobal directive can assign the 'global' attribute to one or more labels. When a label is global, it has a presence that extends beyond the current file, and the linker may resolve references to the variable to it, when those references are made in other files. Those references would typically have been made by use of the 'extern' directive. For example:


    global puterror
    puterror:
      mov eax,errmsg
      call strput
      ret
    errmsg:
      db "this is an error",0
    strput:
       ....
       ret

    creates a function 'puterror' which is visible to other files during linkage. Global may be used with multiple labels:


    global mylab, strput



    Each use of the extern directive can assign the 'external' attribute to one or more labels. When a label is external, the definition is not found in the file currently being assembled. The purpose of the directive is to give the assembler and linker a hint that it should search elsewhere for the definition.

    In the above example, if 'strput' was defined in a different file from the definition of puterror you might write the following:

    global puterror
    extern strput
    puterror:
      mov eax,errmsg
      call strput
      ret
    errmsg:
      db "this is an error",0

    As with the global directive, extern can be used with multiple symbols:

    global puterror
    extern strput, numberput
    puterror:
      push eax
      mov eax,errmsg
      call strput
      pop eax
      call numberput
      ret
     
    errmsg db "this is error number: ",0


    The export directive defines a symbol that can be used by the windows operating system during program load. For example a DLL might use it to declare a function that is to be available to other executable files.  Unlike the global and external directives, 'export' can only be used to change the attributes of one variable at a time. For example in the above examples adding the line:

    export puterror

    would create an export symbol named 'puterror' which windows could then resolve to an import reference in another executable file at load time. Another form of the export directive can be used to change the visible name of the exported function:

    export puterror Error

    would export the function puterror, but other executables would see it as being named Error.



    The import directive is used to signify that the label is exported from some other executable or DLL file, and that windows should load that executable or DLL so that the label can
    be resolved. As with export there are two versions of the directive:


    import ExitProcess Kernel32.dll
    ...
    push 0

      call ExitProcess
    ...

    indicates that the DLL kernel32.dll should be loaded so that a reference to the ExitProcess API call can be resolved. 

    It might be useful to rename ExitProcess to Quit if that is asier to remember and type:

    import ExitProcess Kernel32.dll Quit
    ...
    push 0
    call Quit 
    ...



    Section Directives

    Section directives help arrange data and code. In 16-bit code they are often essential, as a section in OAsm maps more or less directly to a segment in the real-mode architecture (depending on the exact definitions given in the linker specification file); and large programs simply won't fit within a single section. The section directive would then be used to partition the various code and data into segments some way that makes sense based on the application.

    While sections aren't quite as necessary in 32-bit code, it is still customary to partition code into various sections to make it more manageable. In general, the linker will combine the code and data that is generated from assembling multiple source files so that similar types of code or data will appear close together in the output file. This is usually done based on section NAME, for example a section named CODE in one object file will be combined with sections named CODE in other object files, and similarly sections named DATA will be combined together, before the CODE and DATA sections are themselves combined together to make a part of the executable.  Section names are generally arbitrary, however, there are some conventions made specifically in the linker specifications for windows programs.

    Another use for section directives is to define absolute addresses. This is similar to using EQU to give a value to a label, but is more generic and allows use of Data Reservation directives so the assembler can calculate offsets based on the amount of data reserved. This use of section directives helps define the psuedo-struct mechanism.

    The three section directives are:

  • section - define a named section
  • absolute - start an absolute section
  • align - align code or data to a specific boundary

    The Section directive switches to a new section. If this is the first time the section is used, the section directive may also specify attributes for the section. For example:

    section code

    switches to a section named code. Various attributes may be specified, such as section alignment and word size of section.  Some other attributes are parsed for compatibility with x86 assemblers, but are not currently used. The attributes are:
  • ALIGN=xxx  - set alignment of section within EXE file
  • CLASS=xxx - set class name. This attribute is ignored by this assembler
  • STACK=xxx - this section is a stack section. This attribute is ignored by this assembler
  • USE16 - this section uses 16-bit addressing modes and data
  • USE32 - this section uses 32-bit addressing modes and data

    As an example of a simple 32-bit program


    section code ALIGN=2 USE32

    extern Print
    ..start:
      mov eax,helloWorld
      call Print
      ret
    section data ALIGN=4 USE32
    helloWorld db "Hello World",0

    Note that for convenience, 'segment' is defined as an alias for 'section'. So you could write:

    segment code USE32

    to start a section named code, if you prefer.

    The absolute directive is used to switch out of sections where data can be emitted,  into an absolute section with the specified origin. It is called absolute because these labels never get relocated by the linker; they are essentially constants. Labels defined in an absolute section will get the value of the program counter within the section, at the time the section is defined. But these labels get a constant value, that is not relocated by the linker.  For example:

    absolute 0
    lbl1:
      resb 4
    lbl2:
      resw 3
    lbl3:

    creates an absolute section based at absolute zero, and defines three labels. These labels will have the following values based on the space that has been reserved:

    lbl1: 0
    lbl2: 4 * 1 = 4
    lbl3: 4 + 3 * 2 = 10

    Note that in the definition of this section, we did not attempt to create data or code, we only reserved space. In general attempting to generate code or data in an absolute section will cause an error.

    The Align directive aligns data to a specific boundary within a section. For example:

    section data
    db 4
    align 4

    inserts enough zeroed bytes of data to align the current section to the beginning of a four-byte boundary. Note that the section attributes still need to be set to have the same alignment or better, either in the section directive or in the linker specification file, so that the linker will honor the alignment when relocating the section.


    EQU Directive

    The EQU directive allows a label to be given some value. This value must be calculable at the time the EQU directive is encountered and must resolve to a constant. However, the value itself can involve the differences between relative constructs, e.g. the difference between the current program counter and a label defined earlier in the section is a valid expression for EQU.

    When the value is a difference between relative constructs, care must be taken that the branch optimization does not change the value after it has been calculated. For more information, see the section on expressions.

    For example:

    four  EQU 4   ; value of the label 'four' is 4
    mylab resb 64
    size EQU $-mylab ; program counter - mylab = 64



    INCBIN Directive

    The Incbin directive allows the import of a binary file into the current section. For example it could be used to copy a graphics resource such as a bitmap or font verbatim into the current section. Other uses include things like importing help text from a text file, or importing a table such as a CRC table that has been pre-generated by some other program.

    The basic form of incbin is:

    incbin "filename"

    where filename is the name of the file to import. In this form, all data from the beginning to the end of the file will be imported. Another form starts importing at a specific offset and
    goes to the end:

    incbin "Filename", 100

    starts importing from the 100th byte in the file. Still another form lets you specify the length of data to import:

    incbin "Filename", 96, 16

    imports 16 bytes, starting at offset 96 within the file.



    TIMES Directive

    The Times directive is a primitive form of repetitive programming. It takes as operands an instruction or directive, and a count of the number of times to repeat the instruction or directive. As such its functionality can often be performed more efficiently with a Data Reservation directive. It is also much more limited than the %rep group of preprocessor directives. Times is available primarily for nasm compatibility.

    For example the earlier example from the Data Reservation section could be alternatively written:


    mylab db "hello world"
    times mylab + 80 - $ [db '.']

    Here the native form of the db directive is used, since macro substitution is not available in this context. Times could also be used for timing:

    times 4 NOP

    another use for times sometimes found in NASM programs is to align data:

    times ($$-$)%4 [db 0]

    but in this assembler it is better to use the align directive.


    Psuedo-structure Directives


    Structures aren't really a construct supported by the assembler, however, clever macro definitions allow definition of structure-like entities. For example, consider the following:

    struc
    astruc
    s1 resb 4
    s2 resw 3
    s3 resd 5
    s4 resb 1
    endstruc

    This defines the following label values similar to how they were defined in an absolute section:

    s1: 0
    s2: 4 * 1 = 4
    s3: 4 + 3 * 2= 10
    s4: 10 + 5 *4 = 30

    The structure mechanism also defines astruc as 0, and it defines the size of the struct 'astruc_size' as 30 + 1 = 31.

    To access the member of a structure one might use something like:

    mov eax,[ebx + s3]

    The structure mechanism could just as well be done with absolute sections or EQU statements (except that a side effect of the structure mechanism is to define a size). But a little more more interestingly, if you introduce local labels and remember that a local label can be accessed from anywhere if its fully qualified name is specified you might write:

    struct astruc
    .s1 resb 4
    .s2 resw 3
    .s3 resd 5
    .s4 resb 1
    endstruc

    This lets you qualify the name and use:

    mov eax,[ebx + astruc.s3]

    However you need to be careful with this. Structures aren't really part of the assembler, but are instead an extension provided by built-in macros. So you can't make an instance of a structure and then use a period to qualify the instance name with a structure member. For example:

    mystruc resb astruc_size,0
    mov  eax,[mystruc.s3]

    is not valid, because the label 'mystruc.s3' does not exist. The move would have to be changed to something like:

    mov eax,[mystruc + astruc.s3]


    Labels

    Labels may begin with any alphabetic character, or with any of the characters '_', '?', or '.'. Within a label, alphabetic characters, digits, or any of the characters '_', '$', '#', '@', '~', '?', '.' may occur.

    Labels may be followed by a ':' character, but this is optional.

    There are various types of labels. A standard label begins with an alphabetic character, or '_', '?'. 

    Additionally 'local' labels may be defined. Local labels always start with a '.'. Local labels may be reused, providing there is a standard label between uses. This allows you to use meaningless names like '.1' '.2' and so forth within a local context, instead of having to come up with
    unique labels every time a label is required.

    For example in the fragment

    routine1:
      test ax,1
      jnz .exit ; 
      ///do something
    .exit:
      ret

    routine2:
      cmp bx,ax

      jc .exit
      ; do something
    .exit:
    clc
    ret

    .exit is defined twice, however, each definition follows a different standard label so the two definitions are actually different labels.

    Internally, each use of a local label does have a unique name, made up by concatenating the most recent standard label name with the local label name. In the above example the internal names of the two labels are thus routine1.exit and routine2.exit.  It is possible to branch to the fully qualified name from within another context.

    The context for local labels changes each time a new standard label is defined. It is sometimes desirable to define a kind of label which is neither a standard label, that would change the local label context, or a local label, which is useful only within that  context. This is done by prepending the label name with '..@'. For example in the below:


    routine3:
      text ax,1

      jnz .exit
      ..@go3:
    ; do something
    .exit:
    ret

    main:
      call ..@go3
      ret

    the label ..@go3 is accessible from anywhere, but it is not qualified by the local label context of routine 3, nor does it start a new local label context, so .exit is
    still a local label within the context of routine3.

    OAsm generates two forms of such labels  - one within macro invocations, and within 'contexts' as shown in other sections. In these cases the label starts with '..@', has a sequence of digits, then has a '.' or '@' character followed by user-specified text. When using the nonlocal label format, these forms should be avoided to avoid clashing with assembler-generated labels.


    OAsm defines one special label, '..start'. This label, if used, indicates that this particular position in the code space is the first code that should be executed when the program runs.  Note that if two modules both declare this label and are linked together, the linker will throw an error.


    Numbers and Expressions


    Integers may be specified in base 8, base 10, or base 16. The rules for specifying integers follow the C language standard; e.g. if a number starts with 0 it is assumed to be base 8; if it starts with 0x  it is assumed to be base 16; otherwise it is base 10. For compatibility with other assemblers OAsm also supports the trailing 'h' to indicate hexadecimal values (base 16) but such numbers must start with a digit to prevent them from being interpreted as labels.

    For example:

    012 ; octal for the decimal value 10
    12 ; the decimal value 12
    0x12 ; hexadecimal for the decimal value 18
    012h ; hexadecimal for the decimal value 18

    Floating point values are specified similarly as to a C compiler.

    For example:

    1.03
    17.93e27
    10000.4e-27

    Note that floating point values must start with a digit in  OAsm. .003 is not a valid floating point value because character sequences starting
    with a dot are interpreted as local labels.  Use 0.003 instead.


    OAsm makes no real distinction between single characters and sequences of characters. Single quotes (') and double quotes (") may be used interchangeably. But the interpretation of characters and strings depends on context.

    When used in an instruction:

    mov ax,"TO"

    The operand will be constructed in such a way that storing it to memory will result in the characters being stored in the same order they were typed.
    In other words, the sequence:

    mov ax,"TO"
    mov [label],ax

    will result in the value at label being the same as if the assembler directive db were used to initialize the value:

    label db "TO"

    Characters at the end of a string that cannot be encoded in the instruction will be lost, thus:

    mov ax,"hi roger"

    is the same as:

    mov ax,"hi"

    because the register ax only holds the equivalent of two characters.

    On the other hand, data areas may be initialized with strings with various directives. There are three types of values that can be initialized this way; bytes (1byte), words(2 bytes), and double-words(4 bytes). For ASCII characters, the encoding is just the character, with enough leading zeros to pad to the appropriate size.


    The symbol '$', by itself, means the current program counter. This is an absolute program counter, and if passed through to the linker will result in an absolute offset being compiled into the program. But sometimes it doesn't need to be used as an absolute value, for example it can be used to find the length of a section of data:

    mylabel db "hello world",10,13
    hellosize EQU $-mylabel

    where the EQU statement assigns the value of the expression '$-mylabel' to the label hellosize.  

    The symbol '$$' means the beginning of the current section. For example the expression $-$$ gives the offset into the current section.

    A more complex expression may usually be used wherever a number may be specified, consisting perhaps of numbers, labels, special symbols, and operators. OAsm uses operators similar to the ones found in a C compiler, with precedence similar to the C compiler, and introduces some new operators as well. See table 1 for a listing of operators, and table 2 for a listing of operator precedences.




























    ( ) specify evaluation order ofsub-expressions
    SEG refers to segment of a variable (16-bit only)
    - unary minus
    + unary plus
    ~ bitwise complement
    ! logical not
    * multiply
    / divide, unsigned
    /- divide, signed
    % modulous, unsigned
    %- modulous, signed
    + addition
    - subtraction
    WRT offset of a variable, from a specific segment
    >> unsigned shift right
    << unsigned shift left
    > greater than
    >= greater than or equal
    < less than
    <= less than or equal
    == equals
    != not equal to
    & binary and
    ^ binary exclusive or
    | binary or
    && logical and
    || logical or


    Table 1, Operator meanings













    ( ) parenthesis
    SEG, -, +, ~, ! unary operators
    *, /, /-, %, %- multiplicative operators
    +, -, WRT additive operators
    <<,>> shift operators
    >l, >=, <, <= inequality operators
    ==, != equality operators
    & bitwise and
    ^ bitwise exclusive or
    | bitwise or
    && logical and
    || logical or


    Table 2, Operator precedence from highest to lowest




    Expressions involving labels or segments will often be pushed off to the linker for evaluation, however, the linker only knows simple math such as +-*/ and SEG. Sometimes however, an expression such as '$-mylab' can be directly evaluated by the assembler, provided mylab is defined earlier in the current segment. Such evaluations would result in a constant being passed
    to the linker.

    Note that OAsm mimics a multipass assembler, and will attempt to optimize branches to the smallest available form. This is normally not a problem as after each optimization pass OAsm will reevaluate expressions found in the code or data. However, some assembler directives such as EQU and TIMES evaluate their operands immediately, when the instruction is encountered. And all branches start out at the largest possible size. That means that a sequence like:

    section code
    USE32
    label:
      cmp eax,1
      jz forward
      inc eax
    forward:
    size EQU forward - label

    will result in 'size' being evaluated with the jz being a 6-byte instruction, but the final code will have the jz being a two-byte instruction. This disparity between the calculated value and the
    actual value can introduce subtle bugs in a program. To get around this explicitly clarify any jumps in a region that is going to be sized with 'short' or 'near':

    label:
      cmp eax,1
      jz short forward

      inc eax
    forward:

    Data directives aren't subject to size optimizations, so in the usual case of taking the size of a data region this isn't an issue.


    Preprocessor Directives

    The OAsm preprocessor is a C preprocessor, with extensions. The preprocessor can be considered to be a set of routines which go through the code before it is assembled, making certain types of textual transformations to the source code. The transformations range from simple substitution of text or lines of text when a keyword is encountered, to inclusion of text from another file, to selectively ignoring some lines based on compile-time settings. The main difference from a C preprocessor, other than the extensions, is that instead of starting a preprocessor directive with a hash ('#') preprocessor directives are started with a percent ('%').

    Many of these directives involve Conditional Processing, which is a way to selectively choose what lines of code to assemble and what lines to ignore based on previous declarations.

    Table 1 shows the C-style preprocessor directives. These directives are compatible with similar directives in a C Compiler's preprocessor.












    %define define a constant or function-style macro
    %undef undefine a macro
    %error display an error
    %line set the file and line displayed in error messages
    %include include another file
    %if conditional which tests fornonzero expression
    %elif else-style conditional which tests for nonzero-expression
    %ifdef conditional which tests to see if a macro has been %defined
    %ifndef conditional which tests to see if a macro has not been %defined
    %else else clause for conditionals
    %endif end of conditional


    Table 1 - C lanuage style preprocessor directives



    Table 2 shows basic extensions to the C preprocessor that are similar to directives already found in the preprocessor. This includes a new directive %assign which is like %define except it evaluates the macro value based on the assumption it is a numeric expression. It also includes case insensitive macro definition directives, and the beginning of an extensive set of extensions to condtionals that are similar to the %elif  mechanism.






    %assign Like %define, but evaluates an expression and sets the value to the result
    %iassign %assign with a case-insensitive name
    %idefine %define with a case-insensitive name
    %elifdef else-style conditional which tests to see if a macro has been %defined
    %elifndef else-style conditional which tests to see if a macro has not been %defined


    Table 2 - Basic extensions to C style preprocessor



    Table 3 shows extensions to the conditional mechanism that allow text comparisons. There are both case sensitive and case insensitive forms of these directives.










    %ifidn Case sensitive test for string matching
    %ifnidn Case sensitive test for string not matching
    %elifidn else-style case sensitive test for string matching
    %elifnidn else-style case sensitive test for string not matching
    %ifidni Case insensitive test for string matching
    %ifnidni Case insensitive test for string not matching
    %elifndi else-style case insensitive test for string matching
    %elifnidni else-style case insensitive test for string not matching


    Table 3 - Text Comparison Conditionals



    Table 4 shows various extensions to the conditional mechanism that allow classification of a token's type. They can be used for example in multiline macros, to allow a single macro to have different behaviors based on the type of a macro argument.















    %ifid test to see if argument is an identifier
    %ifnid test to see if argument is not an identifier
    %elifid else-style test to see if argument is an identifier
    %elifnid else-style test to see if argument is not an identifier
    %ifnum test to see if argument is a number
    %ifnnum test to see if argument is not a number
    %elifnum else-style test to see if argument is a number
    %elifnnum else-style test to see if argument is not a number
    %ifstr test to see if argument is a string
    %ifnstr test to see if argument is not a string
    %elifstr else-style test to see if argument is a string
    %elifnstr else-style test to see if argument is not a string


    Table 4 - Token Type Classification Conditionals



    Table 5 shows the Multiline Macro Extensions and the Repeat Block Extensions . These extentions include multiline macros, as well as a powerful facility for using the preprocessor to repeat sections of code or data.









    %macro start a multiline macro
    %imacro start a multiline macro, caseinsensitive name
    %endmacro end a multiline macro
    %rotate rotate arguments in a multiline macro
    %rep start a repeat block
    %endrep end a repeat block
    %exitrep exit a repeat block prematurely


    Table 5 - Multiline Macro and Repeat Block Extensions



    Table 6 shows the Context-Related Extensions.  Preprocessor contexts are a powerful mechanism that can be used to 'remember' data between successive macro invocations, and for example could be used to construct a high-level representation of control constructs in the assembler.








    %push start a new context
    %pop end a new context
    %repl rename the context at the topeof the context stack
    %ifctx test to see if a context is in effect
    %ifnctx test to see if a context is not in effect
    %elifctx else-style test to see if a context is in effect
    %elifnctx else-style test to see if a context is not in effect


    Table 6 - Context - Related Extensions




    C-Style Preprocessor directives

    The C-Style preprocessor directives are compatible with similar directives that are existant in preprocessors for C compilers.  OAsm does not change the behavior of these directives other than to change the initial character which introduces the directive from '#' to '%'.

    %define
    %define introduces a method to perform textual substitutions. In its simplest form it will just replace an identifier with some text:

    %define HELLO_WORLD "Hello World"

    replaces all instances of the identifier HELLO_WORLD with the indicated string. For example after this definition the following statement:

    db HELLO_WORLD

    will result in the string "Hello World" being compiled into the program.

    %define is not limited to giving names to strings, it will do any sort of text substitution you want. That could include defining names for constants, or giving a name to an often-used instruction, for example.

    In the below:

    %define ARRAY_MAX 4
    mov eax,ARRAY_MAX

    the text "4" gets substituted for ARRAY_MAX prior to assembling the mov instruction, so what the assembler sees is:

    mov eax,4

    Note that definitions are also processed for substitution:

    %define ONE 1
    %define TWO (ONE + 1)
    %define THREE (TWO + 1)
    mov eax,THREE
     
    is substituted multiple times during processing, with the final result being:

    mov eax,((1 +1) + 1)

    OAsm will detect recursive substitutions and halt the substitution process, so things like:

    %define ONE TWO
    %define TWO ONE

    will halt after detecting the recursion.

    Also, the substitution text can be empty:

    %define EMPTY
    mov eax, EMPTY

    results in the translated text:

    mov eax,

    which cannot be assembled and will result in a syntax error during assembly.


    %define can also be used in its functional form for more advanced text replacement activities. In this form, the identifier is parameterized. During substitutions, arguments are also specified; and the original parameters are replaced with the arguments while substitution is occurring. For example:

    %define mul(a,b) a * b
    mov eax,mul(4,7)
    mov eax,4 * 7

    prior to assembly. It is usually not a good idea to write this quite the way it was done however. The user may elect to put any text they want in the invocation, so one thing that can happen is they write:

    mov eax, mul(4+3, 7+2)

    This gets translated to:

    mov eax, 4 + 3 * 7 + 2

    which was probably not the intent. Below is what was probably desired:

    mov eax, (4+3) * (7+2)

    For this reason it is a good idea to fully parenthesize the parameters used in the original definition:

    %define mul(a,b) ((a) * (b))
    mov eax, mul(4+3, 7+2)

    becomes

    mov eax, ((4 + 3) * (7 + 2))

    The extra set of parenthesis is used to prevent similar situations from happening when 'mul' is used as a subexpression of another expression.


    Note that when using %define, substituted text is not evaluated in any way, other than to process substitutions on the identifier and any specified parameters. So the move statement in the last example would be visible to the assembler exactly as the substitutions dictate, and the assembler has to do further evaluation of the expression if it wants a constant value.

    Within a definition, there are a couple of special-case substitutions that can occur with function-style definitions. In Stringizing, a parameter can be turned into a string. For example if you write:

    %define STRINGIZE(str) #str
    db STRINGIZE(Hello World)

    quotes will be placed around the substituted parameter. So this translates to:

    db "Hello World"

    prior to assembly.

    In Tokenizing, new identifiers may be produced. For example:

    %define Tokenizing(prefix, postfix) (prefix ## postfix + 4)
    mov eax,Tokenizing(Hello,World)

    would be translated to:

    mov eax,HelloWorld + 4
     
    prior to assembly.

    Note that even though the hash character used to start a preprocessor statement has been changed to '%', hash is still used in stringizing and tokenizing.

    Finally, OAsm supports the C99 extension to function-style definitions, which allows variable-length argument lists. For example:

    %define mylist(first, ...) dw first, __VA_ARGS__
     
    where __VA_ARGS__ means append all remaining arguments that are specified, could be used like this:

    mylist(1)
    mylist(1,2)
    mylist(1,2,3,4,5)

    and so on. These would expand to:

    dw 1
    dw 1,2
    dw 1,2,3,4,5

    Note that the name of the identifier that is replaced is case-sensitive with %define, for example HELLO_WORLD is not the same as Hello_World. There is a case-insensitive form of this directive %idefine which can be used to make these and other related identifiers the same.

    Note: OAsm does not support overloading function-style macros. 

    For convenience OAsm allows %define definitions on the command line, which are useful for tailoring build behavior either directly or through the conditional processing facility.

    %undef

    %undef undoes a previous definition, so that it will not be considered for further substitutions (unless defined again). For example:

    %define REG_EBX 3
    %undef REG_EBX
    mov eax, REG_EBX

    results in no substitution occurring for the use of REG_EBX.  In this case this will result in an error.  


    %error
    %error displays an error, causing the assembler to not generate code. For example:

    %error my new error

    might display something like:

    Error errdemo.asm(1): Error Directive: my new error

    When the file is assembled.

    %line
    %line is used to change the file and line number listed in the error reporting. By default the error reporting indicates the file and line an error occur on.  Sometimes in generated source code files, it is useful to refer to the line number in the original source code rather than in the file that is currently being assembled. %line accomplishes this by updating internal tables to indicate to the preprocessor that it should use alternate information when reporting an error. For example inserting the following at line 443 of test.asm:

    mov eax,^4

    produces a syntax error when the code is assembled:

    Error test.asm(443): Syntax Error
     
    If an additional %line directive is inserted:
     
    %line 10, "demo.c"
    mov eax,^4

    the error changes to:

    Error demo.c(10): Syntax Error

    Note that once %line is used to change the line number and file name, OAsm remembers the new information and continues to increment the new line number each time it processes a line of source code.


    %include

    %include is used to start the interpretation of another source file. The current source file is suspended, and the new source file is loaded and assembled. Once the assembler is done with the new source file (and anything it also %includes)  assembly resumes beginning where it left off in the current source file.

    This facility is useful for example to hold preprocessor constants and structures that are shared between multiple source files. But the included file can include any valid assembler statement, including GLOBAL and EXTERN definitions. 

    For example if test.asm is being assembled and the statement:

    %include "test1.asm"

    is encountered, the assembly of test.asm will temporarily be suspended while OAsm goes off to assemble test1.asm.

    After it is done with test1.asm,  OAsm remembers that it was previously assembling test.asm and picks up in that file where it left off (e.g. at the line after the %include statement).

    This is not quite the same as specifying both test.asm and test1.asm on the command line.  In the current example there is only one output file which contains the contents of both test.asm and test1.asm, where as if both were specified on the command line they would result in separate output files.  Additionally, doing it this way can allow the two files to depend on one another in a way that couldn't happen if they were compiled separately.

    For convenience, an include path may be specified on the command line, and OAsm will search for included files both in the current directory, and in directories specified on that path.

    %if

    %if is a %if-style conditional that takes a numeric expression as an argument. If the numeric expression evaluates to a non-zero value, the result of the evaluation will be true, otherwise it will be false.  Note that for purposes of this conditional, expressions are always evaluated; if an undefined identifier is used in a %if expression it is replaced with '0' and  valuation continues.  Subsequent blocks of code will either be assembled if the result of the evaluation is non-zero, or ignored if the result of the evaluation is zero.

    For example:

    %define COLOR 3
    %if COLOR == 3

    mov eax,4
    %endif

    will result in the mov statement being assembled because the result of the argument evaluation is a nonzero value.

    On the other hand:

    %define ZERO 0
    %if ZERO
    mov eax,4
    %endif

    results in nothing being assembled because the value of 'ZERO' is zero.

    See the section on Conditional Processing for more on %if-style conditionals.


    %elif
    %elif is a %elif-style conditional that takes a numeric expression as an argument. If the numeric expression evaluates to a non-zero value, the next block will be assembled, otherwise it will be ignored. As with %if, undefined symbols will be replaced with '0' for purposes of the evaulation.

    For example:

    %define COLOR 3
    %if COLOR == 2

    mov eax,4
    %elif COLOR == 3
    inc eax
    %endif

    will result in the mov statement being ignored and the inc statement being assembled because the result of the %if argument evaluation is zero, and the result of the %elif argument evaluation is nonzero.

    See the section on Conditional Processing for more on %elif-style conditionals.



    %ifdef
    %ifdef is a %if-style conditional that takes an identifer as an argument. If the identifier has been defined with a previous %define or %assign statement, the next block will be  assembled, otherwise it will be ignored.

    For example:

    %define COLOR 3
    %ifdef COLOR
    mov eax,4
    %endif

    will result in the mov statement being assembled because COLOR has been defined.

    Note that a definition declared with %define or %assign must match the argument exactly, whereas a definition declared with %idefine or %iassign can differ in case and still match. %ifdef will not match identifiers declared with %macro or %imacro.

    See the section on Conditional Processing for more on %if-style conditionals.



    %ifndef
    %ifndef is a %if-style conditional that takes an identifer as an argument. If the identifier has not been defined with a previous %define or %assign statement, the next block will be assembled, otherwise it will be ignored.

    For example:

    %define COLOR 3
    %ifndef COLOR

    mov eax,4
    %endif

    will result in the mov statement being ignored because COLOR has been defined. Alternatively:

    %undef COLOR
    %ifndef COLOR
    mov eax,4
    %endif

    will result in the mov statement being assembled because COLOR is not currently defined.

    Note that a definition declared with %define or %assign can have any difference from the argument and trigger assembly of the block, whereas a definition declared with %idefine or %iassign must differ in some way other than in case. %ifndef will assemble the following block for identifiers declared with %macro or %imacro.

    See the section on Conditional Processing for more on %if-style conditionals.


    %else
    %else is used to select a block for assembly, when all previous %if-style conditionals and %elif-style conditionals in the same sequence have had their argumentsevaluate to false. For example:

    %define COLOR = 3
    %if COLOR ==4
    mov eax,3
    %else
    inc eax
    %endif

    will result in the mov being ignored, but the inc being assembled, because the evaluation of the %if argument is false.

    See the section on Conditional Processing for more on %else.


    %endif


    %endif is used to end a conditional sequence. Once a conditional sequence is ended, code assembly proceeds as normal, unless the conditional sequence was itself nested within a block of a higher-level conditional sequence that is not being assembled.

    See the section on Conditional Processing for more on %endif.



    Basic Extensions to C Preprocessor

    The basic extensions to the C Preprocessor generally add functionality that is very similar to the functionality already found in the C Preprocessor, but extends it in some way.

    %assign
    %assign is similar to the non-functional form of %define, in that it defines text to be substituted for an identifier. Where %assign differs is that the text to be substituted is evaluated as if it were an expression at the time the %assign is encountered. This is useful for example in %rep loops. For example the following code makes a data structure that consists of the integers from 1 to 100, in ascending order:

    %assign WORKING 1
    %rep 100
       db WORKING
    %assign WORKING WORKING + 1
    %endrep

    But there is another difference with %assign. It is the only preprocessor directive that interacts with data structures in the assembler; so for example the %assign expression can contain expressions involving labels and the program counter. Thus:

    helloWorld db "hello world"
      %assign COUNTER 64 - ($-helloWorld) 
    %assign PADDING ($-helloWorld)
    %rep COUNTER
       db PADDING

    %assign PADDING PADDING + 1
    %endrep

    puts the string "hello world" in the text, followed by the byte for 11, the byte for 12, etc... up to the byte for 63.

    Note that this latter behavior of interacting with the assembler is only valid if code is being assembled. If the preprocess-only switch is specified on the command line, assembler symbols will not be available, and the latter example will result in errors.



    %iassign
    %iassign is a form of %assign where the identifier is considered to be case insensitive. So for example:

    %iassign
    COUNTER 63
    %rep Counter
    ...
    %endrep

    and similar case variants on the word counter would still result in a loop that executes 63 times.


    %idefine
    %idefine is a form of %define where the identifier is assumed to be case insensitive. So for example: 

    %idefine COUNTER 4
    %idefine counter 4
    %idefine Counter 4

    are equivalent statements, and any case variant of the word COUNTER will match for substitution. Note that this case sensitivity only extends to the identifier; any parameters specified in a function-style %idefine are still case-sensitive for purposes of substitution.

    %elifdef

    %elifdef is a %elif-style conditional that takes an identifer as an argument. If the identifier has been defined with a previous %define or %assign statement, the next block will be assembled, otherwise the next block will be ignored.  For example:

    %define COLOR 3
    %if COLOR == 2
    mov eax,4
    %elifdef COLOR
      inc eax
    %endif

    will result in the mov statement being ignored and the inc statement being assembled because COLOR has been defined but is not 2.

    Note that a definition declared with %define or %assign must match the argument exactly, whereas a definition declared with %idefine or %iassign can differ in case and still match.   %elifdef will not match identifiers declared with %macro or %imacro.

    See the section on Conditional Processing for more on %elif-style conditionals.


    %elifndef

    %elifndef is a %elif-style conditional that takes an identifer as an argument. If the identifier has not been defined with a previous %define or %assign statement, the next block will be assembled, otherwise the next block will be ignored.

    For example:

    %define COLOR 3
    %if COLOR == 2
    mov eax,4
    %elifndef COLOR
      inc eax
    %endif

    will result in nothing being assembled because COLOR is defined but not
    equal to 2.

    Note that a definition declared with %define or %assign can have any difference from the argument and trigger assembly of the block, whereas a definition declared with %idefine or %iassign must differ in some way other than in case.  %elifndef will assemble the following block for identifiers declared with %macro or %imacro.

    See the section on Conditional Processing for more on %elif-style conditionals.


    Conditional Processing

    Conditional processing is a way to tell the assembler that some lines of code should be assembled into the program, and other lines may be ignored. There are a variety of conditional processing directives, which use conditions ranging from evaluation of an expression, to string comparison, to type or state of a previous symbol definition. It is useful particularly in configurationmanagement, to allow different configurations of the program to be built for example by changing the command line. It is also useful in conjunction with multiline macros, where it can be used to evaluate some characteristic of an argument to a macro.

    The conditional processing statements can be broken into four basic types:
  • %if-style conditional
  • %elif-style conditional
  • %else
  • %endif

    Conditional processing always starts with a %if-style conditional, and ends with a %endif conditional. Between them them, there can be any number of %elif-style conditionals (including none at all), followed by zero or one %else conditional. A sequence of these conditionals breaks the enclosed code up into multiple blocks.

    Conditionals may be nested; in other words each block can be further broken up into smaller blocks with more conditionals that are placed inside the initial conditional. Two conditional
    statements are considered to be at the same level if all sets of conditionals within the blocks specified by the conditionals have both a %if-style conditional and a %endif conditional.

    Processing starts with the %if-style conditional. Its arguments are evaluated according to the rules for that conditional. If the evaluation returns a value of true, the following block of code is
    assembled, and any other blocks up to the %endif conditional which matches this %if-style conditional are ignored

    If however the evaluation of the %if-style conditional returns false, the following block of code is ignored. Then processing begins with any %elif-style conditionals at the same level, in the order they appear in the code. Each %elif-style conditional is successively evaluated, until the evaluation of one of the arguments returns 'true'. For each %elif-style conditional that evaluates to false, the corresponding block is skipped; if one returns true its code block is assembled and all remaining blocks of code up to the %endif conditional are ignored.

    If the evaluation of the %if-style conditional returns false, and a corresponding %elif-style conditional that returns true cannot be found at the same level (e.g. they all return false or there aren't any), the %else conditional block is assembled if it exists.

    If all the evaluations return false, and there is no %else conditional, none of the blocks are assembled.

    Various examples follow, for the %if-style conditional that evaluates expressions:

    %if COLOR == 4
      mov eax,4
    %endif

    %if COLOR == 4
      mov eax,4
    %else
      mov eax,1
    %endif

    %if COLOR == 4
    mov eax,4
    %elif COLOR == 3
      mov eax,3
    %elif COLOR == 2
    mov eax,2
    %endif

    %if COLOR == 4
      mov eax,4
    %elif COLOR== 3
      mov eax,3
    %elif COLOR == 2
    mov eax,2
    %else
      mov eax,1
    %endif

    Note that when a conditional block is not being assembled, no preprocessor directives within that block will be evaluated either (other than to allow OAsm's preprocessor to keep track of things like nested conditionals)


    Text Comparison Conditionals

    The Text Comparison Conditionals are used in  Conditional Processing, to compare two peices of textual data. Each takes as an argument a list of two character sequences, separated by a comma. Each sequence is stripped of leading and trailing spaces, and then the sequences are compared. It does not matter if the sequences are enclosed in quotes.

    Depending on the result of the comparison, successive code will be assembled or ignored based on the specific directive specified.

    Text Comparison Conditionals are useful for example in conjunction with %macro and %imacro, to evaluate macro arguments.


    %ifidn
    %ifidn is a %if-style conditional that compares the two peices of textual data in a case-sensitive manner, and the accompanying code block is assembled if the two peices of data match. For example:

    %define HELLO goodbye
    %ifidn HELLO, goodbye

    mov eax,3
    %endif

    would result in the mov statement being assembled because thesubstitution for HELLO matches the text goodbye.

    See the section on Conditional Processing for more on %if-style conditionals.

    %ifnidn

    %ifnidn is a %if-style conditional that compares the two peices of textual data in a case-sensitive manner, and the accompanying code block is assembled if the two peices of data do not match. For example:

    would result in nothing being assembled because the substitution for HELLO matches the text goodbye. Alternatively:

    %define HELLO goodbye
    %ifnidn HELLO, hello

    mov eax,3
    %endif

    would result in the the mov instruction being assembled.

    See the section on Conditional Processing for more on %if-style conditionals.

    %elifidn
    %elifidn is a %elif-style conditional that compares the two peices of textual data in a case-sensitive manner, and the accompanying code block is assembled if the two peices of ata match. For example:

    %define HELLO goodbye
    %if HELLO == 1
    %elifidn HELLO , goodbye
    mov eax,3
    %endif

    would result in the mov statement being assembled because the substitution for HELLO matches the text goodbye.  ('goodbye' is not defined so when it is evaluated as a number the result is zero).

    See the section on Conditional Processing for more on %elif-style conditionals.

    %elifnidn
    %elifnidn is a %elif-style conditional that compares the two peices of textual data in a case-sensitive manner, and the accompanying code block is assembled if the two peices of data do not match. For example:

    %define HELLO goodbye
    %if HELLO = 1

    %elifnidn goodbye , HELLO
    mov eax,3
    %endif

    would result in nothing being assembled because the substitution for HELLO matches the text goodbye.

    See the section on Conditional Processing for more on %elif-style conditionals.




    %ifidni
    %ifidni is a %if-style conditional that compares the two peices of textual data in a case-insensitive manner, and the accompanying code block is assembled if the two peices of data match. For example:

    %define HELLO goodbye
    %ifidni HELLO, GOODBYE

    mov eax,3
    %endif

    would result in the mov statement being assembled because thesubstitution for HELLO matches the text goodbye.

    See the section on Conditional Processing for more on %if-style conditionals.

    %ifnidni

    %ifnidni is a %if-style conditional that compares the two peices of textual data in a case-insensitive manner, and the accompanying code block is assembled if the two peices of data do not match. For example:

    %define HELLO goodbye
    %ifnidni HELLO , GOODBYE

    mov eax,3
    %endif

    would result in nothing being assembled because the substitution for HELLO matches the text goodbye. Alternatively:

    %define HELLO goodbye
    %ifnidni HELLO, hello

    mov eax,3
    %endif

    would result in the the mov instruction being assembled.

    See the section on Conditional Processing for more on %if-style conditionals.

    %elifidni
    %elifidni is a %elif-style conditional that compares the two peices of textual data in a case-insensitive manner, and the accompanying code block is assembled if the two peices of ata match. For example:

    %define HELLO goodbye
    %if HELLO == 1
    %elifidni HELLO , GOODBYE
    mov eax,3
    %endif

    would result in the mov statement being assembled because the substitution for HELLO matches the text goodbye.  ('goodbye' is not defined so when it is evaluated as a number the result is zero).

    See the section on Conditional Processing for more on %elif-style conditionals.

    %elifnidni
    %elifnidni is a %elif-style conditional that compares the two peices of textual data in a case-insensitive manner, and the accompanying code block is assembled if the two peices of data do not match. For example:

    %define HELLO goodbye
    %if HELLO = 1

    %elifnidni GOODBYE , HELLO
    mov eax,3
    %endif

    would result in nothing being assembled because the substitution for HELLO matches the text goodbye.

    See the section on Conditional Processing for more on %elif-style conditionals.







    Token Classification Conditionals

    The Token Type Classification conditionals are used in Conditional Processing.  They take a character stream, and determine what kind of token the assembler will think it would be if it were to assemble the stream as part of processing an instruction or directive.  This is useful for example within multiline macro definitions, to change the behavior of a macro based on the type of data in one or more of the arguments. Token Type classification directives can detect one of three types of token: labels, numbers, and strings.

    %ifid
    %ifid is a %if-style conditional that detects if the token could be a label, and processes the following block if it is.

    %ifid myLabel
    mov eax,3
    %endif

    would result in the mov statement being assembled because myLabel matches a character sequence that could be used in a label. It does not matter if myLabel has actually been defined; the fact that it could be assembled as a label is all that is needed.

    See the section on Conditional Processing for more on %if-style conditionals.



    %ifnid
    %ifnid is a %if-style conditional that detects if the token could be a label, and processes the following block if it is not.

    %ifnid 5
    mov eax,3
    %endif

    would result in the mov statement being assembled because 5 is anumber, and does not match the character sequence required for a label.

    See the section on Conditional Processing for more on %if-style conditionals.


    %elifid
    %elifid is a %elif-style conditional that detects if the token could be a label, and processes the following block if it is.

    %if 1 == 2
    %elifid myLabel

    mov eax,3
    %endif

    would result in the mov statement being assembled because myLabel matches a character sequence that could be used in a label. It does not matter if myLabel has actually been defined; the fact that it could be assembled as a label is all that is needed.

    See the section on Conditional Processing for more on %elif-style conditionals.


    %elifnid
    %elifnid is a %elif-style conditional that detects if the token could be a label, and processes the following block if it is not.

    %if 1 == 2
    %elifnid 5
    mov eax,3
    %endif

    would result in the mov statement being assembled because 5 is a number, and does not match the character sequence required for a label. 

    See the section on Conditional Processing for more on %elif-style conditionals.

    %ifnum
    %ifnum is a %if-style conditional that detects if the token could be a number, and processes the following block if it is.

    %ifnum 5
    mov eax,3
    %endif

    would result in the mov statement being assembled because 5 matches a character sequence that could be used as a number.

    See the section on Conditional Processing for more on %if-style conditionals.


    %ifnnum
    %ifnnum is a %if-style conditional that detects if the token could be a number, and processes the following block if it is not.

    %ifnnum mylabel
    mov eax,3
    %endif

    would result in the mov statement being assembled because mylabel does not match a characer sequence that could be a number.

    See the section on Conditional Processing for more on %if-style conditionals.


    %elifnum
    %elifnum is a %elif-style conditional that detects if the token could be a number, and processes the following block if it is.

    %if 1 == 2
    %elifnum 5

    mov eax,3
    %endif

    would result in the mov statement being assembled because 5 matches a character sequence that could be used as a number.

    See the section on Conditional Processing for more on %elif-style conditionals.


    %elifnnum
    %elifnnum is a %elif-style conditional that detects if the token could be a number, and processes the following block if it is not.

    %if 1 == 2
    %elifnnum myLabel
    mov eax,3
    %endif

    would result in the mov statement being assembled because mylabel does not match a character sequence which could be a number

    See the section on Conditional Processing for more on %elif-style conditionals.


    %ifstr
    %ifstr is a %if-style conditional that detects if the token could be a string, and processes the following block if it is.

    %ifstr "Hello World"
    mov eax,3
    %endif

    would result in the mov statement being assembled because "Hello World"  a character sequence that could be used as a string.

    See the section on Conditional Processing for more on %if-style conditionals.

    %ifnstr
    %ifnstr is a %if-style conditional that detects if the token could be a string, and processes the following block if it is not.

    %ifnstr 5
    mov eax,3
    %endif

    would result in the mov statement being assembled because 5 is anumber, and does not match the character sequence required for a string.

    See the section on Conditional Processing for more on %if-style conditionals.


    %elifstr
    %elifstr is a %elif-style conditional that detects if the token could be a string, and processes the following block if it is.

    %if 1 == 2
    %elifstr "hello world"
    mov eax,3
    %endif

    would result in the mov statement being assembled because "hello world" matches a character sequence that could be used as a string.

    See the section on Conditional Processing for more on %elif-style conditionals.

    %elifnstr
    %elifnstr is a %elif-style conditional that detects if the token could be a string, and processes the following block if it is not.

    %if 1 == 2
    %elifnstr 5
    mov eax,3
    %endif

    would result in the mov statement being assembled because 5 is a number, and does not match the character sequence required for a string. 

    See the section on Conditional Processing for more on %elif-style conditionals.









    Context-Related Extensions


    The Context Related extensions are used to define preprocessor contexts. A preprocessor context can be used to create a memory of state information, between different invocations of multiline macros. For example, a set of macros could be defined to mimic high-level control functions such as loops and if statements.

    Many combinations of contexts can exist simultaneously.  OAsm maintains a stack of all open contexts, pushing new contexts on the top of the stack and removing old contents from the top of the stack.  Each context has a name, and a state which can include context-specific variables and definitions. The name of the context on top of the stack can be examined to determine what the current context is, or changed. It might be useful to change it for example if two macros maintain a context, but a third macro might change the context based on its arguments, e.g. to allow processing by a fourth macro which would not succeed if the name on top of the context stack wasn't correct.

    Within a context, context-specific definitions and labels may be defined for reference from other macros. Each context-specific definition is in scope while that context is in scope, e.g. while the context is on top of the context-stack. If a context is open multiple times simultaneously, each instance of the open context is unique, even though the textual representation labels in in the source code for that context may be the same.

    Context-specific definitions and labels start with the sequence '%$' and then contain a label start character and other label characters, just like other identifiers. For example:

    %$HelloWorld

    could be used in a context-specific definition or label, and would signify that that label goes with the current context. As an example consider the two macros

    %macro BEGIN 0
    %push MyBegin
    %$HelloWorld:
    %endmacro

    %macro FOREVER 0
    %ifctx MyBegin
    jmp %$HelloWorld
    %pop
    %else
    %error FOREVER loop without matching BEGIN
    %endif
    %endmacro

    could be used to implement an infinite loop as follows, if the macros are used in a pair. 

    BEGIN
    inc EAX
    FOREVER

    Contexts can also have 'local' macro definitions:

    %push MY_CONTEXT
    %define %$four 4

    results in a definition that will only be valid while this instance of MY_CONTEXT is on top of the context stack.

    When contexts are used, they don't have to appear within a multiline macro definition, but it is often useful to use them this way.

    Note: OAsm does not separate context-specific label names into different namespaces. Instead, a prefix is inserted before the symbol's name and the symbol is entered in the global symbol table. The prefix takes the form of a non-local label, with context-instance identifying information. This identifying information is simply an integer followed by the character '$'. For example if the context instance number is 54, the label %$Hello would be translated to:

    ..@54$Hello

    by the preprocessor. Non-local labels of this general form should be avoided as they may collide with labels defined locally within a
    context. This also applies to locally defined  %define statements.

    %push
    %push creates a new context and pushes it on the top of the context stack:

    %push CONTEXT_NAME

    'local' definitions can be made within this context as indicated in the introduction.

    If %push is used multiple times with the same context name, each context is unique even though the names are the same. So for example:

    %push MY_CONTEXT
    %$contextLabel:
    %push MY_CONTEXT
    %$contextLabel:

    is valid, because the two labels are named locally to the context and are in different context instances. When the label is used, it will be matched to the context currently on top of the context stack.

    %pop
    %pop removes the context at the top of the context stack:

    %push MY_CONTEXT
    %pop 

    results in MY_CONTEXT no longer being active, and the context that was below it on the context-stack becomes active. Note, you should use any labels or definitions that are specific to a context before it is popped. Once a context is popped off the stack, its state is never recoverable. 

    %repl
    %repl changes the name of the context at the top of the context-stack. For example:

    %push MY_CONTEXT

    creates a context called MY_CONTEXT. If that is followed by:

    %repl NEW_NAME

    the context will now be called NEW_NAME. When a context is renamed this way, all previous local definitions and labels are still accessible while that context is on top of the context stack. The only affect of renaming the context is that conditionals which act on context names will be matched against the new name instead of the old one.


    %ifctx
    %ifctx is a %if-style conditional that  takes a context name as an argument. If the context name matches the name of the context on top of the context stack, the next block is assembled, otherwise it is not.  For example:

    %push MY_NAME
    %ifctx MY_NAME
    mov eax,4
    %endif

    will result in the mov statement being assembled because the top of the context stack is named MY_NAME, whereas:

    %push MY_NAME
    %ifctx ANOTHER_NAME
    mov eax,4
    %endif

    will result in nothing being assembled because the name of the top of the context stack does not match the argument to %ifctx.

    See the section on Conditional Processing for more on %if-style conditionals.


    %ifnctx
    %ifnctx is a %if-style conditional that takes a context name as an argument. If the context name does not match the name of the context on top of the context stack, the next block is assembled, otherwise it is not.

    For example:

    %push MY_NAME
    %ifnctx MY_NAME
    mov eax,4
    %endif

    will result in nothing being assembled because the name of the context on top of the stack matches the argument.

    %push MY_NAME
    %ifnctx ANOTHER_NAME
    mov eax,4
    %endif

    will result in the mov being assembled because the names do not match.

    See the section on Conditional Processing for more on %if-style conditionals.



    %elfictx
    %elifctx is a %elif-style conditional that takes a context name as an argument.  If the context name matches the name of the context on top of the context stack, the next block is assembled, otherwise it is not.

    For example:

    %push MY_NAME
    %if 44
    %elifctx MY_NAME

    mov eax,4
    %endif

    will result in the mov statement being assembled because the top of the context stack is named MY_NAME.

    See the section on Conditional Processing for more on %elif-style conditionals.

    %elifnctx
    %elifnctx is a %elif-style conditional that takes a context name as an argument. If the context name does not match the name of the context on top of the context stack, the next block is assembled, otherwise it is not.

    For example:

    %push MY_NAME
    %if 44

    %elifctx MY_NAME

    mov eax,4
    %endif

    will result in nothing being assembled because the names match.

    See the section on Conditional Processing for more on %elif-style conditionals.


    Multi-Line Macro Extensions

    Multiline macro extensions allow definition of types of macros that are more familiar to assembly language programmers. Such macros may contain an arbitrary number of assembly language statements and preprocessor directives. These macros have three parts: the macro header, the macro body, and the macro invocation.

    The macro header and macro body are used when defining the macro.  For example a simple multiline macro that gives a new name to NOP might look as follows:

    %macro MY_NOP 0
    nop
    %endmacro

    here MY_NOP is the name of the macro, which is case-sensitive since %macro was used rather than %imacro. The zero in the header following MY_NOP indicates this macro has no parameters. The body of the macro is the 'nop' statement, and the macro definition ends with %endmacro.

    After the macro is defined, it can be invoked as many times as necessary. For example:

    MY_NOP
    MY_NOP

    causes the assembler to assemble two 'nop' statements based on the above definition.

    A macro can have parameters:

    %macro MY_MOV 2
    mov %1, %2
    %endmacro

    In this macro, the 2 in the header signifies the macro takes exactly two arguments. Specifying more arguments, or less, during and invocation, will result in an error. The %1 and %2 are
    substituted with the first and second arguments. For example:

    MY_MOV eax,ebx

    becomes:

    mov eax,ebx

    Macros can have a variable number of parameters. In:

    %macro MY_PUSH 1-4
    %rep %0
    push %1
    %rotate 1
    %endrep
    %endmacro

    the header specifies the macro can have between one and four arguments. In the body, the %0 gives the number of arguments that were actually specified. The %rotate rotates the arguments left by one, so that the contents of %2 moves to %1, the contents of %3  moves to %2, and so forth. The contents of %1 moves to the end of the list.

    Invoking this macro as follows:

    MY_PUSH eax,ebx,ecx

    results in the preprocessor generating the following instructions:

    push eax
    push ebx
    push ecx

    This can be made just a little bit better through the use of the infinite argument list specifier:

    %macro MY_PUSH 1-*
    %rep %0
    push %1
    %rotate 1
    %endrep
    %endmacro

    where the * in the header means there is no practical limit to the number of arguments (there is a limit, but it wouldn't be realistic to type that many arguments even with a code-generating program). Now the macro isn't limited to four push statements; it can push as many things as are listed in the macro invocation.

    A corresponding MY_POP can be created with minor changes:

    %macro MY_POP 1-*
    %rep %0
    %rotate -1
    pop %1
    %endrep
    %endmacro

    where the rotate statement now shifts right instead of left, with the rightmost argument appearing in %1.

    Occasionaly it is beneficial to specify that you need some arguments, then you want the rest of the command line:

    %define STRINGIZE(s) #s
    %macro MY_MSG 1+ 
    db  %1,STRINGIZE(%2)
    %endmacro

    Where the + symbol means anything beyond the first argument should be gathered together and make another argument. This does include comma characters; after the first argument's separating comma commas will no longer be processed with this syntax. As an example invocation:

    MY_MSG 44, hello there, world
    would translate to:

    db 44,"hello there, world"

    Of course the + symbol may be combined with specifying variable length argument lists as shown in the following header:

    %macro do_something 1-4+

    Another use for the + symbol is to get the entire argument list of a macro invocation, unparsed, as shown in the following header:

    %macro do_something 0+

    Sometimes it is useful to include the comma character in the argument for a macro invocation:

    %macro define_numbers 3
    db %1,%2,%3
    %endmacro

    define_numbers {1,2},3,4

    to result in:

    db 1,2,3,4

    When variable length argument lists are used, everything starting with the first variable argument can have a default value. For example with the macro header:

    %macro define_strings 1-3 "hello", "there"
    %rep %0
    db %1
    %rotate 1
    %endrep
    %endmacro

    defaults the second and third arguments to "hello" and "there" respectively, if they are not specified in the invocation. For example:

    define_strings "one"

    results in:

    db "one"
    db "hello"
    db "there"

    whereas:

    define_strings "one", "two"

    results in:

    db "one"
    db "two"
    db "there"

    Many of the above macros are unexciting, and perform functions that could be done other ways e.g. with %define. A more interesting example of a multiline macro is as follows:

    %macro power 2

    mov ecx,%1
    mov eax,1
    jecxz %noop
    mov eax,%2
    cmp ecx,1
    jz %noop
    %local:
    imul eax,%2
    loop %local
    %noop:
    %endmacro

    which creates code to raise the second argument to the power of the first argument, and leaves the result in eax. An example invocation:

    power 2,3

    which generates code to return 3 squared in eax. Here we have introduced local labels in macros, which are similar in form to local labels in contexts, except that the macro version does not have a dollars symbol. Such local labels are in scope for a single invocation of the macro; each time the macro is invoked the label will have a different context. 

    As with context-specific labels, the assembler does not implement multiple symbol tables but instead uses a non-local label name.The non-local label name consists of ..@ followed by a context id followed by a period followed by the label name. For example, the labels in the above example would be translated to:

    ..@54.local
    ..@54.noop

    if the context identifier for the current macro invocation is 54. non-local labels fitting this general format should not appear in the source code, as there is a chance they will conflict with label names chosen by the preprocessor.


    %macro
    %macro starts a macro definition. The name of the macro is case-sensitive.

    %imacro
    %imacro starts a macro definition. The name of the macro is not case-sensitive.


    %endmacro
    %endmacro ends a macrodefinition.


    %rotate
    %rotate rotates the macro argument list for the current invocation a number of times specified in the argument. If the number of times is positive, the arguments are rotated left, with the leftmost arguments going to the end of the list. If the number of times is negative, the arguments are rotated right, with the rightmost arguments going to the beginning of the list.



    Repeat Block Extensions

    The Repeat Block Extensions allow a method for replicating lines of code. In the simplest case, a sequence of instructions or data can be literally repeated a fixed number of times:

    %rep 10
    nop
    %endrep

    causes the preprocessor to present 10 nop instructions to the assembler. In a more complex case, %assign can be used to define a function that varies with each loop iteration, allowing easy development of lookup tables:

    %assign i 20
    %rep 10
    db i
    %assign i i - 1
    %endrep

    puts the numbers from 20 to 11 in a table, in decreasing order.

    This type of functionality could be used with more complex functions, for example %rep would be one way a CRC lookup table could be developed with OAsm.

    In another case the loop count could be made to vary based on previous declarations:

    hello db "Hello World"
    %assign count 64 - ($-Hello)
    %rep count
    db 0
    %endrep

    While the latter example is not too exciting and could be done other ways, e.g. with the resb or times directives, more complex functions could be integrated into this type of loop to generate different kinds of data.

    Repeat blocks may be nested. For example:

    %assign i 10
    %rep 3
    %rep 3
    db i
    %assign i i + 1
    %endrep
    %assign i i - 6
    %endrep

    generates enough db statements to define the following sequence:

    10, 11, 12, 7, 8, 9, 4, 5, 6

    Repeat blocks can be exited prematurely. If a %exitrep directive is parsed while a repeat block is being processed, the innermost repeat block exits immediately. Generally, one would use preprocessor conditionals to prevent the %exitrep directive from being processed, until some condition occurs. For example to pop all contexts named "MY_CONTEXT" from the top of the context stack:

    %repeat 1000
        
    // 1000 is an arbitrary value
    %ifnctx MY_CONTEXT
    %exitrep
    %endif
    %pop
    %endrep

    %rep
    %rep is used to start a repeat block. It takes one argument: the number of repetitions to go through.

    %endrep

    %endrep is used to end a repeat block. It takes no arguments

    %exitrep
    %exitrep is used to exit a repeat block prematurely.


    OGrep

    OGrep is a utility that searches for text within files. It is capable of matching text with a direct string comparison, with or without case sensitivity. It also is capable of matching text within a file against regular expressions. Regular expressions allow a mechanism for specifying ways to match text, while specifying parts of the text which can vary and still match. For example the '.' character matches any other character, so a regular expression such as 'a.c' would match any three-character string starting with 'a' and ending with 'c'. More powerful matching is also possible, such as matching sequences of the same character, matching against any character in a specified set, and so forth.

    The general format of an OGrep command line is as follows:

    OGrep [options] match-string list-of-files

    OGrep will search in the list-of-files for text that matches the match-string, and list file and optionally line number information for each match
    found. In simple cases the match string will not be surrounded by quotes, but in more complex cases involving spacing characters and special
    symbols it may be necessary to quote the match-string.

    OGrep has a powerful regular expression matcher, which is turned on by default.However there is a command line option to disable it. When it is turned on some characters will not be matched directly against the text, but will be interpreted in a way that allows the program to perform abstract types of matches. 

    There are several types of matching groups:
  • Match a character or sequence of the same character
  • Match the start or end of a line
  • Match the start, end, or middle of a word
  • Match one character out of a set of characters
  • A match can be specified to be repeated at another point within the sequence

    Some of these matching algorithms can be combined, for example matching one character out of a set of characters can be combined with matching a sequence of characters to find words consisting of characters in a subset of the letters and numbers.


    Command Line

    The general format of an OGrep command line is as follows:

    OGrep [options] match-string list-of-files

    Where [options] are command line options, and the match-string is searched for within the list-of-files.  Matches are listed with file and optionally line number. The files in the list-of-files may contain wildcards, for example:

    OGrep "while" *.c

    Looks through all C language source files in the current directory for the word while.

    By default the match-string is assumed to hold a regular expression.

    Following is a list of the command line switches that OGrep supports.

    -? Getting help

    OGrep usually has rather abbreviated usage text.  To get more detail, use this switch.

    -i case sensitivity

    By default OGrep performs case-sensitive matching.  This switch will change it to use case-insensitive matching.

    -r disable regular expressions

    By default OGrep uses regular expression pattern matching.  To use exact matches instead, specify this option

    -d recurse through subdirectories

    In some instances it is necessary to have OGrep parse all the files in an entire directory tree.  This switch is used to allow for that behavior.  For example:

    OGrep -d "while" *.c

    searches for the word while in all files ending in .c, in the current directory as well as all its subdirectories.

    -w match whole words

    By default OGrep will match text regardless of where it appears.  To make it only match entire words specify this switch.

    For example by default the regular expression 'abc' would match within both 'abc' and 'xabcy'. There are regular expression modifiers that can be used to make it match only 'abc' since in the other case abc occurs within another word. With this switch, OGrep automatically takes the match string and makes it into this type of regular expression. E.g, when the <span


    By default OGrep will list each filename when it finds the first match within a file, then list each matching line underneath it. At the end it will show a count of the number of matches. But there are various command line options which can modify the format of the output.
  • -c list the file names matched, along with a count of matches 
  • -l list only the file names for files that have matches   
  • -n list the line number of matching lines next to the matching line
  • -o (Unix format) list the file name and line number to the left of each matching line, instead of showing the file names separately.
  • -v lines nonmatching lines instead of matching lines
  • -z list the file names matched, line numbers, and matched lines
      


    Regular Expressions

    Regular expressions can be used as expressions within match-strings. They allow a more general mechanism for doing pattern-matches than having to specify each character specifically. For example, the '.' matches any character, so using the sequence 'a.c' would match any three character sequence starting with 'a' and ending with 'c'. This page discusses the various pattern matching mechanisms available when regular expression matching is enabled.

    Because the pattern matching function of regular expressions uses characters to specify patterns, those characters can not be matched directly.

    For example '.' matches any character, but there may be instances when you want it to match only a period. To resolve this, the pattern matching symbols may be quoted by preceding them with the '\' character. This means that the pattern 'a\.b' matches only the sequence a.b.

    Since the quote character generally means quote the next character, trying to match a quote character means the quote character itself has to be quoted. For example to match the text 'a\b' one one have to write the pattern 'a\\b'.

    The quote character is sometimes used to extend the working of the pattern matcher, for example the sequence \b does not mean a 'b' character is being quoted, it means match the beginning or ending of a word.

    The following symbols match various types of patterns:

  • '.' match any character
  • '*' match zero or more occurrances of the preceding character
  • '+' match one or more occurrances of the preceding character
  • '?' match zero or one occurrances of the preceding character
  • '^' match the start of a line
  • '$' match the end of a line
  • '\b' match the beginning or ending of a word
  • '\B' match within a word
  • '\w' match the beginning of a word
  • '\W' match the ending of a word
  • '\<' match the beginning, ending, or inside a word
  • '\>' match anything other than the beginning, ending, or inside a word

    Some of the basic pattern matching such as '+' can match multiple occurances of a character. Range Matching is a more general statement on the number of occurances of a character, for example you  can match a bounded range, say from two to four 'a' characters by doing the following:

    OGrep "a\{2,4\}" *.c 

    Brackets [] can be used to delimit a set of characters. Then the bracketed sequence will match any character in the set. For example

    OGrep "[abc]" *.c

    matches any of the characters a,b,c. A range of characters can be specified:

    OGrep "[a-m]" *.c

    matches any characters in the range a-m.

    Set negation is possible:

    OGrep "[^a-z]" *.c

    matches anything but a lowercase letter.

    Sets can be more complex:

    OGrep "[A-Za-z0-9]" *.c

    matches any alphanumeric value.

    Sets can be combined with more basic pattern matching:

    OGrep "[A-Z]?" *.c

    matches zero or one upper case characters.

    OGrep "[0-9]\{2,4\}"

    matches from two to four digits.

    Sometimes, it is desirable to match the same sub-pattern multiple times before having grep declare the pattern as a match for the text.  In a simple case:

    OGrep "\(myword\)|\0" *.c 

    matches the string:

    myword|myword

    In this case the quoted parenthesis surround the region to match, and the \0 says match that region again. This is not a very interesting case, but when combined with other pattern matching it becomes more powerful, because the \0 doesn't reapply the pattern but instead matches exactly the same pattern as before. For example to combine it with a set:

    OGrep "\([a-z]+\)||\0" *.c 

    matches any lower-case word twice, as long as it is separated from itself by two | characters. This pattern would match 'ab||ab' but not 'ab||xy'.

    Up to ten such regions may be specified in the pattern; to access them use \0 \1 \2 ... \9 where the digit gives the order the region is encountered within the pattern.


    OLink

    OLink takes the output files from compilers and assemblers, and merges them together. This merging is necessary because often a single source file will either declare 'global' code or data that is accessible from other source files, or reference such global data from another file. Each output file from a compiler or assembler will have a list of these kinds of declarations, and the linker has two tasks: to combine the actual code and data from different files, and to resolve these global references between files.

    The end result of linking in most systems is to generate an executable file, however olink generates a fully linked object file (or sometimes a partially linked file) and generation of the executable is deferred to link post processor.  The link post processor for Windows 32 bit programs is DLPE.EXE.

    When it comes to the actual data being combined, the compiler or assembler will organize the code and data into sections. Each section has a name, and this name is one of the criteria used in combining section data. For example a code section might contain the code fragments from the file, an initialized data section might hold initialized data, and an uninitialized data section might reserve space for uninitialized data. In compiler output, other sections might occur for example to hold constant data, string data, or control information. An assembly program can directly control the segmentation of code and data into sections, and any number of arbitrary sections may appear according to the needs of the program.

    It should be stressed that OLink does not generate actual ROM images or executable files; all it does is combine code and data, and resolve the global references. Another post-processing program such as DLHEX or DLPE is used to generate the ROM image or executable file, based on the output of OLink.

    The output of OLink can beguided through use of a specification file and through command-line defines. The specification file indicates the ordering and grouping of sections, and gives default addresses and other attributes to the groupings. Command-line defines can be used to taylor the specifics of how the specification file is used; in OLink the command line defines generally act in terms of giving an absolute address to a variable which has been referenced elsewhere.

    The general form of an OLink Command Line is:

    OLink [options] filename-list

    Where filename-list gives a list of files to merge together.  OLink natively understands files of the extension .o and .l, when these are generated by other tools in this package. Generally it will pass other files specified on the command line to a post-processing program for further analysis. For example .res (windows resource) files are passed on to DLPE to help it build the executable.

    Olink takes the files in filename-list, and produces an output file .rel extension.  The .rel files have the same format as the object files generated by the compiler or assembler, but are partially or fully linked.

    Specification files give a flexible method for specifying how to merge sections from the various input files. They can specify what code and data be combined together, in what order, and what the address of code and data should be. A specification file uses three basic constructs, and each construct can be further clarified with attributes. 

    At the top level there can be one or more Partitions.  Each partition may be relocated independently of other partitions.  

    A partition contains one or more Overlays.  The overlays are independendent units of code or data, which are overlayed onto a common region of memory. The overlay mechanism can be used for example in systems that need to use bank-switching to extend the amount of memory available. 

    An overlay contains Regions, which simply specify the names of sections that should be combined together.  The regions can be actual section names, or expressions containing
    section names. Normally a region would contain all sections matching the section name from all input files, but, a Region can be further clarified with a list of files that should be considered for inclusion. In this way different files with the same section can be combined into different overlays.

    Target configurations specify the default mechanism for taking the linker output and creating a ROM image or executable file. Each target configuration specifies a linker
    specification file, a list of default definitions, and the name of a post-processing program such as DLHex to run to create the final output file. The specification files used with default target configurations are generic in nature, and make certain assumptions about the program; however some of the identifiers in such specification filse may refer to definitions made elsewhere. Those definitions are generally part of the target configuration, and may be modified through command-line options to make minor changes to the configuration. 

    For example, in WIN32 DLLs, the target file specifies the base address of the DLL in terms of a linker define statement, which may be redefined via the command line /D define switch to not collide with other DLLs and thus improve load time.


    Command Line

    The general format of an OLink command line is:

    OLink [options] file-list

    where file-list is an arbitrary list of input files.

    For example:

    OLink one.o two.o three.o

    links several object files and makes an output file called one.rel.

    By default OLink takes the name of the first file on the command line, and replaces its extension with the extension .rel,to decide on an output file name.

    The file list can have wildcards:

    OLink /otest.rel *.o

    links all the files in the curent directory, and the results are put in test.rel because the /o command line switch is specified.


    Response files can be used as an alternate to specifying input on the command line. For example:

    OLink @myresp.lst

    will take command line options from 'myresp.lst'.  This is useful for example when there are many .o or .l files being merged and it is not convenient to specify them on the command line.

    Following is a list of the command line switches OLink supports.

    -c case sensitivity

    By default OLink treats symbols as case-insensitive. If the -c+ switch is specified on the command line, labels will be treated as case sensitive. This supports certain language compilers that do allow the user to type the same word in different case, and have it mean different things.


    -opath Specifying the outpuf tile name

    By default, OLink will take the first input file name, and replace it's extension with the extension .rel to create the name of the output file. However in some cases it is useful to be able to specify the output file name. The specified name can have its extension specified, or it can be typed without an extension to allow OLink to add the .rel extension.

    OLink /r+ /omyfile test1.o test2.o test3.o cl.l

    makes an object file called myfile.rel

    -Lpath specify search path
    By default, OLink will search for objecc and library files in the C Compiler library file path, and in the current directory. This option may be used to specify additional directories to search for lib files:

    OLink /L..\mylibs test.o floating.l 

    will find floating.l either in the C compiler library directory, the current directory, or the directory ..\mylibs.

    -Ddefine=value

    OLink allows definition of global variables on the command line, and gives them an absolute address. This facility can be used either to set the location of a function or data, or to provide a constant value to a specification file entry or even change a constant in user code.

    For example

    OLink /DRAM=0x80000000 myprog.o

    Defines a global variable RAM which has the address 80000000 hex. This variable may be accessed externally from either the program code, or the specification file. It might be used for example to relocate code or data, to specify the address of some hardware, to set a a link-time constant into the program, and so forth.

    Decimal values for addresses may be provided by omitting the 0x prefix. 

    /r+ perform complete link

    By default, OLink performs a partial link. In a partial link, some global definitions may remain unresolved, and libraries are ignored. The output file may be further linked with more object files or with the result of other partial links. However, a partial link cannot be used to generate a rom image or executable file, because not all the information required for such binaries has been generated; specifically there may be some addresses that haven't been defined yet.

    Usually, a complete link happens automatically when the /T switch is used to specify a target configuration.  However in some cases it is desirable to indicate to the linker that a complete link is desired without specifying a target configuration. Use /r+ for this purpose.

    -m  generate map file

    The map file name will be the same as the name of the .rel file, with the extension replaced by .map. The standard map file summarizes the partitions theprogram is contained in, and then lists publics in both alphabetic and numeric order. 

    A more detailed map file may be obtained by using /mx. This gives a list of partitions, overlays, regions, and files that went into making up the program, in the order they were included. t also includes details of the attributes used to place the sections.

    -s - use specification file

    The specification file gives the layout of the program as it will exist in memory or in an executable file.  Without a specification file, the linker will just order sections in the order it finds them, starting at address zero.  Sometimes a default specification file will be selected automatically when using  the /T  target configuration switch. This specification file may be overridden with the /s switch; or if the /T switch is not specified the /s switch may be used to select a specification file. For example:

    OLink /smyprog.spc myprog.o

    -T target configuration
    The target configuration switch specifies that a specific profile be used to build the target. The profile includes a specification file, default definitions, a post-processing program to run, and some other information. The link will be done with the givenparameters, and then the post-processing program will be run to generate the final ROM image or executable file.

    For example:

    OLink /T:M3 myprog.o

    selects the specification profile that builds a Motorola Hex file with four-byte addresses.

    -l link only
    Sometimes it is convenient to use the /T switch but it isn't necessary to go on to call the post processing program to generate a ROM image or executable.  This switch stops the linker after the .rel file is generated.


    Specification Files

    A specification file indicates the combination and ordering of sections of code and data. The specification file consists of one or more  Partitions, which are independent units of code and/or data. 

    Each partition holds one or more Overlays. Overlays are units of code and/or data which may share the same location in memory; for example a processor with a small memory address can use bank switching to move different units of code in and out of the address range as necessary. These different units of code would however all share the same address, so the overlay mechanism gives a way of relocating multiple units of code to the same address.

    Each overlay holds one or more Regions; a region is what specifies which sections get combined. For example the region code takes all sections named code from the input file and concatenates them together. Multiple regions can be concatenated one after another within an overlay. To support the overlay mechanism, each region may further be qualified with file names, so that you can specify that sections named code from one group of files go in one region, and sections named code from another group of files go in another region. These regions could be placed in different overlays to help with things like the bank-switch mechanism indicated above.

    Partitions, overlays, and regions can be given attributes to specify things like an absolute address, a maximum size, alignment, and so forth.

    You can also define labels within partitions and overlays similar to assignment statements in a high-level-language. Each label has an associated expression, which is calculated and then used as a value for the label. These values become globals to the linker, and are treated the same as the address obtained when declaring a global variable in the assembler and compiler. Code in the object files can referencethese labels as if they were externals. 

    Further, expressions used in defining a label or attribute could use another label which is not defined in the specification file; this might be defined for examplesomewhere in the code, or in a command-line definition.  It is especially useful to define such labels in a command line definition, as a way to customize the specification file without rewriting it. For example, if two peices of hardware share the same source code but are linked at different base addresses, one might write a single linker specification file, referencing the base address as a label. Then a linker command-line definition could be used to resolve the specific address the code is linked for.

    The following will be used as an example:

    partition
    {
    overlay {
      region {} data [align=4];
      region {} bss.
    } RAM [addr=0x0000, size=0x4000];
    } DATA;

    partition
    {
    overlay {
       region {} code;
       region {} const;
    } ROM;
    } CODE [addr=0xf000, size=0x1000];

    This defines two partitions, in this case one is for data nad one is for code. The first partition is named DATA and consists of two groups of sections.  First all sections named data are concatenated together, then all sections named bss follow after that. This partition is defined with attributes to start address 0, and extend for 16K.  If the actual size of the partition is greater than 16K, an error will be generated. In this case the overlay is named RAM; this overlay name is what is visible to ROM-generation tools such as DLHEX.

    The second partition is named CODE and also consists of two groups of sections; first all sections named codeare concatenated together, followed by all sections named const. This partition starts at address 0xf000, and extends for 4K. In this case the overlay name visible to DLHEX or other executable-generation tools is ROM.

    In the first partition, an align=4 attribute is declared on the data region. This means that each data section put into the region will be aligned on a four-byte boundary when it is loaded from its corresponding file. (Note: if assembly language code specifies a more stringent bound such as align = 8, that will be used instead).

    In the following:

    partition
    {
    overlay {
       region { bank1a.o bank1b.o bank1c.o } code;
    } BANK1;

    overlay {
       region { bank2a.o bank2b.o bank2c.o } code;
    } BANK2;

    overlay {
       region { bank3a.o bank3b.o bank3c.o } code;
    } BANK3;
    } CODE [addr = 0xe000, size = 0x1000];

    Three banks of code have been defined, each of which starts at address 0xe000 and extends for 4K. Each region references sections named code, however, file names are specifically specified for each region,  so that only code sections from specific files will be included while processing the region. For example in the overlay BANK1, only files bank1a.o, bank1b.o, and bank1c.o will contributed to the contents of the region.

    Wildcards may be used in the file specification, so that the above could be written:

    partition
    {
    overlay {

       region { bank1*.o } code;
    } BANK1;

    overlay {

       region { bank2*.o } code;
    } BANK2;
     
    overlay {

       region { bank3*.o } code;
    } BANK3;
    } CODE [addr= 0xe000, size = 0x1000];

    In the following:

    partition
    {
    overlay {

       RAMSTART=$
       region {} data [align=4];
       style="font-family: Courier New,Courier,monospace;">

       region {} bss;
        RAMEND=$
      } RAM [addr=0x0000, size=0x4000];
    } DATA;

    The labels RAMSTART and RAMEND have been defined. The '$' in the expression indicates to use the address at the location the label is specified, so these definitions effectively define labels at the beginning and ending of the overlay. As indicated before these define global variables, so an x86 assembler program such as the following could be used to set all data in these regions to zero:

    extern RAMSTART, RAMEND
    mov edi, RAMSTART
    mov ecx,RAMEND-RAMSTART
    mov al, 0
    cld
    rep stosb

    Expressions may be more complex, consisting of add, subtract, multiply,divide and parenthesis. As a simple example the above example can be rewritten to define a size:

    partition
    {
    overlay {

       RAMSTART=$
       region {} data [align=4];
       region {} bss;
        RAMSIZE = $-RAMSTART
      } RAM [addr=0x0000, size=0x4000];
    } DATA;



    Labels or expressions may be used in attributes, for example:

    partition
    {
    overlay {

       RAMSTART=$
       region {} data [align=4];
       region {} bss.
        RAMSIZE = $-RAMSTART
      } RAM [addr=RAMBASE, size=0x4000];
    } DATA;

    Here the base address is defined in terms of a label RAMBASE. But RAMBASE is not defined anywhere in the specification file, so it has to be pulled from the linker's table of globals. In this case we might define it on the linker command line as follows:

    OLink /DRAMBASE=0x7000 /smyspec.spc ...

    Labels don't have to include '$' in the expression, although it is often useful. For example:

    MYLABEL=0x44000+2000 

    is valid.

    Note that when using target configurations, the default specificationfiles use these types of declarations, but the target configuration gives default values to use. For example the default value for RAMBASE in a hex file is 0x10000, when used with the default linker specification file that is used for binary and hex file output. But such values can be overridden on the command line; if it is desirable to use the default specification file but RAMBASE is 0x8000 for the specific hardware in question one might use OLink as follows: 

    OLink /T:M3 /DRAMBASE=0x8000 ...

    Partitions, overlays, and regions can be attributed with one or more attributes. The attributes are comma delimited, and enclosed in braces. They occur after the name of the partition or overlay, or after the section specified by a region.

    The possible attributes are listed in Table 1









    Attribute Meaning Default Value for Partitions Default Value for Overlays Default Value for Regions
    ADDR Address 0, or end of previous partition partition address overlay address or end of previous region
    SIZE Absolute size unassigned partition size unassigned
    MAXSIZE absolute size may vary up to this amount unassigned partition maxsize unassigned
    ROUNDSIZE absolute size may vary, but will be rounded up to the next multiple of this 1 partition roundsize unassigned
    FILL fill value used when absolute size does not match size of data included in region 0 partition fill overlay fill
    ALIGN minimum alignment of region 1 partition alignment overlay alignment
    VIRTUAL base address for linking the region, when base address does not match the ADDR attribute unassigned partition virtual attribute virtual address of overlay, or end of previous region


    Table 1.  List of attributes




    Usually the region statement is used to specify a specific section name such as code:

    region { } code;

    But sometimes it is useful to be able to combine multiple sections in a single region with the or operator

    region {} code | const;

    However this is different making two regions, one for code and one for const. The difference is that in this case code and const regions may be intermixed; whereas in the other case all the code sections would be combined together, separately from all the const regions.

    Wildcards may be used in the region name:

    region {} code*

    matches the sections name code1, code2, code123, and so forth.

    And for example

    region {} *  

    matches ALL sections. There are two wildcard characters: * matches a sequence of characters, whereas ? matches a single character.

    Other times you want to do a catch all which gets all sections except for a select section or group of sections.

    region {} *& !(code*)

    This uses the and operator and the not operator to select all sections which do not start with the four letters 'code'.

    A region can be named with any potentially complex expression involving section names and these operators, to match various combinations of sections.


    Target Configurations


    OLink has several default target configurations, that associate the various data needed for creating linker output for a particular type of output together. Each target configuration includes a linker specification file, default definitions for items used but not declared in the specification file, and a reference to a post-processing tool that will take an image linked against the specification file and generate some final binary image, such as a ROM image or an Operating System executable. 

    Each target configuration is accessible via the /T linker switch.

    For example:

    OLink /T:BIN test.o

    invokes the target configuration associated with the name BIN. In the case of BIN the file is linked into three partitions; code, data and stack using the specification file hex.spc;
    and the results are dumped to a binary file using DLHex.

    The remainder of this section will discuss the default target configurations rather than the mechanism for defining them.

    There are several output file formats for generating a rom-based image. However, they all use a common specification file and post-processing tool. This section will briefly touch on the available output formats then touch on the specification file in more detail.

    The available output formats in this mode are:
  • /T:M1 Motorola srecord file format, 2 byte addresses
  • /T:M2 Motorola srecord file format, 3 byte addresses
  • /T:M3 Motorola srecord file format, 4 byte addresses
  • /T:I1 Intel hex file format, 16 bits
  • /T:I2 Intel hex file format, segmented
  • /T:I4 Intel hex file format, 32 bits
  • /T:BIN Binary file format


  • The default specification file for these output formats is hex.spc, and the default post-processing tool is dlhex.exe.

    Hex.spc has 4 independent overlays for code and data. Table 1 lists the overlays, the section names that are recognized in each overlay. It also lists an identifier that can be used with the /D command line switch to the linker to adjust base addresses, and the default base address for each overlay.






    Overlay Sections Base Address Identifier Base Address
    RESET reset RESETBASE 0x00000
    ROM code, const, string CODEBASE 0x00008
    RAM data, bss RAMBASE 0x10000
    STACK stack STACKBASE (size = STACKSIZE) 0x20000 ( 0x400)



    Table 1 - Hex.spc details




    Several types of WIN32 images may be generated. These include:
  • /T:CON32 - console application
  • /T:GUI32 - windowing application
  • /T:DLL32 - dll application

    The default specification file for these output formats is pe.spc, and the default post-processing tool is dlpe.exe. PE.spc has two independent overlays for code and data. Table 2 lists the
    overlays, and the section names that are recognized in each overlay. Table 3 lists the various values supported by pe.spc and  dlpe.exe that may be adjusted on the linker command line.





    Overlay Sections
    .text code, const
    .data data, string, bss


    Table 2 - PE.SPC overlays












    Definition Meaning Default
    FILEALIGN Object Alignment within an executable file 0x200
    HEAPCOMMIT Amount of local heap to commit at program start 0
    HEAPSIZE Size of local heap 0x100000
    IMAGEBASE Base address for the image (used to resolve DLL Address collisions) 0x400000
    OBJECTALIGN Object alignmed t in memory 0x1000
    STACKCOMMIT Amount of stack to commit at program start 0x2000
    STACKSIZE Size of stack for default thread 0x100000


    Table 3 - PE.SPC adjustable parameters









    >



    OMake

    OMake currently has no documentation. OMake is however very similar to GNU make and the GNU make documentation is a good start.   But it should be stressed that OMake is not GNU make and there may be incompatibilities.

    One known incompatibility is that OMake is case sensitive, even though it is being hosted on MSDOS/Windows.  This is a problem when specifying file names if the file name is not spelled in exactly the same case the OS spells it.  

    This may be fixed in a future version.


    OCPP

    OCPP is an extended version of the traditional C language preprocessor. The extensions include support for C11 and C99,  It is beyond the
    scope of this document to discuss the format of input files used with the preprocessor. See a discussion of the C language for further
    details of use with the C language.

    Note that OCPP is not quite the same as the preprocessor built into a C compiler. The C compiler is able to maintain a slightly more detailed context about the preprocessed text. In rare cases loss of this information will cause a file preprocessed with OCPP to not be compilable with any C compiler.

    The general form of an OCPP command line is:

    OCPP [options] file 

    Here the file is the file to preprocess. (multiple files may be specified onthe command line if you choose).

    OCPP has no mechanism for specifying the output file name; instead it takes the
    input file, strips the extension, and writes a file with a '.i' extension to indicate preprocessor output.

    There are several command line options that control how the preprocessing is done. These include the ability to enable extensions, the ability to set a path for include files, and options to define and undefine preprocessor variables.

    Following is a list of the command line switches OCPP supports.

    -A strict mode

    By default, OCPP will perform as a C89 version preprocessor, which is slightly looser than the standard. It can be tightened to meet the
    standard with the /A switch.

    -1 C11 mode

    Puts OCPP into the C11 parsing mode

    -9 C99 mode
    Puts OCPP into the C99 parsing mdoe

    -I set include file path
    By default, OCPP will use the C language system include path to search for include files specified in the source file. If there are other include paths OCPP should search, this switch can be specified to have it search them. For example by default the statement:

    #include <windows.h>

    will search in the C language system include directory to find windows.h.
    Whereas:

    OCPP /I.\include test.c

    will create a file test.i, which will additionally search the path.\include for any include files specified in preprocessor directives.


    /E[+-]nn error control

    nn is the maximum number of errors before the  compile fails; if + is specified extended warnings will be shown that are normally disabled by default. If - is specified warnings will be turned off.  For example:

    OCC /E+44 myfile.c

    enables extended warnings and limits the number of  errors to 44. By default only 25 errors will be shown and then the compiler will abort and

    OCC /E- myfile.c

    compiles myfile.c without displaying any warnings.



    OCPP has two switches useful for defining preprocessor macros. The first switch /D defines a macro. The second switch /U causes OCPP to never allow the specified variable to be defined.

    For example

    OCPP /DMYINT=4 test.c

    defines the variable MYINT and gives it a value of 4. Whereas

    OCPP /UMYINT test.c

    globally undefines MYINT in such a way that it cannot be defined while preprocessing the file.

    A macro doesn't have to be defined with a value:

    OCPP /DWIN32 test.c

    might be used to specify preprocessing based on the program looking for the word WIN32 in #ifdef statements.




    OImpLib

    OImpLib is a WIN32 import librarian, suitable for various operations regarding the import sections of DLLs. It can take input from one of several sources, and place output in one of several destinations.In its most basic format one could use it to take a .DEF file or .DLL file and construct an import library for use with the toolchain, but it can also be used to create a .DEF file or extract things from a library.

    The general format of an OImpLib command line is:

    OImpLib [options] source dest

    where source and dest specify files to use, and further, by parsing the extensions of source and dest OImpLib is able to act in one of several modes

    Response files can be used as an alternate to specifying input on the command line. For example:

    OImpLib test.l @myresp.lst

    will take command line options from myresp.lst.  In general it isn't necessary to use response files with OImpLib as the amount of input required is minimal.

    Following is a list of command line switches OImpLib supports.

    -c-  case insensitive import library
    OImpLib will allow the creation of case insensitive libraries with this switch however, in general it isn't a good idea to make a case-insensitive import library, as WIN32 export records found in DLLs are case-sensitive.


    OImpLib will perform different operations depending on what the file extensions of the input files are. The output file is specified first, followed by one or more input files. The output file may be one of the following: 
  • a library file
  • an object file
  • a .DEF file

    When the output file is a library file, the input file can be a list of object files, .DEF files, and .DLL files. The object files will be placed in the library, whereas the export sections of .DEF and .DLL files will be converted to object files that hold import records, and then placed in the library.

    When the output file is an object file, a single input file can be either a .DEF or .DLL file. The exports from the input file will be placed in the output file.

    When the output file is a .DEF file, the input file can be either a .DLL file or an object file. The exports in the .DLL file will be written to the .DEF file, or the import records in the object file will be converted to export records and written to the .DEF file. For example:

    OImpLib test.l kernel32.dll

    will make an import library holding the export definitions from kernel32.dll

    OImpLib test.ld mydll.def

    will make an import library containing the export definitions from mydll.def.

    On the other hand:

    OImpLib user32.def user32.dll

    will create a definition file from the export records in user32.dll


    OLib

    OLib is an object file librarian. It is used to combine a group of object files into a single file, to make it easier to operate on the group of files. It is capable of adding, removing, and extracting object files from a static library.

    The general format of an OLib command line is:

    OLib [options] library object-file-list

    where library specifies the library, and object-file-list specifies the list of files to operate on.

    For example:

    OLib test.l + obj1.o obj2.o obj3.o obj4.o obj5.o

    adds several object files to the library test.l.

    The object file list can have wildcards:

    OLib test.l + *.o

    adds all the object files from the current directory.


    Response files can be used as an alternate to specifying input on the command line. For example:

    OLib test.l @myresp.lst

    will take command line options from myresp.lst.  Response files might be used for example if a library is to be made out of dozens of object files, and they won't all fit on the OS command
    line.

    Following is a list of the command line switches OLib supports.

    /c- case insensitivity

    By default OLib makes case-sensitive libraries, but this switch will allow creation of a case-insensitive library.  In general you don't need to make a library case insensitive, as the linker will handle case insensitivity based on command line switches even if the library is case-sensitive.


    In a previous examples the '+' symbol was used to indicate that the following files should be added to the library. '+' doesn't have to be used in this case because the default is to add files to the library. But there are two other command line modifiers that can be used to extract files from the library and remove files from the library. These are '*' for extract and '-' for remove. Note
    that '-' is also used for command line switches; to prevent it from being ambiguous it must be present with spaces on either side when used. The '+' and '-' and '*' can be mixed on the command line; files after one of these modifies will be processed according to that modifier until another modifier is encountered. For example:

    OLib test.l * obj1.o

    extracts obj1.o from the library and places it in the current directory, and 

    OLib Test.l - obj2.o 

    removes obj2.o from the library and destroys it. As a more complex example:

    OLib test.l + add1.o add2.o - rem1.o rem2.o * ext1.o

    adds the files add1.o and add2.o, removes the files rem1.o and rem2.o, and extracts the file ext1.o. The order of the modifiers in this example is arbitrary, and modifiers can occur more than once on the command line or in the response file.


    ORC

    ORC is a windows resource compiler. It handles compilation of a file containing standard windows resources into a .RES file.  The .RES file can be given to the linker or to DLPE for use in adding resources to a windows executable file. The specification for the format of the input file is mostly beyond the scope of this document other than to say that the ORC program has a C
    language preprocessor. This makes it possible to define symbolic constants.

    The general format of an ORC command line is:

    ORC [options] rcfile.rc

    where rcfile.rc is the resource file to compile.

    At present ORC will not directly modify an EXE file but its output must be passed to the linker or post processing program.

    Response files can be used as an alternate to specifying input on the command line. For example:

    ORC @myresp.lst 

    will take command line options from myresp.lst. In general it isn't necessary to use response files with ORC as the amount of input required is minimal.

    Following is a list of command line switches ORC supports.

    /ipath set include file path

    By default, ORC will use the C language header include path as an include path to search for files specified in preprocessor INCLUDE directives. For example the statement:

    #include <windows.h>

    will result in windows.h being found in the compiler include directory, and included in the file. If there are other paths that should be searched, they may be specified on the command line with the /i switch. For example:

    ORC /i.\include test.rc 

    Searches in the directory .\include as well as in the C language include directory.

    The ORC preprocessor defines the preprocessor symbol RC_INVOKED to allow include files to specify sections that won't be evaluated by ORC. For example the windows headers use this to prevent RC compilers from trying to parse structure definitions written in the C language. This way instructions to the RC compiler can be intermixed with instructions to the C compiler without causing ORC to process things it isn't capable of processing.

    /Ddefine=value

    Defines preprocessor macros.  For example:

    ORC /DMYINT=4 test.c

    defines the macro MYINT and gives it a value of 4.

    A macro doesn't have to be defined with a value:

    ORC /DWIN32 test.c

    might be used to specify preprocessing based on the program looking for the word WIN32 in #ifdef statements.


    DLHex

    DLHex is a linker postprocessor for obtaining hex and binary files of the type used in generatring rom-based images for embedded systems. It can be used indirectly as part of the link process, if the default linker settings are sufficient. In many cases though configuring the output for an embedded system will require a customized linker specification file. The linker documentation  discusses this in more detail. If a customized specification file is used, it may be necessary to call DLHex directly to obtain an output file.

    DLHex can produce output in one of several formats. These include 
  • Motorola S19 files
  • Intel Hex files
  • pure binary output format

    The S19 and Hex file formats are suitable for use with an EEProm burner or other device that can accept them; the binary format is available to make postprocessing the output for other types of requirements easier. In rare cases the binary format may be used directly, e.g. if a device has a file system and a suitable loader is written to load it into memory.


    The general form of a DLHex command line is:

    dlhex [options] relfile

    Here the rel file is the linker generated file that holds the completely linked code. There are several command line options that control the output. These include options for specifying what parts of the input file to process, how to format the output, and optionally a file name to use for the output file.

    The linker has a default linker specification file, which is used if the linker /T switch is used to run dlhex to create an outputfile. The script is as follows:

    partition {
      overlay {
       region {} reset [ size = RESETSIZE];
      } RESET;
    } pt0 [addr = RESETBASE];
    partition {
      overlay {
       region {} code [ align = 2];
       region {} const [ align = 4];
       region {} string [ align = 2];
      } ROM;
    } pt1 [addr=CODEBASE];

    partition {
      overlay {
       RAMDATA = $;
       region {} data [ align = 4];
       region {} bss [ align = 4];
      } RAM ;
    } pt2 [addr=RAMBASE];

    partition {
      overlay {
       region {} stack[size = $400];
       STACKPOINTER = $;
      } STACK;
    } pt3 [addr = STACKBASE];

    From the above you can see it has four overlay sections, which are named 'RESET','ROM', 'RAM' and 'STACK'.  In this case you may not want to extract the stack into an output file,
    since an embedded system might initialize it in a loop. It may or may not be useful to extract the RAM section either, depending on whether the design of the embedded system specifies initialized data. Assuming it is useful, the /c switch can be used to extract the RESET, ROM and RAM files into a single .HEX file like this:

    dlhex /cRESET,ROM,RAM /mM1 test.rel

    or it can be used to extract the ROM and RAM sections into two files by running it twice like this:

    dlhex /cROM /mM1 /orom.S19 test.rel
    dlhex /cRAM /mM1 /oram.S19 test.rel

    Note that the above discussions assumes use of default linker files; it is acceptable to use a customized linker specification file and name overlays
    anything desirable.


    Following is a list of command line switchs DLHex supports.

    /c - specify what overlays to use

    The input file is a linker generated .rel file. Encoded in the input file is the program text, separated into the overlays indicated in the linker specification script. This command line option is used to specify which of these overlays will be placed in the output file. Following it overlay names are specified, separated by a comma. For example:

    dlhex /mM1 /cCODE;DATA test.rel

    pulls the two overlays CODE and DATA from the test.rel file, and places the contents in a Motorola S19 file. By default, if no /c switch is specified, all overlays will be pulled from the .rel file in the order specified.

    /oname - specify output file
    By default, DLhex will take the input file name, and replace the extension with an extension indicating what type of output is being used. These extensions are as follows:
  • BIN - a binary output file
  • S19 - a Motorola S19 file
  • HEX - an Intel Hex file
      
    However in some cases it is useful to be able to specify the output file name with this switch. The specified name can have its extension specified,or it can be typed without an extension to allow DLhex to add one of the default extensions. For example:

    dlhex /mM1 /omyfile.dat test.rel

    creates a Motorola S19 file and stores it in myfile.dat.

    Whereas

    dlhex /mM1 /omyfile test.rel

    creates  a Motorola S19 file and stores it in myfile.S19.

    /mxxx -- specify output file format
      DLHex supports several types of Motorola S19 output files, several types of Intel Hex output files, and a binary output file format. This switch can be followed by one of the following specifiers:

  • M1 - Motorola S19 files with two byte address fields 
  • M2 - Motorola S19 files with three byte address fields
  • M3 - Motorola S19 files with four-byte address fields
  • I1 - 16 bit Intel hex file. Can be segmented or not depending on the input.
  • I2 - 16 bit Intel hex file. Starts with a segmentation record.
  • I4 - 32 bit Intel hex file.
  • B - Binary output format

    For practical purposes the I1 and I2 formats are the same, except the first record of an I2 file is guaranteed to be a segmentation record.

    The default output format if no /m switch is specified is the binary format.

    /p:xx - pad
    By default DLHex does not pad output files. In the case of Motorola S19 and Intel Hex files there are address bytes in the output file, which means padding may be applied  xternally if necessary, e.g.by an EPROM programmer. In the case of the binary format, no address bytes exist, and without padding the input files are copied directly to the binary output file one section at a time, without regard for the fact that it may be useful to align subsequent sections at the appropriate place relative to the first section. In other words, the default for the binary format is to create a file that has sections offset from each other based on their actual size, rather than based on the addressing information the linker has provided.

    This switch may be used to specify that padding is required between sections.  the 'xx' value is a hexadecimal value used as the padding byte.
    For example:

    dlhex /p:FF test.rel



    DLLE

    DLLE is the utility to make MSDOS executables, that aren't win32 compatible. Generally this means it creates LE or LX style executables, but there are other options as well.

    DLLE is currently not documented.


    DLMZ


    DLMZ is the postprocessor used to create DOS 16-bit executables.There is quite a bit of linker defaults built around it; normally it will be called by the linker in response to use of a linker /T switch that specifies an MSDOS output format.. It would be rare for a user to need to call it directly.

    The general form of a DLMZ command line is:

    dlmz[options] relffile

    Here the relfile is the linker generated file that holds the completely linked code. For example:

    dlmz test.rel

    makes a console application called test.exe from the linked code in test.rel.

    The following is a list of the command line switches that DLMZ supports.

    /oname - specify output name
    By default, DLMZ will take the input file name, and replace the extension with the extension .EXE.

    However in some cases it is useful to be able to specify the output file name with this switch. The specified name can have its extension specified, or it can be typed without an extension to allow dlhex to add one of the default extensions. For example:

    dlmz /mREAL /omyfile test.rel

    makes a segmented executable called myfile.exe

    /mxxx - specify output type
    DLMZ supports two types of MSDOS executables with the /m switch, as follows:

  • TINY - a tiny-mode program  
  • REAL - a segmented program
      

    The default output format if no /m switch is specified is REAL.

    Note that this toolchain isn't compatible with the normal MSDOS build process. Normally,MSDOS programs would have sections that also included something called a class name; the linker would take both the class name and the section name into account when determining how to create an output file. Class names aren't supported by this toolchain, instead it is preferred to write a linker specification file to specify how the sections should be combined. A generic linker specification file exists for each of the supported modes, however, since sections can be named and placed arbitrarily, this specification file would not work for all programs. It may be necessary to augment an MSDOS program with its own specific linker specification file. 

    In tiny mode, it is customary to instruct most MSDOS assemblers to set a code origin of 100h. However, the linker specification file fortiny mode automatically sets this origin. Therefore it does not have to be present for this toolchain to generate tiny mode files. Such files still must start with a code sequence, however.


    DLPE

    DLPE is the postprocessor used to create Win32 executables. There is quite a bit of linker defaults built around it; normally it will be called by the linker in response to use of a linker /T switch that specifies a WIN32 output format.. It would be rare for a user to need to call it directly.

    The general form of a DLPE command line is:

    dlpe [options] relfile [resourcefile]

    Here the relfile is the linker generated file that holds the linked code, and the optional resourcefile can be used to add resources to an executable.

    Resource files can be specified on the command line and DLPE will build a resource section for the exectuble. For example:

    dlpe test.rel test.res

    makes a console application called test.exe from the linked code in test.rel, using resources as specified in test.res.

    Following is a list of command line switches DLPE supports.

    /oname - specify output name

    By default, DLPE will take the input file name, and replace the extension with the extension .EXE to form an output file name. However in some cases it is useful to be able to specify the output file name. The specified name can have its extension specified, or it can be typed without an extension to allow dlhex to add one of the default extensions. The output file name is specified with this switch. For example:

    dlpe /mGUI /omyfile test.rel

    makes a windowing executable called myfile.exe.

    /mxxx specify the output file format
    DLPE supports several types of WIN32 Executables with the /m switch. the switch is followed by one of the folowing:

  • CON - a console application
  • GUI - a windowing application
  • DLL - a DLL
      

    The default output format if no /m switch is specified is the console format.

    /sname set stub file

    By default DLPE will add an MSDOS stub file which simply says that the program requires WIN32, if someone happens to run it on MSDOS.  However, with the /s switch a specific stub can be specified.  This might be useful for example with certain MSDOS DPMI extenders that mimic the WIN32 API for console mode programs. This switches adds a custom stub.   For example:

    dlpe /smystub.exe test.rel

    creates an output file test.exe which has the 16-bit program mystub.exe as its MSDOS stub.