Between the two incompatible binary formats, the static vs shared library distinction, and the overloading of the verb `link' to mean both `what happens after compilation' and `what happens when a compiled program is invoked' (and, actually, the overloading of the word `load' in a comparable but opposite sense), this section is complicated. Little of it is much more complicated than that sentence, though, so don't worry too much about it.
To alleviate the confusion somewhat, we refer to what happens at runtime as `dynamic loading' and cover it in the next section. You will also see it described as `dynamic linking', but not here. This section, then, is exclusively concerned with the kind of linking that happens at the end of a compilation.
The last stage of building a program is to `link' it; to join all
the pieces of it together and see what is missing. Obviously there
are some things that many programs will want to do --- open files, for
example, and the pieces that do these things are provided for you in
the form of libraries. On the average Linux system these can be found
in /lib
and /usr/lib/
, among other places.
When using a static library, the linker finds the bits that the
program modules need, and physically copies them into the executable
output file that it generates. For shared libraries, it doesn't ---
instead it leaves a note in the output saying `when this program is
run, it will first have to load this library'. Obviously shared
libraries tend to make for smaller executables; they also use less
memory and mean that less disk space is used. The default behaviour
of Linux is to link shared if it can find the shared libraries, static
otherwise. If you're getting static binaries when you want shared,
check that the shared library files (*.sa
for a.out, *.so
for ELF) are where they should be, and are readable.
On Linux, static libraries have names like libname.a
, while
shared libraries are called libname.so.x.y.z
where x.y.z
is
some form of version number. Shared libraries often also have links
pointing to them, which are important, and (on a.out configurations)
associated .sa
files. The standard libraries come in both shared
and static formats.
You can find out what shared libraries a program requires by using
ldd
(List Dynamic Dependencies)
$ ldd /usr/bin/lynx
libncurses.so.1 => /usr/lib/libncurses.so.1.9.6
libc.so.5 => /lib/libc.so.5.2.18
This shows that on my system the WWW browser `lynx' depends on the
presence of libc.so.5
(the C library) and libncurses.so.1
(used for terminal control). If a program has no dependencies,
ldd
will say `statically linked
' or `statically linked (ELF)
'.
sin()
in?')
nm
libraryname should list all the symbols that
libraryname has references to. It works on both static and shared
libraries. Suppose that you want to know where tcgetattr()
is defined:
you might do
$ nm libncurses.so.1 |grep tcget
U tcgetattr
The U
stands for `undefined' --- it shows that the ncurses
library uses but does not define it. You could also do
$ nm libc.so.5 | grep tcget
00010fe8 T __tcgetattr
00010fe8 W tcgetattr
00068718 T tcgetpgrp
The `W
' stands for `weak', which means that the symbol is
defined, but in such a way that it can be overridden by another
definition in a different library. A straightforward `normal'
definition (such as the one for tcgetpgrp
) is marked by a
`T
'
The short answer to the question in the title, by the way, is
libm.(so|a)
. All the functions defined in <math.h>
are
kept in the maths library; thus you need to link with -lm
when
using any of them.
ld: Output file requires shared library `libfoo.so.1`
The file search strategy of ld and friends varies according to
version, but the only default you can reasonably assume is
/usr/lib
. If you want libraries elsewhere to be searched,
specify their directories with the -L
option to gcc or ld.
If that doesn't help, check that you have the right file in that
place. For a.out, linking with -lfoo
makes ld look for
libfoo.sa
(shared stubs), and if unsuccessful then for
libfoo.a
(static). For ELF, it looks for libfoo.so
then
libfoo.a
. libfoo.so
is usually a symbolic link to
libfoo.so.x
.
As any other program, libraries tend to have bugs which get fixed over time. They also may introduce new features, change the effect of existing ones, or remove old ones. This could be a problem for programs using them; what if it was depending on that old feature?
So, we introduce library versioning. We categorise the changes that
might be made to a library as `minor' or `major', and we rule that a
`minor' change is not allowed to break old programs that are using the
library. You can tell the version of a library by looking at its
filename (actually, this is, strictly speaking, a lie for
ELF; keep reading to find out why) : libfoo.so.1.2
has
major version 1, minor version 2. The minor version number can be
more or less anything --- libc puts a `patchlevel' in it, giving
library names like libc.so.5.2.18
, and it's also reasonable to
put letters, underscores, or more or less any printable ASCII in it.
One of the major differences between ELF and a.out format is in building shared libraries. We look at ELF first, because it's simpler.
ELF (Executable and Linking Format) is a binary format originally developed by USL (UNIX System Laboratories) and currently used in Solaris and System V Release 4. Because of its increased flexibility over the older a.out format that Linux was using, the GCC and C library developers decided last year to move to using ELF as the Linux standard binary format also.
This section is from the document '/news-archives/comp.sys.sun.misc'.
ELF ("Executable Linking Format) is the "new, improved" object file format introduced in SVR4. ELF is much more powerful than straight COFF, in that it *is* user-extensible. ELF views an object-file as an arbitarily long list of sections (rather than an array of fixed size entities), these sections, unlike in COFF, do not HAVE to be in a certain place and do not HAVE to come in any specific order etc. Users can add new sections to object-files if they wish to capture new data. ELF also has a far more powerful debugging format called DWARF (Debugging With Attribute Record Format) - not currently fully supported on linux (but work is underway). A linked list of DWARF DIEs (or Debugging Information Entries) forms the .debug section in ELF. Instead of being a collection of small, fixed-size information records, DWARF DIEs each contain an arbitrarily long list of complex attributes and are written out as a scope-based tree of program data. DIEs can capture a large amount of information that the COFF .debug section simply couldn't (like C++ inheritance graphs etc.).
ELF files are accessed via the SVR4 (Solaris 2.0 ?) ELF access library, which provides an easy and fast interface to the more gory parts of ELF. One of the major boons in using the ELF access library is that you will never need to look at an ELF file qua. UNIX file, it is accessed as an Elf *, after an elf_open() call and from then on, you perform elf_foobar() calls on its components instead of messing about with its actual on-disk image (something many COFFers did with impunity).
The case for/against ELF, and the necessary contortions to upgrade an a.out system to support it, are covered in the ELF-HOWTO and I don't propose to cut/paste them here. The HOWTO should be available in the same place as you found this one.
To build libfoo.so
as a shared library, the basic steps look
like this:
$ gcc -fPIC -c *.c
$ gcc -shared -Wl,-soname,libfoo.so.1 -o libfoo.so.1.0 *.o
$ ln -s libfoo.so.1.0 libfoo.so.1
$ ln -s libfoo.so.1 libfoo.so
$ LD_LIBRARY_PATH=`pwd`:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH
This will generate a shared library called libfoo.so.1.0
, and
the appropriate links for ld (libfoo.so
) and the dynamic
loader (libfoo.so.1
) to find it. To test, we add the current
directory to LD_LIBRARY_PATH
.
When you're happpy that the library works, you'll have to move it to,
say, /usr/local/lib
, and recreate the appropriate links. The
link from libfoo.so.1
to libfoo.so.1.0
is kept up to date by
ldconfig
, which on most systems is run as part of the boot
process. The libfoo.so
link must be updated manually. If you are
scrupulous about upgrading all the parts of a library (e.g. the header
files) at the same time, the simplest thing to do is make
libfoo.so -> libfoo.so.1
, so that ldconfig will keep both
links current for you. If you aren't, you're setting yourself up
to have all kinds of weird things happen at a later date. Don't say
you weren't warned.
$ su
# cp libfoo.so.1.0 /usr/local/lib
# /sbin/ldconfig
# ( cd /usr/local/lib ; ln -s libfoo.so.1 libfoo.so )
Each library has a soname. When the linker finds one of
these in a library it is searching, it embeds the soname into the
binary instead of the actual filename it is looking at. At runtime,
the dynamic loader will then search for a file with the name of the
soname, not the library filename. Thus a library called
libfoo.so
could have a soname libbar.so
, and all programs
linked to it would look for libbar.so
instead when they started.
This sounds like a pointless feature, but it is key to
understanding how multiple versions of the same library can coexist on
a system. The de facto naming standard for libraries in Linux is to
call the library, say, libfoo.so.1.2
, and give it a soname of
libfoo.so.1
. If it's added to a `standard' library directory
(e.g. /usr/lib
), ldconfig
will create a symlink
libfoo.so.1 -> libfoo.so.1.2
so that the appropriate image
is found at runtime. You also need a link libfoo.so ->
libfoo.so.1
so that ld will find the right soname to use at link
time.
So, when you fix bugs in the library, or add new functions (any
changes that won't adversely affect existing programs), you rebuild
it, keeping the soname as it was, and changing the filename. When you
make changes to the library that would break existing binaries, you
simply increment the number in the soname --- in this case, call the
new version libfoo.so.2.0
, and give it a soname of
libfoo.so.2
. Now switch the libfoo.so
link to point
to the new version and all's well with the world again.
Note that you don't have to name libraries this way, but it's a good convention. ELF gives you the flexibility to name libraries in ways that will confuse the pants off people, but that doesn't mean you have to use it.
Executive summary: supposing that you observe the tradition that major upgrades may break compatibility, minor upgrades may not, then link with
gcc -shared -Wl,-soname,libfoo.so.major -o libfoo.so.major.minor
and everything will be all right.
The ease of building shared libraries is a major reason for upgrading to ELF. That said, it's still possible in a.out. Get ftp://tsx-11.mit.edu/pub/linux/packages/GCC/src/tools-2.17.tar.gz and read the 20 page document that you will find after unpacking it. I hate to be so transparently partisan, but it should be clear from context that I never bothered myself :-)
QMAGIC is an executable format just like the old a.out (also known as ZMAGIC) binaries, but which leaves the first page unmapped. This allows for easier NULL dereference trapping as no mapping exists in the range 0-4096. As a side effect your binaries are nominally smaller as well (by about 1K).
Obsolescent linkers support ZMAGIC only, semi-obsolescent support both formats, and current versions support QMAGIC only. This doesn't actually matter, though, as the kernel can still run both formats.
Your `file' command should be able to identify whether a program is QMAGIC.
An a.out (DLL) shared library consists of two real files and a
symlink. For the `foo' library used throughout this document as an
example, these files would be libfoo.sa
and libfoo.so.1.2
;
the symlink would be libfoo.so.1
and would point at the latter of
the files. What are these for?
At compile time, ld
looks for libfoo.sa
. This is the `stub'
file for the library, and contains all exported data and pointers to
the functions required for run time linking.
At run time, the dynamic loader looks for libfoo.so.1
. This is a
symlink rather than a real file so that libraries can be updated with
newer, bugfixed versions without crashing any application that was
using the library at the time. After the new version --- say,
libfoo.so.1.3
--- is completely there, running ldconfig will
switch the link to point to it in one atomic operation, leaving any
program which had the old version still perfectly happy.
DLL libraries (I know that's a tautology --- so sue me) often appear
bigger than their static counterparts. They reserve space for future
expansion in the form of `holes' which can be made to take no disk
space. A simple cp
call or using the program makehole
will
achieve this. You can also strip them after building, as the
addresses are in fixed locations. Do not attempt to strip ELF
libraries.
A libc-lite is a light-weight version of the libc library built
such that it will fit on a floppy and suffice for all of the most
menial of UNIX tasks. It does not include curses, dbm, termcap
etc code. If your /lib/libc.so.4
is linked to a lite lib, you are
advised to replace it with a full version.
Send me your linking problems! I probably won't do anything about them, but I will write them up if I get enough ...
Check that you have the right links for ld
to find each shared
library. For ELF this means a libfoo.so
symlink to the image,
for a.out a libfoo.sa
file. A lot of people had this problem
after moving from ELF binutils 2.5 to 2.6 --- the earlier version
searched more `intelligently' for shared libraries, so they hadn't
created all the links. The intelligent behaviour was removed for
compatibility with other architectures, and because quite often it got
its assumptions wrong and caused more trouble than it solved.
As of libc.so.4.5.x
and above, libgcc is no longer shared. Hence
you must replace occurrences of `-lgcc
' on the offending line with
`gcc -print-libgcc-file-name`
(complete with the backquotes).
Also, delete all /usr/lib/libgcc*
files. This is important.
__NEEDS_SHRLIB_libc_4 multiply defined
messages are another consequence of the same problem.
This cryptic message most probably means that one of your jump table
slots has overflowed because too little space has been reserved in the
original jump.vars
file. You can locate the culprit(s) by
running the `getsize
' command provided in the tools-2.17.tar.gz
package. Probably the only solution, though, is to bump the major
version number of the library, forcing it to be backward incompatible.
ld: output file needs shared library libc.so.4
This usually happens when you are linking with libraries other than
libc (e.g. X libraries), and use the -g
switch on the link line
without also using -static
.
The .sa
stubs for the shared libraries usually have an undefined
symbol _NEEDS_SHRLIB_libc_4
which gets resolved from the
libc.sa
stub. However with -g
you end up linking with
libg.a
or libc.a
and thus this symbol never gets resolved,
leading to the above error message.
In conclusion, add -static
when compiling with the -g
flag,
or don't link with -g
. Quite often you can get enough debugging
information by compiling the individual files with -g
, and
linking without it.