T R O U B L E S H O O T I N G
If you run into any problem during installation or when using this
package, please first read the following text and all other relevant
documentation. Especially you should consult your server's documen-
tation if you run into problems setting up your server. Also refer
to your network card's user manual or the documentation for the
operating systems of the diskless clients accordingly. However, if
you still can't solve the problem on your own, you can send me an
email to
gero@gkminix.han.de
Users able to speak German can send me the mail in german. Otherwise
please write in english. I already received some emails in so poor
english that I haven't been able to even understand the problem. I
can't help you in that case. And please excuse me that I can't answer
questions sent to me by standard mail or telephone calls. I just don't
have the time for dealing with that.
If you decided to send me an email please describe your problem as
exactly as possible. It usually helps to send me relevant portions
of configuration files (I have to pay for my internet access by myself
so please keep quotings as short as possible). Especially with problems
with the bootrom it usually helps to _exactly_ write down the screen
output, not only but including any error messages. Also state as exact
as possible how you created the problem so that I can try to simulate
it on my own hardware.
Additionally please note that I can't help you with every problem with
your server, as there are so many different systems on the market. The
same is true for problems with network cards. I just don't have the
financial capabilities to buy any card on the market for testing. Per-
sonally I'm using NE2000 and WD8013 cards, so I can probably help you
with those.
If you find a problem which looks like a bug in the code I really
appreciate a short notice from you. And if you have a fix for the bug
I would even more appreciate your message.
Besides contacting me directly there also exists a mailing list related
to network booting which you can subscribe to. Write a mail with the
message 'subscribe netboot' in it's body to majordomo@baghira.han.de
(the subject of the mail doesn't matter). The readers of the mailing
list should also be able to help you with any problem you might have
while setting up a diskless client. And besides that I'm also going
to announce any new version of this netboot package to the mailing
list.
Problem: My operating system OS/XY is not supported by netboot
I would gladly provide support for every operating system on the
market, but I don't have the resources for doing this. However,
if you want a particular operating system to be supported, you
should get in contact with me. In any case you will have to provide
me with a valid and licensed copy of that operating system. You are
also invited to write your own boot loader, and send it to me for
inclusion into netboot under the terms of the GNU GPL.
Problem: While trying to build a bootrom I get a compiler error
The installation scripts require to compile a couple of utility
programs which are only required during building the bootrom.
They should compile on any Unix-type system, so if you get an
error please report it to me, even when you are able to fix it
yourself, so that I can include a patch for future releases.
Problem: I get a an error from make saying something like "missing delimiter"
Some of the Makefiles use ifdef's, which older make programs don't
understand. Even some more "modern" systems like SCO Open-Server 5
have this problem. In that case you will have to get and install GNU
make on your system (which is the better choice anyway).
Problem: The bootrom doesn't startup at all
Either you have a floppy in your diskette drive or you have
a hard disk installed with a partition marked as active, and the
bootrom has been built so that it lets the BIOS look for active
partitions first. Both conditions let the system boot from the
bootable media instead of using the bootrom. Just remove the
floppy or use fdisk to mark all partitions as unbootable (e.g.
inactive). Alternatively you can also build the bootrom so that
it does not allow the BIOS to look for bootable partitions. The
program which actually creates the bootrom ('makerom', it gets
called when you run 'make bootrom') will ask you about this right
after selecting the bootrom kernel image.
Problem: The bootrom behaves strange during startup, and may even hangup
the whole system
If you compiled the mknbi programs on a system with big endian
byte order (like Motorola or PPC systems) this might indicate
that the configuration program couldn't find the correct byte
order. It might also be that there is a bug in the byte ordering
code. Some systems like SPARCs also do not allow data accesses at
misaligned addresses. 'configure' should usually find out about
these conditions. In any case, if 'configure' is not able to pro-
perly detect what kind of system you are using, edit the file
config.h by hand and try it again. Please report this condition,
and also note which system you used for installation.
Problem: The packet driver is not able to start properly
First check what error message the packet driver prints. Usually
this problem is a result of an incorrect setup of the network
card, so check that it uses an I/O address, interrupt line and DMA
channel (if applicable) of it's own, and that the packet driver
uses the correct values. Another common problem with ethernet
cards which use shared memory (like WD80?3 cards) is an overlap-
ping of this shared memory with the rom area used by the bootrom.
Select a different shared memory address in that case. If that's
ok you should next check that you configured the packet driver
correctly with the bootrom configuration program. Usually the
packet driver prints out what it expects the hardware to look
like so you can use this information to check up your setup.
Problem: The bootrom tells me that there is not enough memory but I have
xx megabytes installed
This problem is a result of the fact that the BIOS starts the
bootrom in the processor's real mode. The bootrom is therefore
only able to access the lower 1 megabyte of memory, regardless
of how much you installed. And 384kB of this is reserved for
ROM's and the video memory, so there is only 640kB left. Unfor-
tunately some systems even reserve memory from these lower 640kB
for internal BIOS data. This is called extended BIOS data area,
and known to be used on most PS/2 systems. But also some other
BIOSes use such an extended BIOS data area, which is usually
selectable in the system's setup. Therefore you should try to
deselect such a feature. If that's not possible you are out
of luck - sorry.
Problem: The bootrom doesn't receive a bootp answer and just hangs printing
dots
First you should check if bootpd runs on your server or is started
properly from inetd. Then check that the server's /etc/bootptab is
setup correctly. Especially the hardware address and the client's
IP address and name have to be correct.
Most bootp servers have the ability to write debugging information
into a log file. Use that feature to verify that your server really
receives bootp requests from the client's bootrom and sends out a
valid answer. Also check for error messages in the log file. Even
if your bootpd doesn't write into a seperate log file it might use
syslog on your system, so find the log file name from your syslogd
configuration file and check for errors.
If you are able to use a network tracing program like tcpdump you
can check if the bootrom sends out correct requests and that the
server is answering correctly. In that case it is more likely to
be a problem in the bootrom, so you should create a new bootrom
image with the packet driver debugging module included. You should
then see the bootrom's request packets going out, and the server's
answers coming in. If there are no packets coming in although you
verified that the server is sending out correct replies there might
be a problem with your network card. Did you set it up correctly,
is a cable connected (no kidding, those things really happen)?
If everything fails try to boot the diskless client with the
intended operating system and try to access the network card
using that operating system's tools.
If the server is not sending out answer packets, but the bootpd
logfiles indicates correct answers, it might be a problem with
the arp setup on your server. Normally arp shouldn't be a concern
for you. However, some older versions of bootpd for Linux had
problems here, which could be solved by setting the kernel arp
table manually.
Problem: The bootrom did get a bootp answer but is not able to load the
bootimage file
This is likely to be a problem with the tftpd setup on the server.
Does tftpd run when you startup the bootrom code? If not check
that inetd is configured correctly. Also there might be a TCP/IP
wrapper running on your server which might prohibit access to
the tftp service (which is known to be very insecure and therefore
a candidate for getting started by an internet security wrapper
like tcpd). Check any access configuration files for tcpd.
Furthermore tftpd has to be able to access the bootimage file. It
usually runs as a user with very low priviliges because of security
reasons and might not be allowed to read the bootimage file, so
you should check and set the bootimage file's permissions correctly.
Problem: The boot image loader reports an error
Congratulations! You just discovered a bug in the boot loader.
Please report it to me.
Problem: When I'm using the bootrom menu to load a Unix system off the local
hard disk, it reports some weird error messages to me (especially,
SCO Unix says that it's not able to open boot device). However,
booting without the bootrom works without a problem.
Some operating systems, especially Unix like systems, read the
partition table after booting and try to find their own boot par-
tition. When using the bootrom, it's not necessary to mark the
Unix partition as bootable, so the Unix startup loader fails.
To solve this problem, mark the Unix partition active with some
fdisk program. To avoid that it starts running instead of the
bootrom, create the bootrom so that it does not allow the BIOS
to search for boot partitions on the installed hard disks (the
'makerom' program, which gets run when you do a 'make bootrom',
will ask you about this right after selecting a kernel image).
Problem: I'm loading Linux onto my diskless client and the kernel tells
me to insert a root floppy and press enter
First you should check that you built your kernel correctly. It
should have support for the root filesystem built in. If you want
to use an NFS mounted directory as root the kernel should have
TCP/IP support installed. Also it has to have a driver for your
network card built in, and NFS and NFSROOT have to be both speci-
fied. When using a ramdisk it's support has to be compiled in
as well as support for the filesystem with which you formatted
the ramdisk image. Please note that the loaded kernel is not
able to use modules at bootup time (only _after_ the root file-
system has been mounted, but not before), so everything has to
be compiled in.
If the kernel is not able mount it's root via NFS, this might
have many different reasons. It requires all addresses in the
/etc/bootptab file to be correct, and the access rights on the
server have to be set correctly - not only in /etc/exports but
also the permissions for the directory to get mounted. If that's
correct check that a portmapper is running on the server, and
that it registered the mountd and nfsd services correctly. You
can usually do this by running the command
rpcinfo -p
Note that services are only listed here if their associated server
process is really running. The rpcinfo output should then look
something like this:
program vers proto port
100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100003 2 udp 2049 nfs
100003 2 tcp 2049 nfs
100005 1 udp 663 mountd
100005 1 tcp 665 mountd
However, the port numbers might be different.
When the kernel starts mounting the NFS root directory it prints
out the name of that directory on the server. It should be the
same as the one configured in /etc/bootptab. Check that it's
correct. If not you can try to use the -d option with mknbi-linux
to specify the name explicitely.
If the kernel gets an error from the server's nfsd, it prints
a number which is defined according to the NFS protocol. The
most commonly occurring numbers are:
1 - permission denied to access directory
2 - directory doesn't exist
5 - I/O error on server filesystem
13 - nfsd is unable to access directory
20 - path name is not a directory
63 - path name is too long
Note that some nfsd and mountd programs only read /etc/exports
on startup. If you changed this file afterwards, you will have
to restart both daemons. Additionally, with nfsd versions for
Linux earlier than 2.1 you will have problems with special files
like UNIX domain sockets or block/character special files on
your NFS partitions. You should therefore use the latest avai-
lable versions.
Problem: The Linux kernel mounts it's root correctly but doesn't give me
a login prompt.
1.) This might be the result of an incorrect setup of the root file-
system (see No. 2 below). However, it's also possible that your
server reported the wrong major/minor numbers for the console device
even though you specified them correctly in the NFS mounted root
directory. I know of this problem with AIX and HP-UX servers,
but there might exist others as well which don't transfer special
devices via NFS as Linux requires it. One solution to solve this
problem is to boot the diskless client with a ramdisk image as
it's root, and then mount the should-be-root directory on the
server using NFS. Then you can create the special files in the
dev directory using Linux's mknod program, and use the NFS root
mounting bootimage again.
Another way is to try to find out, how the server operating system
encodes major/minor numbers on it's own filesystem. For example,
HP-UX uses a 32 bit device number, with the 8 highest bits being
the major number, and the lower 24 bits being the minor device
number:
major << 24 | minor ==> aaaaaaaabbbbbbbbbbbbbbbbbbbbbbbb
In this representation (a) means a bit of the major number, and
(b) means a bit of the minor number. Linux uses the following
scheme instead:
major << 8 | minor ==> 0000000000000000aaaaaaaabbbbbbbb
The NFS protocol now transfers these 32 bits just as they are,
without any further interpretation regarding major/minor numbers.
That means, that all relevant bits in the Linux representation
fit into the minor number on HP-UX. Therefore, if you create a
device on the HP-UX server, you have to alway give it a major
number of zero and compute the minor number the way mentioned
above for Linux. For example, to let Linux see a device 5/2 in
it's NFS-mounted /dev directory, you can compute the minor device
number on HP-UX as
5 << 8 | 2 ==> 1282
So the device to create on the HP-UX server is 0/1282. This will
let Linux see 5/2 after the filesystem is mounted with NFS.
2.) Another reason for this problem might be that the init process
doesn't get started at all. This can be a result of incorrect
shared libraries, which the client might see but without a proper
ld.so.cache file. Or the shared libraries are not reachable by
the client at all. Bruce Janson and Markus Gutschke collected a
good list of possibilities, which you should check out:
- you do not have a private copy of the /, /etc, /var, ...
directories
- your /dev directory is missing entries for /dev/zero and/or
/dev/null or is sharing device entries from a server that uses
different major and minor numbers (i.e. a server that is not
running Linux - see above).
- your /lib directory is missing libraries (most notably libc*
and/or libm*) or does not have the loader files ld*.so*
- you neglected to run ldconfig to update /etc/ldconfig.cache
or you do not have a configuration file for ldconfig.
- your /etc/inittab and/or /etc/rc.d/* files have not been
customized for the clients.
- your kernel is missing some crucial compile-time feature
(such as NFS filesystem support, booting from the net, trans-
name (optional), ELF file support, networking support, driver
for your ethernet card).
- missing init executable (in one of the directories
known by the kernel: /etc, /sbin, ?)
- missing /etc/inittab
- missing /dev/tty?
- missing /bin/sh
- system programs that insist on creating/writing to files
outside of /var (mount and /etc/mtab* is the canonical
example)
Problem: Can't compile the bootrom
Please get in touch with me if you encounter any problems
while recompiling the bootrom.