Linux SMP HOWTO: Questions related to any architectures

2. Questions related to any architectures

2.1 Kernel Side

Does Linux support multi-threading? If I start two or more processes, will they be distributed among the available CPUs?
Yes. Processes and kernel-threads are distributed among processors. User-space threads are not.
What kind of architectures are supported in SMP?

From Alan Cox:
SMP is supported in 2.0 on the hypersparc (SS20, etc.) systems and Intel 486, Pentium or higher machines which are Intel MP1.1/1.4 compliant. Richard Jelinek adds: right now, systems have been tested up to 4 CPUs and the MP standard (and so Linux) theoretically allows up to 16 CPUs.
SMP support for UltraSparc, SparcServer, Alpha and PowerPC machines is in available in 2.2.x.

From Ralf Bächle:
MIPS, m68k and ARM does not support SMP; the latter two probly won't ever.
That is, I'm going to hack on MIPS-SMP as soon as I get a SMP box ...
How do I make a Linux SMP kernel?
Most Linux distributions don't provide a ready-made SMP-aware kernel, which means that you'll have to make one yourself. If you haven't made your own kernel yet, this is a great reason to learn how. Explaining how to make a new kernel is beyond the scope of this document; refer to the Linux Kernel Howto for more information. (C. Polisher)
In kernel series 2.0 up to but not including 2.1.132, uncomment the SMP=1 line in the main Makefile (/usr/src/linux/Makefile).
In the 2.2 version, configure the kernel and answer "yes" to the question "Symmetric multi-processing support" (Michael Elizabeth Chastain).
AND
enable real time clock support by configuring the "RTC support" item (in "Character Devices" menu) (from Robert G. Brown). Note that inserting RTC support actually doesn't afaik prevent the known problem with SMP clock drift, but enabling this feature prevents lockup when the clock is read at boot time. A note from Richard Jelinek says also that activating the Enhanced RTC is necessary to get the second CPU working (identified) on some original Intel Mainboards.
AND
(x86 kernel) do NOT enable APM (advanced power management)! APM and SMP are not compatible, and your system will almost certainly (or at least probably ;)) crash while booting if APM is enabled (Jakob Oestergaard). Alan Cox confirms this : 2.1.x turns APM off for SMP boxes. Basically APM is undefined in the presence of SMP systems, and anything could occur.
AND
(x86 kernel) enable "MTRR (Memory Type Range Register) support". Some BIOS are buggy as they do not activate cache memory for the second processor. The MTRR support contains code that solves such processor misconfiguration.

You must rebuild all your kernel and kernel modules when changing to and from SMP mode. Remember to make modules and make modules_install (from Alan Cox).

If you get module load errors, you probably did not rebuild and/or re-install your modules. Also with some 2.2.x kernels people have reported problems when changing the compile from SMP back to UP (uni-processor). To fix this, save your .config file, do make mrproper, restore your .config file, then remake your kernel (make dep, etc.) (Wade Hampton). Do not forget to run lilo after copying your new kernel.
Recap:
```
make config # or menuconfig or xconfig
make dep
make clean
make bzImage # or whatever you want
# copy the kernel image manually then RUN LILO 
# or make lilo
make modules
make modules_install
```
How do I make a Linux non-SMP kernel?
In the 2.0 series, comment the SMP=1 line in the main Makefile (/usr/src/linux/Makefile).
In the 2.2 series, configure the kernel and answer "no" to the question "Symmetric multi-processing support" (Michael Elizabeth Chastain).

You must rebuild all your kernel and kernel modules when changing to and from SMP mode. Remember to make modules and make modules_install and remember to run lilo. See notes above about possible configuration problems.

How can I tell if it worked?

 cat /proc/cpuinfo

Typical output (dual PentiumII):

processor       : 0
cpu             : 686
model           : 3
vendor_id       : GenuineIntel
[...]
bogomips        : 267.06
 
processor       : 1
cpu             : 686
model           : 3
vendor_id       : GenuineIntel
[...]
bogomips        : 267.06

What is the status of converting the kernel toward finer grained locking and multithreading?
Linux kernel version 2.2 has signal handling, interrupts and some I/O stuff fine grain locked. The rest is gradually migrating. All the scheduling is SMP safe.

Kernel version 2.3 (next 2.4) has really fine grained locking. In the 2.3 kernels the usage of the big kernel lock has basically disappeared, all major Linux kernel subsystems are fully threaded: networking, VFS, VM, IO, block/page caches, scheduling, interrupts, signals, etc. (Ingo Molnar)
Does Linux SMP support processor affinity?
Standard kernel
No and Yes. There is no way to force a process onto specific CPU's but the linux scheduler has a processor bias for each process, which tends to keep processes tied to a specific CPU.

Patch
Yes. Look at PSET - Processor Sets for the Linux kernel:
The goal of this project is to make a source compatible and functionally equivalent version of pset (as defined by SGI - partially removed from their IRIX 6.4 kernel) for Linux. This enables users to determine which processor or set of processors a process may run on. Possible uses include forcing threads to separate processors, timings, security (a `root' only CPU?) and probably more.

It is focused around the syscall sysmp(). This function takes a number of parameters that determine which function is requested. Functions include:
- binding a process/thread to a specific CPU
- restricting a CPU's ability to execute some processes
- restricting a CPU from running at all
- forcing a cpu to run _only_ one process (and its children)
- getting information about a CPU's state
- creating/destroying sets of processors, to which processes may be bound
Where should one report SMP bugs to?
Please report bugs to linux-smp@vger.kernel.org.
What about SMP performance?
If you want to gauge the performance of your SMP system, you can run some tests made by Cameron MacKinnon and available at http://www.phy.duke.edu/brahma/benchmarks.smp.
Also have a look at this article by Bryant, Hartner, Qi and Venkitachalam that compares 2.2 and 2.3/2.4 UP and SMP kernels : SMP Scalability Comparisons of Linux¨ Kernels 2.2.14 and 2.3.99 (Ray Bryant) (You'll find also a copy here)

2.2 User Side

Do I really need SMP?
If you have to ask, you probably don't. :) Generally, multi-processor systems can provide better performance than uni-processor systems, but to realize any gains you need to consider many other factors besides the number of CPU's. For instance, on a given system, if the processor is generally idle much of the time due to a slow disk drive, then this system is "input/output bound", and probably won't benefit from additional processing power. If, on the other hand, a system has many simultaneously executing processes, and CPU utilization is very high, then you are likely to realize increased system performance. SCSI disk drives can be very effective when used with multiple processors, due to the way they can process multiple commands without tying up the CPU. (C. Polisher)
Do I get the same performance from 2-300 MHz processors as from one 600 MHz processor?
This depends on the application, but most likely not. SMP adds some overhead that a faster uniprocessor box would not incur (Wade Hampton). :)
How does one display mutiple cpu performance?
Thanks to Samuel S. Chessman, here are some useful utilities:

Character based:
http://www.cs.inf.ethz.ch/~rauch/procps.html
Basically, it's procps v1.12.2 (top, ps, et. al.) and some patches to support SMP.
For 2.2.x, Gregory R. Warnes as made a patch available at http://queenbee.fhcrc.org/~warnes/procps

Graphic:
xosview-1.5.1 supports SMP. And kernels above 2.1.85 (included) the cpuX entry in /proc/stat file.
The official homepage for xosview is: http://lore.ece.utexas.edu/~bgrayson/xosview.html
You'll find a version patched for 2.2.x kernels by Kumsup Lee : http://www.ima.umn.edu/~klee/linux/xosview-1.6.1-5a1.tgz

By the way, you can't monitor processor scheduling precisely with xosview, as xosview itself causes a scheduling perturbation. (H. Peter Anvin)
And Rik van Riel tell us why:
The answer is pretty simple. Basically there are 3 processes involved:
1. the cpu hog (low scheduling priority because it eats CPU)
2. xosview
3. X
The CPU hog is running on one CPU. Then xosview wakes up (on the other CPU) and starts sending commands to X, which wakes up as well.
Since both X and xosview have a much higher priority than the CPU hog, xosview will run on one CPU and X on the other.
Then xosview stops running and we have an idle CPU --> Linux moves the CPU hog over to the newly idle CPU (X is still running on the CPU our hog was running on just before).
How can I enable more than 1 process for my kernel compile?
use:
```
        # make [modules|zImage|bzImages] MAKE="make -jX"
        where X=max number of processes.
        WARNING: This won't work for "make dep".
```
With a 2.2 like kernel, see also the file /usr/src/linux/Documentation/smp.txt for specific instruction.
BTW, since running multiple compilers allows a machine with sufficient memory to use use the otherwise wasted CPU time during I/O caused delays, make MAKE="make -j 2" -j 2 actually helps even on uniprocessor boxes (from Ralf Bächle).
Why is the time given by the time command inaccurate? (from Joel Marchand)
In the 2.0 series, the result given by the time command is false. The sum user+system is right *but* the spreading between user and system time is false.
More precisely: "The explanation is, that all time spent in processors other than the boot cpu is accounted as system time. If you time a program, add the user time and the system time, then you timing will be almost right, except for also including the system time that is correctly accounted for" (Jakob Østergaard).
This bug is corrected in 2.2 kernels.

2.3 SMP Programming

Section by Jakob Østergaard.

This section is intended to outline what works, and what doesn't when it comes to programming multi-threaded software for SMP Linux.

Parallelization methods

POSIX Threads
PVM / MPI Message Passing Libraries
fork() -- Multiple processes

Since both fork() and PVM/MPI processes usually do not share memory, but either communicate by means of IPC or a messaging API, they will not be described further in this section. They are not very specific to SMP, since they are used just as much - or more - on uniprocessor computers, and clusters thereof.

Only POSIX Threads provide us with multiple threads sharing ressources like - especially - memory. This is the thing that makes a SMP machine special, allowing many processors to share their memory. To use both (or more ;) processors of an SMP, use a kernel-thread library. A good library is the LinuxThreads, a pthread library made by Xavier Leroy which is now integrated with glibc2 (aka libc6). Newer Linux distributions include this library by default, hence you do not have to obtain a separate package to use kernel threads.

There are implementations of threads (and POSIX threads) that are application-level, and do not take advantage of the kernel-threading. These thread packages keep the threading in a single process, hence do not take advantage of SMP. However, they are good for many applications and tend to actually run faster than kernel-threads on single processor systems.

Multi-threading has never been really popular in the UN*X world though. For some reason, applications requiring multiple processes or threads, have mostly been written using fork(). Therefore, when using the thread approach, one runs into problems of incompatible (not thread-ready) libraries, compilers, and debuggers. GNU/Linux is no exception to this. Hopefully the next few sections will sched a little light over what is currently possible, and what is not.

The C Library

Older C libraries are not thread-safe. It is very important that you use GNU LibC (glibc), also known as libc6. Earlier versions are, of course possible to use, but it will cause you much more trouble than upgrading your system will, well probably :)

If you want to use GDB to debug your programs, see below.

Languages, Compilers and debuggers

There is a wealth of programming languages available for GNU/Linux, and many of them can be made to use threads one way or the other (some languages like Ada and Java even have threads as primitives in the language).

This section will, however, currently only describe C and C++. If you have experience in SMP Programming with other languages, please enlighten us.

GNU C and C++, as well as the EGCS C and C++ compilers work with the thread support from the standard C library (glibc). There are however a few issues:

When compiling C or C++, use the -D_REENTRANT define in the compiler command line. This is necessary to make certain error-handling functions work like the errno variable.
When using C++, If two threads throw exceptions concurrently, the program will segfault. The compiler does not generate thread-safe exception code. The workaround is to put a pthread_mutex_lock(&global_exception_lock) in the constructor(s) of every class you throw(), and to put the corresponding pthread_mutex_unlock(...) in the destructor. It's ugly, but it works. This solution was given by Markus Ferch.

The GNU Debugger GDB as of version 4.18, should handle threads correctly. Most Linux distribution offer a patched, thread-aware gdb.

It is not necessary to patch glibc in any way just to make it work with threads. If you do not need to debug the software (this could be true for all machines that are not development workstations), there is no need to patch glibc.

Note that core-dumps are of no use when using multiple threads. Somehow, the core dump is attached to one of the currently running threads, and not to the program as a whole. Therefore, whenever you are debugging anything, run it from the debugger.

Hint: If you have a thread running haywire, like eating 100% CPU time, and you cannot seem to figure out why, here is a nice way to find out what's going on: Run the program straight from the shell, no GDB. Make the thread go haywire. Use top to get the PID of the process. Run GDB like gdb program pid. This will make GDB attach itself to the process with the PID you specified, and stop the thead. Now you have a GDB session with the offending thread, and can use bt and the like to see what is happening.

Other libraries

ElectricFence: This library is not thread safe. It should be possible, however, to make it work in SMP environments by inserting mutex locks in the ElectricFence code.

Other points about SMP Programming

Where can I found more information about parallel programming?
Look at the Linux Parallel Processing HOWTO
Lots of useful information can be found at Parallel Processing using Linux
Look also at the Linux Threads FAQ
Are there any threaded programs or libraries?
Yes. For programs, you should look at: Multithreaded programs on linux (I love hyperlinks, did you know that ? ;))
As far as library are concerned, there are:

OpenGL Mesa library
Thanks to David Buccarelli, Andreas Schiffler and Emil Briggs, it exists in a multithreaded version (right now [1998-05-11], there is a working version that provides speedups of 5-30% on some OpenGL benchmarks). The multithreaded stuff is now included in the regular Mesa distribution as an experimental option. For more information, look at the Mesa library

BLAS
Pentium Pro Optimized BLAS and FFTs for Intel Linux
Multithreaded BLAS routines are not available right now, but a dual proc library is planned for 1998-05-27, see Blas News for details.

The GIMP
Emil Briggs, the same guy who is involved in multithreaded Mesa, is also working on multithreaded The GIMP plugins. Look at http://nemo.physics.ncsu.edu/~briggs/gimp/index.html for more info.

Next Previous Contents