Day 1: Tutorials: Threads under linux

Liam Widdowson from HP gave an introduction to threads in general and threaded programming with linux. The first thread libraries that were available for linux were user level threads. They use a single process and have a user level scheduler which is nice since you avoid context switching, but you also get poor performance and specifically problems with user IO when it blocks.
The upside is that no kernel support is necessary and porting is easy (a few days for a new unix platform)

Kernel space threads rely on kernel interaction. Now, a thread can block without affecting other threads as they all go through the scheduler. The downside however is that don't want to create thousands and thousands of threads as you almost get the overhead of a process on linux and the scheduler would become overwhelmed.

The 3rd approach is hybrid: you get kernel and user space threads where sets of threads are under a kernel entity each. This way you can still scale to multiple CPUs, and yet most context switches are in user mode. If a thread blocks, only that set of threads blocks, and the other ones continue to work.

Linuxthreads implements the 1 to 1 kernel space threading model with the help of the clone(2) system call. Inside those threads, you cannot use SIGUSR1 and SIGUSR2 as they are used by the threading library and while the whole thing can be a bit kludgy, it removes the complexity associated with implementing another entity and scheduler within the kernel. So far, Linus has apparently been against the hybrid model due to the complexity it introduces
The linux approach isn't fully POSIX compliant though, and for instance each thread gets a different PID (which changes thread behavior wrt signals), but it's close enough for threaded code to work mostly unmodified on linux

Some informal benchmarking shows that it only takes 3 times longer to fork on linux compared to creating a thread (vs 20 times on solaris, and solaris takes another 5 times longer compared to linux), that's why some architectures tend to push threaded programming more than others

What gets complex with threads is that signals don't necessarily reach the thread that caused them. This is true for SIGCHLD for instance
Some other signals like SIGFPE, SIGSEGV, SIGPIPE and SIGTRAP are however delivered to the right thread.
Also, under linux, since each thread gets a PID, a signal sent to a PID will not go to another thread that is waiting for that signal (sigwait).
One portable way to handle this is to install a signal handler, which forwards the signal action through a semaphore to the right thread.

I didn't take pictures of his slides since there were more a long tutorial that you're better off reading off-line, which is easy since he thoughtfully provided a PDF version of his talk

Back to Main Page

Email
Link to Home Page

2001/01/27 (18:00): Version 1.0
2001/02/02 (09:02): Version 1.1. Removed pictures at Liam's request