[LWN Logo]
[LWN.net]

Sections:
 Main page
 Security
 Kernel
 Distributions
 Development
 Commerce
 Linux in the news
 Announcements
 Back page
All in one big page

See also: last week's Kernel page.

Kernel development


The current development kernel release is still 2.4.0-test7. Linus released the 2.4.0-test8-pre1 prepatch on August 29. It contains a first shot at a thread groups implementation (see below), and a number of fixes.

The current stable kernel release is still 2.2.16. The current 2.2.17 prepatch is 2.2.17pre20, which has gone to Linus and is awaiting his official blessing (otherwise known as "holy penguin pee") as the 2.2.17 release.

The Tux2 filesystem saw its first release this week. Tux2 resembles the journaling filesystem efforts in that it seeks to produce a crash-proof system. Instead of journaling, however, Tux2 uses a technique called a "phase tree." A phase tree filesystem structures everything as a tree - including all of the metadata. By observing strict requirements on the ordering of writes, updates can be made to the filesystem by working up the tree, then "committing" the entire set of changes by writing a new root block.

Journaling filesystems can limit performance due to their need to write things twice - once to the journal, and once to the real location on the disk. By avoiding the duplicated write, the phase tree approach should yield better performance; initial tests would appear to confirm that this is the case.

Tux2 is the work of Daniel Phillips; it was sponsored by innominate AG. The current implementation works with the 2.2.13 kernel; a 2.4 version will be available this fall. More information may be found in the announcement.

The end of the big kernel lock? In the 2.0 kernel, the "big kernel lock" was used to keep more than one processor from executing kernel code at any given time. This lock facilitated the implementation of SMP support for Linux, but protecting the entire kernel with a single lock leads to suboptimal performance, even on systems with just two processors. Thus, much of the scalability work since 2.0 has introduced more fine-grained locking; a relatively small portion of the 2.4.0-test kernel is covered by the big kernel lock.

Now kernel hacker Ingo Molnar has posted a patch which eliminates this lock altogether. It is replaced with a semaphore, which is a more efficient way of performing mutual exclusion for the remaining bits of code that need the big kernel lock - as long as no interrupt handling code needs it. This change is a sign that the multithreading of the Linux kernel is just about complete.

It is also quite a deep change to apply to a kernel that is supposed to be in feature freeze, preparing for a major release. Thus, it is not clear that it will actually make it into 2.4.0; Linus has yet to express an opinion on it.

Troubles with threads. A long discussion about support for threads in Linux wandered into the area of POSIX threads. There is a bit of an "impedance mismatch" between Linux and POSIX threads which makes the latter hard to support completely. It is widely believed that Linux threads are better, but POSIX is what a lot of people use.

A lot of the trouble comes from the fact that POSIX threads were designed to be implementable entirely in user space. Linux, however, provides kernel threads which look an awful lot like processes; they provide a lot of features which POSIX did not anticipate. They also make some of the POSIX semantics hard to implement. Examples:

  • Signal semantics. POSIX expects all threads to be known by the same process ID; a signal sent to that ID gets delivered to exactly one thread as determined by per-thread signal masks and, possibly, a digital coin toss. Since Linux threads are processes, each has its own process ID and can be signalled independently.

  • Waiting for child processes. POSIX threads can expect to be able to wait for any child process to exit - even those created by other threads. When threads are separate processes, these semantics can be hard to implement.

  • Running another program with exec(). POSIX states that, when one thread calls exec(), all other threads are terminated before the new program is run. In a user-mode thread implementation, things can be done in no other way. Linux has no need to do that, however.
Much of this trouble can be solved through the creation of a new concept, being the thread group. Thread groups behave much like process groups, in that they allow a set of threads to be treated as single entity. With some simple thread group support and the creation of a "master thread," POSIX threads should be relatively easy to support in a much more complete way.

The 2.4.0-test8-pre1 patch contains a first shot at a thread group implementation. Linus has laid down a challenge to the POSIX threads implementers to work with this new code and see if it makes things easier; otherwise he'll take it out.

Meanwhile there was some fun discussion of things that could be done with the native Linux thread model. These include an unshare() system call which a thread could use to detach itself - partially or completely - from its thread group. Linux could well be the platform that launches a more interesting approach to threads - but not until 2.5...

For Linus's (rather uncomplimentary) view of POSIX threads, have a look at this posting.

IBM announced the release of the Andrew Filesystem (AFS) this week; details may be found in the press release. Word of the AFS release has been around since LinuxWorld; IBM has just now gotten around to telling the world formally about it.

AFS, of course, is the large-scale distributed filesystem favored by a number of large companies and universities. It has a number of advantages over NFS, including better security, location transparency, and disconnected operation. AFS has been available for Linux for some time, but only as a commercial product implemented by binary kernel modules; it has thus been expensive and prone to break when the kernel is upgraded. The release of a free version will certainly be welcomed by many people.

IBM is taking a bit of an interesting approach with this release, however. The use of the IBM public License has already been commented on in this week's front page; that will keep AFS out of the kernel proper. IBM is also forking AFS to make this release. It is not clear that all of the AFS code will be released, and IBM will continue to develop and support the commercial version. The FAQ talks about moving features from the open version to the proprietary one, but is mum about movement in the other direction.

Without more information on how much development effort IBM plans to put into the open AFS implementation, the cynical among us might well conclude that IBM is hoping to tap the free software community for help in improving its proprietary product. The truth there will come out eventually; until then, the donation of AFS is welcome. It gives the free software world something it didn't have before.

Other patches and updates released this week include:

  • H. J. Lu has released an RPM package with a modified 2.2.16 kernel containing such goodies as NFSv3, the ext3 filesystem, the ALSA sound drivers, and more.

  • Arnaldo Carvalho de Melo of Conectiva, who has been actively fixing glitches throughout the kernel code for some time, has posted his TODO list describing what he plans to fix next.

  • User-mode Linux 0.30-2.4.0-test7 has been announced by Jeff Dike.

  • Robert H. de Vries has updated his POSIX timers patch for 2.4.0-test7.

  • Andre Hedrick has posted a tease describing his tag command queueing implementation for ATA disks. The patch also includes features like acoustic management, support for very large disks, and even serial ATA. No patch was posted, so contacting Andre looks like the way to go to play with this code.

  • Vojtech Pavlik has posted version 2.1 of the VIA IDE driver.

  • The Embedded Debian Project has announced the first development release of CML2+OS - a version of Eric Raymond's CML2 kernel configuration system that has been extended to configure and generate an entire operating system - not just the kernel.

  • Sean Walbran has released version 0.20 of the linmodem mini-HOWTO.

  • Randy Dunlap has made available the slides from his presentation on USB at LinuxWorld.

Section Editor: Jonathan Corbet


August 31, 2000

For other kernel news, see:

Other resources:

 

Next: Distributions

 
Eklektix, Inc. Linux powered! Copyright © 2000 Eklektix, Inc., all rights reserved
Linux ® is a registered trademark of Linus Torvalds