[LWN Logo]
[LWN.net]

Sections:
 Main page
 Security
 Kernel
 Distributions
 On the Desktop
 Development
 Commerce
 Linux in the news
 Announcements
 Linux History
 Letters
All in one big page

See also: last week's Kernel page.

Kernel development


The current kernel release is still 2.4.4. Linus did release 2.4.5pre2 on May 15; it was his first kernel release in almost two weeks. It was followed one day later by 2.4.5pre3. There is little in the prepatches that is exciting, which is how it should be in a stable kernel series.

Alan Cox has released 2.4.4ac9, with a rather longer set of fixes. Included therein is a set of user-mode Linux patches, presumably a result of the wider exposure that UML is getting as part of the "ac" series.

Andrea Arcangeli has also gotten in the act with 2.4.5pre2aa1, which has a number of performance and bugfix patches. Mr. Arcangeli is also working with the 2.2 series, and has released 2.2.20pre2aa1 with a number of additions to that kernel.

A moratorium on device number assignments. It all started with this note from the "Linux Assigned Names and Numbers Authority," otherwise known as H. Peter Anvin:

Linus Torvalds has requested a moratorium on new device number assignments. His hope is that a new and better method for device space handing will emerge as a result.

Major numbers, of course, are part of the device files that Unix has implemented since the beginning. The major number encoded within any particular device file serves as an index into an array within the kernel; it is used to find the device driver which is responsible for managing that device.

These numbers have traditionally been assigned in a static manner. For example, block major number 3 (among others) belongs to IDE disks. Given the static assignment, distributors can set up their systems with a full set of /dev/hd* files, knowing that they will work with all systems. People (and, especially, vendors) who add new drivers to the kernel like to get static device numbers for the same reason - it is easier to make things work everywhere.

There are some problems, however. The kernel is running out of available major numbers (see the March 29 LWN kernel page), and an expansion will be required. Management of the /dev directory is increasingly difficult; a quick check on your author's system shows over 6000 entries there. And devices are increasingly dynamic - many can be attached and removed while the system is running, making static naming difficult.

Linus has evidently decided that it is time to deal with the device numbering problems, and is trying to force the issue by making it hurt. There are two very different aspects of this development that are worth a look. The next item examines the effect of Linus' tactics on kernel development; then we'll take a more technical look at what shape a solution might have.

A fork of the kernel? Not everybody is pleased with the device number moratorium. Those who wish to support new devices under the 2.4 kernel will now have to manage without static numbers. Working with dynamic major numbers is not all that hard, but it does require some work and some boot-time support. Not everybody believes that the static numbering scheme is a problem, but even those who do see a problem would, in general, have preferred that Linus wait until 2.5 to impose his moratorium. Stopping number registration before the stable series is truly stable changes the rules at an inconvenient time, and seems rather heavy-handed.

In response, Alan Cox has stated that he will still accept static device number registrations in his "ac" series of kernels:

And on that issue I'm so convinced you are wrong I'm prepared to maintain sensible Unix device behaviour in the -ac pretty much indefinitely.

H. Peter Anvin will continue to maintain a device number registry for the "ac" kernels. Given Alan's position, it is almost certain that future kernels distributed by Red Hat will follow this behavior and honor any new device numbers. It is also quite likely that other distributors will take a similar approach.

In other words, Linus has made an unpopular decision and the kernel has been forked as a result. The behavior that most users will see in future 2.4 kernels from distributors will probably not be what Linus has decreed.

This is an interesting development, to say the least, but it is also not quite as big a deal as one might think, for a couple of reasons. The first is that Alan still does not plan to go his own way with his kernels:

One thing I absolutely refuse to do is to let a disagreement over some specific device implementation turn into an excuse for a wider difference in the trees. So yes -ac might have static majors but the rest of it I intend to keep merging with Linus and tracking closely to his tree.

The other important reason has to do with how kernel development is done. The Linux kernel is often pointed out as being the unifying factor that keeps Linux systems roughly in sync. But the fact of the matter is that the kernel is probably the most heavily forked free software package in existence. Consider:

  • The "ac" series has always been a fork - it is more than just a staging area for patches on their way to the Linus kernel. Alan's approach is to get the fixes in more quickly, while simultaneously being more pragmatic about what users of the kernel really need.

  • No distributor ships a standard Linus kernel - all apply patches. For example, the 2.4.2 kernel shipped with Red Hat Linux 7.1 includes over 200 patches, including 2.4.2ac3, numerous performance and bugfix patches, zero-copy networking, TUX, and much more. See the 2.4 kernel spec file for the gory details. Not all distributors patch this heavily, but they all ship kernels which differ significantly from the Linus standard.

  • Every port to a different processor is a fork of the kernel which is only resynchronized occasionally. There is currently quite a bit of divergence between the port development trees and the official 2.4.4 kernel.

  • Projects like RTLinux, RTAI, etc. are also forks.

The thing that makes all this work is that all of these forks sync up with the official Linus kernel occasionally. Thus, while a only small percentage of Linux users are actually running a Linus kernel, that kernel serves as the Linux "standard" which charts the course for all the others. As long as the forked kernels follow Linus's flagship, the differences between them will remain relatively small.

So this particular disagreement is not all that significant in the long run, and this particular fork will probably go away in 2.5, when the device naming issue gets figured out. But it does indicate a possible series of events in the future. Linus will, one day, no longer be the benevolent dictator of the kernel. But his departure may not be via the feared "hit by a bus" scenario, or via a high-profile passing of the scepter to an anointed successor. Instead, users may wake up one morning and realize that they have been using somebody else's kernel for quite some time, since it better suits their needs. What that Linus guy is doing just won't seem so important anymore. That day won't be here anytime soon, but, in the distant future, it might just happen.

So...now what? Now that The Word has come down that static device numbering is going away, it's time to figure out what will replace it. There are no obvious, front-runner solutions waiting in the wings; instead, a fair amount of discussion will likely be required. Actually, a tiresome, sometimes acrimonious debate extending well into the 2.5 development series seems likely. It looks a lot like a repeat of the devfs wars.

The ultimate shape of the solution is far from clear at this point, but some themes are already apparent.

  • Things are going to change, and static major-number assignments are going to go away. Major numbers and device types will not be tied to each other anymore. Linus sees the current situation as an administrative nightmare, and is impatient with those who would defend it, even in the 2.4 "stable" series. Thus the intemperate, widely-quoted No more SHIT! posting.

  • A system's device configuration, as seen in user space, will become simpler. Installed disks, for example, will show up as /dev/disk1, /dev/disk2, and so on, regardless of where the drives are physically installed, and regardless of whether some are SCSI and others are IDE, or just about any other concern.

    Consider, for example, this Linus posting on the naming of network interfaces. They are simply named: eth0, eth1, and so on. It does not matter where they are installed in the system, whether they are 10M or 100M cards, etc. Linus believes all devices should be named this way.

  • Dynamic devices will predominate, to the point that even nailed-down devices will be treated as dynamic. Truly static devices are getting rarer, and Linus, at least, sees no point in maintaining "artificial" distinctions between static and dynamic devices.

  • Device naming will get more dynamic and kernel-driven, but there is great resistance to encoding device naming policy in the kernel. There is also the issue of access control - what permissions these dynamic device files should have. If this problem can be solved to everybody's satisfaction, the rest should seem relatively easy.

As an example of how interesting device naming could get, consider the issue of ioctl calls. Some applications now actually look at major numbers to decide which ioctl commands are safe to apply to a given device. If the device numbers become dynamic, this technique no longer works. A complicating factor is that fact that, despite some effort by the kernel developers, the numbers of the ioctl commands are not all distinct. So one device's "rewind" command could potentially be another's "halt and catch fire" operation. One clearly does not want to mix these things up.

Various ideas have gone around on how to address this problem, including setting up a way to query devices to see which ioctl interface(s) they support. But Linus has proposed another idea: why not treat the device names as directories and export much of the ioctl functionality that way? Thus, /dev/fd0 might still be a diskette drive, but an access to /dev/fd0/eject would eject the disk. Many of the ioctl issues would be simplified, and it would also make it easier to do things in scripts.

And, of course, this approach would help to preserve backward compatibility by preserving the older interface for applications that have not been changed. To quote Linus one more time:

It should be a case of "Just plug in a new kernel, and suddenly your existing filesystem just allows you to do more! 20% more for the same price! AND we'll throw in this useful ginzu knife for just 4.95 for shipping and handling. Absolutely free!"

As was pointed out, sometimes it appears that Linus has been in the U.S. for a little too long already...

Other patches and updates released this week include:

  • Jeff Garzik has a new Tulip Ethernet driver, and is looking for testers to find any remaining problems.

  • A new single-copy pipe patch was posted by Manfred Spraul.

  • James Bottomley has posted a new driver for the NCR Dual 700 SCSI card.

  • CML2 1.4.3 was released by Eric Raymond.

  • Heinz J. Mauelshagen has announced that write access to the CVS repository for the logical volume manager project is now enabled. This step is being taken as part of an effort to open up LVM development and to better integrate it with the rest of the kernel process.

  • A document describing the use of global spinlocks has been posted on the Linux Scalability Effort site. It tries to cover all of the global locks used with the kernel, and document just what they protect.

Section Editor: Jonathan Corbet


May 17, 2001

For other kernel news, see:

Other resources:

 

Next: Distributions

 
Eklektix, Inc. Linux powered! Copyright © 2001 Eklektix, Inc., all rights reserved
Linux ® is a registered trademark of Linus Torvalds