[LWN Logo]
[LWN.net]

Sections:
 Main page
 Security
 Kernel
 Distributions
 Development
 Commerce
 Linux in the news
 Announcements
 Linux History
 Letters
All in one big page

See also: last week's Kernel page.

Kernel development


The current development kernel release is 2.5.2, which was released on January 14. The final version of the patch added relatively little to the prepatches; some more scheduling tweaks, a devfs update, and more block device work. It also includes a bug that prevents swap from working properly; people who really want to run 2.5.2 should probably apply this patch.

2.5.3-pre1 came out shortly thereafter. It includes the swap fix, more scheduler work, a parallel port update, and, perhaps most interestingly, the surprise appearance of Andre Hedrick's new ATA (IDE) driver code (see below).

Dave Jones's latest is 2.5.2-dj1. It fixes a number of compilation problems people have encountered in 2.5.3-pre1, adds a scheduler update, and throws in a few other fixes.

Update: it appears that there is a problem with the new ATA driver which can hang systems. Andre is recommending not using 2.5.3-pre1 until he can get a fix out.

The current stable kernel release is still 2.4.17. The 2.4.18 prepatch is up to 2.4.18-pre4; it is restricted to the sorts of fixes and updates one would expect to see in a stable series.

Those looking for a bit more adventure in a 2.4 prepatch may be interested in Alan Cox's return to the "ac" business: 2.4.18-pre3-ac2. This prepatch is more development oriented, with goodies like Rik van Riel's reverse mapping virtual memory, 32-bit UID quota support, and, yes, Andre Hedrick's ATA patches.

The Linux IDE/ATA subsystem. The current Linux ATA (IDE) subsystem is a crucial piece of code. After all, it is responsible for handling I/O to and from the disks that are used on the vast majority of Linux systems; there are good reasons for wanting it to work reliably. So it can be unsettling to hear that subsystem called an unmaintainable hack, complete with the occasional "kooky kludge," and liable to corrupt data. Especially when the person speaking this way is Andre Hedrick, the ATA subsystem's maintainer.

According to Andre, the ATA code's problem comes from its long history. The code has been slowly evolved, with ever more complex patches being applied to make it work with new hardware. Any real attempt at design, says Andre, fell by the wayside in the 2.1 series (when the driver was made to support all architectures) and has been absent since. Rigorous testing and validation of the ATA drivers has not been done. There is, in fact, a known (rare) situation, involving the failure of a DMA transfer, that can corrupt data on the disk. Finally, the current driver does not support a fair amount of modern hardware and its new command modes.

What's needed, it is said, is a massively reworked ATA driver which has been redesigned from the beginning, has been verified to work in all situations, and which supports current and future hardware. Andre, of course, has such a driver - and has for some time. This code boasts a fairly impressive set of features:

  • It supports the ATA Command Block (ACB) (also known as "taskfile") method of controlling drives. An ACB encapsulates an ATA operation in a way very similar to the analogous SCSI command blocks; it is a successor to the old "go poking a bunch of registers" method of controlling ATA devices. In the short term, ACB's are the key to controlled command sequencing and error handling; they are part of the solution for the occasional data corruption problems. The ACB mode is also required to access a number of newer drive features, and will be mandatory for future hardware (such as serial ATA).

  • A number of new features are already supported. These include 48-bit addressing (needed to make use of those nifty new 160GB drives), tagged command queueing, and expanded chipset support.

  • The drivers have been extensively tested with ATA protocol analyzers and other vendor-supplied test harnesses, and have been shown to work.
The new ATA code has been around for a while. Some vendors (i.e. SuSE and others) have shipped it in their stock kernels. It is a part of Alan Cox's 2.4-ac patches. A number of users swear by it. But it did not make it into the 2.4 kernel, and it only got into the 2.5.3 prepatches just in time to force last-minute revisions to this article. Why has this patch remained on the outside for so long?

For 2.4, the main sticking point would appear to be the size and nature of the patch - 350KB for the 2.4.16 version. Since the patch completely reworks the internals of a vital kernel subsystem, people are understandably a little nervous about it. This large patch does not fit into the slow, evolutionary nature of much kernel development; it can not be broken up into small, simple patchlets.

In recognition of the natural reluctance to include a patch of this nature, the patch is designed (1) to allow the use of the old code paths when so instructed, and (2) to be selectable as a separate configuration option. Even so, Linus never wanted to include it. Marcelo Tosatti, the current 2.4 maintainer, does intend to include the patch in the future, when it has seen some more testing.

On the 2.5 side, the block I/O work got in first. Andre suggests that it might have been better to merge in a proven and verifiable ATA layer before thrashing the upper block I/O layers, but that is not how it happened. Now that the block changes have stabilized (for now), the ATA patch has been slipped in. Barring unforeseen problems, it should be a part of the 2.5.3 release.

Part of the problem, though, has been with Andre's approach to communication with the rest of the kernel developers. He tends at times toward volume and defensiveness, and has managed to annoy a number of people. Linus essentially refused to deal with him for a while, telling him to work through Jens instead (though that situation has since improved). Difficult personalities are not hard to come by in free software development communities, but it remains true that it can be harder to get your code included if you are hard to work with.

In any case, the situation seems close to a resolution. The code will see wide testing in both the 2.4-ac and 2.5.x kernels, and it should eventually find its way into the 2.4 kernel as well. Now it must be time to get one of those 160GB disks...

Nailing down initramfs. Part of the 2.5 plan for some time has been the merging of Alexander Viro's initramfs patch. This patch was covered on this page last August; it creates an initial ramdisk containing user-space code which completes the boot process. The contents of this ramdisk are appended to the kernel image itself. The idea is to move boot-time code out of the kernel entirely and to allow greater control over the system initialization process.

One question that is being considered now is: what, exactly, will people want to put in the initramfs image? Greg Kroah-Hartman has been polling people on this question as a way of figuring out what sort of C library will be required. Some of the things that have come up include:

  • Versions of fsck for the popular filesystems. Putting the checker into the initramfs image would allow checking of the root filesystem before it is mounted, which would be a good thing.

  • Partition discovery code. The code that figures out how a particular disk drive is partitioned currently lives in the kernel, but it need not really be there.

  • The full hotplug support mechanism, as a result of the fact that most or all devices will be treated as being hotpluggable in the future (but we'll get to that in a moment).

  • Network discovery tools, such as the DHCP client.

  • The full busybox tool suite.
Adding in busybox would make a 2.5 kernel into a complete, standalone, runnable system - though the kernel image would start to get pretty large.

All this leads to the question of how the ramdisk image will be built, and where the code will live. Some of the code (such as that which finds and mounts the root filesystem) comes straight from the kernel, and seems to be tightly tied to it. Perhaps it should remain part of the kernel distribution. On the other hand, very few people think that busybox should be added to the kernel tree.

So the kernel build process is probably going to have to get a little more complicated. Some kernel initramfs code will have to be merged in with other utilities which are maintained externally, and the whole mess will become the bootable kernel image. This one may take a little while to straighten out.

Those who are curious about what the initramfs image will actually look like can go to the draft specification of the initramfs buffer format.

Alan Cox also let slip another part of the plan for initramfs; this one is proving a little more controversial. It seems that kernel modules will go into the initramfs image as well. In fact, there will no longer be such a thing as a compiled-in driver; all kernels will have to load drivers (and other components) as modules from the initramfs.

Not everybody likes this idea. Many people build kernels with no loadable module support at all, and wish to continue doing so. Their reasons include:

  • Security. Some people feel safer if there is not an easy way to patch code into their running kernels. The fact of the matter, though, is that the Bad Guys figured out how to modify a running kernel some time ago, whether or not that kernel has loadable module support.

  • Performance. For a number of reasons, modular code runs a little more slowly, especially on some architectures. See the November 15, 2001 LWN Kernel Page for more information on why. Performance is a real issue, but it appears that it can be dealt with.

If one accepts that security is a non-issue and that the performance problems can be solved, and seeing that the plan is to treat even nailed-down hardware as if it were hot-pluggable, this change seems fairly likely to happen. Expect the 2.5 kernel to look rather different from its predecessors.

Other patches and updates released this week include:

Section Editor: Jonathan Corbet


January 17, 2002

For other kernel news, see:

Other resources:

 

Next: Distributions

 
Eklektix, Inc. Linux powered! Copyright © 2002 Eklektix, Inc., all rights reserved
Linux ® is a registered trademark of Linus Torvalds