[LWN Logo]
[LWN.net]

Sections:
 Main page
 Security
 Kernel
 Distributions
 On the Desktop
 Development
 Commerce
 Linux in the news
 Announcements
 Linux History
 Letters
All in one big page

See also: last week's Kernel page.

Kernel development


The current kernel release is 2.4.9, which was released on August 16. Linus then jetted off to Finland, so no prepatches have been forthcoming. 2.4.9 has compilation errors for a subset of users (see below), but appears stable for most.

The EMU10K1 (SB Live) driver is stabilizing in 2.4.9. Note, however, that you'll need the user-space tools from Creative to make full use of the sound card with the new driver.

Alan Cox's latest is 2.4.8ac9. As can be seen by the name, Alan has not yet synched up to the 2.4.9 release (he "doesn't see the point" right now), but he has included the usual vast list of fixes and improvements.

Those interested in how the Linus and ac kernels relate may want to take a look at this explanation from Alan, posted on Slashdot.

With the -ac tree I try and do rapid rolling releases, sucking in new code to test it and also its interactions with other new code. By doing releases every few days I get a high number of people testing and reporting bugs before there are too many possible causes. This is how Linus trees used to work long ago, and I still think its the better technique.

The redefinition of min() and max(). C programmers since the very beginning of the language have been familiar with the min() and max() macros, which are usually taken directly out of the first edition of the K&R book as:

    #define max(A,B) ((A) > (B) ? (A) : (B))
    #define min(A,B) ((A) < (B) ? (A) : (B))
Certainly kernel programmers have been reading their K&R - a quick grep of the 2.4.8 source turns up more than 150 individual definitions of min(). Usually, when a body of code contains that many duplicated definitions, it's time to consider a cleanup.

So, perhaps, not too many people may have been surprised when 2.4.9 included a common definition (in linux/kernel.h) for these two little macros. When, however, various modules started turning up compilation errors, and it turned out that the new min() and max() have a different interface, people were surprised indeed, and not particularly pleased. Interface changes during this stable series have become almost commonplace, but few people expected to see a change to something so common and fundamental.

It seems that the new min() and max() have a third argument, being the type of data being compared. So, to get the minimum of two integers, one would code:

    minimum = min(int, a, b);
There are perfectly good reasons for doing things this way; when values of different types are being compared, the explicit type determines what type will be used to do the comparison. It makes things explicit, and forces people to think about what they are doing.

Unfortunately, it also breaks quite a bit of existing code. Any code which defines its own version of these macros will end up with compilation errors. Even worse, for many, is the fact that there is no way to create backward compatibility macros to cover over the difference. A driver using these macros which compiles for 2.4.9 will not compile for earlier versions of the kernel. Linus has tended to not to be sympathetic toward developers who are trying to maintain portability to older kernels, but there are quite a few of them trying to do so anyway.

The question that has come up, of course, is: if Linus wanted a type-aware variant of min() and max(), why didn't he create something with a new name (i.e. typed_min()) and leave the classic macros alone? The answer seems to be that Linus wants to eradicate the old, two-argument macros from the kernel altogether, and so (by way of David Miller) chose an approach that would break code that has not been fixed. Doing things this way can produce more maintainable code in the long term, at the cost of some real short-term pain.

But, one would not normally make such a change in the middle of a stable kernel series, and not to something as well understood as min() and max().

Linus, of course, put this change out in 2.4.9 final and immediately fled the country; the cynical among us might surmise that he knew there would be some discontent. And discontent there is; among other things, Alan Cox does not plan to merge this change into the "ac" series; he'll make a typed_min() and typed_max() instead. Linus does not often back down on such decisions, though; it will be interesting to see how this one resolves itself.

Feeding entropy from network devices. The Linux kernel provides two pseudo-devices which generate random numbers: /dev/random and /dev/urandom. They both provide (seemingly) random numbers to applications, but they differ in one regard: /dev/random works much harder to ensure that the returned numbers are truly random.

The random number generator works through the maintenance of an "entropy pool," a collection of random data which has been collected from outside sources. The most common source of entropy (randomness) in Linux systems is device interrupts; the time periods between keystrokes or disk interrupts is unpredictable enough to provide a degree of true randomness that can not be had from a software-only random number generator. Each random event adds a certain amount of entropy to the pool. If an application reads random data from /dev/random, the kernel will make sure that there is sufficient entropy in the pool to return truly random numbers; if the entropy is inadequate, the read will block until sufficient entropy has been generated. /dev/urandom, instead, will generate numbers (using a secure hash algorithm) regardless of whether sufficient entropy exists; it never blocks waiting for entropy.

In theory, that difference means that a sufficiently clever attacker could, perhaps, predict the random numbers that will be generated by /dev/urandom. Using the predicted numbers, the attacker could proceed to make a mess of any cryptographic or security code using /dev/urandom. Such an attack remains entirely theoretical, however; it would be in no way easy, and nobody has ever demonstrated a way of successfully predicting Linux's random numbers.

Nonetheless, people worry, and many applications will only use random data from /dev/random. On some systems, this can lead to problems if the system is not generating enough entropy; suddenly ssh connections take a long time to start up, and things get unresponsive in general. Network firewalls, with no keyboard and little or no disk activity can be especially susceptible to this problem.

The answer, seemingly, would be to use the arrival of network packets as another source of entropy. Historically, this source of entropy has been avoided, since network traffic is susceptible to observation and manipulation by an attacker. In a highly paranoid world, one might worry about an attacker watching network traffic in an effort to predict the contents of the entropy pool on a target system; the attacker could also feed precisely-timed packets to the target in the hopes of influencing random number generation there. Once again, nobody has ever gotten close to demonstrating an attack of this nature, but if security people didn't worry they would have little to do.

Now, however, Robert Love has submitted a patch which allows the system to use entropy from network traffic, subject to a kernel configuration option. There is some real opposition to the patch; some people feel that network entropy should not be treated as entropy at all, and that applications should just be using /dev/urandom in these cases. The wider consensus, however, is that sometimes network entropy is the best you can get, and that it makes sense to give the user a choice of whether to use it. After all, when, ten years from now, some super cracker develops a network entropy exploit, you can always turn the feature off.

New no-bounce high memory I/O patches have been posted by Jens Axboe (see also the quick update that came out shortly afterward). This patch is rapidly approaching a state of readiness, and, with luck, should find its way into a stable kernel sometime soon. It eliminates the need to use "bounce buffers" on systems with large (multiple GB) amounts of memory, even on systems where the kernel does not directly address high memory. One user has reported a 40% performance increase when running with the patch.

Incorporated into Jens' patch is the new 64-bit PCI DMA interface designed by David Miller. He has also posted the PCI64 patches separately for those who would like to take a look at them. With these patches, DMA I/O is possible on systems with very large amounts of memory (more than 4GB) if the hardware is up to the task. There is also a later revision of this patch available. Between these two efforts, the kernel's support for high-end systems will be much improved.

Other patches and updates released this week include:

  • Ben LaHaise has a new patch which performs merging of virtual memory areas, addressing the problem discussed in the August 9 LWN Kernel Page. Ben claims a significant speedup for Mozilla when running with this patch.

  • A QNX4 filesystem implementation with write support has been made available by Sergey Tzukanov.

  • Keith Owens has posted modutils 2.4.7.

  • Ben Breear has released a patch which adds 802.1Q VLAN support to the kernel.

  • devfs 189 and devfsd v1.3.17 have been released by Richard Gooch.

  • Andrew Morton has released an ext3 filesystem patch for 2.4.9.

  • Rik van Riel has posted a patch allowing interested people to tune the virtual memory system on their machines. It's intended mostly for VM hackers and those running benchmarks.

  • Wichert Akkerman has released strace 4.4.

  • A version of User-mode Linux for 2.4.9 has been released by Jeff Dike.

  • Steve Best has announced release 1.0.3 of IBM's Journaling Filesystem.

  • Greg Kroah-Hartman has released a Compaq hotplug PCI driver for 2.4.8ac8.

  • Also from Greg is a security module release for the 2.4.9 kernel. The security module hackers are beginning to think about submitting the patch for inclusion into the kernel.

  • A new release of the IQ80310 board port has been announced.

Section Editor: Jonathan Corbet


August 23, 2001

For other kernel news, see:

Other resources:

 

Next: Distributions

 
Eklektix, Inc. Linux powered! Copyright © 2001 Eklektix, Inc., all rights reserved
Linux ® is a registered trademark of Linus Torvalds