[LWN Logo]
[LWN.net]

Sections:
 Main page
 Security
 Kernel
 Distributions
 On the Desktop
 Development
 Commerce
 Linux in the news
 Announcements
 Linux History
 Letters
All in one big page

See also: last week's Kernel page.

Kernel development


The current kernel release is 2.4.10. The latest prepatch from Linus is 2.4.11-pre2, released on October 1. It includes a number of tweaks, including the unpleasant oops that afflicted some -pre1 users, and a bunch of merging from the "ac" series. Among other things, the new license tags (see the September 6 kernel page) are going into the standard kernel.

The latest prepatch from Alan Cox is 2.4.10-ac4. It includes most of the 2.4.10 changes, but has explicitly left out the massive virtual memory changes and other "seriously unsafe stuff." As Alan says: "I actually use the trees I release and I want to keep my machines working."

For those who find current kernels to be a bit too much on the bleeding edge, Mikulas Patocka has released a patch to the 0.01 kernel fixing a bug in the disk request sorting algorithm. Linus responded by offering to make Mikulas the official maintainer of the 0.01 series. Time to plug that 386 back in and help out...

On the licensing of security modules. A compromise wording has been worked out for the Linux Security Module interface; the include file will now carry this statement:

This file is GPL. See the Linux Kernel's COPYING file for details. There is controversy over whether this permits you to write a module that #includes this file without placing your module under the GPL. Consult your lawyer for advice.

Meanwhile, it appears that the EXPORT_SYMBOL macro, which makes kernel functions and data structures available to modules, will be augmented with a new EXPORT_SYMBOL_GPL variant. The new tag, clearly, will make a symbol available only to GPL-licensed code; the new license tags should make it possible to enforce that restriction automatically. Once EXPORT_SYMBOL_GPL is in place, the security module code may switch over to using it. Maybe.

There is, however, no plan to switch any existing symbols over to a GPL-only mode. Alan says:

Linus has made it absolutely (as in he'll send out the killer penguin with chainsaw if need be) clear that existing symbols wont mysteriously turn GPL only.

So authors of existing, proprietary modules need not worry that they will lose access to kernel symbols in the future.

Dealing with high network loads. The 2.4 networking stack works quite well, for the most part. It does have one issue, however: dealing with extremely high load. When very large numbers of packets are coming into a system, interrupt processing tends to push all other work aside. In the best case, no user-space work gets done. High loads can also bring a 2.4.10 system down entirely.

Ingo Molnar decided to address this problem, and some related ones having to do with the processing of software interrupts. His patch implements a technique called "soft mitigation." Essentially, if the hardware interrupt rate exceeds a given threshold, the kernel simply disables that interrupt for a timer tick interval (10ms on most systems). The system thus gets a break in which it can catch up.

There are, however, some problems with this approach. The constant threshold can not be set in a way that works for all situations; the maximum tolerable interrupt rate depends on a great many things, including the CPU speed, the cost of servicing the interrupt, and what else is happening on the system. Simply disabling the offending interrupt is easy (no cooperation from the driver or hardware is required), but it is hard on the performance of any other device that may be sharing the same interrupt line. Simply shutting down interrupts on a network interface for 10ms can cause it to start dropping packets in a big way, creating serious network performance problems.

The biggest problem, however, may be that another solution exists and has been in testing for some time. The NAPI ("New API") code, developed by Jamal Hadi Salim, Robert Olsson, and Alexey Kuznetsov, deals with interrupt load problems and much more. The NAPI work is based on the techniques discussed at the Kernel Summit last March, but the work has progressed since then. It has not, perhaps, received the degree of attention that it should have, though this discussion has raised its profile somewhat. Now, if only the project had a proper web site, it might become truly widely known...

NAPI works with modern network adaptors which implement a "ring" of DMA buffers; each packet, as it is received, is placed into the next buffer in the ring. Normally, the processor is interrupted for each packet, and the system is expected to empty the packet from the ring. The NAPI patch responds to the first interrupt by telling the adaptor to stop interrupting; it will then check the ring occasionally as it processes packets and pull new ones without the need for further interrupts.

People who have been on the net for a long time might appreciate this analogy: back in the 1980's, many of us had our systems configured to beep (interrupt) at us ever time an email message arrived. In 2001, beeping mail notifiers are far less common. There's almost always new mail, there's no need for the system to be obnoxious about it. Similarly, on a loaded system, there will always be new packets to process, so there is no need for all those interrupts.

When the networking code checks an interface and finds that no more packets have arrived, interrupts are reenabled and polling stops.

NAPI takes things a little farther by eliminating the packet backlog queue currently maintained in the 2.4 network stack. Instead, the adaptor's DMA ring becomes that queue. In this way, system memory is conserved, packets are less likely to be reordered, and, if the load requires that packets be dropped, they will be disposed of before ever being copied into the kernel.

NAPI requires some changes to the network driver interface, of course. The changes have been designed to be incremental, though. Drivers which have not been converted will continue to function as always (well, at least, as in 2.4.x), but the higher performance enabled by NAPI will require modifications.

Linus likes the NAPI approach, but has said nothing about when it might be merged. One would normally expect it to go into 2.5, with a possible backport to 2.4 later. In the modern world, though, one never knows... It is also possible that parts of Ingo's patch may end up being used as a last-resort, "save the system" response.

Those interested in NAPI can download the USENIX paper describing the techniques used. The actual code is available from Robert Olsson's FTP site.

Other patches and updates released this week include:

  • Jaroslav Kysela has released version 0.9.0beta8 of the ALSA sound driver system.

  • kdb v1.9 has been released by Keith Owens.

  • Also from Keith is modutils 2.4.10.

  • Robert Love has posted a new version of his patch which enables network devices to contribute to the random entropy pool.

  • Version 1.0.6 of IBM's Journaling Filesystem was announced by Steve Best.

  • Dave Jones has released Powertweak 0.99.4, a tuning and hardware configuration tool.

  • A new single-copy pipe implementation was posted by Manfred Spraul.

  • Jeremy Elson has announced the first public release of the "Framework for User-Space Devices," which allows a user-space daemon to handle operations to device files.

  • A preemptible kernel patch was released by Robert Love.

  • The third release candidate of LVM 1.0.1 was announced by Heinz Mauelshagen.

  • Loop-AES-v1.4e, a file and swap encryption module, was released by Jari Ruusu.

Section Editor: Jonathan Corbet


October 4, 2001

For other kernel news, see:

Other resources:

 

Next: Distributions

 
Eklektix, Inc. Linux powered! Copyright © 2001 Eklektix, Inc., all rights reserved
Linux ® is a registered trademark of Linus Torvalds