[LWN Logo]
[LWN.net]

Sections:
 Main page
 Security
 Kernel
 Distributions
 Development
 Commerce
 Linux in the news
 Announcements
 Back page
All in one big page

See also: last week's Kernel page.

Kernel development


The current development kernel release is 2.3.50. This version includes a reworked Cirrus Logic ethernet driver, a new bttv driver, much SuperH processor work, an Appletalk reorganization, a parallel port driver update, a large sound driver update, and a Hercules frame buffer driver.

The current stable kernel release is 2.2.14. The 2.2.15 prepatch is up to 2.2.15pre13; there will probably be at least one more prepatch iteration before the real 2.2.15 release comes out.

Out-of-memory behavior is one of the remaining issues with 2.2.15. The Linux kernel will, in high-load (or low-memory) situations, promise more memory to processes than it can deliver. If too many processes try to cash in those promises, the system reaches a point where something has to give. The 2.2 series has been criticized for a while for its behavior in out of memory (OOM) situations; there are now a couple of hackers working on making things better for 2.2.15.

Why does the kernel overcommit memory? Overcommiting is done because processes often ask for memory that they will never use. The most common case of this is the fork() system call. A process that forks could, in theory, require two copies of all its writable memory (read-only memory, such as program code, can be shared). But the kernel only makes those copies on a page-by-page basis, when something is actually changed. And, in most cases, the process which forked quickly goes on to exec() a different program, and all that memory is discarded without ever having been touched.

Consider a typical large process - emacs, say. If an emacs user runs a "compile" command, emacs will fork, then run make to do the compile. All that emacs memory - which can be substantial - is never needed in the forked copy of the process. If Linux had to actually provide for all that memory, the capability of the system would be much reduced.

Thus overcommitting is necessary, and works almost all of the time. Occasionally, however, Linux will come up short, and will not have memory to give a process that needs it. One obvious response is to simply put the process to sleep, and not wake it until memory becomes available again. Unfortunately, that road can lead very quickly to most of the interesting processes on the system in uninterruptible sleeps. At that point, they can not even be killed to recover their memory, and the system locks up.

An alternative, being pursued by Rik van Riel and Andrea Arcangeli, is to conclude that the system simply can not carry the current load of processes, and kill one or more of them off. It is not an ideal situation, but hopefully it will leave the system in a running state with most of its processes intact.

But...which process do you kill? The answer turns out to be far from easy. The naive approach might be to kill the biggest process, with the idea that it's the one causing the problem. But the big processes tend to be things like the X window server, the aforementioned emacs editor, or some sort of specialized cranker that is the reason for the system's existence in the first place. Killing those processes can lead to lost work and highly irate users.

Killing processes at random also does not work. If init goes away, there will not be much of the system left to save. A process that is directly manipulating hardware (such as the X server again) may leave that hardware in an unusable state. And so on.

Current attempts at proper OOM behavior try to pick out processes which (1) have run for a relatively short time, (2) are not running as root, and (3) are not doing privileged I/O operations. An OOM killer using those guidelines went into 2.2.15pre12, but was removed from pre13 after some complaints. With luck the kernel hackers will be able to get something more robust together for pre14 which will survive wider testing and give Linux 2.2 decent OOM behavior.

Expect a new kernel configuration system in 2.5. As more options get added to the kernel, the process of configuring them gets more complicated. One of the unsung heroes of kernel development is Michael Elizabeth Chastain, who has maintained the configuration system for a long time. He has been struggling to keep up with all of the new features in the kernel, thus far with success. But the end is in sight.

Anybody who has ever configured a kernel build knows that there is a tremendous number of options to decide on. A quick look in Configure.help for 2.3.50 turns up 1226 options. Simply plowing through all of those can be a chore, but the real problem is with dependencies. Many options only make sense if other options have been selected. Some dependencies are relatively simple - you're only concerned with SCSI drivers if SCSI support has been compiled in. Enforcing such dependencies is not terribly hard.

But dependencies increasingly reach across different parts of the kernel. Enabling PCMCIA SCSI cards only makes sense if both PCMCIA and SCSI have been enabled. The current configuration system has a hard time dealing with dependencies - like the above - that do not follow a nice tree structure.

An additional problem is that with tools like menuconfig and xconfig, there is nothing requiring a kernel builder to pass through the options sequentially. On the other hand, very few people have the endurance to go through a full "make config" anymore. As a result, enforcing dependencies - especially in a way that makes sense from a human factors point of view - is even more difficult.

Configuration options need to be presented in a way that makes the dependencies clear. It's not really a kernel hacking problem - it's a user interface problem. It will be interesting to see what solution comes out.

An ARM Linux kernel developer is being sought. If you are interested in this sort of job, check out the announcement.

PerlOS - the horrible, horrible dream. Have a look at this posting about the upcoming Perl Linux kernel and shudder. "There were also the usual angry messages from new Perl users who had stumbled across the list and were demanding to know why upgrading from Perl 5.005 (the intepreter) to Perl 6.001 (the OS) had replaced Windows."

Other patches and updates released this week include:

  • Paul Rusty Russell has written the Linux Kernel Locking HOWTO, which should be required reading for anybody wanting to get into kernel hacking. Why did he write it? "...my pet hamster dressed up in a penguin suit, and appeared to me in a dream, telling me to write documentation for random stuff, and include lots of obscenities."

  • Fairsched 0.15, a hierarchical fair CPU scheduler, has been released by Borislav Deianov.

  • IBM has released version 0.0.2 of its journaling file system for Linux. It's still far from ready for prime time, but progress is being made.

  • A new beta version (0.02) of the InterMezzo high-availability, distributed filesystem has been announced by Peter Braam. It has a lot of cool features, inspired by the Coda filesystem, and is under active development.

  • Werner Almesberger has released the seventh version of his "bootimg" patch, which allows the booting of arbitrary kernel images (without the intervention of LILO or other such loaders). Those who want to play with this should note the warning in the README file: "This is experimental code, which may screw up your kernel such that the first thing it does is to corrupt your hard disk beyond repair. Exercise due caution."

  • Along these lines, a new version of LILO has been posted which is able to get past the 1024-cylinder boot limit that has plagued PC systems for years.

  • SUBTERFUGUE 0.1.1 is out. It is "a framework for observing and playing with the reality of software; it's a foundation for building tools to do tracing, sandboxing, and many other things. You could think of it as 'strace meets expect.'"

  • RTLinux 3.0 pre-release 4 is available.

  • The "comedi" suite of data acquisition drivers is now up to version 0.7.40.

  • Trond Myklebust has released version 0.20.0 of his NFSv3 client implementation.

  • Linmodem-0.2.5, the PCI modem driver, has been released. This version "is close to being usable as a real modem for low speeds (V21/V23), and the V34 code has greatly improved (although it is not usable yet)."

Section Editor: Jonathan Corbet


March 9, 2000

For other kernel news, see:

Other resources:

 

Next: Distributions

 
Eklektix, Inc. Linux powered! Copyright © 2000 Eklektix, Inc., all rights reserved
Linux ® is a registered trademark of Linus Torvalds