[LWN Logo]
[LWN.net]

Sections:
 Main page
 Security
 Kernel
 Distributions
 Development
 Commerce
 Linux in the news
 Announcements
 Back page
All in one big page

See also: last week's Kernel page.

Kernel development


The current development kernel release is 2.4.0-test5. Linus actually sent out an announcement for this release, describing what's in it.

The first prepatch for the -test6 release is available. It consists mostly of small tweaks (many of which are spelling corrections), but also has some MIPS architecture fixes, an IBM MCA SCSI driver update, a big USB storage update, an ext2 filesystem update, and a reorganization of user process accounting.

Ted Ts'o, the new keeper of the 2.4 status list, has posted an updated summary of where the 2.4 release stands. The list remains long. Ted is also maintaining a web page on SourceForge with the current list.

The current stable kernel release is still 2.2.16. The 2.2.17 prepatch is up to 2.2.17pre14; probably at least one more iteration is forthcoming before the official 2.2.17 release.

Towards a new virtual memory system. Difficulties with Linux virtual memory have been popping up since early in the 2.2 stable series. While it works for most people, there are those who can easily get the system into a thrashing, useless state. Lots of work has been done trying to fix things up, with some success. Nonetheless, the current development kernels still can do unpleasant things with some loads.

It looks like 2.4.0 will go out with a less-than-optimal VM implementation. There is still room for tweaking, but Linus is not interested in major changes at this time. And he has a point; there comes a time when you have to draw the line and ship a kernel.

So now the developers are looking toward 2.5, when they'll be able to go in and make radical changes. To that end, Rik van Riel has posted a description of a new VM subsystem as he would like to implement it. It's based heavily on the FreeBSD scheme, which works quite well. But, of course, it will have some special Linux tweaks of its own. See Rik's posting for the details.

Changes to the mount system call, both large and small are on the table. Starting with the smaller issue: the current development kernels handle mounts a little differently from previous kernels (and most Unix systems) in that mounts can stack. Should a system administrator type:

    # mount /dev/hda1 /mnt
    # mount /dev/hda2 /mnt
both mounts will succeed. Somebody looking in /mnt after both operations would see the filesystem that lives on /dev/hda2 - the last one mounted.

Unix systems over the years have not allowed this sort of operation - the second mount would fail with a "mount point busy" error. It seems there are quite a few people who depend on those semantics - a number have complained about the "overmount by default" behavior.

The end result looks to be a return to the old semantics - stacked mounts will not happen unless explicitly requested by the user. (Some might ask why stacked mounts are needed at all; among other things, the automounter can use them to provide for "direct" mount maps.)

The person working with the mount semantics is the same guy who has been making changes all over the filesystem layer - Alexander Viro. He is also working on the addition of "union mounts", where several filesystems can be combined together into a larger, virtual filesystem containing all the files in each of the component parts. The semantics of union mounts still need some thought, however, and no work will be done on them until the 2.5 development series.

In the process of thinking about all this, Mr. Viro came to realize that the current mount interface shows, shall we say, some historical baggage. See this lengthy posting for the full scoop on the problem. Essentially it comes down to (1) the current mount system call interface is, um, inelegant, and (2) it is going to be very hard to add new features, such as union mounts, using the given interface.

So a brand new mount call ("mount6", perhaps) has been proposed, with an API like:

    int mount6 (action, mountpoint, type, 
                flags, device, data);
What's new here is the "action" parameter, which can have values like "mount", "remount", and "bind". With the current interface, the "flags" argument is used, sometimes, to indicate that an action other than a straightforward mount is to occur. Separating the action out will make the interface a lot cleaner.

There seems to be little opposition to the new interface, so it will likely go in at some point. The old mount interface will be preserved (probably by libc), of course, but in this case the interface change will be relatively painless anyway. After all, not very many programs call mount.

How should user space get information about the kernel? It all started with a posting about a compile problem involving one of the kernel header files. It seems that, in some situations, some headers are still being included directly out of the kernel source into user programs. That was supposed to stop happening entirely with glibc 2, and for the most part it has. However, it is still tricky for glibc to get certain kinds of information about how the kernel is configured without going to the header files.

Ulrich Drepper, the maintainer of the Linux glibc port, is direct in his criticism of Linus for not providing a straightforward kernel interface - a sysconf() call - to obtain kernel parameters. Linus has been even more direct, to the point of messing up his soft-spoken image, in his criticism of how glibc does things. According to Linus, kernel support should not be needed to provide user space with various kernel parameters.

How, then, is a user program to obtain information like the maximum number of groups allowed, or the clock tick frequency? Well, according to Linus, the best way to get at constant system parameters is to store them in a file, such as /etc/sysconf. The library can just look in that file, which would be updated (at boot time, perhaps), by a special program that knows where to look. This registry-like file could also contain pure user space information, whatever might be useful in tracking the state of the system configuration.

Not everybody likes the idea; there are some obvious issues to keeping the file synchronized with reality. But Linus is quite clear on the point that no sort of sysconf system call will be added to the kernel.

How can standalone kernel modules find include files? This question came up as a side branch of the sysconf discussion. When a kernel module is built separately from the kernel it will run under (i.e. if it's not part of the standard kernel source), the build process needs to be able to find the right header files. In general, that requires that the person building the module edit the makefile and set the kernel source path directly. That works, but lacks elegance and can be hard for people who are not normally accustomed to building kernels.

Installed kernel modules themselves live in a directory corresponding to the kernel version number under /lib/modules. Thus, modules for a 2.2.16 kernel are likely to be found in /lib/modules/2.2.16 (though many distributor-supplied kernels add on to the version number). So the question came up: when installing the modules, why not have a kbuild directory that has the source to the build kernel as well? Said directory would just be a link to the kernel source tree, of course. Consensus was achieved rather quickly on this idea; expect to see it implemented in future kernels. The change has also found its way into the 2.2.17 prepatch.

Other patches and updates released this week include:

  • Tigran Aivazian has made available his Mutex Comparison Toolkit, which can be used to determine the relative performance of the various kernel locking mechanisms in specific situations.

  • The latest version of Eric Raymond's new kernel configuration scheme is cml2-0.7.5.

  • The Timpanogas Group has released version 2.4.2 of its Netware filesystem implementation.

  • The IP Personality patch is interesting: it is a netfilter module which allows a Linux system to masquerade as something else, thus fooling the various OS fingerprinting tools which are out there.

  • Michael Elizabeth Chastain has written up a new document describing how the kernel makefiles work.

  • Matt Robinson, at TurboLinux these days, has released the 2.0 version of his Linux kernel crash dump analyzer.

Section Editor: Jonathan Corbet


August 3, 2000

For other kernel news, see:

Other resources:

 

Next: Distributions

 
Eklektix, Inc. Linux powered! Copyright © 2000 Eklektix, Inc., all rights reserved
Linux ® is a registered trademark of Linus Torvalds