[LWN Logo]
[LWN.net]

Sections:
 Main page
 Security
 Kernel
 Distributions
 Development
 Commerce
 Linux in the news
 Announcements
 Letters
All in one big page

See also: last week's Kernel page.

Kernel development


The current development kernel release is 2.5.3. The latest 2.5.4 prepatch is 2.5.4-pre1, released on February 5. This prepatch contains the usual sort of fixes and updates, but no major changes. The most interesting thing about this patch may be that Linus used BitKeeper to produce it (see below), and the changelog has, as a result, become far more verbose.

Dave Jones's current patch is 2.5.3-dj3. It includes the latest fixes from the 2.4.18 prepatches, and tosses in the radix tree page cache patch for good measure.

It is worth noting that there have been a few reports of filesystem corruption under 2.5.3. Nothing has really been nailed down, however.

The latest 2.5 status summary is available from Guillaume Boissiere (who also maintains a nice web version).

The current stable kernel release is 2.4.17. Marcelo released 2.4.18-pre8 on February 4; this prepatch adds another set of fixes but no new features. A note from Marcelo indicates that the first 2.4.18 release candidate will be coming out soon.

Alan Cox's latest is 2.4.18-pre7-ac3, which includes a different set of fixes and a few filesystem updates.

Linus tries out BitKeeper. LWN reported way back in 1998 that Larry McVoy's BitKeeper tool was being positioned to help out with kernel development. More than three years later, Linus is actually giving it a try. This experiment is, of course, a direct result of the "patch penguin" discussion (as covered last week). Once things have stabilized, the hope is that BitKeeper will make it easier for Linus to handle patches and keep the development process on track.

For the moment, the benefits of BitKeeper are yet to be seen - Linus has been spending his time getting up to speed and working the system into shape. One immediate plus, however, is that BitKeeper makes it easy to retain the comments that maintainers put on their patches for Linus; thus, the new, verbose changelogs.

Change information is also being made available on a BitKeeper site. More information - including the actual, individual patches - is available there. This information is likely to be incorporated into the kernel.org system as well.

The real potential of BitKeeper will be realized when a sufficient number of other kernel developers start using it. BitKeeper makes it easy to move specific patches between repositories, and provides some seriously slick tools for handling merges. With luck, a more productive kernel development team will emerge from the testing period. Meanwhile, however, as Larry points out:

On the other hand, he's only been using it for a week and he isn't saying it is the best thing since sliced bread. So it's a bit premature to predict whether he will be using it in a month or not. We hope so, and we'll keep working to make you happy with it, but Linus is a harsh judge - if BK doesn't help out, he'll kick it out the door.

Stay tuned.

The radix tree page cache patch by Momchil Velikov and Christoph Hellwig was merged into 2.5.3-dj3. It is expected to make an appearance in the 2.5.4 prepatches before too long, but it was not included in -pre1. Time for a quick look at what this patch does.

The Linux page cache is a collection of pages that belong (usually) to files; they are kept in main memory for performance reasons. The page cache can often take up the bulk of the memory in use at any given time. Whenever a process reads or writes a file, takes a page fault, or is swapped out, the page cache is involved.

The performance of the page cache thus has a big influence on the performance of the system as a whole. When a particular page is called for, the page cache must be able to find the page (or determine that the page is not in the cache) quickly, and, preferably, with minimal memory overhead. The Linux kernel has long maintained a global hash table for pages in the cache. This system works reasonably well much of the time, but it does have a couple of limitations:

  • A page's hash is not unique, so the system must be able to resolve collisions. This is handled by chaining pages in a linked list off the hash table entries. The list entries require eight bytes for every page of physical memory in the system - whether the page is in the cache or not. Traversing the hash chains also adds a small overhead, especially when the desired page is not present.

  • The page cache is controlled by a single, global lock, which presents scalability problems even with a small number of processors.

The radix tree patch addresses these problems by combining a better data structure with a fundamental observation about the nature of the page cache. The observation is this: the existing page cache is a large, global data structure, but the page caching problem is actually a set of smaller, individual tasks. With the patch, Linux actually has many little page caches, one for each open file in the system. The page cache which tracks pages from, say, /usr/bin/vi need not coexist with the cache for index.html in a web documents directory.

Separating page caches from each other has a couple of advantages. One is that each page cache can have its own lock, and the global page cache lock goes away. The other is that the searching problem becomes much smaller, since a smaller address space is involved.

When each file has its own page cache, the only index needed to look up a specific page is its offset within the file. A 32-bit offset will handle files up to 246 bytes (when 4K pages are used) - enough to keep even the streaming video people happy for now. A complete lookup table for 32-bit values would, at 8GB, be a little too large - it would tend to erode the memory savings gained by removing the hash table list entries from the page structure. So a radix tree is used instead. Essentially, the 32-bit offset is divided into sub-fields as follows:

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The current patch uses 7-bit indices, as shown above; that number may eventually change based on benchmark results, but the algorithm remains the same.

That algorithm, essentially is this: the highest-order subfield is used to look in a 128-entry table in the root of the radix tree; the entry in that table will be a pointer to the next node in the tree. The next lower subfield from the index is used to index that node, yielding the third one. Eventually the algorithm finds its way to the bottom of the tree and obtains the page pointer - or finds an empty entry and knows that the page is not present. This data structure thus strongly resembles the system's page tables - which is not entirely surprising, given that it is solving a very similar problem.

An important optimization employed by the patch is to make the tree no deeper than necessary - for small files (up to 128 pages - most of them), the tree will have only one node. Only the least significant subfield of the offset (the red bits in the diagram above) will be used. Lookups should be almost instantaneous.

Users of the "dj" kernels can run the new code now; expect it to appear in a Linus kernel before too long. There is also a version of the patch available for the 2.4.18-pre7 kernel.

Other patches and updates released this week include:

Core kernel code:

Development tools:

  • Hubertus Franke has posted a patch which allows the gcov test coverage tool to be used with the kernel.

Device drivers

  • devfsd 1.3.23 was released by Richard Gooch.

  • Pavel Machek has posted a patch which adds driverfs support for motherboard devices.

Filesystems:

  • UVFS 0.3.1, a user-space filesystem kit, was released by Britt Park.

Kernel building:

  • Michael Elizabeth Chastain has resumed maintenance of the CML1 code.

Miscellaneous:

Networking:

  • Dmitry Kasatkin has announced the release of version 0.9-pre10 of the "affix" BlueTooth stack.

  • Jean Tourrilhes has announced a new IrDA patch and the latest version of the wireless extensions.

Section Editor: Jonathan Corbet


February 7, 2002

For other kernel news, see:

Other resources:

 

Next: Distributions

 
Eklektix, Inc. Linux powered! Copyright © 2002 Eklektix, Inc., all rights reserved
Linux ® is a registered trademark of Linus Torvalds