a/dm-color

Date:	Sat, 30 Jan 1999 23:25:08 -0800
From:	"David S. Miller" <davem@dm.cobaltmicro.com>
To:	lm@bitmover.com
Subject: Re: Page coloring HOWTO [ans]


I'll just relay my experience when I played around with this, and the
distribution scheme I found worked best.

First a clarification:

   Page coloring, in the sense that we are talking about here,
   is %99 dealing with physically indexed secondary/third-level
   etc. caches.  Virtually indexed secondary/third-level caches
   are dinosaurs and they'll die before anyone cares if we cater to
   them (the two most recent I know of were HyperSparc and aparently
   some HP cpus did this).  (and next will be N-way set assosciative
   secondary/third-level physically indexed caches, here page coloring
   in any form will become close to irrelevant)

A point in terminology/implementation:

   As far as page allocation is concerned, our granularity is
   PAGE_SIZE.  However the caches we want to "color" index with some
   lower order bits as well (that is, the cache line size is certainly
   smaller than PAGE_SIZE).  For the purposes of implementation, act
   as if the low order indexing bits did not exist (this translates in
   the end to, you don't need to know what the cache line size is to
   implement, only the total size matters).

   Assume that each architecture has indicated the cache line size to
   us in asm/cache.h in the form of:

#define L2_CACHE_SIZE	(512 * 1024)

   for example.

   We end up using the following definition in our internal
   implementation to do our work:

#define PAGE_COLOR(X)	(((X) & (L2_CACHE_SIZE - 1)) >> PAGE_SHIFT)

The following is a distribution scheme which I found to work extremely
well in practice and testing:

    Add to task_struct a member "int cur_color;"

    Add to inode a member "int cur_color"

    When giving a new address space to a process (via exec() or some
    other means, but not during fork/clone for example) set
    tsk->cur_color to zero.

    When allocating a new inode structure in the vfs, set
    inode->cur_color to zero.

    Now track page cache, page table allocation, and anonymous page
    faulting in the following way:

       a) At each anonymous page write fault, allocate a free page
          with color current->cur_color, and then increment this.

       b) At each page table page allocation, do the same as in #a

       c) At each addition of a new page into the page cache, allocate
          this page using the vfs object's inode->cur_color, and then
          increment.

(while considering the above scheme, consider the effects it has on
 mmap'd shared libraries etc.)

The only thing left is to implement:

	unsigned long get_colored_page(int gfp_flags, int *color_ptr)

Doing it efficiently and with minimal code changes in the current page
allocator is left right now as an exercise to the reader.  I have some
ideas, and after some experimentation I'll try to describe my ideas
for this here.

But right now I will say that four important issues here are:

1) It has to cost close to nothing.

2) It has to be "obviously correct".

3) It should not try too hard, this is a heuristic after all,
   the first priority is to get some page to the caller quickly.

4) It should not contribute to memory fragmentation.

(note that satisfying #3 is probably the nicest way to satisfy #4)

Later,
David S. Miller
davem@redhat.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/