Date: Fri, 28 Apr 2000 20:21:04 -0300 (BRST)
From: Rik van Riel <riel@conectiva.com.br>
To: Linus Torvalds <torvalds@transmeta.com>
Subject: [PATCH] 2.3.99-pre6 vm fix
Hi Linus,
here's a patch against 2.3.99-pre6 that fixes the stability problem
(apparently it was possible for a process to "slip through" the
tests in swap_out() and end up with a swap_cnt of 0 which would mean
an infinite loop in the leftshifting loop).
It also fixes a correctness issue in kswapd. Kswapd would exit
after one call to do_try_to_free_pages(), even if there was still
a lot of work to do. Now kswapd will play again if there are still
a lot of pages to free.
The performance problem isn't 100% fixed yet, but the other two
things are important enough that I thought I'd send the patch
now instead of after the (extra long) weekend.
regards,
Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.
Wanna talk about the kernel? irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/ http://www.surriel.com/
--- linux-2.3.99-pre6/mm/filemap.c.orig Thu Apr 27 12:49:05 2000
+++ linux-2.3.99-pre6/mm/filemap.c Fri Apr 28 19:49:01 2000
@@ -238,14 +238,13 @@
int shrink_mmap(int priority, int gfp_mask, zone_t *zone)
{
- int ret = 0, loop = 0, count;
+ int ret = 0, count;
LIST_HEAD(young);
LIST_HEAD(old);
LIST_HEAD(forget);
struct list_head * page_lru, * dispose;
struct page * page = NULL;
struct zone_struct * p_zone;
- int maxloop = 256 >> priority;
if (!zone)
BUG();
@@ -262,30 +261,26 @@
list_del(page_lru);
p_zone = page->zone;
- /*
- * These two tests are there to make sure we don't free too
- * many pages from the "wrong" zone. We free some anyway,
- * they are the least recently used pages in the system.
- * When we don't free them, leave them in &old.
- */
- dispose = &old;
- if (p_zone != zone && (loop > (maxloop / 4) ||
- p_zone->free_pages > p_zone->pages_high))
- goto dispose_continue;
+ /* This LRU list only contains a few pages from the system,
+ * so we must fail and let swap_out() refill the list if
+ * there aren't enough freeable pages on the list */
/* The page is in use, or was used very recently, put it in
* &young to make sure that we won't try to free it the next
* time */
dispose = &young;
-
if (test_and_clear_bit(PG_referenced, &page->flags))
goto dispose_continue;
- count--;
+ if (p_zone->free_pages > p_zone->pages_high)
+ goto dispose_continue;
+
if (!page->buffers && page_count(page) > 1)
goto dispose_continue;
- /* Page not used -> free it; if that fails -> &old */
+ count--;
+ /* Page not used -> free it or put it on the old list
+ * so it gets freed first the next time */
dispose = &old;
if (TryLockPage(page))
goto dispose_continue;
@@ -375,9 +370,8 @@
/* nr_lru_pages needs the spinlock */
nr_lru_pages--;
- loop++;
/* wrong zone? not looped too often? roll again... */
- if (page->zone != zone && loop < maxloop)
+ if (page->zone != zone && count)
goto again;
out:
--- linux-2.3.99-pre6/mm/page_alloc.c.orig Thu Apr 27 12:57:20 2000
+++ linux-2.3.99-pre6/mm/page_alloc.c Fri Apr 28 12:29:08 2000
@@ -285,9 +285,11 @@
goto allocate_ok;
/* If we're a memory hog, unmap some pages */
- if (current->hog && low_on_memory &&
- (gfp_mask & __GFP_WAIT))
- swap_out(4, gfp_mask);
+ if (current->hog && low_on_memory && (gfp_mask & __GFP_WAIT)) {
+ // swap_out(6, gfp_mask);
+ // shm_swap(6, gfp_mask, (zone_t *)(zone));
+ try_to_free_pages(gfp_mask, (zone_t *)(zone));
+ }
/*
* (If anyone calls gfp from interrupts nonatomically then it
--- linux-2.3.99-pre6/mm/vmscan.c.orig Thu Apr 27 12:57:58 2000
+++ linux-2.3.99-pre6/mm/vmscan.c Fri Apr 28 19:43:37 2000
@@ -387,8 +387,8 @@
if (!p->swappable || !mm || mm->rss <= 0)
continue;
/* small processes are swapped out less */
- while ((mm->swap_cnt << 2 * (i + 1) < max_cnt))
- i++;
+ while ((mm->swap_cnt << 2 * (i + 1) < max_cnt)
+ && i++ < 10)
mm->swap_cnt >>= i;
mm->swap_cnt += i; /* if swap_cnt reaches 0 */
/* we're big -> hog treatment */
@@ -437,14 +437,13 @@
{
int priority;
int count = SWAP_CLUSTER_MAX;
- int ret;
/* Always trim SLAB caches when memory gets low. */
kmem_cache_reap(gfp_mask);
priority = 6;
do {
- while ((ret = shrink_mmap(priority, gfp_mask, zone))) {
+ while (shrink_mmap(priority, gfp_mask, zone)) {
if (!--count)
goto done;
}
@@ -467,9 +466,7 @@
}
}
- /* Then, try to page stuff out..
- * We use swapcount here because this doesn't actually
- * free pages */
+ /* Then, try to page stuff out.. */
while (swap_out(priority, gfp_mask)) {
if (!--count)
goto done;
@@ -530,12 +527,16 @@
pgdat = pgdat_list;
while (pgdat) {
for (i = 0; i < MAX_NR_ZONES; i++) {
- zone = pgdat->node_zones + i;
+ int count = SWAP_CLUSTER_MAX;
+ zone = pgdat->node_zones + i;
+ do {
if (tsk->need_resched)
schedule();
if ((!zone->size) || (!zone->zone_wake_kswapd))
continue;
do_try_to_free_pages(GFP_KSWAPD, zone);
+ } while (zone->free_pages < zone->pages_low &&
+ --count);
}
pgdat = pgdat->node_next;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/