[LWN Logo]

To:	linux-kernel@vger.rutgers.edu
From:	torvalds@transmeta.com (Linus Torvalds)
Subject: Re: lots of 2.2.4 oopses
Date:	24 Mar 1999 19:39:51 GMT

In article <36F92EBB.B33E673E@pobox.com>,
Valient Gough  <vgough@pobox.com> wrote:
>
>I upgraded from 2.2.3 to 2.2.4 last night.  Since then I've had to
>lockups.  The second one managed to log "Unable to handle kernel paging
>request at virtual address 00de3f9c" before it locked up, but I didn't
>have the correct System.map file in place, so klogd didn't have symbols.
>
>But, this morning, I had 4 such oops in a row, with a correct
>System.map.

Ok, they are all basically the same oops, the file pointer list has
gotten corrupted somehow, and that corruption will result in an oops
whenever the kernel tries to touch one of the file pointers.

The oops happens in remove_filp() (which is an inline function,
explaining why the symbol table says it is in get_empty_filp()).

The code looks like

        if(file->f_next)
                file->f_next->f_pprev = file->f_pprev;
        *file->f_pprev = file->f_next;

and the thing oopses apparently because file->f_next is just crap (it
appears to have the value 0x2c302c35, to be exact - seems to be the
string "5,0," in fact).

I would _much_ prefer it if you could try to compile 2.2.4 with
gcc-2.7.2.  We now have two reports of an oops like this, and in both
cases egcs was used as the compiler.  And egcs is doing alias analysis
these days - and following the ANSI rules about aliases is something
that we haven't validated that the kernel is doing correctly (and that's
assuming that egcs is bug-free, which is obviously also a potential
issue). 

Basically, right now I don't know whether this is (a) a real kernel bug
(b) the kernel having some place it is not alias analysis aware and
doing something ugly that breaks with egcs or (c) an egcs bug.

Compiling with gcc-2.7.2 would at least resolve whether it is the
compiler that induces the effect..

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/