Date: Sat, 19 Sep 1998 11:28:06 -0700 (PDT) From: Linus Torvalds <torvalds@transmeta.com> To: Alan Cox <alan@lxorguk.ukuu.org.uk> Subject: Re: tulip driver in 2.1.11* - 2.1.21 is broken - new driver On Sat, 19 Sep 1998, Alan Cox wrote: > > I think both dev->interrupt and dev->tbusy should just go away. Both of > > them are completely broken, for several reasons: > > - they use up a long-word for _one_ bit of information. That's 63 wasted > > bits on some architectures. > > They are performance important. And afaik you btw made dev->interrupt a long > word to cover for the fact the atomic stuff doesnt work on char ;) Correct. But I did that without grepping for the use of it. Now that I have, it's very obvious that I shouldn't have made it a long, I should just have gotten rid of it completely. I don't find anything actually using it, except for a few drivers who caused the problem in the first place. As for tbusy, I know it's supposed to be used for flow control. Grepping for it in the driver directory showed very few cases of that, and almost every use seemed to be about using it for re-entrancy. More importantly, grepping for it shows that EVERY SINGLE USE OF IT IS WRONG! As such, it would be a lot better to just get rid of it completely, and instead introduce a "dev->flags" field, and make sure that we _only_ use that for what it's supposed to be used for. Tell me if I'm wrong on any of the following. I may be, but I can't find why: Chapter 1 "Why tbusy is wrong" by Linus Torvalds - grep for it 99% of all uses are non-atomic. That's obviously wrong in itself, as most of those are outside any global interrupt locks, and the flag is tested and set by interrupts, not by bottom half handlers. The ones that ARE atomic are wrong too, because those seem to all be used for re-entrancy testing rather than for flow control. Ergo, every single use of tbusy seems to be buggy. Better just get rid of it. - qdisc_wakeup() This is the main routine that is supposed to do the re-entrancy test. To some degree this is the "heart" of tbusy. And even this simple routine gets it utterly wrong. It's testing tbusy without setting it, and without having disabled the events that can change it. As such, it is neither UP nor SMP safe - it happens to work apparently because all drivers when they change their state of tbusy will also force a NET_BH to be run, even if they don't actually have any incoming packets. Or maybe it works due to the window being fairly small, and us being lucky - packet re-send just restarts the transmission even if it died for a while. Chapter 2 "How to fix tbusy without going crazy" - get rid of it There are no correct users of it right now anyway, so there are no instances of it being used that we want to save. Getting rid of it will at least allow the compiler to find all cases where somebody _tries_ to use it. - instantiate a new "dev->flags" Make qdisk_wakeup() do a if (!test_and_set_bit(DEV_TRANSMIT, &dev->flags)) { struct Qdisc *q = dev->qdisc; if (qdisc_restart(dev) && q->h.forw == NULL) { q->h.forw = qdisc_head.forw; qdisc_head.forw = &q->h; } } and then make the cases where the network drivers marked tbusy = 0 (the correct ones - the ones that tell the system that its write queues are empty - this is about 10% of the tbusy use as far as I can tell) do a if (!test_and_clear_bit(DEV_TRANSMIT, &dev->flags)) { panic("buggy driver, cleared DEV_TRANSMIT twice"); } when their transmit buffer empties. The above is what tbusy _should_ be doing already, but implemented right. Am I wrong? (Oh, "Chapter 3" is very short, and only has a heading and no actual text. It says "Remove dev->interrupt, because nobody is using it"). Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/