[LWN Logo]

Date:	Sat, 19 Sep 1998 11:28:06 -0700 (PDT)
From:	Linus Torvalds <torvalds@transmeta.com>
To:	Alan Cox <alan@lxorguk.ukuu.org.uk>
Subject: Re: tulip driver in 2.1.11* - 2.1.21 is broken - new driver



On Sat, 19 Sep 1998, Alan Cox wrote:

> > I think both dev->interrupt and dev->tbusy should just go away. Both of
> > them are completely broken, for several reasons:
> >  - they use up a long-word for _one_ bit of information. That's 63 wasted
> >    bits on some architectures.
> 
> They are performance important. And afaik you btw made dev->interrupt a long
> word to cover for the fact the atomic stuff doesnt work on char ;)

Correct. But I did that without grepping for the use of it. Now that I
have, it's very obvious that I shouldn't have made it a long, I should
just have gotten rid of it completely. I don't find anything actually
using it, except for a few drivers who caused the problem in the first
place.

As for tbusy, I know it's supposed to be used for flow control. Grepping
for it in the driver directory showed very few cases of that, and almost
every use seemed to be about using it for re-entrancy.

More importantly, grepping for it shows that EVERY SINGLE USE OF IT IS
WRONG!

As such, it would be a lot better to just get rid of it completely, and
instead introduce a "dev->flags" field, and make sure that we _only_ use
that for what it's supposed to be used for. 

Tell me if I'm wrong on any of the following. I may be, but I can't find
why:

				Chapter 1
			  "Why tbusy is wrong"
				   by
			     Linus Torvalds

  - grep for it
	99% of all uses are non-atomic. That's obviously wrong in itself,
	as most of those are outside any global interrupt locks, and the
	flag is tested and set by interrupts, not by bottom half handlers. 

	The ones that ARE atomic are wrong too, because those seem to all
	be used for re-entrancy testing rather than for flow control.

	Ergo, every single use of tbusy seems to be buggy. Better just get
	rid of it.

  - qdisc_wakeup()
	This is the main routine that is supposed to do the re-entrancy
	test. To some degree this is the "heart" of tbusy. And even this
	simple routine gets it utterly wrong.

	It's testing tbusy without setting it, and without having disabled
	the events that can change it. As such, it is neither UP nor SMP
	safe - it happens to work apparently because all drivers when they
	change their state of tbusy will also force a NET_BH to be run,
	even if they don't actually have any incoming packets.

	Or maybe it works due to the window being fairly small, and us
	being lucky - packet re-send just restarts the transmission even
	if it died for a while. 

				Chapter 2
	  	"How to fix tbusy without going crazy"

  - get rid of it
	There are no correct users of it right now anyway, so there are no
	instances of it being used that we want to save. Getting rid of it
	will at least allow the compiler to find all cases where somebody
	_tries_ to use it. 

  - instantiate a new "dev->flags"
	Make qdisk_wakeup() do a

		if (!test_and_set_bit(DEV_TRANSMIT, &dev->flags)) {
			struct Qdisc *q = dev->qdisc;
			if (qdisc_restart(dev) && q->h.forw == NULL) {
				q->h.forw = qdisc_head.forw;
				qdisc_head.forw = &q->h;
			}
		}

	and then make the cases where the network drivers marked tbusy = 0
	(the correct ones - the ones that tell the system that its write
	queues are empty - this is about 10% of the tbusy use as far as I 
	can tell) do a

		if (!test_and_clear_bit(DEV_TRANSMIT, &dev->flags)) {
			panic("buggy driver, cleared DEV_TRANSMIT twice");
		}

	when their transmit buffer empties.

The above is what tbusy _should_ be doing already, but implemented right.

Am I wrong?

(Oh, "Chapter 3" is very short, and only has a heading and no actual text. 
It says "Remove dev->interrupt, because nobody is using it"). 

		Linus


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/