Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 24 Dec 2006 12:56:39 +1100 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Oleg Bulyzhin <oleg@freebsd.org>
Cc:        cvs-src@freebsd.org, Gleb Smirnoff <glebius@freebsd.org>, src-committers@freebsd.org, Bruce Evans <bde@freebsd.org>, cvs-all@freebsd.org
Subject:   Re: cvs commit: src/sys/dev/bge if_bge.c
Message-ID:  <20061224124016.F24444@delplex.bde.org>
In-Reply-To: <20061223215918.GA33627@lath.rinet.ru>
References:  <200612201203.kBKC3MhO053666@repoman.freebsd.org> <20061220132631.GH34400@FreeBSD.org> <20061222003115.R16146@delplex.bde.org> <20061223215918.GA33627@lath.rinet.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 24 Dec 2006, Oleg Bulyzhin wrote:

> On Fri, Dec 22, 2006 at 01:24:45AM +1100, Bruce Evans wrote:
>> On Wed, 20 Dec 2006, Gleb Smirnoff wrote:
>>>
>>> I have a suspicion that this may cause a problem under high load. Imagine
>>> that thread #1 is spinning in bge_start_locked() getting packets out
>>> of interface queue and putting them into TX ring. Some other threads are
>>> putting the packets into interface queue while its lock is temporarily
>>> relinguished be the thread #1. In the same time interrupts happen, some
>>> packets are sent, but the TX ring is never got empty.
>>>
>>> The above scenario will cause a fake watchdog event.
>>
>> bge_start_locked() starts with the bge (sc) lock held and never releases
>> it as far as I can see.  This this problem can't happen (the lock
>> prevents both txeof and the watchdog from being reached before start
>> resets the timeout to 5 seconds).

> it's quite unusal) and it is not lock related:
> 1) bge_start_locked() & bge_encap fills tx ring.
> 2) during next 5 seconds we do not have packets for transmit (i.e. no
>   bge_start_locked() calls --> no bge_timer refreshing)
> 3) for any reason (don't ask me how can this happen), chip was unable to
>   send whole tx ring (only part of it).
> 4) here we have false watchdog - chip is not wedged but bge_watchdog would
>   reset it.

Then it is a true watchdog IMO.  Something is very wrong if you can't send
512 packets in 5 seconds (or even 1 packet in 5/512 seconds).

> I'd say correct solution would be combination of yours and previous bge_txeof()
> behaviour:
> 1) We should cancel watchdog if tx ring is empty (as you did)
> 2) We should not clear bge_timer inside loop (like it was before), but set it
>   to 5. So watchdog is active until tx ring is empty and we will not get
>   watchdog event while tx ring consumer index is moving.

I prefer the overall timeout of 5 seconds (maybe even 1 second, but
the simple algorithm requires at least an average of 1.5 seconds).  I
don't want the watchdog to be delayed by 512*5 seconds in the unlikely
event that 1 packet trickles out every 5 seconds, especially since
implementing this takes more code (though the amount is tiny).  This
won't happen of course.  No interface-start calls in 5 seconds will
also be rare.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061224124016.F24444>