Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 2 Aug 2006 20:01:16 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        John Polstra <jdp@polstra.com>
Cc:        arch@FreeBSD.org, Robert Watson <rwatson@FreeBSD.org>, net@FreeBSD.org
Subject:   RE: Changes in the network interface queueing handoff model
Message-ID:  <20060802184349.K90387@delplex.bde.org>
In-Reply-To: <XFMail.20060731100533.jdp@polstra.com>
References:  <XFMail.20060731100533.jdp@polstra.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 31 Jul 2006, John Polstra wrote:

> I question whether you need a fallback software if_snd queue at all
> for modern devices such as the Intel and Broadcom gigabit chips.  The
> hardware transmit descriptor rings typically have sizes of the order
> of 256 descriptors.  I think if the ring fills up, you could simply
> drop the packet with ENOBUFS.  That's what happens if the if_snd queue
> fills up, and its maximum size is comparable to the sizes of modern
> descriptor rings.  It would simplify things quite a bit to eliminate
> the if_snd queue entirely for such devices.

I use an if_snd queue length of about 5000 in my version of the sk
driver to work around suckage in ENOBUFS handling.  The hardware (*)
tx ring size is 512, and tiny packets can be sent in 4 usec, so the
hardware queue provides only 2 msec worth of buffering.  select(2)
for output on sockets doesn't work right, so there is no good way (**)
for applications to proceed when a syscall returns ENOBUFS.  An extra
queue length of 500 provides an extra 20 msec worth of buffering which
is usually enough when HZ = 100.

(*) I think the sk tx ring is not really in hardware, so it can be
much larger than 512, but a length of > 5000 for it seems excessive
and caused panics when I tried it.

(**) Various bad ways can be found in various versions of ttcp and
tools/netrate.  They involve either backing off by sleeping (which
doesn't keep the tx active unless the sleep granularity is small
(which only happens under FreeBSD if HZ is too large)), or by never
backing off (which gives busy-waiting).  Instead, select() on the
output socket should actually work -- it should succeed if the tx
queue length is below a low watermark.  Apparently, select() on
output sockets normally doesn't work, since no version of ttcp that
I've looked at (not many) even tries this.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060802184349.K90387>