Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 1 Aug 2006 09:47:53 -0400 (EDT)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        arch@FreeBSD.org, net@FreeBSD.org
Subject:   Re: Changes in the network interface queueing handoff model
Message-ID:  <17615.23433.918293.466584@grasshopper.cs.duke.edu>
In-Reply-To: <20060801142558.M64452@fledge.watson.org>
References:  <20060730141642.D16341@fledge.watson.org> <17615.18793.700752.342809@grasshopper.cs.duke.edu> <20060801142558.M64452@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help

Robert Watson writes:
 > 
 > On Tue, 1 Aug 2006, Andrew Gallatin wrote:
 > 
 > > > - The ifnet send queue is a separately locked object from the device driver,
 > > >    meaning that for a single enqueue/dequeue pair, we pay an extra four lock
 > > >    operations (two for insert, two for remove) per packet.
 > >
 > > Going forward, especially now that we support sun4v CoolThreads hardware, 
 > > we're going to want to rethink the "single lock" per transmit routine model 
 > > that most drivers have.  The most expensive operation in transmit routines 
 > > is bus_dmamap_load_mbuf_sg(), especially when there is an IOMMU involved 
 > > (like on CoolThreads machines) and there is no reason why this needs to be 
 > > called with a driver's transmit lock held.  I have hard data (from Solaris) 
 > > about how much fine grained locking in a 10GbE driver's transmit routine 
 > > helps.
 > 
 > Right now, with the exception of locking for the ifnet dispatch queue, I 
 > believe our ifnet API pretty much leaves decisions about the nature and 
 > granularity of synchronization to the device driver author.  The ifnet queue 
 > is high on my list to address (hence this thread) -- are there any other parts 
 > of our device driver framework that stand in the way from a device driver 
 > being modified to support greater parallelism in sending?

No, not that is directly related to ethernet drivers.

However, busdma is a pain.  Specifically, I hate that
bus_dmamap_load_mbuf_sg() requires a bus_dmamap_t.  That means that
any fine-grained driver will need to "allocate" a bus_dmamap_t either
via bus_dmamap_create(), or by pulling a pre-allocated bus_dmamap_t
from a pre-allocated pool.  Either will require a lock.  Solaris has a
similar problem, and I use the pool approach in my Solaris driver.

Linux's pci_map_single()/pci_unmap_addr_set()/pci_unmap_len_set()
is just so much nicer to use...

Drew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?17615.23433.918293.466584>