From owner-freebsd-current@FreeBSD.ORG Wed Jan 11 16:12:45 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D63D81065672; Wed, 11 Jan 2012 16:12:45 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id 96BC78FC0C; Wed, 11 Jan 2012 16:12:45 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 0FFB87300A; Wed, 11 Jan 2012 17:29:44 +0100 (CET) Date: Wed, 11 Jan 2012 17:29:44 +0100 From: Luigi Rizzo To: John Baldwin Message-ID: <20120111162944.GB2266@onelab2.iet.unipi.it> References: <20120110213719.GA92799@onelab2.iet.unipi.it> <20120110224100.GB93082@onelab2.iet.unipi.it> <201201111005.28610.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201201111005.28610.jhb@freebsd.org> User-Agent: Mutt/1.4.2.3i Cc: Adrian Chadd , freebsd-current@freebsd.org Subject: Re: memory barriers in bus_dmamap_sync() ? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Jan 2012 16:12:45 -0000 On Wed, Jan 11, 2012 at 10:05:28AM -0500, John Baldwin wrote: > On Tuesday, January 10, 2012 5:41:00 pm Luigi Rizzo wrote: > > On Tue, Jan 10, 2012 at 01:52:49PM -0800, Adrian Chadd wrote: > > > On 10 January 2012 13:37, Luigi Rizzo wrote: > > > > I was glancing through manpages and implementations of bus_dma(9) > > > > and i am a bit unclear on what this API (in particular, bus_dmamap_sync() ) > > > > does in terms of memory barriers. > > > > > > > > I see that the x86/amd64 and ia64 code only does the bounce buffers. > > That is because x86 in general does not need memory barriers. ... maybe they are not called memory barriers but for instance how do i make sure, even on the x86, that a write to the NIC ring is properly flushed before the write to the 'start' register occurs ? Take for instance the following segment from head/sys/ixgbe/ixgbe.c::ixgbe_xmit() : txd->read.cmd_type_len |= htole32(IXGBE_TXD_CMD_EOP | IXGBE_TXD_CMD_RS); txr->tx_avail -= nsegs; txr->next_avail_desc = i; txbuf->m_head = m_head; /* Swap the dma map between the first and last descriptor */ txr->tx_buffers[first].map = txbuf->map; txbuf->map = map; bus_dmamap_sync(txr->txtag, map, BUS_DMASYNC_PREWRITE); /* Set the index of the descriptor that will be marked done */ txbuf = &txr->tx_buffers[first]; txbuf->eop_index = last; bus_dmamap_sync(txr->txdma.dma_tag, txr->txdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); /* * Advance the Transmit Descriptor Tail (Tdt), this tells the * hardware that this frame is available to transmit. */ ++txr->total_packets; IXGBE_WRITE_REG(&adapter->hw, IXGBE_TDT(txr->me), i); the descriptor is allocated without any caching constraint, the bus_dmamap_sync() are effectively NOPs on i386 and amd64, and IXGBE_WRITE_REG has no implicit guard. > We could use lfence/sfence on amd64, but on i386 not all processors support ok then we can make it machine-specific versions... this is kernel code so we do have a list of supported CPUs. > those. The broken drivers doing it by hand don't work on early i386 CPUs. > Also, I personally don't like using membars like rmb() and wmb() by hand. > If you are operating on normal memory I think atomic_load_acq() and > atomic_store_rel() are better. is it just a matter of names ? My complaint was mostly on how many unused parameters you need to pass to bus_space_barrier(). They make life hard for both the programmer and the compiler, which might become unable to optimize them out. I understand that more parameter may help parallelism, but i wonder if it is worth the effort. cheers luigi