From owner-freebsd-current@FreeBSD.ORG Thu Jul 12 18:11:13 2012 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CAC2D106564A for ; Thu, 12 Jul 2012 18:11:13 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 73CD58FC0C for ; Thu, 12 Jul 2012 18:11:13 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id BE5A8B91A; Thu, 12 Jul 2012 14:11:12 -0400 (EDT) From: John Baldwin To: Scott Long Date: Thu, 12 Jul 2012 13:51:20 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p17; KDE/4.5.5; amd64; ; ) References: <201207121040.27116.jhb@freebsd.org> <1342111033.198.YahooMailNeo@web45716.mail.sp1.yahoo.com> In-Reply-To: <1342111033.198.YahooMailNeo@web45716.mail.sp1.yahoo.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201207121351.20951.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Thu, 12 Jul 2012 14:11:12 -0400 (EDT) Cc: Peter Jeremy , "current@freebsd.org" Subject: Re: Adding support for WC (write-combining) memory to bus_dma X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jul 2012 18:11:13 -0000 On Thursday, July 12, 2012 12:37:13 pm Scott Long wrote: > > ----- Original Message ----- > > From: John Baldwin > > To: current@freebsd.org > > Cc: scottl@freebsd.org; Peter Jeremy > > Sent: Thursday, July 12, 2012 7:40 AM > > Subject: Adding support for WC (write-combining) memory to bus_dma > > > > I have a need to allocate static DMA memory via bus_dmamem_alloc() that is > > also WC (for a PCI-e device so it can use "nosnoop" transactions). > > This is > > similar to what the nvidia driver needs, but in my case it is much cleaner to > > allocate the memory via bus dma since the existing code I am extending all > > uses busdma. > > > > I have a patch to implement this on 8.x for amd64 that I can port to HEAD if > > folks don't object. What I would really like to do is add a new paramter to > > > > bus_dmamem_alloc() to specify the memory attribute to use, but I am hesitant > > to break that API. Instead, I added a new flag similar to the existing > > BUS_DMA_NOCACHE used to allocate UC memory. > > > > Please don't add new parameters. Now that I'm carefully documenting the > evolution of the APIs, it's becoming glaringly apparent how sloppy we are > with API design and interface compatibility. I'm just as guilty of it as anyone, > but I'd really like to see less instances of call signature changes in existing > functions; they make driver maintenance tedious and are hard to effectively > document. Some options I can think of: > > 1. create bus_dmamem_alloc_attr(). I don't really like leafy API growth like > this either, but it's not a horrible solution. I would actually not oppose this either. In this particular case it is a bit useful as I think the user shouldn't have to explicitly state a default if they don't need a non-standard attribute. Side-band comment: if I were going to change the API of bus_dmamem_alloc(), my biggest request would be a bus_dmamem_alloc_size() that takes the size to allocate as a parameter rather than pulling the size out of the tag. I always think of a tag as describing a DMA engine's capabilities and restrictions, whereas the size of, say, a descriptor ring is often a software-configurable knob (e.g. hw.igb.nrxd) and thus the allocation size isn't really a property of the DMA engine. I also find this to be the least intuitive behavior in the bus DMA API. But that is an entirely different ball of wax. > 2. There are existing placeholder flags, BUS_DMA_BUS[1234] that could be > aliased and repurposed to hold 4 bits of attribute information for this function. > The 3 and 4 variants are already in use, but I haven't looked closely to see > their scope. Humm, I could do that. In practice WC is only really applicable on x86. Also, to be honest, I doubt anyone will ever use special attributes besides UC and WC for DMA on x86. That is the main reason why I just added a new flag and didn't try to add a generic scheme for specifying the memory attribute. I could easily just move BUS_DMA_WRITE_COMBINING into x86's and have it use one of the free placeholder flags. That is probably the simplest short term solution. > 3. Reserve the top 16 bits of the flags for attribute information. > 4. Move the attribute information into the tag and create new setter/getter > accessors for attribute information. This would probably be the cleanest, > though it breaks the existing sloppiness of allowing different pseudo-attributes > for different allocations under the same tag. I've wanted to break down the > existing bus_dma_tag_create() into finer-grained setter/getters for a while in > any case. > 5. Move the attribute information into the map and force everyone to start > creating maps for static memory allocations. This would actually add some > missing uniformity to the API and might actually be cleaner that option 4. > > > While doing this, I ran into an old bug, which is that if you were to call > > bus_dmamem_alloc() with BUS_DMA_NOCACHE but a tag that otherwise fell through > > to using malloc() instead of contigmalloc(), bus_dmamem_alloc() would actually > > change the state of the entire page. This seems wrong. Instead, I think that > > any request for a non-default memory attribute should always use > > contigmalloc(). In fact, even better is to call kmem_alloc_contig() directly > > rather than using contigmalloc(). However, if you change this, then > > bus_dmamem_free() won't always DTRT as it doesn't have enough state to > > know if > > a small allocation should be free'd via free() or contigfree() (the latter > > would be required if it used a non-default memory attribute). The fix I used > > for this was to create a new dummy dmamap that is returned by bus_dmamem_alloc > > if it uses contigmalloc(). bus_dmamem_free() then checks the passed in map > > pointer to decide which type of free to perform. Once this is fixed, the > > actual WC support is rather trivial as it merely consists of passing a > > different argument to kmem_alloc_contig(). > > Yup, this is a problem, and I like your fix; this kind of state is exactly what > belongs in the map. Why don't I break out the fix first as a separate patch. -- John Baldwin