Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Jun 2008 11:26:32 -0700
From:      John-Mark Gurney <jmg@funkthat.com>
To:        Jason Harmening <jason.harmening@gmail.com>
Cc:        freebsd-hackers@freebsd.org, freebsd-drivers@freebsd.org
Subject:   Re: bus_dmamem_alloc
Message-ID:  <20080611182632.GT3767@funkthat.com>
In-Reply-To: <200806072341.17041.jason.harmening@gmail.com>
References:  <200806072341.17041.jason.harmening@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Jason Harmening wrote this message on Sat, Jun 07, 2008 at 23:41 -0500:
> I've written a FreeBSD driver for Conexant CX2388x-based PCI TV capture cards.  

I have a partial one in P4 that seems to handle data transers fine, but
as ATI never gave me the docs for programming the ATSC demodulator, I
haven't worked on it in a long time..

> Of course the driver uses busdma to be as machine-independent as possible.  
> One problem I've encountered is that bus_dmamem_alloc is inadequate for my 
> needs.  The CX2388x only understands 32-bit physical addreses, and the driver 

This restriction is set up by the dma tag...

> has two separate use cases for busdma:
> 
> 1)  Data buffers: These buffers may be relatively large (a 640x480 RGB32 video 
> frame is ~1.2M), and therefore it is desirable that these buffers not be 
> physically contiguous.
> 
> 2)  DMA program buffers:  The DMA engine on the CX2388x is controlled by 
> special-purpose RISC instructions, usually stored in host memory, that 
> provide information on, among other things, the physical layout of the data 
> buffers, which enables handling of non-contiguous data buffers.  These 
> programs are rarely more than a few pages in size, so for the sake of 
> simplicity it is desirable that DMA program buffers be physically contiguous.

Why not use the SRAM for this?  That's what my driver does...  w/ 32k
SRAM, it's more than enough for more programs...

> For case 1), I malloc(9) the buffers and then feed them to busdma, since on 
> most machines bus_dmamem_alloc just calls contigmalloc.  Use of malloc(9) is 
> suboptimal as it may result in bounce buffering for non-IOMMU machines with 
> large amounts of RAM.

I prefer to do direct to use DMA as it saves on allocating a buffer in
the kernel, and then coping the data from that buffer...

> For case 2), I contigmalloc the DMA program buffers in the 32-bit physical 
> address range and then feed them to busdma.  I don't use bus_dmamem_alloc 
> here because it always allocates the maximum size specified in the 
> bus_dma_tag_t.  Since the driver supports dynamic data buffer allocation and 
> DMA program generation, DMA program sizes may vary significantly.  I 
> therefore just create the bus_dma_tag_t with the maximum possible size for a 
> DMA program buffer since I'd prefer not to have to re-create the tag every 
> time the DMA program size changes.  But always allocating the maximum buffer 
> size would be a huge waste of contiguous memory, so bus_dmamem_alloc is out 
> of the question here too.   At the same time, use of contigmalloc is 
> suboptimal as it may not be necessary to restrict the allocation to 32-bit 
> physical addresses on IOMMU-equipped machines.  This is something that 
> bus_dmamem_alloc could take care of, if only it supported a size parameter 
> (as I believe the NetBSD version does).
> 
> So I have 3 questions:
> 
> 1)  Would it be possible to provide a bus_dmamem_alloc overload that takes a 
> size parameter?   We could call it bus_dmamem_alloc_size and have 
> bus_dmamem_alloc just call bus_dmamem_alloc_size with dmat->maxsize to 
> preserve source-level compatibility with existing drivers.

It would be nice, but hasn't been something someone has gotten around to
implementing yet...

> 2) Are there currently any serious plans to have bus_dmamem_alloc perform 
> multi-segment allocations on non-IOMMU machines?  It looks like NetBSD does 
> this by reserving the physical segments and then stitching them together into 
> a virtually contiguous range.  Is something like this feasible for FreeBSD?

This would be useful for large allocations, but for now our code works,
and most IO isn't that large so it hasn't been a bit issue.. It would be
nice though.. :)

> 3) Are there currently any serious plans to support IOMMUs on anything besides 
> Sun machines?  The AMD AGP GART, PowerPC 970 DART, and Intel VT-d and AMD 
> IOMMU all come to mind.

I know that one person recently was working on Intel's VT IOMMU and I
thought it was close to being committed, but I haven't been following
the work...

> If any of these ideas sound feasible, I'd be more than willing to help 
> research/implement/test them.

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080611182632.GT3767>