From owner-freebsd-usb@FreeBSD.ORG Wed Jan 14 05:12:28 2015 Return-Path: Delivered-To: freebsd-usb@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6F525C35; Wed, 14 Jan 2015 05:12:28 +0000 (UTC) Received: from mail.turbocat.net (heidi.turbocat.net [88.198.202.214]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 115CB63B; Wed, 14 Jan 2015 05:12:27 +0000 (UTC) Received: from laptop015.home.selasky.org (cm-176.74.213.204.customer.telag.net [176.74.213.204]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 52C701FE022; Wed, 14 Jan 2015 06:12:24 +0100 (CET) Message-ID: <54B5FAE8.7020705@selasky.org> Date: Wed, 14 Jan 2015 06:13:12 +0100 From: Hans Petter Selasky User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Ian Lepore Subject: Re: usb_pc_cpu_flush References: <1419359192795-5975583.post@n5.nabble.com> <5499E734.1070507@selasky.org> <1419392511197-5975691.post@n5.nabble.com> <549A811D.3060204@selasky.org> <1419416870924-5975752.post@n5.nabble.com> <1419423740820-5975763.post@n5.nabble.com> <549AB711.8070005@selasky.org> <1419431704871-5975773.post@n5.nabble.com> <549BF430.8000207@selasky.org> <1419877515606-5976832.post@n5.nabble.com> <1421133295061-5980199.post@n5.nabble.com> <1421160576.14601.175.camel@freebsd.org> <54B53956.4090708@selasky.org> <1421163656.14601.184.camel@freebsd.org> <54B54073.6000409@selasky.org> <1421166591.14601.195.camel@freebsd.org> <54B54D4D.3010805@selasky.org> <1421180700.14601.209.camel@freebsd.org> In-Reply-To: <1421180700.14601.209.camel@freebsd.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: kott , freebsd-usb@freebsd.org X-BeenThere: freebsd-usb@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: FreeBSD support for USB List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Jan 2015 05:12:28 -0000 On 01/13/15 21:25, Ian Lepore wrote: > On Tue, 2015-01-13 at 17:52 +0100, Hans Petter Selasky wrote: >> On 01/13/15 17:29, Ian Lepore wrote: >>> On Tue, 2015-01-13 at 16:57 +0100, Hans Petter Selasky wrote: >>>> On 01/13/15 16:40, Ian Lepore wrote: >>>>> On Tue, 2015-01-13 at 16:27 +0100, Hans Petter Selasky wrote: >>>>>> On 01/13/15 15:49, Ian Lepore wrote: >>>>>>> On Tue, 2015-01-13 at 00:14 -0700, kott wrote: >>>>>>>> Yes with cache disabled, this problem is not seen. Seems to be with a issue >>>>>>>> with l2 cache. >>>>>>>> Thanks kott >>>>>>> >>>>>>> Except that there are no known problems with l2 cache on armv7 right >>>>>>> now. There are known problems with the USB driver using the busdma >>>>>>> routines incorrectly, which accidentally works okay on x86 platforms but >>>>>>> likely not so well on others. >>>>>>> >>>>>> >>>>>> Hi, >>>>>> >>>>>> If there is a problem it is in "usb_pc_cpu_flush()" or >>>>>> "usb_pc_cpu_invalidate()": >>>>>> >>>>>> void >>>>>> usb_pc_cpu_flush(struct usb_page_cache *pc) >>>>>> { >>>>>> if (pc->page_offset_end == pc->page_offset_buf) { >>>>>> /* nothing has been loaded into this page cache! */ >>>>>> return; >>>>>> } >>>>>> bus_dmamap_sync(pc->tag, pc->map, BUS_DMASYNC_PREWRITE); >>>>>> } >>>>>> >>>>>> USB has a very simple DMA sync language, either flush or invalidate. >>>>>> These are used correctly from what I can see with regard to the FreeBSD >>>>>> USB specification. >>>>>> (unless a simple restart somehow fixes the problem) >>>>>> If the "usb_pc_cpu_flush()" function does not cause the CPU cache to be >>>>>> written to RAM before the function returns, please let me know. >>>>>> >>>>>> --HPS >>>>> >>>>> You have an incomplete concept of how busdma sync operations work. It >>>>> isn't a simple "cpu cache written to ram" operation, there are bounce >>>>> buffers and other complexities involved that require that the sync >>>>> operations be done at the correct time in the correct order, and the >>>>> current usb driver doesn't do that. Instead it does things like >>>>> >>>>> bus_dmamap_sync(pc->tag, pc->map, BUS_DMASYNC_POSTREAD); >>>>> bus_dmamap_sync(pc->tag, pc->map, BUS_DMASYNC_PREREAD); >>>>> >>>>> And that's just nonsense that will lead to problems like delivering >>>>> random buffer garbage to/from a device. >>>>> >>>>> To the degree that USB works at all on non-x86 platforms it works by >>>>> accident. Usually. Except when it doesn't. >>>>> >>>>> -- Ian >>>>> >>>> >>>> Hi, >>>> >>>> Bounce buffers are perfectly fine with USB as long as the busdma does >>>> what it is told. If there is no easy way to do a simple "cache flush" or >>>> "cache invalide" or bounce buffer "flush" or bounce buffer "invalidate" >>>> multiple times in a row, then busdma cannot co-exist with USB. It is not >>>> because I'm stubborn, but because of the way USB DMA controllers are >>>> designed. >>>> >>>> With USB chipsets we sometimes need to read the RAM area for a single >>>> buffer multiple times to poll for updates. From what I've been told in >>>> the past BUSDMA does. >>>> >>>> --HPS >>>> >>>> --HPS >>>> >>>> --HPS >>>> >>> >>> And so we reach the same old impasse, where you declare that USB is >>> special and doesn't have to behave like other drivers, even though it is >>> in no way unique in terms of having things like concurrent shared access >>> to descriptor memory, something that virtually every modern NIC has. >>> >> >> Hi, >> >> Can you give an example of a NIC driver which you consider a good >> example which use DMA both for data (not only mbufs) and the control >> path and use busdma as you consider to be correct? >> >> --HPS > > dev/ffec/if_ffec.c. I'm not happy with the fact that it takes two calls > (a PRE and a POST) to accomplish a single action, but that's the right > way to do things in the current busdma world, PRE and POST operations > need to be paired. > > I think we need new busdma support for shared concurrent access > descriptors, because it's a type of dma that just isn't supported well > by the existing API. There should be a new flag for bus_dmamem_alloc() > that indicates the memory is going to be used for such shared access > (because some platforms may be able to provide memory that's mapped > optimally for such situations), and there should be new sync ops that > don't require a pair of calls to accomplish a single action. > > All of this is in the context of shared descriptor memory. Regular IO > buffers should just use the proper sequence of PRE and POST syncs (and > most especially should *never* do POSTREAD before PREREAD like the > current usb_pc_cpu_invalidate() does, because with bounce buffers that > will just not work right). > Hi, I understand and that can be done for IO-buffers like in the FFEC driver. For other buffers it is a bigger task, however: I see that some memory is allocated using "BUS_DMA_COHERENT" in if_ffec.c and in those cases no busdma sync operations are performed! Is that correct? For example "rxdesc_ring" is allocated coherently and no cache sync ops are done before and after access. The buffer that Mr. Kott hit a crash on should also be allocated coherently. Looking at the ARM busdma code in -current I see a possible bug: > int > _bus_dmamap_load_buffer(bus_dma_tag_t dmat, bus_dmamap_t map, void *buf, > bus_size_t buflen, struct pmap *pmap, int flags, bus_dma_segment_t *segs, > int *segp) > { > bus_size_t sgsize; > bus_addr_t curaddr; > struct sync_list *sl; > vm_offset_t vaddr = (vm_offset_t)buf; > int error = 0; > > if (segs == NULL) > segs = dmat->segments; > if ((flags & BUS_DMA_LOAD_MBUF) != 0) > map->flags |= DMAMAP_CACHE_ALIGNED; > > if ((dmat->flags & BUS_DMA_COULD_BOUNCE) != 0) { > _bus_dmamap_count_pages(dmat, map, pmap, buf, buflen, flags); > if (map->pagesneeded != 0) { > error = _bus_dmamap_reserve_pages(dmat, map, flags); > if (error) > return (error); > } > } > CTR3(KTR_BUSDMA, "lowaddr= %d boundary= %d, " > "alignment= %d", dmat->lowaddr, dmat->boundary, dmat->alignment); > > while (buflen > 0) { > /* > * Get the physical address for this segment. > */ > if (__predict_true(pmap == kernel_pmap)) { > curaddr = pmap_kextract(vaddr); > } else { > curaddr = pmap_extract(pmap, vaddr); > map->flags &= ~DMAMAP_COHERENT; > } If the memory was allocated coherently and we clear the DMAMAP_COHERENT flag when we are loading the memory, the memory will get freed to the wrong bucket when we later free it - right? > if (map->flags & DMAMAP_COHERENT) > ba = coherent_allocator; > else > ba = standard_allocator; > Kott: Can you try to comment out: map->flags &= ~DMAMAP_COHERENT; In "sys/arm/arm/busdma_machdep.c" and see if it makes any difference? Ian: Can you explain what the check "pmap == kernel_pmap" means when it is true? Does it mean that the memory is mapped into virtual kernel space? And if false? To optimise BUSDMA allocations the USB stack will allocate a PAGE_SIZE object and then load smaller parts of the PAGE_SIZE allocation using BUSDMA. Is this supported by BUSDMA? --HPS