Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Apr 2009 16:08:03 -0600
From:      Scott Long <scottl@samsco.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        Damian Gerow <dgerow@afflictions.org>, freebsd-current@freebsd.org, Richard Todd <rmtodd@ichotolot.servalan.com>
Subject:   Re: ZFS checksum errors on umass(4) insertion
Message-ID:  <49E7AC43.8050901@samsco.org>
In-Reply-To: <200904161804.42399.jhb@freebsd.org>
References:  <49BD117B.2080706@163.com> <200904161748.08402.jhb@freebsd.org> <49E7A8DF.9080902@samsco.org> <200904161804.42399.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
John Baldwin wrote:
> On Thursday 16 April 2009 5:53:35 pm Scott Long wrote:
>> John Baldwin wrote:
>>> On Thursday 16 April 2009 5:32:52 pm Scott Long wrote:
>>>> John Baldwin wrote:
>>>>> Can you please try http://www.FreeBSD.org/~jhb/patches/dma_pg.patch?  
> This
>>>>> lines up with your analysis in that it fixes a problem in the bounce 
>>> buffer
>>>>> code that was introduced with the new USB stack (and only triggers when 
>>> the
>>>>> USB code has to use a bounce buffer).
>>>>>
>>>> As a data point, most normal I/O is not going to trigger this bug, even
>>>> if it gets bounced.  I/O using O_DIRECT can, and GEOM discovery I/O can
>>>> as well.  Since memory is allocated from the top of the system, I think
>>>> that the damage gets done early during boot, and then propagates out
>>>> over time as the system becomes busier.
>>> Hmm, are you sure regular I/O won't trigger it as well?  All it takes is 
> for 
>>> any USB transfer that starts off within a page to get a page into a 
> non-zero 
>>> offset and later have a request >= PAGE_SIZE bounce.  Since the VM is 
> going 
>>> to always ask for I/O to pages (e.g. GET/PUTPAGES) normal disk I/O would 
>>> break if it uses the bad bounce page I think.
>>>
>> Sorry, I knew what I meant but didn't say it that well.  Once it gets 
>> triggered, it poisons that bounce page from thereon out, and any I/O 
>> will be affected.  But the only I/O that will typically trigger it is 
>> GEOM scanning and O_DIRECT.
> 
> Ah, ok.  Actually, couldn't any USB request trigger it (so long as the source 
> buffer isn't page aligned?)  E.g. if you malloc()'d a 16-byte buffer and used 
> it to receive some descriptor for the keyboard driver and malloc() used a 
> backing page > 4GB, wouldn't that bounce and end up breaking a bounce page?
> 

It could, and if that's happening, then some proactive measures should
be taken by the usb code to generate buffers that won't bounce.  It
would suck to have to go through the bounce code for every single
keystroke or mouse movement.  I haven't looked at the new usb code,
though, and I thought that the old code did a contigmalloc of some
sort to avoid this.

Scott




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?49E7AC43.8050901>