Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Nov 2013 02:32:25 -0000
From:      "Steven Hartland" <smh@freebsd.org>
To:        <hartzell@alerce.com>, <freebsd-stable@FreeBSD.org>, "Andriy Gapon" <avg@FreeBSD.org>
Cc:        re@freebsd.org
Subject:   Re: Help with filing a [maybe] ZFS/mmap bug.
Message-ID:  <4EB902F80CE84DD2BF36C85EF4CE8EF8@multiplay.co.uk>
References:  <20967.760.95825.310085@gargle.gargle.HOWL><51E80B30.1090004@FreeBSD.org><20968.10645.880772.30501@gargle.gargle.HOWL><520202E5.30300@FreeBSD.org><20994.55913.93606.436124@gargle.gargle.HOWL><FEE7BDCF7F494EE1BA0BE9424275AA91@multiplay.co.uk> <21111.12085.958991.356982@gargle.gargle.HOWL>

next in thread | previous in thread | raw e-mail | index | archive | help

----- Original Message ----- 
From: "George Hartzell" <hartzell@alerce.com>
> > > Andriy Gapon writes:
> > > > on 18/07/2013 20:44 George Hartzell said the following:
> > > > > Andriy Gapon writes:
> > > > >  > on 17/07/2013 23:47 George Hartzell said the following:
> > > > >  > > How should I move forward with this?
> > > > >  > 
> > > > >  > Could you please try to reproduce this problem using a kernel built with
> > > > >  > INVARIANTS options?
> > > > > 
> > > > > I added INVARIANT_SUPPORT and INVARIANTS options to the GENERIC
> > > > > kernel, rebuilt it, installed it and running through my "test case"
> > > > > generated a lot of invalid flac files.  I"m not sure what the options
> > > > > are/were supposed to do though, it looks like they generally lead to
> > > > > KASSERTS, which lead to abort()'s.  Nothing in /var/log/messages or on
> > > > > the console.
> > > > 
> > > > George,
> > > > 
> > > > do you have anything new on this issue?
> > > 
> > > Since the message that you quoted I narrowed down my "test case"
> > > somewhat but I have not yet produced a stand-alone tool that
> > > reproduces it (you still have to go through picard et al.).
> > > 
> > > > Could you please try the following patch?
> > > > http://people.freebsd.org/~avg/zfs-putpages.diff
> > > > 
> > > > I expect it to not really fix the issue, but it may help to narrow it down.
> > > > Please keep INVARIANTS.
> > > 
> > > Absolutely.  Probably not until the weekend, but I'll give it a go.
> > > 
> > > Thanks for following up.
> > 
> > Did you manage to make any progress with this?
> > 
> > We're seeing a problem where rrdcached corrupts rrd files and remembering
> > this thread and knowning it uses mmap and we're on ZFS I was wondering
> > it this may be the cause for this issue too.
> > 
> > I've just recompiled rrdtool without mmap support and am clearing down
> > all corrupted files but it would be good to know if any progress was
> > made on this?
> > 
> >     Regards
> >     Steve
> 
> I was able recreate the problem on a 10-BETA-something-or-other
> recently (I'd only been using 9 up until then).  Andriy's patches
> didn't make a difference.  I haven't heard anything since reporting
> back to him.

I've pretty much confirmed mmap support is causing the corruption when
running rrdcached as since rebuilding with mmap disabled I've had no
further corruption.

@George when you got corruption what did the files look like? I ask as
here I see lots of zeros as through the file size was correct but pretty
much blanked.

@avg what was your thinking behind what may be the issue here?

If this is a mmap bug in zfs its a pretty serious one given the amount
of silent corruption you can get.

@re Although reported incidents appear to be rare as its silent data
corruption users may be blissfully unaware its happening. Given that
my gut feeling is this is serious enough that we need to get something
in place before 10 release, even if this is make ZFS report ENOTSUP
for mmap calls, would you agree?

    Regards
    Steve



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EB902F80CE84DD2BF36C85EF4CE8EF8>