Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Aug 2008 10:03:08 -0700
From:      Matt Simerson <matt@corp.spry.com>
To:        freebsd-fs@freebsd.org
Subject:   Re: ZFS hang issue and prefetch_disable - UPDATE
Message-ID:  <A0AAC88F-2C25-4AF9-BBAD-BFA568635140@corp.spry.com>
In-Reply-To: <20080806112944.6793fc11@twoflower.in.publishing.hu>
References:  <20253C48-38CB-4A77-9C59-B993E7E5D78A@corp.spry.com> <62D3072A-E41A-4CFC-971D-9924958F38C7@corp.spry.com> <20080806112944.6793fc11@twoflower.in.publishing.hu>

next in thread | previous in thread | raw e-mail | index | archive | help

On Aug 6, 2008, at 2:29 AM, CZUCZY Gergely wrote:

> A few weeks ago, i was exactly referring to this. Somewhere around =20
> here:
> http://lists.freebsd.org/pipermail/freebsd-fs/2008-July/004796.html
>
> The thing, that it works on pointyhat, and it works on kris@'s box, =20=

> is just an
> IWFM-level, not the proof of any stability, reliability.
>
> FreeBSD is a quite stable OS, the code has a relatively good quality =20=

> as far as
> I've seen it, and it's quite stable. Somewhy the ZFS port seems to =20
> be an
> exception, it's refused to be merged properly and the issues to be =20
> solved.
>
> No matter how much someone tunes ZFS, no matter what you disable, =20
> it's not
> garanteed, not even on the tiniest level ever, to not to freeze your =20=

> box, not
> to throw a panic, to keep your data and everything.

You want/expect guarantees of stability with experimental features? I =20=

think someone needs their expectations calibrated.

> Many of us has reported this, bot noone looked into it.

Because you haven't seen proof that someone looked into doesn't mean =20
nobody has. You are not being fair nor respectful to the time that =20
others are investing in ZFS.

> use something else, but that's not the point. The point is, I don't =20=

> see a
> meaning of a port of this quality. I know it's quite complex and =20
> whatnot, but
> at this level, it cannot be run in a production environment. It's =20
> missing
> reliability.

If you don't see the value of ZFS, don't use it. I'm not complaining =20
because ZFS isn't stable. I'd like it to be, but the best way I can =20
help is provide detailed information about my setup and under what =20
conditions the feature has problems. By doing so, I'm providing useful =20=

data. Denigrating the authors because ZFS doesn't meet your =20
expectations doesn't help anybody, so please don't do that.

Matt


> No matter how much you hack it, there's always a not-so-impossible =20
> chance, that
> it will shot you in your back, when you're not watching.
>
> I hope the latest ZFS patches will solve a lot of issues, and we =20
> won't see
> problems like this anymore.
>
> On Thu, 31 Jul 2008 13:58:26 -0700
> Matt Simerson <matt@corp.spry.com> wrote:
>
>>
>> My announcement that vfs.zfs.prefetch_disable=3D1 resulted in a =
stable
>> system was premature.
>>
>> One of my backup servers (see specs below) hung. When I got onto the
>> console via KVM, it looked normal with no errors but didn't respond =20=

>> to
>> Control-Alt-Delete.  After a power cycle, zpool status showed 8  =20
>> disks
>> FAULTED and the action state was: http://www.sun.com/msg/ZFS-8000-5E
>>
>> Basically, that meant my ZFS file system and 7.5TB of data was gone.
>> Ouch.
>>
>> I'm using a pair of ARECA 1231ML RAID controllers. Previously, I had
>> them configured in JBOD with raidz2. This time around, I configured
>> both controllers with one 12 disk RAID 6 volume. Now FreeBSD just =20
>> sees
>> two 10TB disks which I stripe with ZFS:   zpool create back01 /dev/
>> da0 /dev/da1
>>
>> I also did a bit more fiddling with /boot/loader.conf. Previous I =20
>> had:
>>
>> vm.kmem_size=3D"1536M"
>> vm.kmem_size_max=3D"1536M"
>> vfs.zfs.prefetch_disable=3D1
>>
>> This resulted in ZFS using 1.1GB of RAM (as measured using the
>> technique described on the wiki) during normal use. The system in
>> question hung during the nightly processing (which backs up some =20
>> other
>> systems via rsync) and my suspicions are that when I/O load picked =20=

>> up,
>> it exhausted the available kernel memory and hung the system. So =20
>> now I
>> have these settings on one system:
>>
>> vm.kmem_size=3D"1536M"
>> vm.kmem_size_max=3D"1536M"
>> vfs.zfs.arc_min=3D"16M"
>> vfs.zfs.arc_max=3D"64M"
>> vfs.zfs.prefetch_disable=3D1
>>
>> and the same except vfs.zfs.arc_max=3D"256M" on the other. The one =
with
>> 64M uses 256MB of RAM for ZFS and the one set at 256M uses 600MB of
>> RAM. These are measured under heavy network and disk IO load being
>> generated by multiple rsync processes pulling backups from remote
>> nodes and storing it on ZFS. I am using ZFS compression.
>>
>> I get much better performance now with RAID 6 on the controller and
>> ZFS striping than using raidz2.
>>
>> Unless tuning the arc_ settings made the difference. Either way, the
>> system I just rebuilt is now quite a bit faster with RAID 6 than JBOD
>> + raidz2.
>>
>> Hopefully tuning vfs.zfs.arc_max will result in stability. If it
>> doesn't, my next choice is upgrading to -HEAD with the recent ZFS
>> patch or ditching ZFS entirely and using geom_stripe. I don't like
>> either option.
>>
>> Matt
>>
>>
>>> From: Matt Simerson <matt@corp.spry.com>
>>> Date: July 22, 2008 1:25:42 PM PDT
>>> To: freebsd-fs@freebsd.org
>>> Subject: ZFS hang issue and prefetch_disable
>>>
>>> Symptoms
>>>
>>> Deadlocks under heavy IO load on the ZFS file system with
>>> prefetch_disable=3D0.  Setting vfs.zfs.prefetch_disable=3D1 results =
in a
>>> stable system.
>>>
>>> Configuration
>>>
>>> Two machines. Identically built. Both exhibit identical behavior.
>>> 8 cores (2 x E5420) x 2.5GHz, 16 GB RAM, 24 x 1TB disks.
>>> FreeBSD 7.0 amd64
>>> dmesg: http://matt.simerson.net/computing/zfs/dmesg.txt
>>>
>>> Boot disk is a read only 1GB compact flash
>>> # cat /etc/fstab
>>> /dev/ad0s1a  / ufs  ro,noatime  2 2
>>>
>>> # df -h /
>>> Filesystem  1K-blocks   Used  Avail Capacity  Mounted on
>>> /dev/ad0s1a    939M    555M    309M    64%    /
>>>
>>> RAM has been boosted as suggested in ZFS Tuning Guide
>>> # cat /boot/loader.conf
>>> vm.kmem_size=3D 1610612736
>>> vm.kmem_size_max=3D 1610612736
>>> vfs.zfs.prefetch_disable=3D1
>>>
>>> I haven't mucked much with the other memory settings as I'm using
>>> amd64 and according to the FreeBSD ZFS wiki, that isn't necessary.
>>> I've tried higher settings for kmem but that resulted in a failed
>>> boot. I have ample RAM And would love to use as much as possible for
>>> network and disk I/O buffers as that's principally all this system
>>> does.
>>>
>>> Disks & ZFS options
>>>
>>> Sun's "Best Practices" suggests limiting the number of disks in a
>>> raidz pool to no more than 6-10, IIRC. ZFS is configured as shown:
>>> http://matt.simerson.net/computing/zfs/zpool.txt
>>>
>>> I'm using all of the ZFS default properties except: atime=3Doff,
>>> compression=3Don.
>>>
>>> Environment
>>>
>>> I'm using these machines as backup servers. I wrote an application
>>> that generates a list of the thousands of VPS accounts we host. For
>>> each host, it generates a rsnapshot configuration file and backs up
>>> up their VPS to these systems via rsync. The application manages
>>> concurrency and will spawn additional rsync processes if system i/o
>>> load is below a defined threshhold. Which is to say, I can crank up
>>> or down the amount of disk IO the system sees.
>>>
>>> With vfs.zfs.prefetch_disable=3D0, I can trigger a hang within a few
>>> hours (no more than a day). If I keep the i/o load (measured via
>>> iostat) down to a low level (< 200 iops) then I  still get hangs but
>>> less frequently (1-6 days).  The only way I have found to prevent
>>> the hangs is by setting vfs.zfs.prefetch_disable=3D1.
>>
>> _______________________________________________
>> freebsd-fs@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>
>
> --=20
> =DCdv=F6lettel,
>
> Czuczy Gergely
> Harmless Digital Bt
> mailto: gergely.czuczy@harmless.hu
> Tel: +36-30-9702963




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A0AAC88F-2C25-4AF9-BBAD-BFA568635140>