Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 16 Dec 2009 11:41:04 -0500
From:      Ben Kelly <ben@wanderview.com>
To:        Arnaud Houdelette <arnaud.houdelette@tzim.net>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: Possible ZFS livelock or SCHED_ULE bug ?
Message-ID:  <F864C01F-3E81-461C-9E90-964608F189BC@wanderview.com>
In-Reply-To: <4B290515.5080909@tzim.net>
References:  <4B290515.5080909@tzim.net>

next in thread | previous in thread | raw e-mail | index | archive | help

On Dec 16, 2009, at 11:04 AM, Arnaud Houdelette wrote:

> Hi all !
> I got a UniProcessor AMD64 box, with 512 MB ram with 2 ZFS pools as a =
home-NAS.
>=20
> I got some IO issues since I moved from 7.2 to 8.0.
> With a GENERIC kernel (or a stripped down one),  during high IO =
activity (as a make buildword can cause), I encounter random hangs or =
deadlocks.
> top show system CPU usage at 99%, the most CPU using process being =
[zfskern] ( {txg_thread_enter} if I switch to thread view).
> The box still respond to ping. Current processes can still run, but I =
can't run new ones.
> Sometimes, I can return to normal by Ctrl-C-ing the buildworld (or =
other operation), sometimes I can't, I got to reboot the box.
>=20
> The Issue seemed to become less frequent with 8.0-stable instead of =
8.0-RELEASE, but still present (I get approximately 75% chance of hang =
with a buildworld).
> I got the hang with Prefetch enabled or disabled. Idem for ZIL.
>=20
> I tried to enable kernel dumps, but the box hangs saving the dump =
(root is on ZFS) or when starting kdbg on it.
> I recompiled kernel with SCHED_4BSD, and it seems I can't reproduce =
the hang.
>=20
> What do you think ?
> Did I misconfigured something ?

This sounds similar to something I ran into on CURRENT last year:

  =
http://docs.freebsd.org/cgi/getmsg.cgi?fetch=3D832196+0+archive/2009/freeb=
sd-current/20090322.freebsd-current

The immediate problem was a priority inversion problem between the =
txg_thread_enter threads and the spa_zio threads.  This should be solved =
(or at least mitigated) on 8.0 now that these threads have explicit =
priorities set.  Can you check to see what priorities these threads are =
at on your machine?  They should have priorities something like -8 for =
txg_thread_enter and -16 for spa_zio.

- Ben






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?F864C01F-3E81-461C-9E90-964608F189BC>