From owner-freebsd-current@FreeBSD.ORG Sun Jan 6 16:03:49 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A202A16A41B; Sun, 6 Jan 2008 16:03:49 +0000 (UTC) (envelope-from kris@FreeBSD.org) Received: from weak.local (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id D827913C467; Sun, 6 Jan 2008 16:03:47 +0000 (UTC) (envelope-from kris@FreeBSD.org) Message-ID: <4780FBE2.8040208@FreeBSD.org> Date: Sun, 06 Jan 2008 17:03:46 +0100 From: Kris Kennaway User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Henri Hennebert References: <20080104163352.GA42835@lor.one-eyed-alien.net> <9bbcef730801040958t36e48c9fjd0fbfabd49b08b97@mail.gmail.com> <200801061051.26817.peter.schuller@infidyne.com> <9bbcef730801060458k4bc9f2d6uc3f097d70e087b68@mail.gmail.com> <4780D289.7020509@FreeBSD.org> <4780F839.5020200@restart.be> In-Reply-To: <4780F839.5020200@restart.be> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org, Peter Schuller , Ivan Voras , Brooks Davis Subject: Re: When will ZFS become stable? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Jan 2008 16:03:49 -0000 Henri Hennebert wrote: > Kris Kennaway wrote: >> Ivan Voras wrote: >>> On 06/01/2008, Peter Schuller wrote: >>>>> This number is not so large. It seems to be easily crashed by rsync, >>>>> for example (speaking from my own experience, and also some of my >>>>> colleagues). >>>> I can definitely say this is not *generally* true, as I do a lot of >>>> rsyncing/rdiff-backup:ing and similar stuff (with many files / large >>>> files) >>>> on ZFS without any stability issues. Problems for me have been >>>> limited to >>>> 32bit and the memory exhaustion issue rather than "hard" issues. >>> >>> It's not generally true since kmem problems with rsync are often hard >>> to repeat - I have them on one machine, but not on another, similar >>> machine. This nonrepeatability is also a part of the problem. >>> >>>> But perhaps that's all you are referring to. >>> >>> Mostly. I did have a ZFS crash with rsync that wasn't kmem related, >>> but only once. >> >> kmem problems are just tuning. They are not indicative of stability >> problems in ZFS. Please report any further non-kmem panics you >> experience. > > I encounter 2 times a deadlock during high I/O activity (the last one > during rsync + rm -r on a 5GB hierarchy (openoffice-2/work). > > I was running with this patch: > http://people.freebsd.org/~pjd/patches/zgd_done.patch > db> show allpcpu > Current CPU: 1 > > cpuid = 0 > curthread = 0xa5ebe440: pid 3422 "txg_thread_enter" > curpcb = 0xeb175d90 > fpcurthread = none > idlethread = 0xa5529aa0: pid 12 "idle: cpu0" > APIC ID = 0 > currentldt = 0x50 > > cpuid = 1 > curthread = 0xa56ab220: pid 47 "arc_reclaim_thread" > curpcb = 0xe6837d90 > fpcurthread = none > idlethread = 0xa5529880: pid 11 "idle: cpu1" > APIC ID = 1 > currentldt = 0x50 > > With the 2 times arc_reclaim_thread `running` Backtraces of the affected processes (or just alltrace) are usually required to proceed with debugging, and lock status is also often vital (show alllocks, requires witness). Also, in the case when threads are actually running (not deadlocked), then it is often useful to repeatedly break/continue and sample many backtraces to try and determine where the threads are looping. Kris