From owner-freebsd-current@FreeBSD.ORG  Fri Mar 20 00:20:22 2009
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2FD0E106564A;
	Fri, 20 Mar 2009 00:20:22 +0000 (UTC)
	(envelope-from ben@wanderview.com)
Received: from mail.wanderview.com (mail.wanderview.com [66.92.166.102])
	by mx1.freebsd.org (Postfix) with ESMTP id 85CBD8FC12;
	Fri, 20 Mar 2009 00:20:21 +0000 (UTC)
	(envelope-from ben@wanderview.com)
Received: from harkness.in.wanderview.com (harkness.in.wanderview.com
	[10.76.10.150]) (authenticated bits=0)
	by mail.wanderview.com (8.14.3/8.14.3) with ESMTP id n2K0JkCJ002415
	(version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO);
	Fri, 20 Mar 2009 00:19:47 GMT (envelope-from ben@wanderview.com)
Message-Id: <BDABA909-C2AE-4A55-869B-CA01BE778A82@wanderview.com>
From: Ben Kelly <ben@wanderview.com>
To: Adam McDougall <mcdouga9@egr.msu.edu>
In-Reply-To: <49C2CFF6.8070608@egr.msu.edu>
Mime-Version: 1.0 (Apple Message framework v930.3)
Date: Thu, 19 Mar 2009 20:19:46 -0400
References: <DC9F2088-A0AF-467D-8574-F24A045ABD81@wanderview.com>
	<49C2CFF6.8070608@egr.msu.edu>
X-Mailer: Apple Mail (2.930.3)
X-Spam-Score: -1.439 () ALL_TRUSTED,HTML_MESSAGE
X-Scanned-By: MIMEDefang 2.64 on 10.76.20.1
Content-Type: text/plain;
	charset=US-ASCII;
	format=flowed;
	delsp=yes
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: attilio@freebsd.org, Jeff Roberson <jroberson@jroberson.net>,
	Sam Leffler <sam@freebsd.org>, freebsd-current@freebsd.org,
	Pawel Jakub Dawidek <pjd@freebsd.org>
Subject: Re: [patch] zfs livelock and thread priorities
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 Mar 2009 00:20:22 -0000

On Mar 19, 2009, at 7:06 PM, Adam McDougall wrote:
> I was really impressed with your diagnosis but didn't try your patch  
> until this afternoon.  I had not seen processes spin, but I have had  
> zfs get stuck roughly every 2 days on a somewhat busy ftp/rsync  
> server until I turned off zil again, then it was up for over 13 days  
> when I decided to try this patch.  This system boots from a ufs /  
> and turns around to try mounting a zfs root over top, but the first  
> time it stalled for a few minutes at the root mount and "gave up"  
> with a spinlock held too long, second time same thing but I didn't  
> wait long enough for the spinlock error. Then I tried a power cycle  
> just because, and the next two tries I got a page fault kernel  
> panic.  I'd try to give more details but right now im trying to get  
> the server back up with a livecd because I goofed and don't have an  
> old kernel to fall back on.  Just wanted to let you know, and thanks  
> for getting as far as you did!

Ouch!  Sorry you ran into that.

I haven't seen these problems, but I keep my root partition on UFS and  
only use zfs for /usr, /var, etc.  Perhaps that explains the  
difference in behavior.

You could try changing the patch to use lower priorities.  To do this  
change compat/opensolaris/sys/proc.h so that it reads:

   #define        minclsyspri     PRI_MAX_REALTIME
   #define        maxclsyspri    (PRI_MAX_REALTIME - 4)

This compiles and runs on my machine.  The theory here is that other  
kernel threads will be able to run as they used to, but the zfs  
threads will still be fixed relative to one another.  Its really just  
a stab in the dark, though.  I don't have any experience with the "zfs  
mounted on top of ufs root" configuration.  If this works we should  
try to see if we can replace PRI_MAX_REALTIME with PRI_MAX_KERN so  
that the zfs kernel threads run in the kernel priority range.

If you could get a stack trace of the kernel panic that would be  
helpful.  Also, if you have console access, can you break to debugger  
during the boot spinlock hang and get a backtrace of the blocked  
process?

If you want to compare other aspects of your environment to mine I  
uploaded a bunch of info here:

   http://www.wanderview.com/svn/public/misc/zfs_livelock

Finally, I'm CC'ing the list and some other people so they are aware  
that the patch runs the risk of a panic.

I hope that helps.

- Ben