From owner-freebsd-stable@FreeBSD.ORG Sat Jul 27 22:13:11 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 9BDCE8C1 for ; Sat, 27 Jul 2013 22:13:11 +0000 (UTC) (envelope-from prvs=09200d4ca8=michael@esosoft.com) Received: from eagle.esosoft.net (eagle.esosoft.net [66.241.144.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 80BD6274C for ; Sat, 27 Jul 2013 22:13:11 +0000 (UTC) Received: from [74.100.23.197] (port=63588 helo=michaelimac.castillodelsol.com) by eagle.esosoft.net with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80.1 (FreeBSD)) (envelope-from ) id 1V3CjZ-0001eZ-9H; Sat, 27 Jul 2013 15:13:05 -0700 Subject: Re: NFS deadlock on 9.2-Beta1 Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Content-Type: text/plain; charset=us-ascii From: Michael Tratz In-Reply-To: <20130727205815.GC4972@kib.kiev.ua> Date: Sat, 27 Jul 2013 15:13:05 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <602747E8-0EBE-4BB1-8019-C02C25B75FA1@esosoft.com> References: <780BC2DB-3BBA-4396-852B-0EBDF30BF985@esosoft.com> <806421474.2797338.1374956449542.JavaMail.root@uoguelph.ca> <20130727205815.GC4972@kib.kiev.ua> To: Konstantin Belousov X-Mailer: Apple Mail (2.1508) Cc: Steven Hartland , Rick Macklem , freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Jul 2013 22:13:11 -0000 On Jul 27, 2013, at 1:58 PM, Konstantin Belousov = wrote: > On Sat, Jul 27, 2013 at 04:20:49PM -0400, Rick Macklem wrote: >> Michael Tratz wrote: >>>=20 >>> On Jul 24, 2013, at 5:25 PM, Rick Macklem >>> wrote: >>>=20 >>>> Michael Tratz wrote: >>>>> Two machines (NFS Server: running ZFS / Client: disk-less), both >>>>> are >>>>> running FreeBSD r253506. The NFS client starts to deadlock >>>>> processes >>>>> within a few hours. It usually gets worse from there on. The >>>>> processes stay in "D" state. I haven't been able to reproduce it >>>>> when I want it to happen. I only have to wait a few hours until >>>>> the >>>>> deadlocks occur when traffic to the client machine starts to pick >>>>> up. The only way to fix the deadlocks is to reboot the client. >>>>> Even >>>>> an ls to the path which is deadlocked, will deadlock ls itself. >>>>> It's >>>>> totally random what part of the file system gets deadlocked. The >>>>> NFS >>>>> server itself has no problem at all to access the files/path when >>>>> something is deadlocked on the client. >>>>>=20 >>>>> Last night I decided to put an older kernel on the system r252025 >>>>> (June 20th). The NFS server stayed untouched. So far 0 deadlocks >>>>> on >>>>> the client machine (it should have deadlocked by now). FreeBSD is >>>>> working hard like it always does. :-) There are a few changes to >>>>> the >>>>> NFS code from the revision which seems to work until Beta1. I >>>>> haven't tried to narrow it down if one of those commits are >>>>> causing >>>>> the problem. Maybe someone has an idea what could be wrong and I >>>>> can >>>>> test a patch or if it's something else, because I'm not a kernel >>>>> expert. :-) >>>>>=20 >>>> Well, the only NFS client change committed between r252025 and >>>> r253506 >>>> is r253124. It fixes a file corruption problem caused by a previous >>>> commit that delayed the vnode_pager_setsize() call until after the >>>> nfs node mutex lock was unlocked. >>>>=20 >>>> If you can test with only r253124 reverted to see if that gets rid >>>> of >>>> the hangs, it would be useful, although from the procstats, I doubt >>>> it. >>>>=20 >>>>> I have run several procstat -kk on the processes including the ls >>>>> which deadlocked. You can see them here: >>>>>=20 >>>>> http://pastebin.com/1RPnFT6r >>>>=20 >>>> All the processes you show seem to be stuck waiting for a vnode >>>> lock >>>> or in __utmx_op_wait. (I`m not sure what the latter means.) >>>>=20 >>>> What is missing is what processes are holding the vnode locks and >>>> what they are stuck on. >>>>=20 >>>> A starting point might be ``ps axhl``, to see what all the threads >>>> are doing (particularily the WCHAN for them all). If you can drop >>>> into >>>> the debugger when the NFS mounts are hung and do a ```show >>>> alllocks`` >>>> that could help. See: >>>> = http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerne= ldebug-deadlocks.html >>>>=20 >>>> I`ll admit I`d be surprised if r253124 caused this, but who knows. >>>>=20 >>>> If there have been changes to your network device driver between >>>> r252025 and r253506, I`d try reverting those. (If an RPC gets stuck >>>> waiting for a reply while holding a vnode lock, that would do it.) >>>>=20 >>>> Good luck with it and maybe someone else can think of a commit >>>> between r252025 and r253506 that could cause vnode locking or >>>> network >>>> problems. >>>>=20 >>>> rick >>>>=20 >>>>>=20 >>>>> I have tried to mount the file system with and without nolockd. It >>>>> didn't make a difference. Other than that it is mounted with: >>>>>=20 >>>>> rw,nfsv3,tcp,noatime,rsize=3D32768,wsize=3D32768 >>>>>=20 >>>>> Let me know if you need me to do something else or if some other >>>>> output is required. I would have to go back to the problem kernel >>>>> and wait until the deadlock occurs to get that information. >>>>>=20 >>>=20 >>> Thanks Rick and Steven for your quick replies. >>>=20 >>> I spoke too soon regarding r252025 fixing the problem. The same = issue >>> started to show up after about 1 day and a few hours of uptime. >>>=20 >>> "ps axhl" shows all those stuck processes in newnfs >>>=20 >>> I recompiled the GENERIC kernel for Beta1 with the debugging = options: >>>=20 >>> = http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerne= ldebug-deadlocks.html >>>=20 >>> ps and debugging output: >>>=20 >>> http://pastebin.com/1v482Dfw >>>=20 >>> (I only listed processes matching newnfs, if you need the whole = list, >>> please let me know) >>>=20 >> Is your "show alllocks" complete? If not, a complete list of locks >> would definitely help. As for "ps axhl", a complete list of = processes/threads >> might be useful, but not as much, I think. >>=20 >>> The first PID showing up having that problem is 14001. Certainly the >>> "show alllocks" command shows interesting information for that PID. >>> I looked through the commit history for those files mentioned in the >>> output to see if there is something obvious to me. But I don't know. >>> :-) >>> I hope that information helps you to dig deeper into the issue what >>> might be causing those deadlocks. >>>=20 >> Well, pid 14001 is interesting in that it holds both the sleep lock >> acquired by sblock() and an NFS vnode lock, but is waiting for = another >> NFS vnode lock, if I read the pastebin stuff correctly. >>=20 >> I suspect that this process is somewhere in kern_sendfile(), since = that >> seems to be where sblock() gets called before vn_lock(). > 14001 is multithreaded, two its threads own shared vnode lock, and > third thread sleeps for a lock. >=20 >>=20 >> It's just a "shot in the dark", but if you can revert >> r250907 (dated May 22), it might be worth a try. It adds a bunch of >> stuff to kern_sendfile(), including vn_lock() calls and the date = sounds >> about right. > The output provided is not useful. >=20 > I need the data listed in the referenced 'debugging deadlocks' page. Now that I have a way of pretty much reproducing the problem quite = quickly, I can get you the data. I never debugged a kernel before so it's a new territory for me. A first = for everything right? :-) If you can help me with the commands I can = retrieve it. I found this paper talking about "Live Debugging DDB" http://people.freebsd.org/~jhb/papers/bsdcan/2008/article/node3.html Let's assume the pid which started the deadlock is 14001 (it will be a = different pid when we get the results, because the machine has been = restarted) I type: show proc 14001 I get the thread numbers from that output and type: show thread xxxxx for each one. And a trace for each thread with the command? tr xxxx Anything else I should try to get or do? Or is that not the data at all = you are looking for? >=20 >>=20 >> rick >> ps: I've added kib@ to the cc list, in case he can help with this. He >> appears to have been involved in the commits MFC'd by r250907. >>=20 >>> I did include the pciconf -lv, because you mentioned network device >>> drivers. It's Intel igb. The same hardware is running a kernel from >>> January 19th, 2013 also as an NFS client. That machine is rock >>> solid. No problems at all. >>>=20 >> Ok, so it sounds like we are looking for a post-January 19 commit. >>=20 >>> I also went to r251611. That's before r251641 (The NFS FHA changes). >> The FHA changes are server related and my understanding is that the >> problem is showing up in an NFS client. >>=20 >>> Same problem. Here is another debugging output from that kernel: >>>=20 >>> http://pastebin.com/ryv8BYc4 >>>=20 >>> If I should test something else or provide some other output, please >>> let me know. >>>=20 >>> Again thank you! >>>=20 >>> Michael >>>=20 >>>=20 >>>=20 >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to = "freebsd-stable-unsubscribe@freebsd.org"