From owner-freebsd-stable@FreeBSD.ORG Thu Aug 22 16:16:10 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 3DC17562 for ; Thu, 22 Aug 2013 16:16:10 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-bk0-x232.google.com (mail-bk0-x232.google.com [IPv6:2a00:1450:4008:c01::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C4B82289C for ; Thu, 22 Aug 2013 16:16:09 +0000 (UTC) Received: by mail-bk0-f50.google.com with SMTP id mz11so776797bkb.37 for ; Thu, 22 Aug 2013 09:16:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=mI7gHSP68CkAxcf6/PiSHMhKqEBQtdo63zocIxJ+5Q0=; b=uTT09IJR9UyLlhE0DlzhNcwUAgi3PactRJ2t9zU9MSjZdL4xyqhTwYGpOCEIJJROXA YWio/sq38OeX+BLea2rljpQXmv+XZfwAYcP/wlpwNYY5QS4wffGNclIXDmqR91fckKU6 xGp3mhm+z7BbqQY383jQvL7J3zA+gSYooZTpwwIIOgWnyBwk6/he8g9SzpuyQHZBxnHm wgNLB0JBYv4lcmHbrEb8ST3je7exmoNsT95xX3MJP3bqayVzeF+g/IRSHZVrrlWuE6WE iyxPc+MiK2AUVrSsRuNrnYjyhhDE3Joem1hr478b/WkvklXHF4yfbjv2frCFJ/zUtpbn X5IQ== MIME-Version: 1.0 X-Received: by 10.204.71.133 with SMTP id h5mr11339799bkj.0.1377188167838; Thu, 22 Aug 2013 09:16:07 -0700 (PDT) Sender: jdavidlists@gmail.com Received: by 10.204.3.213 with HTTP; Thu, 22 Aug 2013 09:16:07 -0700 (PDT) In-Reply-To: References: <20130821131032.GX4972@kib.kiev.ua> <461961460.12238255.1377133690607.JavaMail.root@uoguelph.ca> Date: Thu, 22 Aug 2013 12:16:07 -0400 X-Google-Sender-Auth: QJyyXQIK-iro4G8xB-lgxRZDwOw Message-ID: Subject: Re: NFS deadlock on 9.2-Beta1 From: J David To: freebsd-stable Content-Type: text/plain; charset=ISO-8859-1 Cc: Rick Macklem X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Aug 2013 16:16:10 -0000 One deadlocked process cropped up overnight, but I managed to panic the box before getting too much debugging info. :( The process was in state T instead of D, which I guess must be a side effect of some of the debugging code compiled in. Here are the details I was able to capture: db> show proc 7692 Process 7692 (httpd) at 0xfffffe0158793000: state: NORMAL uid: 25000 gids: 25000 parent: pid 1 at 0xfffffe00039c3950 ABI: FreeBSD ELF64 arguments: /nfsn/apps/tapache22/bin/httpd threads: 3 100674 D newnfs 0xfffffe021cdd9848 httpd 100597 D pgrbwt 0xfffffe02fda788b8 httpd 100910 s httpd db> show thread 100674 Thread 100674 at 0xfffffe0108c79480: proc (pid 7692): 0xfffffe0158793000 name: httpd stack: 0xffffff834c80f000-0xffffff834c812fff flags: 0x2a804 pflags: 0 state: INHIBITED: {SLEEPING} wmesg: newnfs wchan: 0xfffffe021cdd9848 priority: 96 container lock: sleepq chain (0xffffffff813c03c8) db> tr 100674 Tracing pid 7692 tid 100674 td 0xfffffe0108c79480 sched_switch() at sched_switch+0x234/frame 0xffffff834c812360 mi_switch() at mi_switch+0x15c/frame 0xffffff834c8123a0 sleepq_switch() at sleepq_switch+0x17d/frame 0xffffff834c8123e0 sleepq_wait() at sleepq_wait+0x43/frame 0xffffff834c812410 sleeplk() at sleeplk+0x11a/frame 0xffffff834c812460 __lockmgr_args() at __lockmgr_args+0x9a9/frame 0xffffff834c812580 nfs_lock1() at nfs_lock1+0x87/frame 0xffffff834c8125b0 VOP_LOCK1_APV() at VOP_LOCK1_APV+0xbe/frame 0xffffff834c8125e0 _vn_lock() at _vn_lock+0x63/frame 0xffffff834c812640 ncl_upgrade_vnlock() at ncl_upgrade_vnlock+0x5e/frame 0xffffff834c812670 ncl_bioread() at ncl_bioread+0x195/frame 0xffffff834c8127e0 VOP_READ_APV() at VOP_READ_APV+0xd1/frame 0xffffff834c812810 vn_rdwr() at vn_rdwr+0x2bc/frame 0xffffff834c8128d0 kern_sendfile() at kern_sendfile+0xa90/frame 0xffffff834c812ac0 do_sendfile() at do_sendfile+0x92/frame 0xffffff834c812b20 amd64_syscall() at amd64_syscall+0x259/frame 0xffffff834c812c30 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xffffff834c812c30 --- syscall (393, FreeBSD ELF64, sys_sendfile), rip = 0x801b26f4c, rsp = 0x7ffffe9f43c8, rbp = 0x7ffffe9f4700 --- db> show lockchain 100674 thread 100674 (pid 7692, httpd) inhibited db> show thread 100597 Thread 100597 at 0xfffffe021c976000: proc (pid 7692): 0xfffffe0158793000 name: httpd stack: 0xffffff834c80a000-0xffffff834c80dfff flags: 0x28804 pflags: 0 state: INHIBITED: {SLEEPING} wmesg: pgrbwt wchan: 0xfffffe02fda788b8 priority: 84 container lock: sleepq chain (0xffffffff813c0148) db> tr 100597 Tracing pid 7692 tid 100597 td 0xfffffe021c976000 sched_switch() at sched_switch+0x234/frame 0xffffff834c80d750 mi_switch() at mi_switch+0x15c/frame 0xffffff834c80d790 sleepq_switch() at sleepq_switch+0x17d/frame 0xffffff834c80d7d0 sleepq_wait() at sleepq_wait+0x43/frame 0xffffff834c80d800 _sleep() at _sleep+0x30f/frame 0xffffff834c80d890 vm_page_grab() at vm_page_grab+0x120/frame 0xffffff834c80d8d0 kern_sendfile() at kern_sendfile+0x992/frame 0xffffff834c80dac0 do_sendfile() at do_sendfile+0x92/frame 0xffffff834c80db20 amd64_syscall() at amd64_syscall+0x259/frame 0xffffff834c80dc30 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xffffff834c80dc30 --- syscall (393, FreeBSD ELF64, sys_sendfile), rip = 0x801b26f4c, rsp = 0x7ffffebf53c8, rbp = 0x7ffffebf5700 --- db> show lockchain 100597 thread 100597 (pid 7692, httpd) inhibited The "inhibited" is not something I'm familiar with and didn't match the example output; I thought that maybe the T state was overpowering the locks, and that maybe I should continue the system and then -CONT the process. However, a few seconds after I issued "c" at the DDB prompt, the system panicked in the console driver ("mtx_lock_spin: recursed on non-recursive mutex cnputs_mtx @ /usr/src/sys/kern/kern_cons.c:500"), so I guess that's not a thing to do. :( Sorry my stupidity and ignorance is dragging this out. :( This is all well outside my comfort zone, but next time I'll get it for sure. Thanks!