From owner-freebsd-current@FreeBSD.ORG  Tue Jun 16 13:40:40 2009
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 59407106564A;
	Tue, 16 Jun 2009 13:40:40 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 190868FC08;
	Tue, 16 Jun 2009 13:40:40 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id A5C0446B8A;
	Tue, 16 Jun 2009 09:40:39 -0400 (EDT)
Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8])
	by bigwig.baldwin.cx (Postfix) with ESMTPA id 93B388A073;
	Tue, 16 Jun 2009 09:40:38 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-current@freebsd.org
Date: Tue, 16 Jun 2009 08:12:48 -0400
User-Agent: KMail/1.9.7
References: <1242075474.72992.118.camel@hood.oook.cz>
	<4A36B6D8.8000701@FreeBSD.org> <20090616005810.GE1111@egr.msu.edu>
In-Reply-To: <20090616005810.GE1111@egr.msu.edu>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200906160812.49039.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(bigwig.baldwin.cx); Tue, 16 Jun 2009 09:40:38 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,RDNS_NONE
	autolearn=no version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx
Cc: Adam McDougall <mcdouga9@egr.msu.edu>
Subject: Re: pointyhat panic
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Jun 2009 13:40:40 -0000

On Monday 15 June 2009 8:58:11 pm Adam McDougall wrote:
> On Mon, Jun 15, 2009 at 10:02:16PM +0100, Kris Kennaway wrote:
> 
>   Pav Lucistnik wrote:
>   > panic: mtx_lock() of destroyed mutex @ /usr/src/sys/rpc/clnt_vc.c:953
>   > cpuid = 2
>   > KDB: enter: panic
>   > [thread pid 0 tid 100029 ]
>   > Stopped at      kdb_enter+0x3d: movq    $0,0x3f5fb8(%rip)
>   > db> bt
>   > Tracing pid 0 tid 100029 td 0xffffff00018e1000
>   > kdb_enter() at kdb_enter+0x3d
>   > panic() at panic+0x17b
>   > _mtx_lock_flags() at _mtx_lock_flags+0xc5
>   > clnt_vc_soupcall() at clnt_vc_soupcall+0x273
>   > sowakeup() at sowakeup+0xf8
>   > tcp_do_segment() at tcp_do_segment+0x23c9
>   > tcp_input() at tcp_input+0x9ec
>   > ip_input() at ip_input+0xbc
>   > ether_demux() at ether_demux+0x1ed
>   > ether_input() at ether_input+0x171
>   > em_rxeof() at em_rxeof+0x201
>   > em_handle_rxtx() at em_handle_rxtx+0x4b
>   > taskqueue_run() at taskqueue_run+0x96
>   > taskqueue_thread_loop() at taskqueue_thread_loop+0x3f
>   > fork_exit() at fork_exit+0x12a
>   > fork_trampoline() at fork_trampoline+0xe
>   > --- trap 0, rip = 0, rsp = 0xffffffff240a6d40, rbp = 0 ---
>   > 
>   > The box is in kdb on serial console for now. May 9 -CURRENT, I think.
>   > 
>   
>   This happened again.  The trigger was this (^C of a find on a busy 
>   netapp volume with a lot of other concurrent nfs traffic to the same 
>   mountpoint):
>   
>   pointyhat# find . -name \*.bz2 -mmin -10
>   ^Cnfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   load: 4.54  cmd: find 93357 [rpccon] 11.19u 111.62s 0% 4848k
>   
>   About 5-10 minutes later the machine panicked.  I'll try updating to a 
>   newer -CURRENT.
>   
>   Kris
> 
> This sounds like nearly exactly the same symptoms I noticed on
> a -current machine a few months ago, I was doing a du on a 
> nfs mount, decided to ctrl-c it, got the not responding for a
> while and a few minutes after the system paniced.  I hadn't
> had a chance to report it yet but I did find a workaround,
> it is stable if I remove "intr" from the NFS mount options.
> Hope this helps a little.

These should be fixed in the latest HEAD.  It would be good to 
re-enable "intr" and test it before 8.0 is released.

-- 
John Baldwin