From owner-freebsd-stable@FreeBSD.ORG Thu Nov 23 06:11:56 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3432516A407 for ; Thu, 23 Nov 2006 06:11:56 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id 74F0143D5A for ; Thu, 23 Nov 2006 06:11:22 +0000 (GMT) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id 514981A3C1C; Wed, 22 Nov 2006 22:11:55 -0800 (PST) Received: by obsecurity.dyndns.org (Postfix, from userid 1000) id 821B651CC4; Thu, 23 Nov 2006 01:11:40 -0500 (EST) Date: Thu, 23 Nov 2006 01:11:38 -0500 From: Kris Kennaway To: Chris Message-ID: <20061123061137.GA49872@xor.obsecurity.org> References: <3aaaa3a0611212149u21146180ra84503472a0336e3@mail.gmail.com> <20061122170353.GA38104@xor.obsecurity.org> <3aaaa3a0611222125v36344f17rbc59a60516836b44@mail.gmail.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="qDbXVdCdHGoSgWSk" Content-Disposition: inline In-Reply-To: <3aaaa3a0611222125v36344f17rbc59a60516836b44@mail.gmail.com> User-Agent: Mutt/1.4.2.2i Cc: FreeBSD Stable , Kris Kennaway Subject: Re: sshfs/nfs cause server lockup X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Nov 2006 06:11:56 -0000 --qDbXVdCdHGoSgWSk Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 23, 2006 at 05:25:21AM +0000, Chris wrote: > On 22/11/06, Kris Kennaway wrote: > >On Wed, Nov 22, 2006 at 05:49:12AM +0000, Chris wrote: > >> On a few occasions all different remote servers I have had nfs cause > >> servers to stop responding so I stopped using it all the servers were > >> either 6.0 release 6.1 release or 6-stable. > >> > >> We recently discovered sshfs which supports cross platform mounting > >> server is linux and I mounted on a freebsd 6.1 release using security > >> branch up to date. > >> > >> it was working fine for around 5 to 6 days with some problems with > >> sshfs not updating files that are updated but wasnt compromising the > >> stability of the freebsd server I just remounted to keep up to date. > >> Then today the linux server had network problems so the sshfs timed > >> out and there is 2 dirs I mount, the first mounted fine a bit slow but > >> connected but when I ran the command to mount the 2nd dir the server > >> stopped responding. > >> > >> My 2nd ssh terminal was alive I tried to run top to see if sshfs was > >> hanging or something but when I hit enter top didnt run and the 2nd > >> terminal was froze, note both terminals didnt timeout and a ircd > >> running on the server also did not timeout but the box wasnt listening > >> to any new requests, it was responding to pings fine. > >> > >> I have a remote reboot facility on the box but no local access and no > >> kvm/serial console facility available this is the case for all of my > >> servers. I initially tried a soft reboot which uses ctrl-alt-delete > >> but the pings kept replying so I could see the reboot wasn initiated > >> indicating some kind of console lockup as well, I then did a hard > >> reboot which brought the server back. > >> > >> All logs stopped when the first lockup occured so no errors etc. > >> recorded bear in mind I have no local access to this machine. It does > >> appear that 6.x has some kind of serious remote mounting bug because I > >> never had these nfs problems in freebsd 5.x. > >> > >> I would be interested in any thoughts as to what could help me I have > >> rebooted the server now with network mpsafe disabled to see if this > >> will help it is using a generic kernel with the following changes. > > > >Sounds like your "sshfs" is causing the kernel to deadlock in that > >error situation. You can confirm by enabling DEBUG_LOCKS and > >DEBUG_VFS_LOCKS, then breaking to DDB and running 'show lockedvnods' > >when the deadlock occurs. > > > >If you're still having problems with NFS on 6.2, we'd much rather you > >reported those so that we can investigate and try to fix them. > > > >Kris > > > > > > >=20 > Ok thanks, I will make sure this box is updated to 6.2 when it hits > release, if I enable the options in the kernel I will need local > access to use ddb? Yeah, you'll need a form of console access (local or serial). In principle you could extract the information from a coredump (i.e. trigger a coredump when the system deadlocks), but I don't think there's a kgdb macro equivalent of 'show lockedvnods'. Kris --qDbXVdCdHGoSgWSk Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFFZTuZWry0BWjoQKURAhW8AKCOJm6EXFH8VbFtY90Jtiso1IYxvgCbB9x9 zdBQNj2Sk88tzuyGS148XzI= =0ylO -----END PGP SIGNATURE----- --qDbXVdCdHGoSgWSk--