Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 04 Apr 2007 07:39:39 -0700
From:      "Chris H." <chris#@1command.com>
To:        freebsd-stable@FreeBSD.ORG
Subject:   Re: NFS == lock && reboot
Message-ID:  <20070404073939.h9p3mgp2m88kswk8@webmail.1command.com>
In-Reply-To: <200704041427.l34ERGP3037877@lurza.secnetix.de>
References:  <200704041427.l34ERGP3037877@lurza.secnetix.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Quoting Oliver Fromme <olli@lurza.secnetix.de>:

> Chris H. <chris#@1command.com> wrote:
> > Thomas David Rivers wrote:
> > > I have found that if I kill rpc.lockd on the NFS server,
> > > most of the NFS issues I have (including a similar lock-up on
> > > 6.1-RELEASE) go away.
>
> FWIW, I also had problems with running rpc.lockd and
> rpc.statd (no panics, though).  If you don't need them
> (i.e. you don't need cross-machine locking), then don't
> use them.  Use the -L flag to mount_nfs so at least
> local locking works.
>
> > You don't happen to have any experiences keeping rpc.statd
> > running?
>
> Basically, it doesn't make much sense to run one without
> the other.  If you disable rpc.lockd, you can also safely
> disable rpc.statd.
>
> However, I don't think that your actual problem (lock-up
> and panics) is related to rpc.lockd or rpc.statd.  It
> rather sounds like something else is wrong with your
> machine.  NFS works perfectly fine for me, including
> copying huge files.
>
> You wrote that you had a lot of crashes that accumulated
> many files in lost+found.  Well, maybe your filesystem
> was somehow damaged in the process.  It is possible to
> damage file systems in a way that can lead to panics, and
> it's not necessarily detected and repaired by fsck.

Indeed. I /too/ considered this. However, I largely dismissed this
as a possibility as most all of them are 0 length in size. The others
are fragments of logs. I'm not /completely/ ruling this out though.

>
> > > > # cp /path/to/approx/10Mb/file /host/path/to/dest/dir/
> > > >
> > > > Fatal double fault
> > > > eis 0x0blah
> > > > eiblah blah0x
> > > > panic double fault
> > > > no dump device defined
>
> You should try to setup a dump device, so you get a kernel
> crash dump next time.  The crash dump can be used to find
> out where the crash occured -- and I bet it's not in the
> NFS code.
>
> See the Handbook for details on how to setup a dump device.
>
> By the way, does the problem also occur when copying the
> file to/from a memory disk, so no physical disk is involved?
> That way you would exclude the disk and the disk driver as
> potential causes.  Similarly, try a loopback NFS mount
> (i.e. mount from 127.0.0.1) in order to exclude the network
> interface driver as a potential cause.
>
> If the problem still exists when copying a 10 MB file from
> a memory disk to a memory disk (same or other) via a
> localhost mount on the same machine, then it looks like
> the NFS code might be at fault.
>
> Best regards
>   Oliver

All good advise. I'm going to /initially/ take the easy way out
first (remove lockd/statd from rc.conf). As a quick experiment.
Then I'll endevour to investigate further using your suggestions.

Thank you very much for all your time and thoughtful answer.

--Chris


>
> --
> Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
> Handelsregister: Registergericht Muenchen, HRA 74606,  Gesch=E4ftsfuehrun=
g:
> secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M=FC=
n-
> chen, HRB 125758,  Gesch=E4ftsf=FChrer: Maik Bachmann, Olaf Erb, Ralf Geb=
hart
>
> FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd
>
> "C++ is the only current language making COBOL look good."
>        -- Bertrand Meyer
>



-- 
panic: kernel trap (ignored)



-----------------------------------------------------------------
FreeBSD 5.4-RELEASE-p12 (SMP - 900x2) Tue Mar 7 19:37:23 PST 2006
/////////////////////////////////////////////////////////////////




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070404073939.h9p3mgp2m88kswk8>