FreeBSD Mail Archives

Date:      Fri, 19 Oct 2007 01:13:16 -0700
From:      "Chris H." <chris#@1command.com>
To:        freebsd-stable@freebsd.org
Subject:   Re: Reproducable, possibly NFS related, fatal double fault in 6.2-R-p7
Message-ID:  <20071019011316.5ffmycud8g0oggsg@webmail.1command.com>
In-Reply-To: <47151FF7.2080501@FreeBSD.org>
References:  <20071004165755.GA1049@pp.htv.fi> <47120D83.1010703@FreeBSD.org> <20071015203202.GA17964@pp.htv.fi> <20071016004637.GA79351@cdnetworks.co.kr> <20071016185714.GB2186@pp.htv.fi> <20071016130146.pfyan4vs5cwgsoc0@webmail.1command.com> <20071016202251.GC4047@lava.net> <47151FF7.2080501@FreeBSD.org>

Quoting Kris Kennaway <kris@freebsd.org>:

> Clifton Royston wrote:
>> On Tue, Oct 16, 2007 at 01:01:46PM -0700, Chris H. wrote:
>>> excerpt from this list titled: NFS == lock && reboot, that I posted 
>>> follows:
>>>
>>> ------8<---SNIP---8<-----SNIP-----8<-------
>>> # uname -a
>>> FreeBSD host.domain.tld 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Fri Jan 
>>> 26 16:27:14 PST 2007
>>>
>>> Greetings,
>>> Does anyone know when NFS and friends will be working again? I 
>>> haven't been able
>>> to /safely/ use it from 4.8 on. I remember some talk on the list 
>>> sometime ago and
>>> then it seemed to be resolved, as the discussion ended. So I thought it was
>>> fixed. Seems not. :(
>>>
>>> My scenario;
>>> mount host off root:
>>> mount script exec'd follows...
>>>
>>> #!/bin/sh -
>>> mount -t nfs host.domain.tld:/ /host
>>> mount -t nfs host.domain.tld:/var /host/var
>>>
>>> confirm mount...
>>>
>>> # ls /host
>>> .snap    COPYRIGHT    bin
>>> ...
>>> usr    var    tmp
>>>
>>> OK looks good...
>>>
>>> # cp /path/to/approx/10Mb/file /host/path/to/dest/dir/
>>>
>>> Fatal double fault
>>> eis 0x0blah
>>> eiblah blah0x
>>> panic double fault
>>> no dump device defined
>>> rebooting in 15sec...
>>>
>>> Hmmm... that's not good. :(
>>>
>>> ------8<---SNIP---8<-----SNIP-----8<-------
>>>
>>> My final solution was to change the lines in /etc/rc.conf
>>> from:
>>> nfs_client_enable="YES"
>>> nfs_reserved_port_only="YES"
>>> nfs_server_enable="YES"
>>> rpc_lockd_enable="YES"
>>> rpc_statd_enable="YES"
>>> rpcbind_enable="YES"
>>>
>>> to:
>>> nfs_client_enable="YES"
>>> nfs_reserved_port_only="YES"
>>> nfs_server_enable="YES"
>>> #rpc_lockd_enable="YES"
>>> #rpc_statd_enable="YES"
>>> rpcbind_enable="YES"
>>>
>>> Making those changes ended the "Fatal double fault && reboot in 15 
>>> seconds..."
>>
>>   Thanks for this very timely mention!  The cluster of servers I am
>> about to upgrade from 4.8 <embarrassed cough> to 6.2 relies heavily on
>> NFS to an old Netapp.  If I have got to disable rpc_lockd and
>> rpc_statd, it's good to know that now!
>>    Can I ask, can anybody confirm that they're running 6.2 on NFS
>> successfully *with* lockd and statd?
>
> Er, yes, of course it does.  The old message he is quoting is bogus 
> on its own,
While I'll grant you that I haven't *yet* found/taken the time to create a
dump device and re-enable rpd_lockd && rpc_statd && cp 10Mb file to mount
point to produce an *instantaneous* "Fatal double fault". I don't think it's
fair to label my original post entirely /bogus/ - especially in light of
the recent post I replied to. Which seems to have some very common ground.
I should probably mention that since my last posting (my original thread),
I have some 20+ RELENG_6_2 boxen that *do* have rpd_lockd + rpc_statd
enabled. Yet none of them produce a "Fatal double fault". They are all
Tyan SMP boards with dual onboard fxp's - as opposed to the Nvidia UP
which has a single onboard nve. They are all inter-connected via NFS.
I have a 750Gb drive hanging off the /problematic/ Nvidia board, that I
had intended to use for NFS back-up's. But given the NFS issue I had with
it, it didn't seem to be the best solution. If anyone felt like throwing
me a "cheat sheet" for creating a dump device out of that drive and a
"quickie" for producing a backtrace. I'm sure I'd be better able to find
the required time to produce the required information. I'm sorry. It's
just that I'm a hundred million miles away from that right now. As I've
been building several large web applications, and their deadline is fast
approaching. FWIW I bounced all the servers today, and therefore have
recent /verbose/ dmesg's. Should any of the information they provide, be
of any help/use to anyone.

Take care. :)

--Chris

> I don't know if he ever was able to provide meaningful traces but it 
> may well be nve as in the upthread discussion.
>
> Kris
>
>
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
>



-- 
panic: kernel trap (ignored)

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071019011316.5ffmycud8g0oggsg>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation