Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Aug 2003 19:19:59 +0200
From:      Pawel Worach <pawel.worach@telia.com>
To:        Robert Watson <rwatson@freebsd.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: nfs tranfers hang in state getblck or nfsread
Message-ID:  <3F4E39BF.10001@telia.com>
In-Reply-To: <Pine.NEB.3.96L.1030828084515.34202C-100000@fledge.watson.org>
References:  <Pine.NEB.3.96L.1030828084515.34202C-100000@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Robert Watson wrote:
> On Wed, 27 Aug 2003, Pawel Worach wrote:
> 
> Ok, so let me see if I have the sequence of events straight:
> 
> (1) Boot a 4.8-RELEASE/STABLE NFS server
> (2) Boot a 5.1-RELEASE/CURRENT NFS client
> (3) Mount a file system using TCP NFSv3
> (4) Reboot the client system, reboot, and remount
> (5) Thrash the file system a bit with large reads/writes, and it hangs

Not quite, more like this:
1) Boot the 5.1-CURRENT nfs server
2) Boot the 5.1-CURRENT diskless client (i'm using PXE/DHCP)
3) Login and run find(1) for a while on every filesystem.
(e.g. find / ^C ; find /usr ^C ; find /export ^C and so on to
generate some getattr(), read() and c/o calls)
4) Shut down the client in a _non-clean_ way, pull the power
or enter DDB and 'reset'.
5) Boot the diskless client again.

Now here are the messages i get while booting the client (step 5).
(darkstar is the server, corona is the client. the one about mounttab
is present at every boot and is not related to this problem)
Mounting root from nfs:
NFS ROOT: 192.168.1.11:/export/root
start_init: trying /sbin/init
Interface fxp0 IP-Address 192.168.1.20 Broadcast 192.168.1.255
Loading configuration files.
Entropy harversting: interrupts ethernet point_to_point
Starting file system checks:
nfs: can't update /var/db/mounttab for darkstar:/export/root
+++ mount_md of /var
nfs server darkstar:/usr: not responding
<insert about a 10 second delay here>
nfs server darkstar:/usr: is alive again
nfs server darkstar:/usr/home: not responding
<insert about a 20 second delay here>
nfs server darkstar:/usr/home: is alive again
<insert about a 20 second delay here>
[tcp] darkstar:/export: nfsd: RPCPROG_NFS: RPC: Remote system error - Operation 
timed out
<insert about a 80 second delay here>
nfs server darkstar:/export: not responding
<insert about a 40 second delay here>
nfs server darkstar:/export: is alive again

 From here on the boot continues normally and the system works fine.

I'm going to set different mount options for every filesystem now
and do this again so maybe i can nail down what is causing this.
Ths only filesystem that doesn't have problems is / and that is
also the only one using udp.

Hope this is not as confusing as my previus mail :)

And whoever commented about the "magic" stuff, that was a cut-and-paste from the
'dumpfs <fs> | grep UFS' command.

	- Pawel



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3F4E39BF.10001>