Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 25 Oct 2004 02:20:08 +0200
From:      Joan Picanyol <lists-freebsd-questions@biaix.org>
To:        freebsd-stable@freebsd.org
Cc:        freebsd-net@freebsd.org
Subject:   process stuck in nfsfsync state
Message-ID:  <20041025002008.GA36161@grummit.biaix.org>

next in thread | raw e-mail | index | archive | help
[please honour Mail-Followup-To:, no need to keep the crosspost]

This is a repost of
http://docs.FreeBSD.org/cgi/mid.cgi?20041014110752.GA57541, with some
additional information. I've updated the client to RC1, and the problem
still persists. In short, a 5.3-RC1 client mounting /home off a 4.10-p3
server can't use the NFS fs anymore when trying to start GNOME, since gconfd
and gnome-session are in nfsfsync state. Any process accessing the fs
hungs, and the console gets full of
nfs server grummit:/fs/home/mount: not responding                               
messages, even though the client can still ping the server and other
mount points are still available.

AFAICT, nfsd and friends are running both on the client and the server,
and the client can use RPC properly (checked via rpcinfo).

Also, doing 'tcpdump -vv -s 192 port nfs' on the client and the server
seems support the hypothesis of a locking issue, since I see a write
request for the same fh repeating over and over.

The trace of gnome-session is as follows:

db> tr 610
sched_switch(c180b4b0,0,1,11d,27b8ea4) at sched_switch+0x190
mi_switch(1,0,c063d701,19d,2) at mi_switch+0x2ac
sleepq_switch(c216d23c,c0639f0f,18e,2,da518a5c) at sleepq_switch+0x134
sleepq_wait(c216d23c,0,c063b2f5,db,0) at sleepq_wait+0x41
msleep(c216d23c,c216d210,4d,c1906703,0) at msleep+0x3b5
nfs_flush(c216d210,c17fed00,1,c180b4b0,0) at nfs_flush+0x961
nfs_close(da518b8c,1,c0643a5e,140,c0681da0) at nfs_close+0x7e
vn_close(c216d210,2,c17fed00,c180b4b0,c0692c20) at vn_close+0x67
vn_closefile(c1c2b6e8,c180b4b0,c0637a98,829,c1c2b6e8) at vn_closefile+0xc4
fdrop_locked(c1c2b6e8,c180b4b0,c0637a98,768) at fdrop_locked+0xb4
fdrop(c1c2b6e8,c180b4b0,3,c180b4b0,da518c98) at fdrop+0x3c
closef(c1c2b6e8,c180b4b0,c0637a98,3e3,0) at closef+0x21c
close(c180b4b0,da518d14,4,431,1) at close+0x135
syscall(2f,2f,2f,0,28d38ec0) at syscall+0x272
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (6, FreeBSD ELF32, close), eip = 0x28ca1e6f, esp = 0xbfbfe52c, ebp = 0xbfbfe538 ---

I have a debugging kernel and a console attached, feel free to ask for
any other information of interest.

This is driving me nuts, and I'm surely not the only one using GNOME
over NFS, is anyone else seeing this? What exactly is going on? How can
I fix it? It might be that the problem appeared going from BETA3 to
BETA6, but I've been unable to "downgrade" the workstation; where can I
get a copy of BETA3 to test this?

tks
-- 
pica



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041025002008.GA36161>