Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Jan 2011 17:10:41 -0800
From:      John Hickey <jjh@deterlab.net>
To:        freebsd-stable@freebsd.org
Subject:   nfsd hung on ufs vnode lock
Message-ID:  <C36EE8BD-D906-4B46-AD96-3916FFBAD254@deterlab.net>

next in thread | raw e-mail | index | archive | help
There was a previous thread about this, but it doesn't look like there =
was any resolution:

http://lists.freebsd.org/pipermail/freebsd-stable/2010-May/056986.html

I run a fileserver for an Emulab (www.emulab.net) system.  As such, the =
exports table is constantly modified as experiments are swapped in and =
out.  We also get a lot of researchers using NFS for strange things.  In =
this case, the exclusive lock was for a cache directory shared by about =
36 machines running Ubuntu 8.04 and mounting with NFSv2.  Eventually, =
all our nfsd processes get stuck since the exclusive lock for the =
directory is never released.  I could use any and all pointers on =
getting this fixed.

What I am running:

jjh@users: ~$ uname -a
FreeBSD users.isi.deterlab.net 7.3-RELEASE-p2 FreeBSD 7.3-RELEASE-p2 #9: =
Tue Sep 14 16:24:57 PDT 2010     =
root@users.isi.deterlab.net:/usr/obj/usr/src/sys/USERS7  i386

Here are the sleepchains for my system (note that 0xd1f72678 appears =
twice):

0xce089cf0: tag syncer, type VNON
 usecount 1, writecount 0, refcount 2 mountedhere 0
 flags ()
  lock type syncer: EXCL (count 1) by thread 0xcdb4b000 (pid 46)

0xd1f72678: tag ufs, type VDIR
 usecount 2, writecount 0, refcount 67 mountedhere 0
 flags ()
 v_object 0xd1e90e80 ref 0 pages 1
  lock type ufs: EXCL (count 1) by thread 0xce1146c0 (pid 866) with 62 =
pending
     ino 143173560, on dev mfid0s1f

0xd1e6f228: tag ufs, type VDIR
 usecount 1, writecount 0, refcount 3 mountedhere 0
 flags ()
 v_object 0xd180f480 ref 0 pages 1
  lock type ufs: SHARED (count 1)
     ino 19268907, on dev mfid0s1f

0xd1a37564: tag ufs, type VNON
 usecount 1, writecount 0, refcount 1 mountedhere 0
 flags ()
  lock type ufs: EXCL (count 1) by thread 0xcdb4c240 (pid 871)
     ino 115689129, on dev mfid1s1d

0xce089cf0: tag syncer, type VNON
 usecount 1, writecount 0, refcount 2 mountedhere 0
 flags ()
  lock type syncer: EXCL (count 1) by thread 0xcdb4b000 (pid 46)

0xd1f72678: tag ufs, type VDIR
 usecount 2, writecount 0, refcount 67 mountedhere 0
 flags ()
 v_object 0xd1e90e80 ref 0 pages 1
  lock type ufs: EXCL (count 1) by thread 0xce1146c0 (pid 866) with 62 =
pending
     ino 143173560, on dev mfid0s1f

0xd1e6f228: tag ufs, type VDIR
 usecount 1, writecount 0, refcount 3 mountedhere 0
 flags ()
 v_object 0xd180f480 ref 0 pages 1
  lock type ufs: SHARED (count 1)
     ino 19268907, on dev mfid0s1f

0xd1a37564: tag ufs, type VNON
 usecount 1, writecount 0, refcount 1 mountedhere 0
 flags ()
  lock type ufs: EXCL (count 1) by thread 0xcdb4c240 (pid 871)
     ino 115689129, on dev mfid1s1d

Here is process 866:

(kgdb) proc 866
[Switching to thread 66 (Thread 100104)]#0  sched_switch (td=3D0xce1146c0,=
 newtd=3DVariable "newtd" is not available.
) at /usr/src/sys/kern/sched_ule.c:1936
1936   =20
(kgdb) bt
#0  sched_switch (td=3D0xce1146c0, newtd=3DVariable "newtd" is not =
available.
) at /usr/src/sys/kern/sched_ule.c:1936
#1  0xc080a4a6 in mi_switch (flags=3DVariable "flags" is not available.
) at /usr/src/sys/kern/kern_synch.c:444
#2  0xc0837aab in sleepq_switch (wchan=3DVariable "wchan" is not =
available.
) at /usr/src/sys/kern/subr_sleepqueue.c:497
#3  0xc08380f6 in sleepq_wait (wchan=3D0xd4176394) at =
/usr/src/sys/kern/subr_sleepqueue.c:580
#4  0xc080a92a in _sleep (ident=3D0xd4176394, lock=3D0xc0ceb498, =
priority=3D80, wmesg=3D0xc0bb656e "ufs", timo=3D0) at =
/usr/src/sys/kern/kern_synch.c:230
#5  0xc07ea9fa in acquire (lkpp=3D0xcd7375a0, extflags=3DVariable =
"extflags" is not available.
) at /usr/src/sys/kern/kern_lock.c:151
#6  0xc07eb2ec in _lockmgr (lkp=3D0xd4176394, flags=3D8194, =
interlkp=3D0xd41763c4, td=3D0xce1146c0, file=3D0xc0bc20c8 =
"/usr/src/sys/kern/vfs_subr.c", line=3D2062)
 at /usr/src/sys/kern/kern_lock.c:384
#7  0xc0a24765 in ffs_lock (ap=3D0xcd737608) at =
/usr/src/sys/ufs/ffs/ffs_vnops.c:377
#8  0xc0b26876 in VOP_LOCK1_APV (vop=3D0xc0ca4740, a=3D0xcd737608) at =
vnode_if.c:1618
#9  0xc0896d76 in _vn_lock (vp=3D0xd417633c, flags=3D8194, =
td=3D0xce1146c0, file=3D0xc0bc20c8 "/usr/src/sys/kern/vfs_subr.c", =
line=3D2062) at vnode_if.h:851
#10 0xc0889da4 in vget (vp=3D0xd417633c, flags=3D8194, td=3D0xce1146c0) =
at /usr/src/sys/kern/vfs_subr.c:2062
#11 0xc087bd23 in vfs_hash_get (mp=3D0xce0962d0, hash=3D143173100, =
flags=3DVariable "flags" is not available.
) at /usr/src/sys/kern/vfs_hash.c:81
#12 0xc0a1e429 in ffs_vgetf (mp=3D0xce0962d0, ino=3D143173100, flags=3D2, =
vpp=3D0xcd737800, ffs_flags=3D0) at =
/usr/src/sys/ufs/ffs/ffs_vfsops.c:1400
#13 0xc0a1e95e in ffs_vget (mp=3D0xce0962d0, ino=3D143173100, flags=3D2, =
vpp=3D0xcd737800) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:1380
#14 0xc0a00765 in ffs_valloc (pvp=3D0xd1f72678, mode=3D33152, =
cred=3D0xcf024700, vpp=3D0xcd737800) at =
/usr/src/sys/ufs/ffs/ffs_alloc.c:970
#15 0xc0a30945 in ufs_makeinode (mode=3D33152, dvp=3D0xd1f72678, =
vpp=3D0xcd737a64, cnp=3D0xcd737a78) at =
/usr/src/sys/ufs/ufs/ufs_vnops.c:2254
#16 0xc0a310c0 in ufs_create (ap=3D0xcd73799c) at =
/usr/src/sys/ufs/ufs/ufs_vnops.c:193
#17 0xc0b26ed2 in VOP_CREATE_APV (vop=3D0xc0ca4740, a=3D0xcd73799c) at =
vnode_if.c:206
#18 0xc09c02ad in nfsrv_create (nfsd=3D0xcde57500, slp=3D0xcde37000, =
td=3D0xce1146c0, mrq=3D0xcd737c58) at vnode_if.h:112
#19 0xc09c7a61 in nfssvc (td=3D0xce1146c0, uap=3D0xcd737cfc) at =
/usr/src/sys/nfsserver/nfs_syscalls.c:456
#20 0xc0b108e5 in syscall (frame=3D0xcd737d38) at =
/usr/src/sys/i386/i386/trap.c:1101
#21 0xc0af4290 in Xint0x80_syscall () at =
/usr/src/sys/i386/i386/exception.s:262
#22 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)

John Hickey
jjh@deterlab.net





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C36EE8BD-D906-4B46-AD96-3916FFBAD254>