Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 25 Apr 2001 15:08:52 +0100
From:      Oliver Cook <ollie@uk.clara.net>
To:        freebsd-hackers@freebsd.org
Subject:   open (vfs_syscalls.c:994) && NFS
Message-ID:  <20010425150852.B37512@mutare.noc.clara.net>

next in thread | raw e-mail | index | archive | help
A little bit of background: these systems are
FreeBSD 3.x and 4.x installations running Apache
1.3.x serving webpages stored on a NetApp filer
over NFS.

One folder has a corrupt directory entry:
/clara/htdocs/clara.net/k/o/m/komunikation/webspace/

Trying to 'cat', 'cp' etc any file in this
directory results in a process locked in "D"
disk wait.

After about a week there are hundreds of stuck
httpd processes in exactly this state. It is not
possible to attach to them, but information can
be gleaned from a kernel backtrace:

hera[/]# ps aux|grep httpd|grep " D"|head 1
claranet 82569  0.0  0.0  2464   68  ??  D     6:47AM   0:00.01 /usr/local/apache/bin/httpd
Broken pipe
hera[/]# gdb -k /sys/compile/HERA/kernel.debug /dev/mem
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd"...
IdlePTD 3080192
initial pcb at be17000
panic messages:
---
---
#0  mi_switch () at ../../kern/kern_synch.c:859
859             if (switchtime.tv_sec == 0)
(kgdb) proc 82569
(kgdb) bt
#0  mi_switch () at ../../kern/kern_synch.c:859
#1  0xc01467e9 in tsleep (ident=0xe00a3aca, priority=18, wmesg=0xc024a79b "nfsvinval", timo=0) at ../../kern/kern_synch.c:468
#2  0xc01ad14f in nfs_vinvalbuf (vp=0xe0097b80, flags=1, cred=0xc691e800, p=0xe27c8220, intrflg=1) at ../../nfs/nfs_bio.c:1170
#3  0xc01d02a6 in nfs_open (ap=0xe2878e10) at ../../nfs/nfs_vnops.c:506
#4  0xc01736af in vn_open (ndp=0xe2878edc, fmode=1, cmode=420) at vnode_if.h:189
#5  0xc016f6a1 in open (p=0xe27c8220, uap=0xe2878f80) at ../../kern/vfs_syscalls.c:994
#6  0xc02238e6 in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 4, tf_esi = 672559256, tf_ebp = -1077937648,
      tf_isp = -494432300, tf_ebx = 672502180, tf_edx = 672559256, tf_ecx = 15, tf_eax = 5, tf_trapno = 12, tf_err = 2, tf_eip = 672418516,
      tf_cs = 31, tf_eflags = 659, tf_esp = -1077937692, tf_ss = 47}) at ../../i386/i386/trap.c:1073
#7  0xc0218be6 in Xint0x80_syscall () 
#8  0x8062fe0 in ?? ()
#9  0x806ccdd in ?? ()
#10 0x806618c in ?? ()
#11 0x80797f4 in ?? ()
#12 0x807985e in ?? ()
#13 0x8071027 in ?? ()
#14 0x80712ac in ?? ()
#15 0x807162c in ?? ()
#16 0x8071b41 in ?? ()
#17 0x8072144 in ?? ()
#18 0x804a159 in ?? ()
(kgdb) fr 5
#5  0xc016f6a1 in open (p=0xe27c8220, uap=0xe2878f80) at ../../kern/vfs_syscalls.c:994
994             error = vn_open(&nd, flags, cmode);
(kgdb) print nd
$1 = {ni_dirp = 0x80e6a64 "/clara/htdocs/clara.net/k/o/m/komunikation/webspace/mabel.xls", ni_segflg = UIO_USERSPACE, ni_startdir = 0x0,
  ni_rootdir = 0xdd196ec0, ni_topdir = 0x0, ni_vp = 0xe0097b80, ni_dvp = 0xe0097c20, ni_pathlen = 1, ni_next = 0xe0424036 "htm",
  ni_loopcnt = 1, ni_cnd = {cn_nameiop = 0, cn_flags = 49220, cn_proc = 0xe27c8220, cn_cred = 0xc691e800, cn_pnbuf = 0xe0424000 "",
    cn_nameptr = 0xe042402d "ce/n1nhs.htm", cn_namelen = 9, cn_consume = 0}}
(kgdb) print nd->ni_cnd->cn_nameptr
$2 = 0xe042402d "ce/n1nhs.htm"
(kgdb) print nd->ni_cnd->cn_nameptr
$3 = 0xe042402d "ce/n1nhs.htm"

The pointer ni_dirp contains a reference to a file
in the directory with the corrupt entry. This is
true for ALL the processes that are stuck in 'D'.

What does change is the pointer cn_nameptr, which
changes for every web request.

I would have thought that httpd would have alloc'ed
memory for the open(), so I am at a loss at as to
why the ni_dirp pointer contains the reference to 
the Excel spreadsheet in the directory with the corrupt
entry. Why does this not change from request to
request as more files are opened and closed over NFS?

Can anybody explain what is going on with open()?

Thanks.

Ollie

-- 
Oliver Cook    Systems Administrator, ClaraNET
ollie@uk.clara.net      020 7903 3000 ext. 291

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010425150852.B37512>