Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 06 Jan 1999 16:41:57 -0500
From:      "C. Stephen Gunn" <csg@physics.purdue.edu>
To:        freebsd-hackers@FreeBSD.ORG
Cc:        ajk@physics.purdue.edu, crh@physics.purdue.edu, jonsmith@physics.purdue.edu, bp@physics.purdue.edu, ab@eas.purdue.edu
Subject:   NFS problems under 3.0-RELEASE
Message-ID:  <199901062142.QAA13257@galileo.physics.purdue.edu>

next in thread | raw e-mail | index | archive | help

We've been experiencing crashes/hangs with NFS here recently, and
finally got the chance to try and debug it today, here's what we
know so far:

 1) It happens during NFS writes. In our specific case, it happens
    when writing files to my home directory automounted off my
    machine when I roam to other workstations in our department.

 2) If you increase the frequency of the writes, you can make it
    crash (actually hang) easier.  While our method of choice was
    to run Netscape (the DB file I/O kills it usually) I've had
    it hang a couple of time when writing files with vi, or sending
    mail.  Again, it only pertains to NFS.

I finally got the chance today to install a DDB kernel, and get a
dump/backtrace after the system hung.  The backtrace showed that
the kernel was in the middle of an nfs_vinvalbuf() call which in
turn called vinvalbuf().

Here's the deal, the backtrace showed a valid parameters for the
call to vinvalbuf() from inside nfs_vinvalbuf(), but somehow the
vnode parameter to vinvalbuf() apparently got smashed.

We'd hang in the middle of the tsleep() loop at the beginning of
vinvalbuf() since we weren't paying attention to tsleep()'s error
code of ERESTART.  I merged 2-3 lines from current to check the
error, and the hangs go away.

This still doesn't address the problem though.  Someone is smashing
this vnode pointer, usually with 0x0100, as far as I can tell.
I've not had the time to digest all of the nfs/vfs changes that
have happened since 3.0 release, but the CVS logs didn't seem to
indicate changes that might address this.

At least it doesn't crash now, but it's probably a bug thats still
out there.

 - Steve

--
C. Stephen Gunn, Computer Systems Engineer         <csg@physics.purdue.edu>
Physics Computer Network, Purdue University    

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199901062142.QAA13257>