Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Feb 2003 20:43:04 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Craig Boston <craig@xfoil.gank.org>
Cc:        Lars Eggert <larse@ISI.EDU>, current@freebsd.org, Poul-Henning Kamp <phk@critter.freebsd.dk>
Subject:   Re: panic starting gnome
Message-ID:  <3E545CD8.184323A9@mindspring.com>
References:  <3E52BB14.2040309@isi.edu> <3E532F61.653A09B0@mindspring.com> <3E5408B0.9030300@isi.edu> <1045713737.612.22.camel@localhost>

next in thread | previous in thread | raw e-mail | index | archive | help
Craig Boston wrote:
> Well, I haven't had much luck tracking down the exact cause.  For some
> reason I haven't been able to figure out, all of my crash dumps jump
> directly from vn_open_cred (line 185 of vfs_vnops.c) to calltrap().  The
> namei call doesn't show up in the stack at all, almost like the function
> is being inlined.  I'm only using -O, which shouldn't inline anything
> not explicitly declared as such.

Nope.  The problem is a NULL pointer dereference, apparently into
the proc structure, which is a NULL proc pointer.

> Anyway, using a cvsup binary search I've managed to narrow it down
> some.  The problem did not exist before midnight UTC on 2003-04-15.  It
> does exist on midnight UTC 2003-04-16.  I've been digging through the
> commit logs for that day, but it seems it was a busy day for the VFS
> code with lots of commits.  Since it always happens after an fdfree(),
> I'm leaning toward a large (number of files) commit by alfred@ having to
> do with a lock order reversal and adding a mutex associated with freeing
> filedesc structures.  Just a guess, though.

FWIW, I arrived at the same place, given Lars' debugging information,
though it was only my most likely suspect.  There are some changes
that went in for KSE, as well, but I'm pretty sure they were after
last Wednesday.


> Reproducing the problem seems to be as simple as killing any process
> that has an open, locked file on an NFS volume.  A simple
> 
> gconfd-1 &
> sleep 5; killall -9 gconfd-1
> 
> does it every time for me.  I assume this would also happen if a process
> calls exit() without closing all of it's fds first; probably why
> starting GNOME or booting diskless is enough to tickle it.

Yes, this is most likely.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E545CD8.184323A9>