Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 11 Dec 2001 21:52:15 EDT
From:      Eric Jacobs <eaja@erols.com>
To:        Dag-Erling Smorgrav <des@ofug.org>, fs@freebsd.org
Subject:   Re: kern/32681: Reproducable PANIC in -stable
Message-ID:  <PstOfc.3c16c65f.c67ea6@localhost>
In-Reply-To: <xzpwuztu0jt.fsf@flood.ping.uio.no>
References:  <200112111500.fBBF04J48329@freefall.freebsd.org> <xzpwuztu0jt.fsf@flood.ping.uio.no>

next in thread | previous in thread | raw e-mail | index | archive | help
Dag-Erling Smorgrav <des@ofug.org> wrote

> Stefan.Esser@o-tel-o.de writes:
> 
> > In order to check for the presence of some process %PID%, nessusd 
> > performs an opendir("/proc/%PID%"), which can lead to a panic in 
> > fstafs(), if the corresponding process just finishes just before the 
> > call to fstatfs ...
> 
> Ah, OK, I see.  First, if nessus needs to check for the existence of a 
> process with a given PID, it should kill(pid, 0) instead of relying on
> procfs.  Second, the problem seems to be insufficient locking -
> possibly in pseudofs, but equally possibly in the VFS system.  When a 
> process terminates, pseudofs automatically reclaims all vnodes 
> associated with it, which leads to an obvious race condition which can
> only be avoided through proper locking.
> 
> Looking at the pseudofs code, it doesn't seem to be at fault.  When the 
> process exits, pfs_exit() removes all vnodes associated with it
> from pseudofs' vnode cache, and vgone()s them.
> 
> I'm starting to think that the race might be in namei(), actually - 
> since the namei() call from fstatfs() succeeds, it means namei() found a 
> vnode, but that vnode's v_mount is NULL, which means it's been 
> reclaimed, which can't happen if it's referenced (can it?), and namei() 
> is supposed to reference the vnode before it returns it.  My bet is that 
> there is a small race between namei() finding a vnode and referencing 
> it, but I'm getting out of my depth.  Does anybody else
> have any ideas?

I think you're on the wrong trail here. I don't run nessus, but 
I can reproduce a very similar crash with this little snippet:
(on my FreeBSD 4.2 system)


#include <sys/param.h>
#include <sys/mount.h>
#include <fcntl.h>
#include <stdio.h>

int main(void)
{
    	int p;
    	int fd;
    	char buf[50];
    	struct statfs sf;
	
    	if (p = fork()) {
        		sprintf(buf, "/proc/%d", p);
        		fd = open(buf, O_RDONLY);
        		printf("open successful fd=%d\n", fd);
        		printf("1st fstatfs returns %d\n", fstatfs(fd, &sf));
        		sleep(1);
        		printf("2nd fstatfs returns %d\n", fstatfs(fd, &sf));
    	}
}

This should crash before the third printf, so namei is not the culprit
here.

If this is the problem in the bug report, it stems from the following
code in 
vfs_subr.c, in vgonel:

	/*
	 * Delete from old mount point vnode list, if on one.
	 */
    	if (vp->v_mount != NULL)
        		insmntque(vp, (struct mount *)0);

This ends up setting v_mount to 0, which happens to aggravate fstatfs
as it is one of the very few syscalls that makes a VFS_* virtual call
without checking the v_type of the vnode that is passed in.

This problem could be solved one of several ways. One is of course to
make a rule that says that that a vnode's v_type must be checked before
a VFS call can be made. I'm not sure what functions other than fstatfs
this would affect.

Note that this would probably still have to affect other VFS syscalls that
use namei rather than getvnode, such as statfs, as the vnode might go
VBAD after the namei but before the virtual dispatch. There might have to
be a way to hold the interlock from the v_type check until the dispatch
to rule out the possibility of a race condition in between.

Another possible solution could be modelled after the way that vgone
handles the v_op field:

	/*
	 * Delete from old mount point vnode list, if on one.
	 */
    	if (vp->v_mount != NULL)
        insmntque(vp, dead_mount_p);

In other words, all of the VBAD vnodes (vgoners) would be considered
to be owned by a special mount structure which would always be valid,
analogous to the dead_vnodeop_p structure for the VOP_* calls. This
could avoid having to add explicit checks and/or interlocking scenarios
for functions that want to call a VFS function given a vnode.





-- 
P


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?PstOfc.3c16c65f.c67ea6>