From owner-freebsd-fs Tue Dec 11 19: 3:59 2001 Delivered-To: freebsd-fs@freebsd.org Received: from smtp02.mrf.mail.rcn.net (smtp02.mrf.mail.rcn.net [207.172.4.61]) by hub.freebsd.org (Postfix) with ESMTP id B63CE37B416 for ; Tue, 11 Dec 2001 19:03:52 -0800 (PST) Received: from 66-44-4-121.s1137.apx1.lnh.md.dialup.rcn.com ([66.44.4.121] helo=localhost.) by smtp02.mrf.mail.rcn.net with smtp (Exim 3.33 #10) id 16DzgT-0001gJ-00; Tue, 11 Dec 2001 22:03:50 -0500 References: <200112111500.fBBF04J48329@freefall.freebsd.org> In-Reply-To: Date: Tue, 11 Dec 2001 21:52:15 EDT From: Eric Jacobs To: Dag-Erling Smorgrav , fs@freebsd.org Subject: Re: kern/32681: Reproducable PANIC in -stable Organization: X-Mailer: Post Office 0.7.2 build 20010211(by eric@localhost 2001/02/11 21:52:30) Message-ID: X-Mailer: PostOffice 0.7 Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Dag-Erling Smorgrav wrote > Stefan.Esser@o-tel-o.de writes: > > > In order to check for the presence of some process %PID%, nessusd > > performs an opendir("/proc/%PID%"), which can lead to a panic in > > fstafs(), if the corresponding process just finishes just before the > > call to fstatfs ... > > Ah, OK, I see. First, if nessus needs to check for the existence of a > process with a given PID, it should kill(pid, 0) instead of relying on > procfs. Second, the problem seems to be insufficient locking - > possibly in pseudofs, but equally possibly in the VFS system. When a > process terminates, pseudofs automatically reclaims all vnodes > associated with it, which leads to an obvious race condition which can > only be avoided through proper locking. > > Looking at the pseudofs code, it doesn't seem to be at fault. When the > process exits, pfs_exit() removes all vnodes associated with it > from pseudofs' vnode cache, and vgone()s them. > > I'm starting to think that the race might be in namei(), actually - > since the namei() call from fstatfs() succeeds, it means namei() found a > vnode, but that vnode's v_mount is NULL, which means it's been > reclaimed, which can't happen if it's referenced (can it?), and namei() > is supposed to reference the vnode before it returns it. My bet is that > there is a small race between namei() finding a vnode and referencing > it, but I'm getting out of my depth. Does anybody else > have any ideas? I think you're on the wrong trail here. I don't run nessus, but I can reproduce a very similar crash with this little snippet: (on my FreeBSD 4.2 system) #include #include #include #include int main(void) { int p; int fd; char buf[50]; struct statfs sf; if (p = fork()) { sprintf(buf, "/proc/%d", p); fd = open(buf, O_RDONLY); printf("open successful fd=%d\n", fd); printf("1st fstatfs returns %d\n", fstatfs(fd, &sf)); sleep(1); printf("2nd fstatfs returns %d\n", fstatfs(fd, &sf)); } } This should crash before the third printf, so namei is not the culprit here. If this is the problem in the bug report, it stems from the following code in vfs_subr.c, in vgonel: /* * Delete from old mount point vnode list, if on one. */ if (vp->v_mount != NULL) insmntque(vp, (struct mount *)0); This ends up setting v_mount to 0, which happens to aggravate fstatfs as it is one of the very few syscalls that makes a VFS_* virtual call without checking the v_type of the vnode that is passed in. This problem could be solved one of several ways. One is of course to make a rule that says that that a vnode's v_type must be checked before a VFS call can be made. I'm not sure what functions other than fstatfs this would affect. Note that this would probably still have to affect other VFS syscalls that use namei rather than getvnode, such as statfs, as the vnode might go VBAD after the namei but before the virtual dispatch. There might have to be a way to hold the interlock from the v_type check until the dispatch to rule out the possibility of a race condition in between. Another possible solution could be modelled after the way that vgone handles the v_op field: /* * Delete from old mount point vnode list, if on one. */ if (vp->v_mount != NULL) insmntque(vp, dead_mount_p); In other words, all of the VBAD vnodes (vgoners) would be considered to be owned by a special mount structure which would always be valid, analogous to the dead_vnodeop_p structure for the VOP_* calls. This could avoid having to add explicit checks and/or interlocking scenarios for functions that want to call a VFS function given a vnode. -- P To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message