From owner-freebsd-current  Thu May 29 08:15:01 1997
Return-Path: <owner-current>
Received: (from root@localhost)
          by hub.freebsd.org (8.8.5/8.8.5) id IAA18916
          for current-outgoing; Thu, 29 May 1997 08:15:01 -0700 (PDT)
Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.50])
          by hub.freebsd.org (8.8.5/8.8.5) with SMTP id IAA18905
          for <current@FreeBSD.ORG>; Thu, 29 May 1997 08:14:58 -0700 (PDT)
Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id IAA03559; Thu, 29 May 1997 08:13:53 -0700
From: Terry Lambert <terry@lambert.org>
Message-Id: <199705291513.IAA03559@phaeton.artisoft.com>
Subject: Re: Boom! :-)
To: rsanders@mindspring.net (Robert Sanders)
Date: Thu, 29 May 1997 08:13:53 -0700 (MST)
Cc: current@FreeBSD.ORG
In-Reply-To: <knenaqbz29.fsf@xena.mindspring.com> from "Robert Sanders" at May 29, 97 01:57:18 am
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-current@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

> > This is a workaround for the lockstatus panic.  A better fix will probably
> > have to wait until Peter is finished with poll(2).
> 
> That seems to have helped with the lockstatus panic.  Now I have a
> different problem, detailed below.  
> 
> Fatal trap 12: page fault while in kernel mode

[ ... ]

> #5  0xf01a7d5d in ffs_sync (mp=0xf0702a00, waitfor=2, cred=0xf04f5a80, 
>     p=0xf022ee60) at ../../ufs/ffs/ffs_vfsops.c:839
> #6  0xf013408f in sync (p=0xf022ee60, uap=0x0, retval=0x0)
>     at ../../kern/vfs_syscalls.c:480

[ ... ]

> Should this vnode exist without an underlying inode?  I know very
> little about the BSD kernel, so excuse the naive question.

Without locking (the patch semi-kills it), a sync can occur on a
page for which a mapping has been created but for which the page
has not yet completed faulting, or on a dirty page which has been
written and discarde, but for which the mapping has not yet been
removed.  With locking, you can't get this because the vnode's
"type" would be deadfs, but you get the other panic because the
interface to vnodes is not completely reflexive (it's much like
an inverse of the cn_pnbuf situation in namei(), actually).


The patch is not correct; the problem is in the VXLOCK handling
being an inherently bogus way to do the vnode reclamation, smearing
responsibility between the FS insteances and the kernel.  It is
bad layering, and needs a rewrite (or needs someone's existing
rewrite comitted).

To fix things, vnode management should go to a per FS type vrele(),
and the vnode should be allocated as follows:

	struct in_core_inode_for_fs_type {
		struct vnode	i_vnode;	/* the real vnode*/
		...
	};

(The per FS vrele() is necessary because the vnode is now an FS
specific opaque inode reference).

To do this right would require getting rid of "struct fileops" (the
abomination before God that it is), and deleting deadfs entirely as
being utterly bogus.


> This may simply be superstition, but these panics always follow within
> an update period of me killing pppd to bring down a kernel PPP
> connection.  It does seem significant that v_type = VCHR and the
> problem seems to coincide with me closing a PPP session on a character
> device (/dev/cuaa2).  Unfortunately, without knowing what inode v_data
> used to point to, I suppose there's no way to trace the origin on this
> vnode.

Yes.  This is the evil of using a common vnode pool and allocating
and reusing the things, willy-nilly.


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.