Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Jun 1996 18:09:24 -0700
From:      Matt Day <mday@sting.artisoft.com>
To:        freebsd-current@FreeBSD.ORG, taob@io.org
Subject:   Re: Kernel panic in fsync, 2.2-960501-SNAP
Message-ID:  <199606110109.SAA22935@sting.artisoft.com>

next in thread | raw e-mail | index | archive | help
Brian Tao <taob@io.org> wrote:
>     Got this today (nm output follows):
>
> Fatal trap 12: page fault while in kernel mode
> fault virtual address   = 0x18
> fault code              = supervisor read, page not present
> instruction pointer     = 0x8:0xf012afc3
> stack pointer           = 0x10:0xefbfff2c
> frame pointer           = 0x10:0xefbfff58
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 6732 (emacs)
> interrupt mask          =
> panic: page fault
>
> [..]
>
> # nm -a /kernel | sort | fgrep -C f012af
> f012adc0 T _ftruncate
> f012aef0 T _otruncate
> f012af20 T _oftruncate
> f012af50 T _fsync
> f012b020 T _rename
> f012b31c T _mkdir

It looks like your panic could very well have been caused by a bug
I reported several months ago.  It has not been fixed yet in either
tree.  Here is my original bug report:

> From mday Mon Feb  5 03:01:27 1996
> To: freebsd-bugs@freebsd.org, freebsd-hackers@freebsd.org
> Subject: Bad bug in ffs_sync() & friends
> 
> Hi,
> 
> I think there is a very rare, yet fatal, bug in ffs_sync() in the
> -CURRENT code (and the -STABLE code, and NetBSD 1.1, etc...).
> This bug has occured twice on my system in the past 6 months.
> 
> Consider this scenario:
> ffs_vget() calls getnewvnode(), and then calls MALLOC() to allocate
> memory for the incore inode.  That MALLOC() blocks.
> While that MALLOC() is blocked, ffs_sync() gets called.  ffs_sync()
> finds the vnode just set up by that getnewvnode() on the mnt_vnodelist
> (because getnewvnode() put it there) and proceeds to dereference
> vp->v_data by calling VOP_ISLOCKED(), but v_data is still zero because
> that MALLOC() blocked.
> 
> It looks like this bug is lurking in many other routines as well --
> pretty much any routine that runs down the mnt_vnodelist.
> 
> What do you think?  Please e-mail me directly, as I do not subscribe to
> these mailing lists.
> 
> Thanks,
> 
> Matt Day <mday@artisoft.com>

Here is one possible bug fix to the -CURRENT FFS code (the same
bug exists in some of the other file systems as well):

*** sys/ufs/ffs/ffs_vfsops.c-	Sat Mar  2 20:43:40 1996
--- sys/ufs/ffs/ffs_vfsops.c	Mon Jun 10 17:49:30 1996
***************
*** 866,871 ****
--- 866,881 ----
  	}
  	ffs_inode_hash_lock = 1;
  
+ 	/*
+ 	 * N.B.: If this MALLOC() is performed after the getnewvnode()
+ 	 * it might block, leaving a vnode with a NULL v_data to be
+ 	 * found by ffs_sync() if a sync happens to fire right then,
+ 	 * which will cause a panic because ffs_sync() blindly
+ 	 * dereferences vp->v_data (as well it should).
+ 	 */
+ 	type = ump->um_devvp->v_tag == VT_MFS ? M_MFSNODE : M_FFSNODE; /* XXX */
+ 	MALLOC(ip, struct inode *, sizeof(struct inode), type, M_WAITOK);
+ 
  	/* Allocate a new vnode/inode. */
  	error = getnewvnode(VT_UFS, mp, ffs_vnodeop_p, &vp);
  	if (error) {
***************
*** 873,882 ****
  			wakeup(&ffs_inode_hash_lock);
  		ffs_inode_hash_lock = 0;
  		*vpp = NULL;
  		return (error);
  	}
- 	type = ump->um_devvp->v_tag == VT_MFS ? M_MFSNODE : M_FFSNODE; /* XXX */
- 	MALLOC(ip, struct inode *, sizeof(struct inode), type, M_WAITOK);
  	bzero((caddr_t)ip, sizeof(struct inode));
  	vp->v_data = ip;
  	ip->i_vnode = vp;
--- 883,891 ----
  			wakeup(&ffs_inode_hash_lock);
  		ffs_inode_hash_lock = 0;
  		*vpp = NULL;
+ 		FREE(ip, type);
  		return (error);
  	}
  	bzero((caddr_t)ip, sizeof(struct inode));
  	vp->v_data = ip;
  	ip->i_vnode = vp;

Another way to fix the bug would be to check for vp->v_data == NULL
in ffs_sync().  But that way would not be very elegant, in my
opinion.  I think a good, safe policy would be "if a vnode can be
found on the mnt_vnodelist list by a process, the process can assume
that the vnode is fully initialized".

Hope that helps,

Matt Day <mday@artisoft.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199606110109.SAA22935>