Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Mar 1997 16:02:00 +1030 (CST)
From:      Michael Smith <msmith@atrad.adelaide.edu.au>
To:        jlemon@americantv.com (Jonathan Lemon)
Cc:        msmith@atrad.adelaide.edu.au, proff@iq.org, hackers@FreeBSD.ORG
Subject:   Re: xemacs crashes kernel
Message-ID:  <199703040532.QAA10831@genesis.atrad.adelaide.edu.au>
In-Reply-To: <19970303230157.25741@right.PCS> from Jonathan Lemon at "Mar 3, 97 11:01:57 pm"

next in thread | previous in thread | raw e-mail | index | archive | help
Jonathan Lemon stands accused of saying:
> On Mar 03, 1997 at 03:11:23PM +1030, Michael Smith wrote:
> > Jonathan Lemon stands accused of saying:
> > > On Mar 03, 1997 at 01:03:08PM +1100, Julian Assange wrote:
> > > > 
> > > > (1) telnet into machine
> > > > (2) start up xemacs in text mode
> > > > (3) suspend xemacs
> > > > (4) remote-disconnect telnet
> > > 
> > > Bleah.  Confirmed here, on a 2.2-GAMMA machine.  Doing this causes
> > > a "Trap 12, code 0 - page fault in kernel mode".
> > 
> > Can you give us the trap message and do the nm /kernel | less thing?
> 
> Panic dump (typed by hand):
> 
> 	Fatal trap 12: page fault while in kernel mode
> 	fault virtual address	= 0x18
> 	fault code		= supervisor read, page not present

Looks like a read dereference of a null structure pointer.

> 	instruction pointer	= 0x8:0xf013753b
> 	stack pointer		= 0x10:0x3fbfff18
> 	frame pointer		= 0x10:0x3fbfff44
> 	code segment		= base 0x0, limit 0xfffff, type 0x1b
> 				= DPL 0, pres 1, def32 1, gran 1
> 	processor eflags	= interrupt enabled, resume, IOPL = 0
> 	interrupt mask		= 
> 	kernel: type 12 trap, code 0
> 
> 	stopped at _fsync+0x73, testb $0x40, 0x18(%eax)
> 
> nm /kernel | grep f0137 | sort
> 
> 	f01374c8 T _fsync

Ok.  Here it is :

int
fsync(p, uap, retval)
        struct proc *p;
        struct fsync_args *uap;
        int *retval;
{
        register struct vnode *vp;
        struct file *fp;
        int error;

        error = getvnode(p->p_fd, uap->fd, &fp);
        if (error)
                return (error);
        vp = (struct vnode *)fp->f_data;
        VOP_LOCK(vp);
        if (vp->v_object) {
                vm_object_page_clean(vp->v_object, 0, 0 ,0, FALSE);
        }
        error = VOP_FSYNC(vp, fp->f_cred,
                (vp->v_mount->mnt_flag & MNT_ASYNC) ? MNT_NOWAIT : MNT_WAIT, p);

MNT_ASYNC is 0x40, and mnt_flag looks to be about 0x18 offset in the
mount structure.  Looks like maybe someone trying to fsync something
that's not a file, although a quick test here doesn't indicate that.

Are non-file items supposed to have valid v_mount pointers?  Other places
in the kernel that look at vp->v_mount often check it against zero first;
should that be done here, eg.

	(vp->v_mount && (vp->v_mount->mnt_flag & MNT_ASYNC)) ? MNT_NOWAIT...

as well?  This looks like it might have been overlooked when the async
filesystem stuff came in, as old versions of this code read :

        error = VOP_FSYNC(vp, fp->f_cred, MNT_WAIT, p);

Suggestions?  Jonathan, can you try the above and see if it cures your
problem?

-- 
]] Mike Smith, Software Engineer        msmith@gsoft.com.au             [[
]] Genesis Software                     genesis@gsoft.com.au            [[
]] High-speed data acquisition and      (GSM mobile)     0411-222-496   [[
]] realtime instrument control.         (ph)          +61-8-8267-3493   [[
]] Unix hardware collector.             "Where are your PEZ?" The Tick  [[



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199703040532.QAA10831>