Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 9 Oct 1996 13:03:19 +0800 (WST)
From:      Peter Wemm <peter@haywire.dialix.com>
To:        FreeBSD-gnats-submit@freebsd.org
Subject:   kern/1744: run queue or proc list smashed 4 times in 2 days
Message-ID:  <199610090503.NAA02004@newton.dialix.com.au>
Resent-Message-ID: <199610090510.WAA06332@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         1744
>Category:       kern
>Synopsis:       run queue or proc list smashed 4 times in 2 days
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Oct  8 22:10:01 PDT 1996
>Last-Modified:
>Originator:     Peter Wemm
>Organization:
What, here? :-)
>Release:        FreeBSD 2.2-961004-SNAP i386
>Environment:

Vanilla i486 box, 16M, 2 IDE drives and one slow SCSI drive on an AHA1542CF.
FreeBSD newton.dialix.com.au 2.2-961004-SNAP FreeBSD 2.2-961004-SNAP #30: Tue Oct  8 06:34:52 WST 1996     peter@newton.dialix.com.au:/home2/src/sys/compile/NEWTON  i386

>Description:

Normally, this is a quiet machine, but it's taken a nose-dive in stability
in the last two days.

It's been faulting like this:

WARNING: / was not properly dismounted.

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x4
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xf01aa108
stack pointer           = 0x10:0xefbffe0c
frame pointer           = 0x10:0xefbffe30
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = resume, IOPL = 0
current process         = Idle
interrupt mask          = net tty bio
panic: page fault

Syncing disks...

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x10
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xf012925a
stack pointer           = 0x10:0xefbffc88
frame pointer           = 0x10:0xefbffc98
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = resume, IOPL = 0
current process         = Idle
interrupt mask          = net tty bio
panic: page fault

dumping to dev 20001, offset 32768
dump 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

In this particular case, it died in cpu_switch about line 364:
        /* XX update whichqs? */
        btrl    %ebx,%edi                       /* clear q full status */
        leal    _qs(,%ebx,8),%eax               /* select q */
        movl    %eax,%esi

        movl    P_FORW(%eax),%ecx               /* unlink from front of process
q */
        movl    P_FORW(%ecx),%edx
        movl    %edx,P_FORW(%eax)
        movl    P_BACK(%ecx),%eax
        movl    %eax,P_BACK(%edx)
        ^^^^^^^^^^^^^^^^^^^^^^^^^
        cmpl    P_FORW(%ecx),%esi               /* q empty */
        je      3f

The backtrace looks like this:
[.. rest of trap processing ..]
#13 0xf01a2ce1 in calltrap ()
#14 0xf010e6bd in tsleep ()
#15 0xf0120327 in sbwait ()
#16 0xf011f0e3 in soreceive ()
#17 0xf0121b90 in recvit ()
#18 0xf0121dff in recvfrom ()
#19 0xf01ab0d3 in syscall ()
#20 0xf01a2d35 in Xsyscall ()

The process that was running was either of:
  UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT  TT       TIME COMMAND
    1   176     4   0   2  0   208    0 sbwait SWs   ??    0:00.00  (rwhod)
    0 27386     4   1   2  0   148    0 sbwait Ss    ??    0:00.00  (comsat)

This particular kernel is not running any modified code.

The other three dumps were quite similar, but I din't have the disk space
at the time to save them for analysis.

>How-To-Repeat:

I don't think this box is doing anything unusual, apart from cvsup which
makes it sweat a fair bit.  (a 6.5MB process on a 16M machine that's doing 
other things is hard work :-)

>Fix:
	
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199610090503.NAA02004>