Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 7 Jul 2008 04:47:12 GMT
From:      Andrew Snow <andrew@modulus.org>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/125356: Repeated panic in kqueue_close from kern_close
Message-ID:  <200807070447.m674lCVt008961@www.freebsd.org>
Resent-Message-ID: <200807070450.m674o0Nv067626@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         125356
>Category:       kern
>Synopsis:       Repeated panic in kqueue_close from kern_close
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jul 07 04:50:00 UTC 2008
>Closed-Date:
>Last-Modified:
>Originator:     Andrew Snow
>Release:        7.0-PRERELEASE
>Organization:
>Environment:
FreeBSD b1.octopus.com.au 7.0-PRERELEASE FreeBSD 7.0-PRERELEASE #1: Tue Mar  4 16:38:04 EST 2008     root@b5.octopus.com.au:/usr/obj/usr/src/sys/OCTO64  amd64

>Description:


Example #1:

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address   = 0x9a050
fault code              = supervisor write data, page not present
instruction pointer     = 0x8:0xffffffff8027f877
stack pointer           = 0x10:0xffffffffb4c0a9e0
frame pointer           = 0x10:0xffffff005c0ccda0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12949 (smtpd)
trap number             = 12
panic: page fault
cpuid = 2
Uptime: 6d11h8m19s
Physical memory: 8183 MB



(kgdb) bt
#0  doadump () at pcpu.h:194
#1  0x0000000000000004 in ?? ()
#2  0xffffffff8029e6ef in boot (howto=260) at 
/usr/src/sys/kern/kern_shutdown.c:409
#3  0xffffffff8029eb37 in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:563
#4  0xffffffff803df54e in trap_fatal (frame=0xc, eva=Variable "eva" is 
not available.
) at /usr/src/sys/amd64/amd64/trap.c:724
#5  0xffffffff803df903 in trap_pfault (frame=0xffffffffb4c0a930, 
usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641
#6  0xffffffff803e00a1 in trap (frame=0xffffffffb4c0a930) at 
/usr/src/sys/amd64/amd64/trap.c:410
#7  0xffffffff803c6fde in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:169
#8  0xffffffff8027f877 in kqueue_close (fp=0xffffff0114352000, 
td=0xffffff003bcb3680)
     at /usr/src/sys/kern/kern_event.c:1457
#9  0xffffffff802762cf in fdrop (fp=0xffffff0114352000, 
td=0xffffff003bcb3680) at file.h:297
#10 0xffffffff802775cd in closef (fp=0xffffff0114352000, 
td=0xffffff003bcb3680) at /usr/src/sys/kern/kern_descrip.c:1958
#11 0xffffffff80277d3e in kern_close (td=0xffffff003bcb3680, fd=Variable 
"fd" is not available.
) at /usr/src/sys/kern/kern_descrip.c:1054
#12 0xffffffff803dfaeb in syscall (frame=0xffffffffb4c0ac70) at 
/usr/src/sys/amd64/amd64/trap.c:852
#13 0xffffffff803c71eb in Xfast_syscall () at 
/usr/src/sys/amd64/amd64/exception.S:290
#14 0x0000000800bee16c in ?? ()
Previous frame inner to this frame (corrupt stack?)


The offending line:

#8  0xffffffff8027f877 in kqueue_close (fp=0xffffff0114352000, 
td=0xffffff003bcb3680)
     at /usr/src/sys/kern/kern_event.c:1457
1457                    while ((kn = SLIST_FIRST(&kq->kq_knlist[i])) != 
NULL) {


The process that made the syscall was a postfix smtpd inside a jail.  It 
had been running fine for almost a week before this.



Example #2:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0xbb5050
fault code              = supervisor write data, page not present
instruction pointer     = 0x8:0xffffffff8027f877
stack pointer           = 0x10:0xffffffffb57048c0
frame pointer           = 0x10:0xffffff002575cda0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 40813 (pipe)
trap number             = 12
panic: page fault
cpuid = 1
Uptime: 18d23h44m57s
Physical memory: 8183 MB
Dumping 801 MB: 786 770 754 738 722 706 690 674 658 642 626 610 594 578 562 546 530 514 498 482 466 450 434 418 402 386 370 354 338 322 306 290 274 258 242 226 210 194 178 162 146 130 114 98 82 66 50 34 18 2

#0  doadump () at pcpu.h:194
194             __asm __volatile("movq %%gs:0,%0" : "=r" (td));
(kgdb) bt
#0  doadump () at pcpu.h:194
#1  0x0000000000000004 in ?? ()
#2  0xffffffff8029e6ef in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#3  0xffffffff8029eb37 in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:563
#4  0xffffffff803df54e in trap_fatal (frame=0xc, eva=Variable "eva" is not available.
) at /usr/src/sys/amd64/amd64/trap.c:724
#5  0xffffffff803df903 in trap_pfault (frame=0xffffffffb5704810, usermode=0)
    at /usr/src/sys/amd64/amd64/trap.c:641
#6  0xffffffff803e00a1 in trap (frame=0xffffffffb5704810) at /usr/src/sys/amd64/amd64/trap.c:410
#7  0xffffffff803c6fde in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169
#8  0xffffffff8027f877 in kqueue_close (fp=0xffffff01799280f0, td=0xffffff006d37d9c0)
    at /usr/src/sys/kern/kern_event.c:1457
#9  0xffffffff802762cf in fdrop (fp=0xffffff01799280f0, td=0xffffff006d37d9c0) at file.h:297
#10 0xffffffff802775cd in closef (fp=0xffffff01799280f0, td=0xffffff006d37d9c0)
    at /usr/src/sys/kern/kern_descrip.c:1958
#11 0xffffffff802785bd in fdfree (td=0xffffff006d37d9c0) at /usr/src/sys/kern/kern_descrip.c:1668
#12 0xffffffff80282c90 in exit1 (td=0xffffff006d37d9c0, rv=0) at /usr/src/sys/kern/kern_exit.c:276
#13 0xffffffff80283dd4 in sys_exit (td=Variable "td" is not available.
) at /usr/src/sys/kern/kern_exit.c:102
#14 0xffffffff803dfaeb in syscall (frame=0xffffffffb5704c70) at /usr/src/sys/amd64/amd64/trap.c:852
#15 0xffffffff803c71eb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:290
#16 0x0000000800effa1c in ?? ()
Previous frame inner to this frame (corrupt stack?)


>How-To-Repeat:
Have not been able to work out what causes the problem, but the server in question has crashed in the same place several times now.  Usually lasts 1-4 weeks between crashes.


>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200807070447.m674lCVt008961>