Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 9 Sep 2002 08:36:33 -0700 (PDT)
From:      Pawel Malachowski <pawmal@unia.3lo.lublin.pl>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/42597: kernel panic, xl and bpf related
Message-ID:  <200209091536.g89FaXau093976@www.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         42597
>Category:       kern
>Synopsis:       kernel panic, xl and bpf related
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Sep 09 08:40:02 PDT 2002
>Closed-Date:
>Last-Modified:
>Originator:     Pawel Malachowski
>Release:        4.6.2-RELEASE
>Organization:
ZiN
>Environment:
FreeBSD gargantua.zin.ask 4.6.2-RELEASE FreeBSD 4.6.2-RELEASE #0: Sat Sep  7 16:47:11 CEST 2002     root@gargantua.zin.ask:/usr/obj/usr/src/sys/PM-UX-AUTO  i386
>Description:
      Looks similar to kern/30952, kern/31710 -- but I'm not sure.
My machine crashes from time to time (typical uptime is 1-4 days).
This is a Celeron 950MHz on ASUS TUV4X with 1 RealTek 8139C and two 3Com 905B NICs.
Kernel config is a GENERIC with additional options:
pseudo-device   ccd     4
device          apm
pseudo-device   speaker
pseudo-device   snp     3
options         IPFIREWALL
options         IPFIREWALL_VERBOSE
options         IPFIREWALL_FORWARD
options         IPFIREWALL_VERBOSE_LIMIT=100
options         IPFIREWALL_DEFAULT_TO_ACCEPT
options         IPFILTER
options         IPFILTER_LOG
options         IPFILTER_DEFAULT_BLOCK
options         QUOTA
options         DUMMYNET
options         SHMMAXPGS=65536
options         SEMMNI=40
options         SEMMNS=240
options         SEMUME=40
options         SEMMNU=120
options         IPX
options         NCP
options         NWFS
options         ETHER_8023
pseudo-device   tap
options         HZ=1000
All NMB* and other options are auto-sized, this machine has 256MB of memory.

panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xa111351
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc02f5ccc
stack pointer           = 0x10:0xcdf7edec
frame pointer           = 0x10:0xcdf7edf8
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 14899 (trafshow)
interrupt mask          = net tty
trap number             = 12
panic: page fault

syncing disks... 4
done
Uptime: 1d18h3m58s

dumping to dev #ad/0x30001, offset 1573024
dump ata0: resetting devices .. done
// cut
---
#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487
487             if (dumping++) {
#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487
#1  0xc01fb2c3 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:316
#2  0xc01fb6e8 in poweroff_wait (junk=0xc03eb16c, howto=-1069634417)
    at /usr/src/sys/kern/kern_shutdown.c:595
#3  0xc0379d92 in trap_fatal (frame=0xcdf7edac, eva=168891217)
    at /usr/src/sys/i386/i386/trap.c:966
#4  0xc0379a65 in trap_pfault (frame=0xcdf7edac, usermode=0, eva=168891217)
    at /usr/src/sys/i386/i386/trap.c:859
#5  0xc0379623 in trap (frame={tf_fs = -841351152, tf_es = 16, tf_ds = -839450608,
      tf_edi = -1058681600, tf_esi = 6687744, tf_ebp = -839389704,
      tf_isp = -839389736, tf_ebx = -1058681600, tf_edx = -1058799616,
      tf_ecx = 168891217, tf_eax = 599456, tf_trapno = 12, tf_err = 0,
      tf_eip = -1070637876, tf_cs = 8, tf_eflags = 66050, tf_esp = -1058205184,
      tf_ss = -1051000152}) at /usr/src/sys/i386/i386/trap.c:458
#6  0xc02f5ccc in xl_newbuf (sc=0xc15b0000, c=0xc15b02a8)
    at /usr/src/sys/pci/if_xl.c:1727
#7  0xc02f5e82 in xl_rxeof (sc=0xc15b0000) at /usr/src/sys/pci/if_xl.c:1826
#8  0xc02f65a4 in xl_intr (arg=0xc15b0000) at /usr/src/sys/pci/if_xl.c:2061
#9  0xc03845f9 in intr_mux (arg=0xc0e35160)
    at /usr/src/sys/i386/isa/intr_machdep.c:582
#10 0xc036c646 in vec11 ()
#11 0xc0201145 in softclock () at /usr/src/sys/kern/kern_timeout.c:131
#12 0xc036c553 in doreti_swi ()
#13 0xc036b135 in Xint0x80_syscall ()
#14 0x2809c51b in ?? ()
#15 0x280a90ab in ?? ()
#16 0x804a7c0 in ?? ()
#17 0x804ac73 in ?? ()
#18 0x804ba06 in ?? ()
#19 0x280796b9 in ?? ()
#20 0x2807932f in ?? ()
#21 0x8049aaf in ?? ()
#22 0x8049659 in ?? ()
(kgdb) up 6
#6  0xc02f5ccc in xl_newbuf (sc=0xc15b0000, c=0xc15b02a8)
    at /usr/src/sys/pci/if_xl.c:1727
1727            MCLGET(m_new, M_DONTWAIT);
(kgdb) list
1722
1723            MGETHDR(m_new, M_DONTWAIT, MT_DATA);
1724            if (m_new == NULL)
1725                    return(ENOBUFS);
1726
1727            MCLGET(m_new, M_DONTWAIT);
1728            if (!(m_new->m_flags & M_EXT)) {
1729                    m_freem(m_new);
1730                    return(ENOBUFS);
1731            }

I can provide more info from gdb upon request.

>How-To-Repeat:
      Try to work harder with 3Com 905B NIC on machine as described above.
>Fix:
      Don't know.
>Release-Note:
>Audit-Trail:
>Unformatted:
 >netstat -m -M vmcore.5 -N /usr/obj/usr/src/sys/PM-UX-AUTO/kernel.debug
 10/416/10048 mbufs in use (current/peak/max):
         10 mbufs allocated to data
 9/356/2512 mbuf clusters in use (current/peak/max)
 816 Kbytes allocated to network (10% of mb_map in use)
 0 requests for memory denied
 0 requests for memory delayed
 0 calls to protocol drain routines
 
 There are Ierrors:
 >netstat -i -M vmcore.5 -N /usr/obj/usr/src/sys/PM-UX-AUTO/kernel.debug | grep Link#2
 xl0   1500  <Link#2>    00:a0:24:aa:1b:95 103961268   970 94181546     0     0
 This NIC was replaced with other one, Ierrors are still there. This NIC is connected to the NWay store-and-forward switch. Switch was replaced with other store-and-forward one. Even if this is a cable fault, this can't produce kernel panic. ;)  It's easy  to reach a 9-10MBytes/s in both directions on this interface even while Ierrors counter is increasing.
 
 There were 5 kernel panics in the near past, 4 of them were similar to this and 1 was acquire_lock() related. Let's see at these 4: current process in panic message always points to trafd or trafshow process (note, both are bpf processes). Trafshow was r ecompiled with increased #define MAX_PAGES, there are usually about 30-40 trafshow pages while program is running. Instruction pointer in panic message always points to xl_newbuf().
 Machine acts as a router. Most of the traffic goes from xl0 interface to rl0 inteface. xl0 is a 100Mbit full-duplex, rl0 is forced to work at 10Mbit with full-duplex.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200209091536.g89FaXau093976>