Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Apr 2007 12:32:35 +0200
From:      Thomas <freebsdlists@bsdunix.ch>
To:        fs@freebsd.org
Subject:   FS (gjournal?) releated crashes with current?
Message-ID:  <462896C3.6040402@bsdunix.ch>

next in thread | raw e-mail | index | archive | help
Hello

I triggered several crashes with 7-Current from 2007-04-19. The system
mostly crashes if I'm syncing data with 4-5 parallel rsync processes.

Most debug options are disabled in my kernel and malloc was compiled
with MALLOC_PRODUCTION. I use GJournal (/dev/da0.journal) on a SATA
Raid6 created with an areca 1230 controller. The Raid status is fine.

# mount
/dev/ad4s1a on / (ufs, local)
devfs on /dev (devfs, local)
/dev/ad4s1g on /disk1 (ufs, local, soft-updates)
/dev/ad4s1d on /tmp (ufs, local, soft-updates)
/dev/ad4s1f on /usr (ufs, local, soft-updates)
/dev/ad4s1e on /var (ufs, local, soft-updates)
/dev/da0.journal on /usr/local/data (ufs, asynchronous, local, noatime,
gjournal)


After every crash /dev/da0.journal is marked as clean but when I do full
fsck i got:

# umount /usr/local/data
# fsck -y /usr/local/data
** /dev/da0.journal
** Last Mounted on /usr/local/data
** Phase 1 - Check Blocks and Sizes
PARTIALLY TRUNCATED INODE I=150149446
SALVAGE? yes

-4415861736689041919 BAD I=150149446
6180257590692086610 BAD I=150149446
7624567997605723585 BAD I=150149446
8268956604991674674 BAD I=150149446
2342221461849545187 BAD I=150149446
-292497344028865874 BAD I=150149446
-5568323556661920569 BAD I=150149446
-7916380230741665943 BAD I=150149446
4170928977557909368 BAD I=150149446
4450577158601375817 BAD I=150149446
1180086702901020396 BAD I=150149446
EXCESSIVE BAD BLKS I=150149446
CONTINUE? yes

INCORRECT BLOCK COUNT I=150149446 (1856 should be 736)
CORRECT? yes

PARTIALLY TRUNCATED INODE I=151138150
SALVAGE? yes
....
....
and many more.



I have 2 core dumpes:
lisa# cat /var/crash/info.6
Dump header from device /dev/ad4s1b
  Architecture: i386
  Architecture Version: 2
  Dump Length: 328253440B (313 MB)
  Blocksize: 512
  Dumptime: Fri Apr 20 08:51:51 2007
  Hostname: lisa.mlan.solnet.ch
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 7.0-CURRENT #0: Thu Apr 19 09:14:51 UTC 2007
    root@lisa.mlan.solnet.ch:/usr/obj/usr/src/sys/UP7_SATA
  Panic String: ffs_valloc: dup alloc
  Dump Parity: 949538821
  Bounds: 6
  Dump Status: good

lisa# kgdb kernel.debug /var/crash/vmcore.6
kgdb: kvm_nlist(_stopped_cpus):
kgdb: kvm_nlist(_stoppcbs):
[GDB will not be able to debug user-mode threads:
/usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".

Unread portion of the kernel message buffer:
mode = 0100644, inum = 154219448, fs = /usr/local/data
panic: ffs_valloc: dup alloc
Uptime: 7h47m45s
Physical memory: 3445 MB
Dumping 313 MB: 298 282 266 250 234 218 202 186 170 154 138 122 106 90
74 58 42 26 10

#0  doadump () at pcpu.h:172
172             __asm __volatile("movl %%fs:0,%0" : "=r" (td));
(kgdb) backtrace
#0  doadump () at pcpu.h:172
#1  0xc0597df8 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#2  0xc0598088 in panic (fmt=0xc0773469 "ffs_valloc: dup alloc")
    at /usr/src/sys/kern/kern_shutdown.c:563
#3  0xc06aa4f8 in ffs_valloc (pvp=0xc7a1daa0, mode=33152, cred=0xcbd66300,
    vpp=0xe9040888) at /usr/src/sys/ufs/ffs/ffs_alloc.c:966
#4  0xc06d552f in ufs_makeinode (mode=33152, dvp=0xc7a1daa0,
vpp=0xe9040b98,
    cnp=0xe9040bac) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2238
#5  0xc06d24e5 in ufs_create (ap=0x0) at
/usr/src/sys/ufs/ufs/ufs_vnops.c:188
#6  0xc0730294 in VOP_CREATE_APV (vop=0x0, a=0xe9040a1c) at vnode_if.c:206
#7  0xc0603644 in vn_open_cred (ndp=0xe9040b84, flagp=0xe9040c84,
cmode=384,
    cred=0xcbd66300, fdidx=0) at vnode_if.h:111
#8  0xc060346a in vn_open (ndp=0x0, flagp=0xe9040c84, cmode=384, fdidx=6)
    at /usr/src/sys/kern/vfs_vnops.c:93
#9  0xc05fdbc7 in kern_open (td=0xc91026c0, path=0x0,
pathseg=UIO_USERSPACE,
    flags=2563, mode=384) at /usr/src/sys/kern/vfs_syscalls.c:987
#10 0xc05fdb0c in open (td=0xc91026c0, uap=0x0)
    at /usr/src/sys/kern/vfs_syscalls.c:954
#11 0xc07200e2 in syscall (frame=0xe9040d38)
    at /usr/src/sys/i386/i386/trap.c:1016
#12 0xc0710440 in Xint0x80_syscall () at
/usr/src/sys/i386/i386/exception.s:196
#13 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)


lisa# cat /var/crash/info.7
Dump header from device /dev/ad4s1b
  Architecture: i386
  Architecture Version: 2
  Dump Length: 298967040B (285 MB)
  Blocksize: 512
  Dumptime: Fri Apr 20 09:55:56 2007
  Hostname: lisa.mlan.solnet.ch
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 7.0-CURRENT #0: Thu Apr 19 09:14:51 UTC 2007
    root@lisa.mlan.solnet.ch:/usr/obj/usr/src/sys/UP7_SATA
  Panic String: sbdrop
  Dump Parity: 487915010
  Bounds: 7
  Dump Status: good

isa# kgdb kernel.debug /var/crash/vmcore.7
kgdb: kvm_nlist(_stopped_cpus):
kgdb: kvm_nlist(_stoppcbs):
[GDB will not be able to debug user-mode threads:
/usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".

Unread portion of the kernel message buffer:
panic: sbdrop
Uptime: 1h1m12s
Physical memory: 3445 MB
Dumping 285 MB: 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46
30 14

#0  doadump () at pcpu.h:172
172             __asm __volatile("movl %%fs:0,%0" : "=r" (td));
(kgdb) backtrace
#0  doadump () at pcpu.h:172
#1  0xc0597df8 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#2  0xc0598088 in panic (fmt=0xc0768744 "sbdrop") at
/usr/src/sys/kern/kern_shutdown.c:563
#3  0xc05d8278 in sbdrop_internal (sb=0xc6ecf8ec, len=432) at
/usr/src/sys/kern/uipc_sockbuf.c:846
#4  0xc05d8442 in sbdrop_locked (sb=0xc6ecf8ec, len=492) at
/usr/src/sys/kern/uipc_sockbuf.c:896
#5  0xc0646828 in tcp_do_segment (m=0xc6a75100, th=0xc6a45824,
so=0xc6ecf828, tp=0xccbd616c, drop_hdrlen=40, tlen=0)
    at /usr/src/sys/netinet/tcp_input.c:2191
#6  0xc0645439 in tcp_input (m=0xc6a75100, off0=20) at
/usr/src/sys/netinet/tcp_input.c:994
#7  0xc063def1 in ip_input (m=0xc6a75100) at
/usr/src/sys/netinet/ip_input.c:662
#8  0xc06184b8 in netisr_dispatch (num=2, m=0x0) at
/usr/src/sys/net/netisr.c:278
#9  0xc06108f1 in ether_demux (ifp=0xc66bdc00, m=0xc6a75100) at
/usr/src/sys/net/if_ethersubr.c:843
#10 0xc0610763 in ether_input (ifp=0xc66bdc00, m=0xc6a75100) at
/usr/src/sys/net/if_ethersubr.c:701
#11 0xc04dc535 in bge_rxeof (sc=0xc66c8000) at
/usr/src/sys/dev/bge/if_bge.c:2949
#12 0xc04dca0c in bge_intr (xsc=0xc66c8000) at
/usr/src/sys/dev/bge/if_bge.c:3127
#13 0xc05819c6 in ithread_execute_handlers (p=0xc6682480, ie=0xc65ce600)
at /usr/src/sys/kern/kern_intr.c:682
#14 0xc0581ad8 in ithread_loop (arg=0xc66a9a50) at
/usr/src/sys/kern/kern_intr.c:766
#15 0xc05809aa in fork_exit (callout=0xc0581a84 <ithread_loop>,
arg=0xc66a9a50, frame=0xe6c6fd38)
    at /usr/src/sys/kern/kern_fork.c:814
#16 0xc0710450 in fork_trampoline () at
/usr/src/sys/i386/i386/exception.s:205



System information:

uname -a
FreeBSD lisa.mlan.solnet.ch 7.0-CURRENT FreeBSD 7.0-CURRENT #0: Thu Apr
19 09:14:51 UTC 2007
root@lisa.mlan.solnet.ch:/usr/obj/usr/src/sys/UP7_SATA  i386

dmesg:
Copyright (c) 1992-2007 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.0-CURRENT #0: Thu Apr 19 09:14:51 UTC 2007
    root@lisa.mlan.solnet.ch:/usr/obj/usr/src/sys/UP7_SATA
module_register: module g_journal already exists!
Module g_journal failed to register: 17
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (3000.13-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf62  Stepping = 2

Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0xe41d<SSE3,RSVD2,MON,DS_CPL,CNXT-ID,CX16,xTPR,<b15>>
  AMD Features=0x20100000<NX,LM>
  AMD Features2=0x1<LAHF>
  Logical CPUs per core: 2
real memory  = 3622305792 (3454 MB)
avail memory = 3545190400 (3380 MB)
kbd1 at kbdmux0
cpu0 on motherboard
pcib0: <Host to PCI bridge> pcibus 0 on motherboard
pir0: <PCI Interrupt Routing Table: 17 Entries> on motherboard
pci0: <PCI bus> on pcib0
pcib1: <PCIBIOS PCI-PCI bridge> irq 10 at device 1.0 on pci0
pci1: <PCI bus> on pcib1
pcib2: <PCI-PCI bridge> at device 0.0 on pci1
pci2: <PCI bus> on pcib2
arcmsr0: <Areca SATA Host Adapter RAID Controller (RAID6 capable)
> > mem 0xdc500000-0xdc500fff,0xdc000000-0xdc3fffff irq 11 at device 14.0
on pci2
ARECA RAID ADAPTER0: Driver Version 1.20.00.14 2007-2-05
ARECA RAID ADAPTER0: FIRMWARE VERSION V1.42 2006-10-13
arcmsr0: [ITHREAD]
pcib3: <PCI-PCI bridge> at device 0.2 on pci1
pci3: <PCI bus> on pcib3
pcib4: <PCIBIOS PCI-PCI bridge> irq 10 at device 28.0 on pci0
pci4: <PCI bus> on pcib4
pcib5: <PCIBIOS PCI-PCI bridge> at device 28.4 on pci0
pci5: <PCI bus> on pcib5
pci5:0:0: bad VPD cksum, remain 14
bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x4101>
mem 0xdc600000-0xdc60ffff irq 10 at device 0.0 on pci5
miibus0: <MII bus> on bge0
brgphy0: <BCM5750 10/100/1000baseTX PHY> PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
1000baseTX-FDX, auto
bge0: Ethernet address: 00:e0:81:5d:b8:7b
bge0: [ITHREAD]
pcib6: <PCIBIOS PCI-PCI bridge> at device 28.5 on pci0
pci6: <PCI bus> on pcib6
pci6:0:0: bad VPD cksum, remain 14
bge1: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x4101>
mem 0xdc700000-0xdc70ffff irq 11 at device 0.0 on pci6
miibus1: <MII bus> on bge1
brgphy1: <BCM5750 10/100/1000baseTX PHY> PHY 1 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
1000baseTX-FDX, auto
bge1: Ethernet address: 00:e0:81:5d:b8:7c
bge1: [ITHREAD]
uhci0: <UHCI (generic) USB controller> port 0x3000-0x301f irq 5 at
device 29.0 on pci0
uhci0: [GIANT-LOCKED]
uhci0: [ITHREAD]
usb0: <UHCI (generic) USB controller> on uhci0
usb0: USB revision 1.0
uhub0: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0
uhub0: 2 ports with 2 removable, self powered
uhci1: <UHCI (generic) USB controller> port 0x3020-0x303f irq 10 at
device 29.1 on pci0
uhci1: [GIANT-LOCKED]
uhci1: [ITHREAD]
usb1: <UHCI (generic) USB controller> on uhci1
usb1: USB revision 1.0
uhub1: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1
uhub1: 2 ports with 2 removable, self powered
uhci2: <UHCI (generic) USB controller> port 0x3040-0x305f irq 11 at
device 29.2 on pci0
uhci2: [GIANT-LOCKED]
uhci2: [ITHREAD]
usb2: <UHCI (generic) USB controller> on uhci2
usb2: USB revision 1.0
uhub2: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb2
uhub2: 2 ports with 2 removable, self powered
uhci3: <UHCI (generic) USB controller> port 0x3060-0x307f irq 10 at
device 29.3 on pci0
uhci3: [GIANT-LOCKED]
uhci3: [ITHREAD]
usb3: <UHCI (generic) USB controller> on uhci3
usb3: USB revision 1.0
uhub3: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb3
uhub3: 2 ports with 2 removable, self powered
ehci0: <Intel 82801GB/R (ICH7) USB 2.0 controller> mem
0xdca00000-0xdca003ff irq 5 at device 29.7 on pci0
ehci0: [GIANT-LOCKED]
ehci0: [ITHREAD]
usb4: waiting for BIOS to give up control
usb4: timed out waiting for BIOS
usb4: EHCI version 1.0
usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3
usb4: <Intel 82801GB/R (ICH7) USB 2.0 controller> on ehci0
usb4: USB revision 2.0
uhub4: <Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb4
uhub4: 8 ports with 8 removable, self powered
pcib7: <PCIBIOS PCI-PCI bridge> at device 30.0 on pci0
pci10: <PCI bus> on pcib7
vgapci0: <VGA-compatible display> port 0x4000-0x407f mem
0xd8000000-0xdbffffff,0xdc400000-0xdc43ffff at device 1.0 on pci10
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH7 UDMA100 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x30a0-0x30af at device 31.1 on pci0
ata0: <ATA channel 0> on atapci0
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci0
ata1: [ITHREAD]
atapci1: <Intel ICH7 SATA300 controller> port
0x30e8-0x30ef,0x30dc-0x30df,0x30e0-0x30e7,0x30d8-0x30db,0x30b0-0x30bf
mem 0xdca00400-0xdca007ff irq 10 at device 31.2 on pci0
atapci1: [ITHREAD]
ata2: <ATA channel 0> on atapci1
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci1
ata3: [ITHREAD]
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xe0000-0xe17ff pnpid
ORM0000 on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
fdc0: <Enhanced floppy controller> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2
on isa0
fdc0: [FILTER]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode
ppbus0: <Parallel port bus> on ppc0
plip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
ppc0: [GIANT-LOCKED]
ppc0: [ITHREAD]
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A, console
sio0: [FILTER]
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
sio1: [FILTER]
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
unknown: <PNP0c01> can't assign resources (memory)
unknown: <PNP0303> can't assign resources (port)
unknown: <PNP0c02> can't assign resources (memory)
unknown: <INT0800> can't assign resources (memory)
unknown: <PNP0501> can't assign resources (port)
unknown: <PNP0700> can't assign resources (port)
unknown: <PNP0501> can't assign resources (port)
unknown: <PNP0400> can't assign resources (port)
Timecounter "TSC" frequency 3000130965 Hz quality 800
Timecounters tick every 1.000 msec
ipfw2 (+ipv6) initialized, divert enabled, rule-based forwarding
enabled, default to accept, logging limited to 100 packets/entry by default
Waiting 5 seconds for SCSI devices to settle
The GEOM class JOURNAL is already loaded.
acd0: CDROM <LG CD-ROM CRD-8522B/2.00> at ata0-master PIO4
ad4: 238475MB <Hitachi HDT725025VLA380 V5DOA52A> at ata2-master SATA150
da0 at arcmsr0 bus 0 target 0 lun 0
da0: <Areca ARC-1230-VOL#00 R001> Fixed Direct Access SCSI-5 device
da0: 166.666MB/s transfers (83.333MHz DT, offset 32, 16bit)
da0: 2097129MB (4294920192 512 byte sectors: 255H 63S/T 267346C)
cd0 at ata0 bus 0 target 0 lun 0
cd0: <LG CD-ROM CRD-8522B 2.00> Removable CD-ROM SCSI-0 device
cd0: 16.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present
GEOM_JOURNAL: Journal 1974089085: da0 contains data.
GEOM_JOURNAL: Journal 1974089085: da0 contains journal.
GEOM_JOURNAL: Journal da0 clean.
Trying to mount root from ufs:/dev/ad4s1a
bge0: link state changed to UP
bge1: link state changed to UP

boot/loader.conf:
geom_journal_load="YES"
kern.dfldsiz="1G"
kern.maxdsiz="1G"

sysctl.conf:
net.inet.ip.random_id=1
net.inet.tcp.blackhole=1
net.inet.udp.blackhole=1
net.inet.icmp.drop_redirect=1
net.inet.ip.fw.one_pass=0
kern.maxfiles=65536
kern.maxfilesperproc=32768


More information needed?

Cheers,
Thomas




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?462896C3.6040402>