From owner-freebsd-fs@FreeBSD.ORG Sat Apr 21 08:29:00 2007 Return-Path: X-Original-To: fs@freebsd.org Delivered-To: freebsd-fs@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 59CA516A402 for ; Sat, 21 Apr 2007 08:29:00 +0000 (UTC) (envelope-from freebsdlists@bsdunix.ch) Received: from mail03.solnet.ch (mail03.solnet.ch [212.101.4.137]) by mx1.freebsd.org (Postfix) with ESMTP id 80D1813C44B for ; Sat, 21 Apr 2007 08:28:59 +0000 (UTC) (envelope-from freebsdlists@bsdunix.ch) X-Virus-Scanned: by amavisd-new at mail03.solnet.ch Received: from mail03.solnet.ch ([127.0.0.1]) by localhost (mail03.solnet.ch [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 9YPEcSVApLC7 for ; Sat, 21 Apr 2007 08:28:54 +0000 (UTC) Received: from 192.168.1.102.local.home (home.bsdunix.ch [82.220.17.23]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail03.solnet.ch (Postfix) with ESMTP id 9C0C060F94 for ; Sat, 21 Apr 2007 08:28:54 +0000 (UTC) Message-ID: <4629CB46.4010102@bsdunix.ch> Date: Sat, 21 Apr 2007 10:28:54 +0200 From: Thomas User-Agent: Thunderbird 2.0.0.0 (Macintosh/20070326) MIME-Version: 1.0 To: fs@freebsd.org References: <462896C3.6040402@bsdunix.ch> In-Reply-To: <462896C3.6040402@bsdunix.ch> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Cc: Subject: Re: FS (gjournal?) releated crashes with current? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Apr 2007 08:29:00 -0000 Hello I recompiled the kernel with witness an all other debug options. Now I see several errors. login: Apr 21 01:51:53 lisa proftpd[680]: lisa.mlan.solnet.ch - error: unable to accept an incoming connection (Software caused connection abort) fsync: giving up on dirty 0xc6944220: tag devfs, type VCHR usecount 1, writecount 0, refcount 1415 mountedhere 0xc68ee600 flags () v_object 0xc1036ac8 ref 0 pages 32986 lock type devfs: EXCL (count 1) by thread 0xc65d5bd0 (pid 37) dev da0.journal GEOM_JOURNAL: Cannot suspend file system /usr/local/data (error=35). fsync: giving up on dirty 0xc6944220: tag devfs, type VCHR usecount 1, writecount 0, refcount 1139 mountedhere 0xc68ee600 flags () v_object 0xc1036ac8 ref 0 pages 32986 lock type devfs: EXCL (count 1) by thread 0xc65d5bd0 (pid 37) dev da0.journal GEOM_JOURNAL: Cannot suspend file system /usr/local/data (error=35). fsync: giving up on dirty 0xc6944220: tag devfs, type VCHR usecount 1, writecount 0, refcount 1298 mountedhere 0xc68ee600 flags () v_object 0xc1036ac8 ref 0 pages 33098 lock type devfs: EXCL (count 1) by thread 0xc65d5bd0 (pid 37) dev da0.journal I created the FS with newfs -J /dev/da0.journal. The raid controller and smarttools doesn't show any problems on the disks which helds /usr/local/data Ch01 Raid Set # 00 400.1GB WDC WD4000KS-00MNB0 Ch02 Raid Set # 00 400.1GB WDC WD4000YS-01MPB1 Ch03 Raid Set # 00 400.1GB WDC WD4000KS-00MNB0 Ch04 Raid Set # 00 400.1GB WDC WD4000YS-01MPB1 Ch05 Raid Set # 00 400.1GB WDC WD4000KS-00MNB0 Ch06 Raid Set # 00 400.1GB WDC WD4000KS-00MNB0 Ch07 Raid Set # 00 400.1GB WDC WD4000YS-01MPB1 Ch08 Raid Set # 00 400.1GB WDC WD4000KS-00MNB0 The WD4000YS-01MPB1 are brand new. Does someone have any experience with such issue? Cheers, Tom Thomas wrote: > Hello > > I triggered several crashes with 7-Current from 2007-04-19. The system > mostly crashes if I'm syncing data with 4-5 parallel rsync processes. > > Most debug options are disabled in my kernel and malloc was compiled > with MALLOC_PRODUCTION. I use GJournal (/dev/da0.journal) on a SATA > Raid6 created with an areca 1230 controller. The Raid status is fine. > > # mount > /dev/ad4s1a on / (ufs, local) > devfs on /dev (devfs, local) > /dev/ad4s1g on /disk1 (ufs, local, soft-updates) > /dev/ad4s1d on /tmp (ufs, local, soft-updates) > /dev/ad4s1f on /usr (ufs, local, soft-updates) > /dev/ad4s1e on /var (ufs, local, soft-updates) > /dev/da0.journal on /usr/local/data (ufs, asynchronous, local, noatime, > gjournal) > > > After every crash /dev/da0.journal is marked as clean but when I do full > fsck i got: > > # umount /usr/local/data > # fsck -y /usr/local/data > ** /dev/da0.journal > ** Last Mounted on /usr/local/data > ** Phase 1 - Check Blocks and Sizes > PARTIALLY TRUNCATED INODE I=150149446 > SALVAGE? yes > > -4415861736689041919 BAD I=150149446 > 6180257590692086610 BAD I=150149446 > 7624567997605723585 BAD I=150149446 > 8268956604991674674 BAD I=150149446 > 2342221461849545187 BAD I=150149446 > -292497344028865874 BAD I=150149446 > -5568323556661920569 BAD I=150149446 > -7916380230741665943 BAD I=150149446 > 4170928977557909368 BAD I=150149446 > 4450577158601375817 BAD I=150149446 > 1180086702901020396 BAD I=150149446 > EXCESSIVE BAD BLKS I=150149446 > CONTINUE? yes > > INCORRECT BLOCK COUNT I=150149446 (1856 should be 736) > CORRECT? yes > > PARTIALLY TRUNCATED INODE I=151138150 > SALVAGE? yes > .... > .... > and many more. > > > > I have 2 core dumpes: > lisa# cat /var/crash/info.6 > Dump header from device /dev/ad4s1b > Architecture: i386 > Architecture Version: 2 > Dump Length: 328253440B (313 MB) > Blocksize: 512 > Dumptime: Fri Apr 20 08:51:51 2007 > Hostname: lisa.mlan.solnet.ch > Magic: FreeBSD Kernel Dump > Version String: FreeBSD 7.0-CURRENT #0: Thu Apr 19 09:14:51 UTC 2007 > root@lisa.mlan.solnet.ch:/usr/obj/usr/src/sys/UP7_SATA > Panic String: ffs_valloc: dup alloc > Dump Parity: 949538821 > Bounds: 6 > Dump Status: good > > lisa# kgdb kernel.debug /var/crash/vmcore.6 > kgdb: kvm_nlist(_stopped_cpus): > kgdb: kvm_nlist(_stoppcbs): > [GDB will not be able to debug user-mode threads: > /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-marcel-freebsd". > > Unread portion of the kernel message buffer: > mode = 0100644, inum = 154219448, fs = /usr/local/data > panic: ffs_valloc: dup alloc > Uptime: 7h47m45s > Physical memory: 3445 MB > Dumping 313 MB: 298 282 266 250 234 218 202 186 170 154 138 122 106 90 > 74 58 42 26 10 > > #0 doadump () at pcpu.h:172 > 172 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); > (kgdb) backtrace > #0 doadump () at pcpu.h:172 > #1 0xc0597df8 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 > #2 0xc0598088 in panic (fmt=0xc0773469 "ffs_valloc: dup alloc") > at /usr/src/sys/kern/kern_shutdown.c:563 > #3 0xc06aa4f8 in ffs_valloc (pvp=0xc7a1daa0, mode=33152, cred=0xcbd66300, > vpp=0xe9040888) at /usr/src/sys/ufs/ffs/ffs_alloc.c:966 > #4 0xc06d552f in ufs_makeinode (mode=33152, dvp=0xc7a1daa0, > vpp=0xe9040b98, > cnp=0xe9040bac) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2238 > #5 0xc06d24e5 in ufs_create (ap=0x0) at > /usr/src/sys/ufs/ufs/ufs_vnops.c:188 > #6 0xc0730294 in VOP_CREATE_APV (vop=0x0, a=0xe9040a1c) at vnode_if.c:206 > #7 0xc0603644 in vn_open_cred (ndp=0xe9040b84, flagp=0xe9040c84, > cmode=384, > cred=0xcbd66300, fdidx=0) at vnode_if.h:111 > #8 0xc060346a in vn_open (ndp=0x0, flagp=0xe9040c84, cmode=384, fdidx=6) > at /usr/src/sys/kern/vfs_vnops.c:93 > #9 0xc05fdbc7 in kern_open (td=0xc91026c0, path=0x0, > pathseg=UIO_USERSPACE, > flags=2563, mode=384) at /usr/src/sys/kern/vfs_syscalls.c:987 > #10 0xc05fdb0c in open (td=0xc91026c0, uap=0x0) > at /usr/src/sys/kern/vfs_syscalls.c:954 > #11 0xc07200e2 in syscall (frame=0xe9040d38) > at /usr/src/sys/i386/i386/trap.c:1016 > #12 0xc0710440 in Xint0x80_syscall () at > /usr/src/sys/i386/i386/exception.s:196 > #13 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > > > lisa# cat /var/crash/info.7 > Dump header from device /dev/ad4s1b > Architecture: i386 > Architecture Version: 2 > Dump Length: 298967040B (285 MB) > Blocksize: 512 > Dumptime: Fri Apr 20 09:55:56 2007 > Hostname: lisa.mlan.solnet.ch > Magic: FreeBSD Kernel Dump > Version String: FreeBSD 7.0-CURRENT #0: Thu Apr 19 09:14:51 UTC 2007 > root@lisa.mlan.solnet.ch:/usr/obj/usr/src/sys/UP7_SATA > Panic String: sbdrop > Dump Parity: 487915010 > Bounds: 7 > Dump Status: good > > isa# kgdb kernel.debug /var/crash/vmcore.7 > kgdb: kvm_nlist(_stopped_cpus): > kgdb: kvm_nlist(_stoppcbs): > [GDB will not be able to debug user-mode threads: > /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-marcel-freebsd". > > Unread portion of the kernel message buffer: > panic: sbdrop > Uptime: 1h1m12s > Physical memory: 3445 MB > Dumping 285 MB: 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 > 30 14 > > #0 doadump () at pcpu.h:172 > 172 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); > (kgdb) backtrace > #0 doadump () at pcpu.h:172 > #1 0xc0597df8 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 > #2 0xc0598088 in panic (fmt=0xc0768744 "sbdrop") at > /usr/src/sys/kern/kern_shutdown.c:563 > #3 0xc05d8278 in sbdrop_internal (sb=0xc6ecf8ec, len=432) at > /usr/src/sys/kern/uipc_sockbuf.c:846 > #4 0xc05d8442 in sbdrop_locked (sb=0xc6ecf8ec, len=492) at > /usr/src/sys/kern/uipc_sockbuf.c:896 > #5 0xc0646828 in tcp_do_segment (m=0xc6a75100, th=0xc6a45824, > so=0xc6ecf828, tp=0xccbd616c, drop_hdrlen=40, tlen=0) > at /usr/src/sys/netinet/tcp_input.c:2191 > #6 0xc0645439 in tcp_input (m=0xc6a75100, off0=20) at > /usr/src/sys/netinet/tcp_input.c:994 > #7 0xc063def1 in ip_input (m=0xc6a75100) at > /usr/src/sys/netinet/ip_input.c:662 > #8 0xc06184b8 in netisr_dispatch (num=2, m=0x0) at > /usr/src/sys/net/netisr.c:278 > #9 0xc06108f1 in ether_demux (ifp=0xc66bdc00, m=0xc6a75100) at > /usr/src/sys/net/if_ethersubr.c:843 > #10 0xc0610763 in ether_input (ifp=0xc66bdc00, m=0xc6a75100) at > /usr/src/sys/net/if_ethersubr.c:701 > #11 0xc04dc535 in bge_rxeof (sc=0xc66c8000) at > /usr/src/sys/dev/bge/if_bge.c:2949 > #12 0xc04dca0c in bge_intr (xsc=0xc66c8000) at > /usr/src/sys/dev/bge/if_bge.c:3127 > #13 0xc05819c6 in ithread_execute_handlers (p=0xc6682480, ie=0xc65ce600) > at /usr/src/sys/kern/kern_intr.c:682 > #14 0xc0581ad8 in ithread_loop (arg=0xc66a9a50) at > /usr/src/sys/kern/kern_intr.c:766 > #15 0xc05809aa in fork_exit (callout=0xc0581a84 , > arg=0xc66a9a50, frame=0xe6c6fd38) > at /usr/src/sys/kern/kern_fork.c:814 > #16 0xc0710450 in fork_trampoline () at > /usr/src/sys/i386/i386/exception.s:205 > > > > System information: > > uname -a > FreeBSD lisa.mlan.solnet.ch 7.0-CURRENT FreeBSD 7.0-CURRENT #0: Thu Apr > 19 09:14:51 UTC 2007 > root@lisa.mlan.solnet.ch:/usr/obj/usr/src/sys/UP7_SATA i386 > > dmesg: > Copyright (c) 1992-2007 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 7.0-CURRENT #0: Thu Apr 19 09:14:51 UTC 2007 > root@lisa.mlan.solnet.ch:/usr/obj/usr/src/sys/UP7_SATA > module_register: module g_journal already exists! > Module g_journal failed to register: 17 > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (3000.13-MHz 686-class CPU) > Origin = "GenuineIntel" Id = 0xf62 Stepping = 2 > > Features=0xbfebfbff > Features2=0xe41d> > AMD Features=0x20100000 > AMD Features2=0x1 > Logical CPUs per core: 2 > real memory = 3622305792 (3454 MB) > avail memory = 3545190400 (3380 MB) > kbd1 at kbdmux0 > cpu0 on motherboard > pcib0: pcibus 0 on motherboard > pir0: on motherboard > pci0: on pcib0 > pcib1: irq 10 at device 1.0 on pci0 > pci1: on pcib1 > pcib2: at device 0.0 on pci1 > pci2: on pcib2 > arcmsr0: >> mem 0xdc500000-0xdc500fff,0xdc000000-0xdc3fffff irq 11 at device 14.0 > on pci2 > ARECA RAID ADAPTER0: Driver Version 1.20.00.14 2007-2-05 > ARECA RAID ADAPTER0: FIRMWARE VERSION V1.42 2006-10-13 > arcmsr0: [ITHREAD] > pcib3: at device 0.2 on pci1 > pci3: on pcib3 > pcib4: irq 10 at device 28.0 on pci0 > pci4: on pcib4 > pcib5: at device 28.4 on pci0 > pci5: on pcib5 > pci5:0:0: bad VPD cksum, remain 14 > bge0: > mem 0xdc600000-0xdc60ffff irq 10 at device 0.0 on pci5 > miibus0: on bge0 > brgphy0: PHY 1 on miibus0 > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, > 1000baseTX-FDX, auto > bge0: Ethernet address: 00:e0:81:5d:b8:7b > bge0: [ITHREAD] > pcib6: at device 28.5 on pci0 > pci6: on pcib6 > pci6:0:0: bad VPD cksum, remain 14 > bge1: > mem 0xdc700000-0xdc70ffff irq 11 at device 0.0 on pci6 > miibus1: on bge1 > brgphy1: PHY 1 on miibus1 > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, > 1000baseTX-FDX, auto > bge1: Ethernet address: 00:e0:81:5d:b8:7c > bge1: [ITHREAD] > uhci0: port 0x3000-0x301f irq 5 at > device 29.0 on pci0 > uhci0: [GIANT-LOCKED] > uhci0: [ITHREAD] > usb0: on uhci0 > usb0: USB revision 1.0 > uhub0: on usb0 > uhub0: 2 ports with 2 removable, self powered > uhci1: port 0x3020-0x303f irq 10 at > device 29.1 on pci0 > uhci1: [GIANT-LOCKED] > uhci1: [ITHREAD] > usb1: on uhci1 > usb1: USB revision 1.0 > uhub1: on usb1 > uhub1: 2 ports with 2 removable, self powered > uhci2: port 0x3040-0x305f irq 11 at > device 29.2 on pci0 > uhci2: [GIANT-LOCKED] > uhci2: [ITHREAD] > usb2: on uhci2 > usb2: USB revision 1.0 > uhub2: on usb2 > uhub2: 2 ports with 2 removable, self powered > uhci3: port 0x3060-0x307f irq 10 at > device 29.3 on pci0 > uhci3: [GIANT-LOCKED] > uhci3: [ITHREAD] > usb3: on uhci3 > usb3: USB revision 1.0 > uhub3: on usb3 > uhub3: 2 ports with 2 removable, self powered > ehci0: mem > 0xdca00000-0xdca003ff irq 5 at device 29.7 on pci0 > ehci0: [GIANT-LOCKED] > ehci0: [ITHREAD] > usb4: waiting for BIOS to give up control > usb4: timed out waiting for BIOS > usb4: EHCI version 1.0 > usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 > usb4: on ehci0 > usb4: USB revision 2.0 > uhub4: on usb4 > uhub4: 8 ports with 8 removable, self powered > pcib7: at device 30.0 on pci0 > pci10: on pcib7 > vgapci0: port 0x4000-0x407f mem > 0xd8000000-0xdbffffff,0xdc400000-0xdc43ffff at device 1.0 on pci10 > isab0: at device 31.0 on pci0 > isa0: on isab0 > atapci0: port > 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x30a0-0x30af at device 31.1 on pci0 > ata0: on atapci0 > ata0: [ITHREAD] > ata1: on atapci0 > ata1: [ITHREAD] > atapci1: port > 0x30e8-0x30ef,0x30dc-0x30df,0x30e0-0x30e7,0x30d8-0x30db,0x30b0-0x30bf > mem 0xdca00400-0xdca007ff irq 10 at device 31.2 on pci0 > atapci1: [ITHREAD] > ata2: on atapci1 > ata2: [ITHREAD] > ata3: on atapci1 > ata3: [ITHREAD] > pci0: at device 31.3 (no driver attached) > pmtimer0 on isa0 > orm0: at iomem 0xc0000-0xc7fff,0xe0000-0xe17ff pnpid > ORM0000 on isa0 > atkbdc0: at port 0x60,0x64 on isa0 > atkbd0: irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > atkbd0: [ITHREAD] > fdc0: at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 > on isa0 > fdc0: [FILTER] > fd0: <1440-KB 3.5" drive> on fdc0 drive 0 > ppc0: at port 0x378-0x37f irq 7 on isa0 > ppc0: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode > ppbus0: on ppc0 > plip0: on ppbus0 > lpt0: on ppbus0 > lpt0: Interrupt-driven port > ppi0: on ppbus0 > ppc0: [GIANT-LOCKED] > ppc0: [ITHREAD] > sc0: at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 > sio0: type 16550A, console > sio0: [FILTER] > sio1 at port 0x2f8-0x2ff irq 3 on isa0 > sio1: type 16550A > sio1: [FILTER] > vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > unknown: can't assign resources (memory) > unknown: can't assign resources (port) > unknown: can't assign resources (memory) > unknown: can't assign resources (memory) > unknown: can't assign resources (port) > unknown: can't assign resources (port) > unknown: can't assign resources (port) > unknown: can't assign resources (port) > Timecounter "TSC" frequency 3000130965 Hz quality 800 > Timecounters tick every 1.000 msec > ipfw2 (+ipv6) initialized, divert enabled, rule-based forwarding > enabled, default to accept, logging limited to 100 packets/entry by default > Waiting 5 seconds for SCSI devices to settle > The GEOM class JOURNAL is already loaded. > acd0: CDROM at ata0-master PIO4 > ad4: 238475MB at ata2-master SATA150 > da0 at arcmsr0 bus 0 target 0 lun 0 > da0: Fixed Direct Access SCSI-5 device > da0: 166.666MB/s transfers (83.333MHz DT, offset 32, 16bit) > da0: 2097129MB (4294920192 512 byte sectors: 255H 63S/T 267346C) > cd0 at ata0 bus 0 target 0 lun 0 > cd0: Removable CD-ROM SCSI-0 device > cd0: 16.000MB/s transfers > cd0: Attempt to query device size failed: NOT READY, Medium not present > GEOM_JOURNAL: Journal 1974089085: da0 contains data. > GEOM_JOURNAL: Journal 1974089085: da0 contains journal. > GEOM_JOURNAL: Journal da0 clean. > Trying to mount root from ufs:/dev/ad4s1a > bge0: link state changed to UP > bge1: link state changed to UP > > boot/loader.conf: > geom_journal_load="YES" > kern.dfldsiz="1G" > kern.maxdsiz="1G" > > sysctl.conf: > net.inet.ip.random_id=1 > net.inet.tcp.blackhole=1 > net.inet.udp.blackhole=1 > net.inet.icmp.drop_redirect=1 > net.inet.ip.fw.one_pass=0 > kern.maxfiles=65536 > kern.maxfilesperproc=32768 > > > More information needed? > > Cheers, > Thomas > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"