From owner-freebsd-stable@FreeBSD.ORG Sun Jul 7 12:13:57 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 01B91404; Sun, 7 Jul 2013 12:13:57 +0000 (UTC) (envelope-from Andre.Albsmeier@siemens.com) Received: from david.siemens.de (david.siemens.de [192.35.17.14]) by mx1.freebsd.org (Postfix) with ESMTP id 8B24A1217; Sun, 7 Jul 2013 12:13:56 +0000 (UTC) Received: from mail1.siemens.de (localhost [127.0.0.1]) by david.siemens.de (8.13.6/8.13.6) with ESMTP id r67CDsDu011605; Sun, 7 Jul 2013 14:13:54 +0200 Received: from curry.mchp.siemens.de (curry.mchp.siemens.de [139.25.40.130]) by mail1.siemens.de (8.13.6/8.13.6) with ESMTP id r67CDsr7027734; Sun, 7 Jul 2013 14:13:54 +0200 Received: (from localhost) by curry.mchp.siemens.de (8.14.7/8.14.7) id r67CDsor028591; Date: Sun, 7 Jul 2013 14:13:54 +0200 From: Andre Albsmeier To: Konstantin Belousov Subject: Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found Message-ID: <20130707121354.GA39055@bali> References: <20130531122611.GA6607@bali> <201305311051.03157.jhb@freebsd.org> <20130616063942.GA72803@bali> <201306171530.31208.jhb@freebsd.org> <20130704051409.GA22021@bali> <20130704052440.GG91021@kib.kiev.ua> <20130704052659.GA23398@bali> <20130704061550.GI91021@kib.kiev.ua> <20130707072553.GA38133@bali> <20130707074112.GD91021@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130707074112.GD91021@kib.kiev.ua> X-Echelon: X-Advice: Drop that crappy M$-Outlook, I'm tired of your viruses! User-Agent: Mutt/1.5.21 (2010-09-15) Cc: "freebsd-stable@freebsd.org" , John Baldwin X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Jul 2013 12:13:57 -0000 On Sun, 07-Jul-2013 at 09:41:12 +0200, Konstantin Belousov wrote: > On Sun, Jul 07, 2013 at 09:25:53AM +0200, Andre Albsmeier wrote: > > OK, here we go (looks better now): > > > > GNU gdb 6.1.1 [FreeBSD] > > Copyright 2004 Free Software Foundation, Inc. > > GDB is free software, covered by the GNU General Public License, and you are > > welcome to change it and/or distribute copies of it under certain conditions. > > Type "show copying" to see the conditions. > > There is absolutely no warranty for GDB. Type "show warranty" for details. > > This GDB was configured as "i386-marcel-freebsd"... > > > > Unread portion of the kernel message buffer: > > dev = stripe/p, block = 592, fs = /palveli > > panic: ffs_blkfree_cg: freeing free block > > KDB: stack backtrace: > > db_trace_self_wrapper(c08207eb,d70fc924,c05fdfc9,c081df13,c08a82e0,...) at db_trace_self_wrapper+0x26/frame 0xd70fc8f4 > > kdb_backtrace(c081df13,c08a82e0,c0833a0b,d70fc930,d70fc930,...) at kdb_backtrace+0x29/frame 0xd70fc900 > > panic(c0833a0b,c2aae178,250,0,c2af80d4,...) at panic+0xc9/frame 0xd70fc924 > > ffs_blkfree_cg(250,0,8000,49f,d70fcad0,...) at ffs_blkfree_cg+0x399/frame 0xd70fc9c8 > > ffs_blkfree(c2b35100,c2af8000,c2b0d470,250,0,...) at ffs_blkfree+0xad/frame 0xd70fca00 > > indir_trunc(fffa3ff4,ffffffff,0,8000,0,...) at indir_trunc+0x658/frame 0xd70fcae0 > > indir_trunc(ffffdff3,ffffffff,c072df0a,c2d68d00,c087abd8,...) at indir_trunc+0x514/frame 0xd70fcbc0 > > handle_workitem_freeblocks(0,d70fcc4c,2,246,c2ab1000,...) at handle_workitem_freeblocks+0x2dc/frame 0xd70fcc24 > > process_worklist_item(0,0,0,c086ae78,0,...) at process_worklist_item+0x27a/frame 0xd70fcc6c > > softdep_process_worklist(c2b36548,0,54,c0835825,64,...) at softdep_process_worklist+0x91/frame 0xd70fcc9c > > softdep_flush(0,d70fcd08,0,c2aac2f0,0,...) at softdep_flush+0x3e4/frame 0xd70fcccc > > fork_exit(c0738bb0,0,d70fcd08) at fork_exit+0xa2/frame 0xd70fccf4 > > fork_trampoline() at fork_trampoline+0x8/frame 0xd70fccf4 > > --- trap 0, eip = 0, esp = 0xd70fcd40, ebp = 0 --- > > Uptime: 2d16h29m37s > > Physical memory: 503 MB > > Dumping 95 MB: 80 64 48 32 16 > > > > No symbol "stopped_cpus" in current context. > > No symbol "stoppcbs" in current context. > > #0 doadump (textdump=1) at pcpu.h:249 > > 249 pcpu.h: No such file or directory. > > in pcpu.h > > (kgdb) where > > #0 doadump (textdump=1) at pcpu.h:249 > > #1 0xc05fdddd in kern_reboot (howto=260) at /src/src-9/sys/kern/kern_shutdown.c:449 > > #2 0xc05fe028 in panic (fmt=) at /src/src-9/sys/kern/kern_shutdown.c:637 > > #3 0xc0717899 in ffs_blkfree_cg (ump=0xc2b35100, fs=0xc2af8000, devvp=0xc2b0d470, bno=592, > > size=32768, inum=1183, dephd=0xd70fcad0) at /src/src-9/sys/ufs/ffs/ffs_alloc.c:2151 > > #4 0xc0717c8d in ffs_blkfree (ump=0xc2b35100, fs=0xc2af8000, devvp=0xc2b0d470, bno=592, > > size=32768, inum=1183, vtype=VREG, dephd=0xd70fcad0) at /src/src-9/sys/ufs/ffs/ffs_alloc.c:2280 > > #5 0xc0730348 in indir_trunc (freework=0xc2f99100, dbn=1642816, lbn=-376844) > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7965 > > #6 0xc0730204 in indir_trunc (freework=0xc2f99100, dbn=1639680, lbn=-8205) > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7946 > > #7 0xc07324bc in handle_workitem_freeblocks (freeblks=0xc2fc1e00, flags=512) > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7588 > > #8 0xc0730dfa in process_worklist_item (mp=0xc2b36548, target=10, flags=512) > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1774 > > #9 0xc07360c1 in softdep_process_worklist (mp=0xc2b36548, full=0) > > at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1558 > > #10 0xc0738f94 in softdep_flush () at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414 > > #11 0xc05d1b82 in fork_exit (callout=0xc0738bb0 , arg=0x0, frame=0xd70fcd08) > > at /src/src-9/sys/kern/kern_fork.c:988 > > #12 0xc07ba904 in fork_trampoline () at /src/src-9/sys/i386/i386/exception.s:279 > > (kgdb) up 10 > > #10 0xc0738f94 in softdep_flush () at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414 > > 1414 progress += softdep_process_worklist(mp, 0); > > > > -Andre > > This looks unrelated, and exactly this panic is usually has one of two > causes: > - corrupted filesystem, run fsck to recheck it; root@palveli:~>fsck /dev/stripe/p ** /dev/stripe/p ** Last Mounted on /palveli ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 9895 files, 2039706 used, 15697693 free (5397 frags, 1961537 blocks, 0.0% fragmentation) ***** FILE SYSTEM IS CLEAN ***** > - faulty hardware, most likely RAM, but might be CPU/CPU cache/bus. Well, of course I cannot prove that this is not the case. But the box runs flawlessly otherwise. RAM is ECC monitored, PSU is OK and airflow is OK. Sure, I can't look inside of CPU etc. > > Is it the same machine where the bcopy panic occured ? Yes. Let's see what it does the next days... -Andre