Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 Dec 1997 10:51:44 -0500 (EST)
From:      Thomas David Rivers <rivers@dignus.com>
To:        ivt@gamma.ru, tlambert@primenet.com
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: panic: blkfree: freeling free block/frag
Message-ID:  <199712161551.KAA13808@lakes.dignus.com>

next in thread | raw e-mail | index | archive | help

"Igor Timkin" <ivt@gamma.ru> writes:
> 
> Terry Lambert writes:
> > > Every 4-8 days my news server (~10 full incoming feeds, ~50
> > > outgoing feeds) crash:
> > > panic: blkfree: freeing free block
> > > or
> > > panic: blkfree: freeing free frag
> > 
> > Long Answer:
> > 
> > Generally, this type of problem means you should rebuild the news spool,
> > since *any* corruption could result in invalid information on the disk
> > that could result in a panic.
> > 
> > Most likely, you crashed once, and you expected fsck to do something
> > that it can't do reliably: recover an async mounted partition.  The
> > partition was "recovered" and marked clean, but when you reference
> > a particular disk metadata construct, it goes off into the weeds
> > because the recovery was imperfect.
> 
> Unfortunally, I still have this problem.
> I had make newfs 5 days ago. But yestarday I got the same panic
> (uptime 5.2 days):


Yep - this looks _exactly_ like my "daily panic" problem I had
on a full news feed... 

I wonder - did you happen to see John Dyson's recent post regarding
a small VM patch to 2.2.5.  It seems unrelated, but it could be a contributor
to the problem.  I haven't yet tried it on my stand-alone reproduction;
but it might be worthwhile for you to try.

A newfs, etc.. isn't going to fix this.  For example; you can probably
recreate it rather quickly by scribbling all over the disk (i.e.
write 0xff all over the partition), doing a newfs - which should initialize
things to their proper values, then doing an fsck to find out some
of the 0xff values incorrectly remain (that's my stand-alone reproduction.)

The "bug" - is that somehow a buffer is being incorrectly reused; but
it seems to be _very_ timing dependent.. when I put nice printf()s in the
kernel to try and track this down - the problem goes away.   Making it
extremely difficult to find...

I've demonstrated it on a aha1542, and a 2748, and a IDE disk drive, so
I don't believe it's hardware dependent (although the two SCSI versions
were with the same SCSI disk drive - the IDE one, of course, was not)
and with a 486 and a 386 and a 386sx.

Sorry I don't have good news to add - but I wanted to add my experiences
to help any curious readers...

Also - if anyone has time and feels like debugging this; I have
a serial-console setup on the machine that reproduces it.  I can
make accounts available to anyone to connect via the internet to this
setup and have a "whack" at it.   (i.e. log into the machine to which
the serial console is connected... )

	- Dave Rivers -



> 
> ivt@news:/var/tmp/innfeed:2:306>gdb -k /sys/compile/NEWS/kernel /usr/local/news/crash/vmcore.1
> GDB is free software and you are welcome to distribute copies of it
>  under certain conditions; type "show copying" to see the conditions.
> There is absolutely no warranty for GDB; type "show warranty" for details.
> GDB 4.16 (i386-unknown-freebsd), 
> Copyright 1996 Free Software Foundation, Inc...
> IdlePTD 1f0000
> current pcb at 1d79c0
> panic: blkfree: freeing free block
> #0  boot (howto=256) at ../../kern/kern_shutdown.c:266
> 266                                     dumppcb.pcb_cr3 = rcr3();
> (kgdb) where
> #0  boot (howto=256) at ../../kern/kern_shutdown.c:266
> #1  0xe01105e2 in panic (fmt=0xe0188a85 "blkfree: freeing free block")
>     at ../../kern/kern_shutdown.c:390
> #2  0xe0188c57 in ffs_blkfree (ip=0xe3e50500, bno=10, size=4096)
>     at ../../ufs/ffs/ffs_alloc.c:1230
> #3  0xe018b09a in ffs_indirtrunc (ip=0xe3e50500, lbn=-12, dbn=394248, 
>     lastbn=-1, level=0, countp=0xdfbffd9c) at ../../ufs/ffs/ffs_inode.c:500
> #4  0xe018aac8 in ffs_truncate (ap=0xdfbffe74) at ../../ufs/ffs/ffs_inode.c:317
> #5  0xe018e6a5 in ufs_inactive (ap=0xdfbffea0) at vnode_if.h:1003
> #6  0xe012fb3f in vrele (vp=0xe3b12800) at vnode_if.h:699
> #7  0xe012fa33 in vput (vp=0xe3b12800) at ../../kern/vfs_subr.c:858
> #8  0xe0191e80 in ufs_remove (ap=0xdfbffef4) at ../../ufs/ufs/ufs_vnops.c:697
> #9  0xe0131d25 in unlink (p=0xe4dc6e00, uap=0xdfbfff94, retval=0xdfbfff84)
>     at vnode_if.h:459
> #10 0xe01ac1ff in syscall (frame={tf_es = 39, tf_ds = -541130713, tf_edi = 1, 
>       tf_esi = 28736, tf_ebp = -541074456, tf_isp = -541065244, 
>       tf_ebx = 28944, tf_edx = 0, tf_ecx = 41472, tf_eax = 10, tf_trapno = 7, 
>       tf_err = 7, tf_eip = 268950145, tf_cs = 31, tf_eflags = 582, 
>       tf_esp = -541074568, tf_ss = 39}) at ../../i386/i386/trap.c:890
> #11 0x1007da81 in ?? ()
> #12 0x2453 in ?? ()
> #13 0x2914 in ?? ()
> #14 0x1095 in ?? ()




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199712161551.KAA13808>