Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Aug 2001 16:42:21 -0700 (PDT)
From:      John Baldwin <jhb@FreeBSD.org>
To:        Michael Lucas <mwlucas@blackhelicopters.org>
Cc:        current@FreeBSD.org, Poul-Henning Kamp <phk@FreeBSD.org>, Greg Lehey <grog@FreeBSD.org>
Subject:   Re: devfs and Vinum (was: any -current && vinum problems?)
Message-ID:  <XFMail.010815164221.jhb@FreeBSD.org>
In-Reply-To: <20010815181100.A48748@blackhelicopters.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On 15-Aug-01 Michael Lucas wrote:
> On Wed, Aug 15, 2001 at 10:21:39AM +0930, Greg Lehey wrote:
>> To help localize this problem, could you please try this same thing on
>> a kernel without devfs?  The dump you sent me did not look like a
>> Vinum bug, as I said in my reply.
> 
> Sorry, it happens on a non-devfs kernel as well.  Since it doesn't
> appear to be a Vinum bug, I'm taking the liberty of sending the whole
> thing to -current.  (I sent my first dump to Greg in particular, since
> a Vinum command triggered whatever this is.)
> 

> Script started on Wed Aug 15 17:57:48 2001
> magpire/var/crash;file /boot/kernel/vinum.ko 
> /boot/kernel/vinum.ko: ELF 32-bit LSB shared object, Intel 80386, version 1
> (FreeBSD), not stripped
> magpire/var/crash;file kernel.debug.nodevfs 
> kernel.debug.nodevfs: ELF 32-bit LSB executable, Intel 80386, version 1
> (FreeBSD), dynamically linked (uses shared libs), not stripped
> magpire/var/crash;gdb -k kernel.debug.nodevfs vmcore.3 
> GNU gdb 4.18
> Copyright 1998 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "i386-unknown-freebsd"...
> IdlePTD 4284416
> initial pcb at 34b860
> panicstr: bremfree: bp 0xcc2a1ae4 not locked

Unfortunately this is the panic message from later on during the syncing disks
stage, not the real panic. :(

>#15 0xc01f0783 in witness_destroy (lock=0xc1ec4e68) at
>#../../../kern/subr_witness.c:395

This is the real problem:

        mtx_lock(&all_mtx);
        lock_cur_cnt--;
        STAILQ_REMOVE(&all_locks, lock, lock_object, lo_list);
        lock->lo_flags &= ~LO_INITIALIZED;
        mtx_unlock(&all_mtx);

It panics in the STAILQ_REMOVE().  I've seen this a couple of times but have no
idea how that list pointer is getting corrupted.  My guess is that a mutex is
being destroyed twice or something dumb like that; however, I'm not sure how.
The LO_INITIALIZED flags and checks are supposed to catch that case.  I suppose
there is a chance we could preempt in between the LO_INITIALIZED check and the
actual removal and then free it and get in trouble that way.  Hmm.  Try moving
the mtx_lock of &all_mtx before the check for LO_INITIALIZED and see if you can
get a different panic.  It may be a bug in the ucred stuff.  (At least several
other panics of this type have been the result of crfree's.)

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.010815164221.jhb>