From owner-freebsd-current Wed Aug 15 16:42:56 2001 Delivered-To: freebsd-current@freebsd.org Received: from mail.wrs.com (unknown-1-11.windriver.com [147.11.1.11]) by hub.freebsd.org (Postfix) with ESMTP id AD02637B406; Wed, 15 Aug 2001 16:42:49 -0700 (PDT) (envelope-from jhb@FreeBSD.org) Received: from laptop.baldwin.cx (john@[147.11.46.201]) by mail.wrs.com (8.9.3/8.9.1) with ESMTP id QAA13623; Wed, 15 Aug 2001 16:42:17 -0700 (PDT) Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <20010815181100.A48748@blackhelicopters.org> Date: Wed, 15 Aug 2001 16:42:21 -0700 (PDT) From: John Baldwin To: Michael Lucas Subject: Re: devfs and Vinum (was: any -current && vinum problems?) Cc: current@FreeBSD.org, Poul-Henning Kamp , Greg Lehey Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 15-Aug-01 Michael Lucas wrote: > On Wed, Aug 15, 2001 at 10:21:39AM +0930, Greg Lehey wrote: >> To help localize this problem, could you please try this same thing on >> a kernel without devfs? The dump you sent me did not look like a >> Vinum bug, as I said in my reply. > > Sorry, it happens on a non-devfs kernel as well. Since it doesn't > appear to be a Vinum bug, I'm taking the liberty of sending the whole > thing to -current. (I sent my first dump to Greg in particular, since > a Vinum command triggered whatever this is.) > > Script started on Wed Aug 15 17:57:48 2001 > magpire/var/crash;file /boot/kernel/vinum.ko > /boot/kernel/vinum.ko: ELF 32-bit LSB shared object, Intel 80386, version 1 > (FreeBSD), not stripped > magpire/var/crash;file kernel.debug.nodevfs > kernel.debug.nodevfs: ELF 32-bit LSB executable, Intel 80386, version 1 > (FreeBSD), dynamically linked (uses shared libs), not stripped > magpire/var/crash;gdb -k kernel.debug.nodevfs vmcore.3 > GNU gdb 4.18 > Copyright 1998 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-unknown-freebsd"... > IdlePTD 4284416 > initial pcb at 34b860 > panicstr: bremfree: bp 0xcc2a1ae4 not locked Unfortunately this is the panic message from later on during the syncing disks stage, not the real panic. :( >#15 0xc01f0783 in witness_destroy (lock=0xc1ec4e68) at >#../../../kern/subr_witness.c:395 This is the real problem: mtx_lock(&all_mtx); lock_cur_cnt--; STAILQ_REMOVE(&all_locks, lock, lock_object, lo_list); lock->lo_flags &= ~LO_INITIALIZED; mtx_unlock(&all_mtx); It panics in the STAILQ_REMOVE(). I've seen this a couple of times but have no idea how that list pointer is getting corrupted. My guess is that a mutex is being destroyed twice or something dumb like that; however, I'm not sure how. The LO_INITIALIZED flags and checks are supposed to catch that case. I suppose there is a chance we could preempt in between the LO_INITIALIZED check and the actual removal and then free it and get in trouble that way. Hmm. Try moving the mtx_lock of &all_mtx before the check for LO_INITIALIZED and see if you can get a different panic. It may be a bug in the ucred stuff. (At least several other panics of this type have been the result of crfree's.) -- John Baldwin -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message