From owner-freebsd-current  Tue Jan 11 18:34:56 2000
Delivered-To: freebsd-current@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by hub.freebsd.org (Postfix) with ESMTP id 43E9615467
	for <current@FreeBSD.ORG>; Tue, 11 Jan 2000 18:34:51 -0800 (PST)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id SAA66268;
	Tue, 11 Jan 2000 18:34:50 -0800 (PST)
	(envelope-from dillon)
Date: Tue, 11 Jan 2000 18:34:50 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200001120234.SAA66268@apollo.backplane.com>
To: asami@cs.berkeley.edu (Satoshi Asami)
Cc: current@FreeBSD.ORG
Subject: Re: crashes
References:  <200001120008.QAA89430@silvia.hip.berkeley.edu>
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


:Have people seen these?
:
:#0  0xc012fbd0 in boot ()
:(kgdb) bt
:#0  0xc012fbd0 in boot ()
:#1  0xc012ff54 in poweroff_wait ()
:#2  0xc01c75d5 in ffs_blkfree ()
:#3  0xc01cb626 in handle_workitem_freeblocks ()
:#4  0xc01c9bbf in softdep_process_worklist ()
:#5  0xc0156ff3 in sched_sync ()
:#6  0xc0204620 in fork_trampoline ()
:Cannot access memory at address 0x2000.
:(kgdb) quit
:===

    I have occassionally seen this failure when setting vfs.vmiodirenable 
    to 1.  I have not seen it otherwise.

    Please upgrade your system to the latest ffs_softdep.c, vers 1.47.  This
    will fix a number of other problems but may not fix this one.

    If you can reproduce this bug reliably w/ the latest ffs_softdep.c then
    we may have a chance of finding it.

:#5  0xc01c9832 in acquire_lock ()
:#6  0xc01cc45e in softdep_disk_io_initiation ()
:#7  0xc0167173 in spec_strategy ()
:#8  0xc0166cad in spec_vnoperate ()
:#9  0xc01d813d in ufs_vnoperatespec ()
:#10 0xc01d7bbd in ufs_strategy ()
:#11 0xc01d810d in ufs_vnoperate ()
:#12 0xc014f958 in bwrite ()
:#13 0xc0154a0e in vop_stdbwrite ()
:#14 0xc0154869 in vop_defaultop ()
:#15 0xc01d810d in ufs_vnoperate ()
:#16 0xc0150772 in vfs_bio_awrite ()
:#17 0xc01d1921 in ffs_fsync ()
:#18 0xc01d03d0 in ffs_sync ()
:#19 0xc01595e3 in sync ()
:#20 0xc012f9a3 in boot ()
:#21 0xc012ff54 in poweroff_wait ()
:#22 0xc01ccba1 in handle_allocdirect_partdone ()
:#23 0xc01cc952 in softdep_disk_write_complete ()
:#24 0xc0151bca in biodone ()
:#25 0xc01ed8d2 in ad_interrupt ()
:#26 0xc01eadf6 in ataintr ()
:(kgdb) quit

    I don't know about this one.


:panicstr: softdep_lock: lock held by -2
:panic messages:
:---
:panic: handle_allocdirect_partdone: lost dep
:
:syncing disks... panic: softdep_lock: lock held by -2
:Uptime: 9d1h24m57s
:
:dumping to dev #wd/0x20001, offset 917632
:dump ata0: resetting devices .. done
:63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
:---
:#0  0xc012fbd0 in boot ()
:(kgdb) bt
:#0  0xc012fbd0 in boot ()
:#1  0xc012ff54 in poweroff_wait ()
:#2  0xc01c9851 in acquire_lock ()
:#3  0xc01cc516 in initiate_write_filepage ()
:#4  0xc01cc3e3 in softdep_disk_io_initiation ()
:#5  0xc0167173 in spec_strategy ()
:#6  0xc0166cad in spec_vnoperate ()
:#7  0xc01d813d in ufs_vnoperatespec ()
:#8  0xc01d7bbd in ufs_strategy ()
:#9  0xc01d810d in ufs_vnoperate ()
:#10 0xc014f958 in bwrite ()
:#11 0xc0154a0e in vop_stdbwrite ()
:#12 0xc0154869 in vop_defaultop ()
:#13 0xc01d810d in ufs_vnoperate ()
:#14 0xc014fb46 in bawrite ()
:#15 0xc01d188f in ffs_fsync ()
:#16 0xc01d03d0 in ffs_sync ()
:#17 0xc01595e3 in sync ()
:#18 0xc012f9a3 in boot ()
:#19 0xc012ff54 in poweroff_wait ()
:#20 0xc01ccba1 in handle_allocdirect_partdone ()
:#21 0xc01cc952 in softdep_disk_write_complete ()
:#22 0xc0151bca in biodone ()
:#23 0xc01ed8d2 in ad_interrupt ()
:#24 0xc01eadf6 in ataintr ()

    This is new to me too.

    Any chance you can get a kernel core dump on these machines instead
    of just rebooting?  And have a debug kernel for them?

:Both machines are of -current, Dec 26 vintage.  These machines died
:building -current packages.  (Gee, sounds like real heroes.)  These
:crashes usually result in fsck -p failing during subsequent reboot.
:
:Satoshi

    Yes, the fsck thingy happens to me too though if this is a Dec 26
    kernel you should definitely update it -- I seem to recall something
    related to fsck being fixed around that time frame and there was a
    softupdates fix after that time frame for a bug occuring before the
    time frame committed as well (separate from the current spate of 
    softupdates issues).

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message