From owner-freebsd-hackers Sat Feb 28 18:24:36 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id SAA24895 for freebsd-hackers-outgoing; Sat, 28 Feb 1998 18:24:36 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from elvis.vnet.net (elvis.vnet.net [166.82.1.5]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id SAA24882 for ; Sat, 28 Feb 1998 18:24:25 -0800 (PST) (envelope-from rivers@dignus.com) Received: from ponds.dignus.com (ponds.vnet.net [166.82.177.48]) by elvis.vnet.net (8.8.8/8.8.4) with ESMTP id VAA08091; Sat, 28 Feb 1998 21:25:22 -0500 (EST) Received: from lakes.dignus.com (lakes [10.0.0.3]) by ponds.dignus.com (8.8.5/8.8.5) with ESMTP id VAA03908; Sat, 28 Feb 1998 21:46:44 -0500 (EST) Received: (from rivers@localhost) by lakes.dignus.com (8.8.7/8.6.9) id VAA01614; Sat, 28 Feb 1998 21:29:16 -0500 (EST) Date: Sat, 28 Feb 1998 21:29:16 -0500 (EST) From: Thomas David Rivers Message-Id: <199803010229.VAA01614@lakes.dignus.com> To: hackers@FreeBSD.ORG, julian@whistle.com Subject: Re: The 'dave rivers' memorial panic. Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > I've gone some considerable distance towards tracking down a crash > that seems to resemble the problem dave's been seeing. Julian - As you can tell - I'm not ready to be 'memorialized' just yet - I'm just *way* behind in mail reading (I seem to float at about 6000 messages behind...) Anyway - I did a quick scan of Subject: lines and found my own name! Kinda surprising! > > Basically, a particular setup (running on several > pieces of hardware) is incapable of doing a full news expire, > We've managed to simplify the case down to a reproducible test, that > involves copying the contents of one disk to a newly newfs'd > partition on another. > > > The exact symptom that we see is that bits in the cylinder group > bitmap get set by "something" after the cg has bee queued for write. > > adding a test confirms that everything is alright immediatly before > the write, but the next time the cg is accessed, there are some extra > bits set. The changes are present on the disk. > It's not hardware.. we've changed everything, but it's reproducible. > with this particular setup.. > > the more I write here the more it sounds like flaky hardware.. > <\hmmmmm> but the patterns seen on disk do not > act like hardware.. Yep - tell me about it... > it looks like a reallocation.. That's the path I went down for a long time; but I couldn't see it. Also, when I added printf()s, of course; it didn't occur. I wondered if something was getting reallocated because of a critical-region issue... > > some file or more likely, directory, is extended, > and the cg summary info is never updated, though the > bitmaps are.. Yes! My reproduction goes similarly - particularly the info being never updated... (i.e. write some trash on the disk; do a newfs - and - whoops; the trash is still there - the 0's were never written...) > > > so the question is: > > does anyone know of any 'covert' paths where the cg structs > (including bitmaps) are accessed other than in ffs_alloc.c? > I'd love to be able to mark the pages concerned 'read-only' when I queue > them for write. that'd catch the other writer,, :) > > anyone have any ideas on how I'd do that for a bdwrite(bp)? It sure does sound like my problem! It's been some time since you sent this. I didn't find any responses - did you happen to get anywhere? Was this in 2.2.x or 3.0? The last 3.0 I tried this with (a snap from last summer) continued to demonstrate the problem... - The not-quite-memorial Dave Rivers - To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message