From owner-freebsd-hackers Sun Mar 2 09:50:43 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id JAA22986 for hackers-outgoing; Sun, 2 Mar 1997 09:50:43 -0800 (PST) Received: from dg-rtp.dg.com (dg-rtp.rtp.dg.com [128.222.1.2]) by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id JAA22981 for ; Sun, 2 Mar 1997 09:50:39 -0800 (PST) Received: by dg-rtp.dg.com (5.4R3.10/dg-rtp-v02) id AA07896; Sun, 2 Mar 1997 12:50:04 -0500 Received: from ponds by dg-rtp.dg.com.rtp.dg.com; Sun, 2 Mar 1997 12:50 EST Received: from lakes.water.net (lakes [10.0.0.3]) by ponds.water.net (8.8.3/8.7.3) with ESMTP id IAA04344; Sun, 2 Mar 1997 08:12:04 -0500 (EST) Received: (from rivers@localhost) by lakes.water.net (8.8.3/8.6.9) id IAA13157; Sun, 2 Mar 1997 08:17:26 -0500 (EST) Date: Sun, 2 Mar 1997 08:17:26 -0500 (EST) From: Thomas David Rivers Message-Id: <199703021317.IAA13157@lakes.water.net> To: ponds!root.com!dg, ponds!lakes.water.net!rivers, ponds!lambert.org!terry Subject: Re: Another installment of the "dup alloc"/"bad dir" panic problems. Cc: ponds!freebsd.org!freebsd-hackers Content-Type: text Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > > > Yes, I wish it was that, I'd love to be done with this :-) > > > > However, this particular reproduction of the dup-alloc problem > > is with an AHA 1542B and Micropolis ~500meg drive... > > > > So, now the question I'm considering is "what could be some > > timing dependent that it affects both IDE and SCSI drivers?" > > 1542B? > > How much RAM do you have? > > If you have more than 16M ... it's bouncing. Try backing down to > 16M and not bouncing and see if that's where it is... > > > Terry Lambert Another good idea - but I only have 12 meg in this particular machine. Also, you should recall that I am experiencing this problem on an 8-meg 386dx (intel 387) with an IDE drive... that kinda points to something "higher-level" then the physical device drivers... Right now, I'm mulling over race conditions in disksort(). Something along the lines of: start to add buf to beginning of queue take an interrupt indicating previous I/O was complete remove partially added buf wow - lost buffer... disksort() appears to be run at splbio() [it's not obvious from the SCSI code that's what's going on, but the wd.c code definitely dones that.] If the interrupt comes in at just the right time, it seems there is a potential to loose a buffer... which I think is what I'm seeing. [That would also explain why adding a printf() to disksort masked the problem.] I'm going to play with this idea a while and see if I can verify it... - Dave Rivers -