From owner-freebsd-hackers  Sun Mar  2 09:50:43 1997
Return-Path: <owner-hackers>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.5/8.8.5) id JAA22986
          for hackers-outgoing; Sun, 2 Mar 1997 09:50:43 -0800 (PST)
Received: from dg-rtp.dg.com (dg-rtp.rtp.dg.com [128.222.1.2])
          by freefall.freebsd.org (8.8.5/8.8.5) with SMTP id JAA22981
          for <freebsd-hackers@freebsd.org>; Sun, 2 Mar 1997 09:50:39 -0800 (PST)
Received: by dg-rtp.dg.com (5.4R3.10/dg-rtp-v02)
	id AA07896; Sun, 2 Mar 1997 12:50:04 -0500
Received: from ponds by dg-rtp.dg.com.rtp.dg.com; Sun,  2 Mar 1997 12:50 EST
Received: from lakes.water.net (lakes [10.0.0.3]) by ponds.water.net (8.8.3/8.7.3) with ESMTP id IAA04344; Sun, 2 Mar 1997 08:12:04 -0500 (EST)
Received: (from rivers@localhost) by lakes.water.net (8.8.3/8.6.9) id IAA13157; Sun, 2 Mar 1997 08:17:26 -0500 (EST)
Date: Sun, 2 Mar 1997 08:17:26 -0500 (EST)
From: Thomas David Rivers <ponds!rivers@dg-rtp.dg.com>
Message-Id: <199703021317.IAA13157@lakes.water.net>
To: ponds!root.com!dg, ponds!lakes.water.net!rivers, ponds!lambert.org!terry
Subject: Re: Another installment of the "dup alloc"/"bad dir" panic problems.
Cc: ponds!freebsd.org!freebsd-hackers
Content-Type: text
Sender: owner-hackers@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

> 
> >  Yes, I wish it was that, I'd love to be done with this :-)
> > 
> >  However, this particular reproduction of the dup-alloc problem
> > is with an AHA 1542B and Micropolis ~500meg drive...
> > 
> >  So, now the question I'm considering is "what could be some
> > timing dependent that it affects both IDE and SCSI drivers?"
> 
> 1542B?
> 
> How much RAM do you have?
> 
> If you have more than 16M ... it's bouncing.  Try backing down to
> 16M and not bouncing and see if that's where it is...
> 
> 
> 					Terry Lambert

 Another good idea - but I only have 12 meg in this particular
machine.

 Also, you should recall that I am experiencing this problem on an
8-meg 386dx (intel 387) with an IDE drive... that kinda points to
something "higher-level" then the physical device drivers...

 Right now, I'm mulling over race conditions in disksort().  Something
along the lines of:

	start to add buf to beginning of queue
	take an interrupt indicating previous I/O was complete
	remove partially added buf
	wow - lost buffer...

 disksort() appears to be run at splbio() [it's not obvious from
the SCSI code that's what's going on, but the wd.c code definitely 
dones that.]  If the interrupt comes in at just the right time, it
seems there is a potential to loose a buffer... which I think is
what I'm seeing.  [That would also explain why adding a printf()
to disksort masked the problem.] I'm going to play with this idea
a while and see if I can verify it...

	- Dave Rivers -