Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Oct 1999 07:26:35 -0700 (PDT)
From:      Matthew Jacob <mjacob@feral.com>
To:        Andrew Gallatin <gallatin@cs.duke.edu>, sos@freebsd.org
Cc:        alpha@freebsd.org
Subject:   Re: workaround for ata driver woes on alpha 
Message-ID:  <Pine.BSF.4.05.9910180702290.14549-100000@semuta.feral.com>
In-Reply-To: <14347.6330.820928.627692@grits.cs.duke.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
(I have not yet looked at the actual source...)

> I think there is a serious problem with the ad_timeout() function in
> the case where the request has actually completed & the timeout was
> too short.  ad_timeout() has no way to know if the request it has been
> passed is still valid, or has been deallocated.  Wrapping the function
> in splbio() will only narrow the race, not close it because we're
> still going to be at splsoftclock when the function is called.  I
> think setting the timeout to a reasonable value is a good workaround,
> but I'm still concerned about very slow hardware..

No, lengthening the timeout, while possibly correct for trying to achieve
the same length of timeout on alpha as in i386, will *never* solve window
problems- it just makes them more infrequent which is, in fact, far more
dangerous to an OS than the outright panic (why? Think about it- if you
make a problem just *rare* instead of really going away, you curse the
platform it occurs on with an aura of unreliabilty so that people are just
too uneasy to depend on it...).....

(goes off and looks at source....)

Yep. This is broken. The timeout can still run when a request has been
deallocated. This whole area of the code needs to be rewritten/rethought.
I wouldn't run it, even with the timeout extended, without that.

I would recommend hanging requests off the softc in a list (if it's more
than one per ata instance) or just as a pointer *which gets nulled if
untimeout is called* so that splbio protection can offer mutex exclusion
on the callout vs. the IDE interrupt thread looking through the currently
active list. If ad_interrupt runs and calls untimeout on an already
active callout it will make the callout thread not find anything to
whine about (but only if the callout thread knows where to looK). You
should note that this would still be problematic if there ever were
identical request block pointers. Also, IMO, using a timeout per I/O
request is a heavy load for a system unless you need the precise accuracy.
I prefer a general periodic timer per device instance and timeout counts
for all active commands for that device.

Soren- this is your stuff isn't it- what have we misunderstood?

-matt





To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-alpha" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.05.9910180702290.14549-100000>