Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 05 Jun 1998 15:59:19 -0400 (EDT)
From:      Simon Shapiro <shimon@simon-shapiro.org>
To:        Bob Willcox <bob@luke.pmr.com>
Cc:        Stefan Esser <se@FreeBSD.ORG>, Greg Lehey <grog@lemis.com>, Mike Smith <mike@smith.net.au>, Karl Pielorz <kpielorz@tdx.co.uk>, tcobb <tcobb@staff.circle.net>, "freebsd-current@freebsd.org" <freebsd-current@FreeBSD.ORG>, Michael Hancock <michaelh@cet.co.jp>
Subject:   Re: DPT driver fails and panics with Degraded Array
Message-ID:  <XFMail.980605155919.shimon@simon-shapiro.org>
In-Reply-To: <19980605085459.A9510@pmr.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On 05-Jun-98 Bob Willcox wrote:
> On Fri, Jun 05, 1998 at 12:01:01AM +0200, Stefan Esser wrote:
>> On 1998-06-04 12:12 -0400, Simon Shapiro <shimon@simon-shapiro.org>
>> wrote:
>> > Many of these problems are actually (arguabbly?) induced by timing
>> > problems
>> > on the PCI bus.  Certain PCI-PCI bridges (or even motherboard ``main''
>> > chipsets will deliver interrupts, I/O bus transactions and memory
>> > transactions out of order when hammered very rapidly, under heavy
>> > load, or
>> > both.  We proved it clearly with certain ``industrial'' computers, and
>> > certain motherboards, by making the symptoms go away (or drastically
>> > change) as you move the DPT, video cards, Ethernet cards, etc. from
>> > slot to
>> > slot.
>> 
>> This is a design "feature" of PCI, actually, and well documented.
>> 
>> The interrupt lines are directly connected to the chip-set (or 
>> possibly the CPU in non-Intel PCI systems) and for that reason,
>> there may for example be as many outstanding memory writes in 
>> write-buffers as their FIFO depths allow, when the end-of-transfer
>> interrupt is recognized by the CPU.
>> 
>> There is a documented protocol to flush all write buffers: Just
>> read some device register at the start of the interrupt handler
>> (i.e. before trying to access common data structures in memory,
>> that are used for communication between CPU and an intelligent 
>> device). The read will be blocked until all buffers are flushed.
> 
> Hmm, well my AIX device driver did this.  The first thing it did was to
> read the HA_AUX_STATUS register on the adapter to see if an interrupt
> was pending for it (a pretty common thing to do I think).  Note that I
> never saw the problem running on my (Motorola) UP test system.  Bull saw
> it quite regularly on their MP systems, though.

So does the DPT driver for FreeBSD.  It is in the publicly available source
code :-)

DPTs position was that they are PCI compliant.  This position proved
correct when the failed system was replaced with one sporting a later
generation PCI-PCI bridge.  It worked correctly.

I have a feeling that anyone who have written a DPT driver suffered to some
degree through the same issues.  These low level hardware dependencies were
ironed out long ago.  Not in small part due to execellent support from Mark
Salyzyn from DPT and Mike Neuffer who wrote the Linux DPT drivers.

As to the relevance of this thread to the origin, my experience tells me
that if these issues are part of the equation, they show up very early and
very severely, and not in the manner posted here.  The conditional code in
the FreeBSD driver (in particular the 2.2 version) is well documented by
now and can be used to verify the source of the problem, if any.

Simon


Simon


---


Sincerely Yours, 

Simon Shapiro                                           Shimon@Simon-Shapiro.ORG
                                                        770.265.7340

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.980605155919.shimon>