Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 06 Dec 1999 14:41:57 -0800
From:      Parag Patel <parag@cgt.com>
To:        Mike Smith <msmith@FreeBSD.ORG>
Cc:        Gerard Roudier <groudier@club-internet.fr>, Ed Hall <edhall@screech.weirdnoise.com>, freebsd-hackers@FreeBSD.ORG
Subject:   Re: PCI DMA lockups in 3.2 (3.3 maybe?) 
Message-ID:  <6268.944520117@pinhead.parag.codegen.com>
In-Reply-To: Message from Mike Smith <msmith@FreeBSD.ORG>  of "Mon, 06 Dec 1999 13:28:40 PST." <199912062128.NAA01671@mass.cdrom.com> 

next in thread | previous in thread | raw e-mail | index | archive | help

Regarding the PCI DMA problems and corruption, it reminds of me of a
similar PCI and DMA-related problem we had when porting OpenBSD to a
now-defunct NKK MIPS chipset.  It may not be related, but here it is.

The port was up and running but under heavy load, say a compile, apps
(specifically one of the compiler passes) that were running would start
dying with seg-faults and then locking up the system.  We finally had to
get out the logic-analyzer and MIPS probe, and even then we still
couldn't watch everything due to the MIPS on-chip cache.

The support chipset was locking up.  This chip had to handle memory
access from the MIPS CPU, handle DRAM directly, and handle DMA access
from the PCI bus.  It bridged all three (CPU, RAM, PCI) and seemed to us
to be hosing itself in some funky meta-stable condition.  Heavy
simultaneous memory access, typically PCI DMA bursts from different
devices, usually triggered the lockup.

So it's quite possible that the host-to-PCI-to-memory controller chipset
may be the real culprit and not the drivers or specific PCI devices.


In the proecss, we discovered a very interesting thing about the
NCR/Symbios chips, at least the 810 and 825 series.  Turns out that when
they are executing their scripts, and the scripts access an on-board PCI
register, that access actually negotiates for the PCI bus and uses it to
read the register!  That's right - it uses the PCI bus to talk to
itself - even when it's not DMA-ing anything!

Freaked us out when we saw it, 'cause the CPU wasn't anywhere near any
code that was accessing the NCR's registers.  Of course it slows down
script execution but could slow down the PCI bus depending on the
script.  And this is all without the CPU being involved.  Certainly
it'll cause more PCI-bus activity that most other chips, and perhaps
this is why NCR controllers tend to trigger the DMA condition.

It seems that whoever designed the NCR's script-engine glommed it onto
the original programmed I/O SCSI core using the PCI bus instead of
redesigning the chip.  Cheap short-cut.  Dunno if any other NCR chips
exhibit this behavior, but I wouldn't be surprised.


	-- Parag Patel


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6268.944520117>