From owner-freebsd-scsi  Tue Mar 23  0:36:49 1999
Delivered-To: freebsd-scsi@freebsd.org
Received: from Sisyphos.MI.Uni-Koeln.DE (Sisyphos.MI.Uni-Koeln.DE [134.95.212.10])
	by hub.freebsd.org (Postfix) with ESMTP
	id 3471214D2C; Tue, 23 Mar 1999 00:34:08 -0800 (PST)
	(envelope-from se@dialup124.zpr.uni-koeln.de)
Received: from dialup124.zpr.Uni-Koeln.DE (dialup124.zpr.Uni-Koeln.DE [134.95.219.124])
	by Sisyphos.MI.Uni-Koeln.DE (8.8.7/8.8.7) with ESMTP id JAA19900;
	Tue, 23 Mar 1999 09:33:48 +0100 (MET)
Received: (from se@localhost) by dialup124.zpr.Uni-Koeln.DE (8.9.3/8.6.9) id JAA00480; Tue, 23 Mar 1999 09:36:34 +0100 (CET)
Date: Tue, 23 Mar 1999 09:36:34 +0100
From: Stefan Esser <se@mi.uni-koeln.de>
To: Christian Weisgerber <naddy@mips.rhein-neckar.de>
Cc: freebsd-scsi@freebsd.org, Stefan Esser <se@freebsd.org>
Subject: Re: Crash: what happened?
Message-ID: <19990323093634.A425@dialup124.mi.uni-koeln.de>
Reply-To: se@freebsd.org
References: <7d6obe$ne4$1@mips.rhein-neckar.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
X-Mailer: Mutt 0.95.4i
In-Reply-To: <7d6obe$ne4$1@mips.rhein-neckar.de>; from Christian Weisgerber on Tue, Mar 23, 1999 at 01:47:42AM +0100
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On 1999-03-23 01:47 +0100, Christian Weisgerber <naddy@mips.rhein-neckar.de> wrote:
> Our favorite unstable 2.2.8 box crashed with the following. Any idea
> what could have caused this?
> 
> ncr0:1: ERROR (1:0) (8-0-800) (8/13) @ (mem 159e0:00000000).

This looks like a memory read error (or the delayed result 
thereof) ... In the error message above (1:0) = (dstat:sist).

dstat=1 (Illegal Instruction)
sist=0

Offset of next instruction is 0x159e0 (way outside the "official"
NCR SCRIPTS code).

At that address, a value of 0 was read, which is not a valid 
instruction for the CPU in the NCR chip. It stopped working,
and the driver did not manage to recover from that state. (It
is possible, that the memory range holding the "micro-code"
was corrupted. I can't tell, what made the NCR jump to that
invalid address where the "Illegal Instruction Detected" 
interrupt made it stop. (It may have been a soft error, just
had one a few days ago, when one bit flipped during a kernel
build ...)

Since I assume that this is a single occurence in an otherwise
reliable system, I'd not consider this to be a major problem.

If something like that happens again, I'd rather guess it is
hardware going bad than software. (I have heard of NCR chips 
fail after years of reliable operation, and this may also be
a memory chip running under marginal conditions ...)

Gruß, STefan


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message