Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 10 Mar 1995 11:35:56 -0600 (CST)
From:      pritc003@maroon.tc.umn.edu
To:        bugs@FreeBSD.org
Subject:   Parity error on SCSI tape causes panic w/Adaptec 2842 controller
Message-ID:  <2f608dfe74be002@maroon.tc.umn.edu>

next in thread | raw e-mail | index | archive | help

To: FreeBSD-gnats-submit@freebsd.org
Subject: 
From: pritc003@maroon.tc.umn.edu
Reply-To: pritc003@maroon.tc.umn.edu


>Submitter-Id:   current-users
>Originator:     Mike Pritchard
>Organization:   None
>Confidential:   no
>Synopsis:       Parity error from SCSI tape cause panic w/Adaptec 2842
>Severity:       serious
>Priority:       medium
>Category:       kern
>Release:        FreeBSD 2.0-950210-SNAP i386
>Class:          sw-bug
>Environment: 

	Adaptec 2842VL SCSI controller
	Archive 2150S tape drive

>Description: 

	If a tape parity error is detected by the Adaptec 2842 driver
	software (sys/i386/scsi/aic7xxxx.c), the system will panic
	with the following messages:

	ahc1: parity error on channel A target 0, lun 0
	ahc1: Unknown SCSIINT. Status = 0x17
	panic: ahc1: brkaddrint, Illegal Host Access at seqaddr = 0x0

	Examing the code shows that the parity error detection code
	incorrectly falls into the unknown scsiinit code, which
	eventually leads to the panic.  A fix for this is attached, but
	that fix uncovers another problem that causes repeated
	scsi device timeouts on sd0.  


>How-To-Repeat: 

	Find a QIC-150 tape with a parity error, and try reading the
	tape.  The system will panic when the parity error is detected.

>Fix: 
	
	Here is a partial fix to the problem to help someone get
	started, but there is still some other underlying problem
	that shows up with this fix installed.  With this fix installed,
	the parity error is detected, and the machine will not panic,
	but then it starts complaining about scsi device timeouts on
	sd0 and keeps doing that forever, so the machine hangs up
	anyways.  I've seen the scsi device timeout problem a few
	other times before, so it probably does need to be addressed,
	although in this case it may just be happening because the
	parity error code is just plain broken in some fasion.

	I'm also willing to help test out any fixes.


*** old/aic7xxx.c	Fri Mar 10 10:53:44 1995
--- ./aic7xxx.c	Fri Mar 10 10:56:42 1995
***************
*** 1141,1146 ****
--- 1141,1155 ----
                  }
  		xs = scb->xs;
  
+ 		if ((status & (SELTO | SCSIPERR | BUSFREE)) == 0) {
+                       printf("ahc%d: Unknown SCSIINT. Status = 0x%x\n", 
+ 			     unit, status);
+                       outb(CLRSINT1 + iobase, status);
+                       UNPAUSE_SEQUENCER(ahc);
+                       outb(CLRINT + iobase, CLRINTSTAT);
+ 		      scb = NULL;
+ 		      goto cmdcomplete;
+ 		}
  		if (status & SELTO) { 
  			u_char active;
  			u_char flags;
***************
*** 1196,1209 ****
  #endif
  		}
  
- 		else {
-                       printf("ahc%d: Unknown SCSIINT. Status = 0x%x\n", 
- 			     unit, status);
-                       outb(CLRSINT1 + iobase, status);
-                       UNPAUSE_SEQUENCER(ahc);
-                       outb(CLRINT + iobase, CLRINTSTAT);
- 		      scb = NULL;
-                 }
  		if(scb != NULL) {
  		    /* We want to process the command */
                      untimeout(ahc_timeout, (caddr_t)scb);
--- 1205,1210 ----



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2f608dfe74be002>