Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 12 Mar 1995 07:43:41 -0500 (EST)
From:      Peter Dufault <dufault@hda.com>
To:        oli@devsoft.com
Cc:        freebsd-bugs@freefall.cdrom.com
Subject:   Re: kern/238: failed assertion in ncr.c --> no more scsi disk access
Message-ID:  <199503121243.HAA16757@hda.com>
In-Reply-To: <199503111840.KAA14351@freefall.cdrom.com> from "oli@devsoft.com" at Mar 11, 95 10:40:01 am

next in thread | previous in thread | raw e-mail | index | archive | help
oli@devsoft.com writes:

I think we have three bugs here:

> 
> 
> >Number:         238
> >Category:       kern
> >Synopsis:       failed assertion in ncr.c --> no more scsi disk access
> >Confidential:   no
> >Severity:       serious
> >Priority:       medium
> >Responsible:    freebsd-bugs (FreeBSD bugs mailing list)
> >State:          open
> >Class:          sw-bug
> >Submitter-Id:   current-users
> >Arrival-Date:   Sat Mar 11 10:40:00 1995
> >Originator:     Oliver Adler &
> >Organization:
> no current org
> >Release:        FreeBSD 2.0-RELEASE i386 also in current snapshot
> >Environment:
> 
> 	The hardware: ASUS PCI/I-486SP3G
> 		      Intel 486 DX 4 100
> 		      32MB Ram 70ns in two sims
> 		      WD8003EP AT-BUS
> 		      Adaptec AHA1542CF with disabled Floppy and disabled BIOS
> 		      ELSA Winner100pro PCI 2MB
> 		      QUANTUM Empire 2100S  Target 0 on ncr bus
> 			  "QUANTUM EMPIRE_2100S     1200"
> 		      WANG DAT 3400 Target 3 on ncr bus
> 			  "WangDAT Model 3400DX     1.10"
> 
> >Description:
> 
> 	If you try to access /dev/rst0.0 with dd bs=128k if=/dev/rst0.0
> 	you get on the first access:
> Mar 10 09:00:28 boheme kernel: st0: bad request, must be between 0 and 0
>         (I think this message doesn't do any harm. It also starts the first
> 	 access on the adaptec and does no harm there.)

Bug 1: Tape prints out this message on first access.  I think this
bug is a minor one that is always there and shows up with the first
tape access.

>         If you try a second time you get the following:
> 	 st0:      ncr.c assertion "cp = np->header.cp" line 5171 failed
> 	 st0:      ncr.c assertion "cp" line 5172 failed
> 	 .
> 	 .
> 	 ncr0:    restart

Bug 2: ncr driver does something funny and then resets the SCSI bus.
When this is addressed you'll be able to move your DAT back to this SCSI
bus.

> 	 .
> 	 .
> 	 sd0: unit attention
> 	 sd0: oops not qeued

Bug 3: Disk driver can't live through a bus reset.

Bug 3 is partly addressed in -current.

In -current it will usually retry the disk access and you'll usually be
OK.  You can still lose disk transfers completely and wedge the system -
I think the path where you have a timeout (versus a detected failure)
doesn't retry the access and you'll start getting

> sd0: timeout

messages until you reset.

About SCSI bus resets:

Host adapter drivers should not reset the SCSI bus.  Bus reset
policy should be moved up out of the host adapter code and into
some common code in the sys/scsi directory, with a "reset scsi bus"
entry provided in the host adapter drivers.

Type drivers have to be retry on the "power on, media changed, or
bus device reset" occurred error - the SCSI spec ENCOURAGES devices
to reset the bus on power up under the assumption that any transfer
that was going on when the device was powered up is likely to be
trash.


-- 
Peter Dufault               Real Time Machine Control and Simulation
HD Associates, Inc.         Voice: 508 433 6936
dufault@hda.com             Fax:   508 433 5267



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199503121243.HAA16757>