From owner-freebsd-bugs Sun Mar 12 04:47:11 1995 Return-Path: bugs-owner Received: (from majordom@localhost) by freefall.cdrom.com (8.6.10/8.6.6) id EAA15218 for bugs-outgoing; Sun, 12 Mar 1995 04:47:11 -0800 Received: from hda.com (hda.com [199.232.40.182]) by freefall.cdrom.com (8.6.10/8.6.6) with ESMTP id EAA15212 for ; Sun, 12 Mar 1995 04:47:08 -0800 Received: (dufault@localhost) by hda.com (8.6.9/8.3) id HAA16757; Sun, 12 Mar 1995 07:43:41 -0500 From: Peter Dufault Message-Id: <199503121243.HAA16757@hda.com> Subject: Re: kern/238: failed assertion in ncr.c --> no more scsi disk access To: oli@devsoft.com Date: Sun, 12 Mar 1995 07:43:41 -0500 (EST) Cc: freebsd-bugs@freefall.cdrom.com In-Reply-To: <199503111840.KAA14351@freefall.cdrom.com> from "oli@devsoft.com" at Mar 11, 95 10:40:01 am X-Mailer: ELM [version 2.4 PL24] Content-Type: text Content-Length: 3014 Sender: bugs-owner@FreeBSD.org Precedence: bulk oli@devsoft.com writes: I think we have three bugs here: > > > >Number: 238 > >Category: kern > >Synopsis: failed assertion in ncr.c --> no more scsi disk access > >Confidential: no > >Severity: serious > >Priority: medium > >Responsible: freebsd-bugs (FreeBSD bugs mailing list) > >State: open > >Class: sw-bug > >Submitter-Id: current-users > >Arrival-Date: Sat Mar 11 10:40:00 1995 > >Originator: Oliver Adler & > >Organization: > no current org > >Release: FreeBSD 2.0-RELEASE i386 also in current snapshot > >Environment: > > The hardware: ASUS PCI/I-486SP3G > Intel 486 DX 4 100 > 32MB Ram 70ns in two sims > WD8003EP AT-BUS > Adaptec AHA1542CF with disabled Floppy and disabled BIOS > ELSA Winner100pro PCI 2MB > QUANTUM Empire 2100S Target 0 on ncr bus > "QUANTUM EMPIRE_2100S 1200" > WANG DAT 3400 Target 3 on ncr bus > "WangDAT Model 3400DX 1.10" > > >Description: > > If you try to access /dev/rst0.0 with dd bs=128k if=/dev/rst0.0 > you get on the first access: > Mar 10 09:00:28 boheme kernel: st0: bad request, must be between 0 and 0 > (I think this message doesn't do any harm. It also starts the first > access on the adaptec and does no harm there.) Bug 1: Tape prints out this message on first access. I think this bug is a minor one that is always there and shows up with the first tape access. > If you try a second time you get the following: > st0: ncr.c assertion "cp = np->header.cp" line 5171 failed > st0: ncr.c assertion "cp" line 5172 failed > . > . > ncr0: restart Bug 2: ncr driver does something funny and then resets the SCSI bus. When this is addressed you'll be able to move your DAT back to this SCSI bus. > . > . > sd0: unit attention > sd0: oops not qeued Bug 3: Disk driver can't live through a bus reset. Bug 3 is partly addressed in -current. In -current it will usually retry the disk access and you'll usually be OK. You can still lose disk transfers completely and wedge the system - I think the path where you have a timeout (versus a detected failure) doesn't retry the access and you'll start getting > sd0: timeout messages until you reset. About SCSI bus resets: Host adapter drivers should not reset the SCSI bus. Bus reset policy should be moved up out of the host adapter code and into some common code in the sys/scsi directory, with a "reset scsi bus" entry provided in the host adapter drivers. Type drivers have to be retry on the "power on, media changed, or bus device reset" occurred error - the SCSI spec ENCOURAGES devices to reset the bus on power up under the assumption that any transfer that was going on when the device was powered up is likely to be trash. -- Peter Dufault Real Time Machine Control and Simulation HD Associates, Inc. Voice: 508 433 6936 dufault@hda.com Fax: 508 433 5267