Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Apr 2001 22:53:55 +0200
From:      J Wunsch <j@uriah.heep.sax.de>
To:        freebsd-scsi@FreeBSD.ORG
Subject:   Re: Problem with current sa(4) driver
Message-ID:  <20010418225355.U688@uriah.heep.sax.de>
In-Reply-To: <200104150504.f3F544s00932@aslan.scsiguy.com>; from gibbs@scsiguy.com on Sat, Apr 14, 2001 at 11:04:04PM -0600
References:  <20010414203925.A63281@uriah.heep.sax.de> <200104150504.f3F544s00932@aslan.scsiguy.com>

next in thread | previous in thread | raw e-mail | index | archive | help
As Justin T. Gibbs wrote:

> While it is true that the sa driver should be filtering out this
> particular case because there is no error, returning ERESTART for
> NO_SENSE is also wrong.  You should be able to fix that by changing
> the table entry for that sense code in cam_periph.c.

You mean, like this?

Index: cam_periph.c
===================================================================
RCS file: /home/ncvs/src/sys/cam/cam_periph.c,v
retrieving revision 1.34
diff -c -r1.34 cam_periph.c
*** cam_periph.c	2001/04/04 18:24:35	1.34
--- cam_periph.c	2001/04/17 17:46:11
***************
*** 1369,1374 ****
--- 1369,1376 ----
  
  		switch (err_action & SS_MASK) {
  		case SS_NOP:
+ 			error = 0;
+ 			break;
  		case SS_RETRY:
  			action_string = "Retrying Command";
  			error = ERESTART;

Tried this, it fixes the problem with ILI's, sa(4) now properly
returns a short read.  However, it uncovers a new bug that was just
waiting around...

% dd if=/dev/sa0 of=/dev/null bs=10k
dd: /dev/sa0: Input/output error
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1+0 records in
1+0 records out
10240 bytes transferred in 70.992489 secs (144 bytes/sec)

/var/log/messages says:

(sa0:sym0:0:1:0): READ(06). CDB: 8 0 0 28 0 0 
(sa0:sym0:0:1:0): CAM Status: SCSI Status Error
(sa0:sym0:0:1:0): SCSI Status: Check Condition
(sa0:sym0:0:1:0): NO SENSE info:2800 asc:0,1
(sa0:sym0:0:1:0): Filemark detected
(sa0:sym0:0:1:0): Retries Exhausted

So now, when hitting the EOM filemark, we get an EIO.  Again, this
looks like something where sa(4) should IMHO special-case the error
decision, instead of relying on cam_periph_error() to DTRT (which it
cannot).

I tried to manually patch the return value of cam_periph_error() to 0
in kgdb, but this just gets me back at the second problem:

% ps axl
  UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT  TT       TIME COMMAND
...
  107   373     1   0  -8  0   244   33 cbwait DWE   p0-   0:00.00 dd if=/dev/sa0 of=/dev/null bs=10

It sits there, and waits indefinately.  I'm at a loss here to see why
this happens. :-(

> ERESTART means the error recovery code has already re-queued the
> CCB to retry the operation.  By ignoring this code, you are telling
> the caller of saerror() to complete the command normally resulting in
> an eventual release of this particular ccb back to the free pool.

OK, understood, thanks for the explanation!

-- 
cheers, J"org               .-.-.   --... ...--   -.. .  DL8DTL

http://www.sax.de/~joerg/                        NIC: JW11-RIPE
Never trust an operating system you don't have sources for. ;-)

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010418225355.U688>