From owner-freebsd-scsi  Mon Aug 28  2: 1:41 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from mail-service.wias-berlin.de (descartes.wias-berlin.de [192.124.249.194])
	by hub.freebsd.org (Postfix) with ESMTP id F074737B422
	for <freebsd-scsi@FreeBSD.ORG>; Mon, 28 Aug 2000 02:01:28 -0700 (PDT)
Received: from wias-berlin.de by mail-service.wias-berlin.de (8.8.8/1.1.22.3/16Mar99-0205PM)
	id LAA0000029684; Mon, 28 Aug 2000 11:01:26 +0200 (MET DST)
Received: from localhost by wias-berlin.de (8.8.8/1.1.10.5/07Oct97-0932AM)
	id LAA0000014478; Mon, 28 Aug 2000 11:01:20 +0200 (MET DST)
To: groudier@club-internet.fr, j@uriah.heep.sax.de, dkelly@hiwaay.net
Cc: freebsd-scsi@FreeBSD.ORG
Subject: Re: NCR SCSI controller problems with FreeBSD-4.1R
From: Jens Andre Griepentrog <griepent@wias-berlin.de>
Reply-To: griepent@wias-berlin.de
In-Reply-To: <Pine.LNX.4.10.10008231356240.899-100000@linux.local>
References: <20000823105725P.griepent@hilbert.wias-berlin.de>
	<Pine.LNX.4.10.10008231356240.899-100000@linux.local>
X-Mailer: Mew version 1.94.1 on Emacs 19.33 
Mime-Version: 1.0
Content-Type: Text/Plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
Message-Id: <20000828110119B.griepent@hilbert.wias-berlin.de>
Date: Mon, 28 Aug 2000 11:01:19 +0200
X-Dispatcher: imput version 990905(IM130)
Lines: 102
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


Dear G=E9rard, David and Joerg!

Thank you very much for your suggestions, especially yours, G=E9rard!
Indeed, looking at the SC-200 board it was configured for PCI INT B!
Setting the jumper to PCI INT A solved the problem, now I was able
to install FreeBSD-4.1R successfully.
Again reading the manual of both the motherboard and the controller
card PCI INT A is the default setting, it was my mistake to trust on
the the right setting without a simple check. But last for years there
was no reason to have any look to the controller because the system has =

worked flawlessly with all previous FreeBSD releases until now.

Being happy I want to thank you again,

Jens Griepentrog

>
> On Wed, 23 Aug 2000 14:25:06, G=E9rard Roudier suggests the following:=

>
> 1) Check that your SC-200 is configured for PCI INT A.
>    If it is not, change jumpers accordingly.
> 2) Move you board to another PCI slot.
> 3) Check/set SCSI termination properly.
> =

> If either (1) and/or (2) and/or (3) does not cure, then send me any dm=
esg
> you have. You also could try to catch manually driver messages that lo=
ok
> like complaint about something going wrong under FreeBSD and/or Linux =
as
> well.
>
> The fact that your system experiences the problem also under Linux and=

> given that the ncr(4) driver (used in 3.4) also polls interrupts using=
 a
> timer, it might well be that hardware interrupts are not functionning
> properly for your SC-200 (just a guessing).
>
>  Gerard.  =

>

>> On Wed, 23 Aug 2000, Jens Andre Griepentrog wrote:
>>
>> Dear FreeBSD and SCSI experts!
>> =

>> I have tried to install FreeBSD 4.1 Release from CD-ROM (4.1-install.=
iso).
>> Using the boot floppies kern.flp and mfsroot.flp the system hangs up =
during
>> the detection and settlement of SCSI devices. None of my five SCSI de=
vices
>> connected to the ASUS SC-200 NCR SCSI controller appears in the boot =
display
>> messages. Here is the short list of relevant hardware components:
> =

>> 	ASUS P6NP5 Intel Pentium Pro 200 mother board,
>> 	three IBM and Quantum SCSI hard drives,
>> 	Tandberg SCSI QIC streamer,
>> 	Plextor SCSI CD-ROM.
>> =

>> Using the same hardware this problem appears for the first time in my=

>> four-year experience with successful installed releases 2.1, 2.2, 3.1=
 or 3.4.
>> =

>> From time to time I tried to install several Linux distributions to g=
et
>> the impression of a comparable OS but every time running into the sam=
e
>> trouble concerning the detection of SCSI devices.
>> =

>> The deeper reason could be a problem with the NCR SCSI controller bec=
ause
>> of reported code modifications after release 3.4.
>> =

>> It would be nice if you could give me hint. Let me know if you need m=
ore
>> details about my hardware or BIOS settings. For instance I can send a=
 dmesg
>> log file from my current FreeBSD-3.4 installation.
>> =

>> So long,
>> =

>> Jens Griepentrog
>>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Mon Aug 28  7:43: 5 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from sabre.velocet.net (sabre.velocet.net [198.96.118.66])
	by hub.freebsd.org (Postfix) with ESMTP id 47EB737B424
	for <freebsd-SCSI@freebsd.org>; Mon, 28 Aug 2000 07:43:02 -0700 (PDT)
Received: from office.tor.velocet.net (trooper.velocet.net [216.126.82.226])
	by sabre.velocet.net (Postfix) with ESMTP id 33B71137F12
	for <freebsd-SCSI@freebsd.org>; Mon, 28 Aug 2000 10:43:00 -0400 (EDT)
Received: (from dgilbert@localhost)
	by office.tor.velocet.net (8.9.3/8.9.3) id KAA71431;
	Mon, 28 Aug 2000 10:42:59 -0400 (EDT)
	(envelope-from dgilbert)
From: David Gilbert <dgilbert@velocet.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <14762.31347.647187.677745@trooper.velocet.net>
Date: Mon, 28 Aug 2000 10:42:59 -0400 (EDT)
To: freebsd-SCSI@freebsd.org
Subject: SCSI disconnect with quantum Atlas IV disks.
X-Mailer: VM 6.75 under 20.4 "Emerald" XEmacs  Lucid
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

OK... round two.  As I mentioned in my posting about RAID... I'm
having trouble with my system disconnecting disks during intense usage 
(like the nightly finds that run across the disk).

The system is running two 29160 controllers (couldn't buy 2940's) with 
LVD cables and terminators (all rated for 160 operation).  With 4.1,
we're running at 40Mhz (80 MB/s) ... which is fine, but I'm wondering
if this could be causing part of the problem.

When might the 160 patch be MFC'd?

I've also fired of a query to Quantum to see if there are any firmware 
updates... but a search of their knowledge base doesn't indicate any.

One strange note is that I've found that I have to turn on the drives
slightly after turning on the computer lest they fail to reset
properly.  I don't know what causes this, but changing controllers,
cables and terminators hasn't helped (I also have a set of TechRAM
390F's that exhibit largely the same symptoms).

Would it be possible to code some really BIG bus resets and reprobes
into the SCSI layer instead of disconnecting the drive --- maybe even
some tunable parameter --- It would seem sensible that you could do
with some gaps in performance once you've gotten to this level of
problem.

Dave.

-- 
============================================================================
|David Gilbert, Velocet Communications.       | Two things can only be     |
|Mail:       dgilbert@velocet.net             |  equal if and only if they |
|http://www.velocet.net/~dgilbert             |   are precisely opposite.  |
=========================================================GLO================


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Mon Aug 28 20:31: 8 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80])
	by hub.freebsd.org (Postfix) with ESMTP id 4141737B42C
	for <freebsd-scsi@FreeBSD.ORG>; Mon, 28 Aug 2000 20:31:04 -0700 (PDT)
Received: (from grog@localhost)
	by wantadilla.lemis.com (8.11.0/8.9.3) id e7T3Ue612616;
	Tue, 29 Aug 2000 13:00:40 +0930 (CST)
	(envelope-from grog)
Date: Tue, 29 Aug 2000 13:00:40 +0930
From: Greg Lehey <grog@lemis.com>
To: Matthew Jacob <mjacob@feral.com>
Cc: Sam <freep@thecity.sfsu.edu>, freebsd-scsi@FreeBSD.ORG
Subject: Re: "tape is now frozen"
Message-ID: <20000829130040.Q11422@wantadilla.lemis.com>
References: <20000809090336.F13974@wantadilla.lemis.com> <Pine.BSF.4.10.10008081636150.87492-100000@beppo.feral.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0i
In-Reply-To: <Pine.BSF.4.10.10008081636150.87492-100000@beppo.feral.com>; from mjacob@feral.com on Tue, Aug 08, 2000 at 04:39:03PM -0700
Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia
Phone: +61-8-8388-8286
Fax: +61-8-8388-8725
Mobile: +61-418-838-708
WWW-Home-Page: http://www.lemis.com/~grog
X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF  13 24 52 F8 6D A4 95 EF
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Tuesday,  8 August 2000 at 16:39:03 -0700, Matt Jacob wrote:
>>
>> Indeed, it looks like he omitted the message that we most wanted to
>> see.
>>
>> Let me repeat here that I find this *very* irritating.  It happens,
>> for example, if I try to read a block which is too long.  There's no
>> way to know the length of a tape block in advance, so this is
>> relatively easy to get, particularly with DDS-4 drives, and it
>
> This error should not occur if you're in variable block mode. If you
> set the drive in fixed block mode and read a block that's too large,
> the tape driver cannot know where the tape heads are located. It's
> that simple.

It's not that simple.  If it happens in the middle of the tape, I need
to rewind the bloody thing to recover.  At least an fsf or bsf should
be sufficient.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Mon Aug 28 20:36:57 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from feral.com (feral.com [192.67.166.1])
	by hub.freebsd.org (Postfix) with ESMTP id 1B0BD37B423
	for <freebsd-scsi@FreeBSD.ORG>; Mon, 28 Aug 2000 20:36:55 -0700 (PDT)
Received: from beppo.feral.com (beppo [192.67.166.79])
	by feral.com (8.9.3/8.9.3) with ESMTP id UAA25441;
	Mon, 28 Aug 2000 20:36:45 -0700
Date: Mon, 28 Aug 2000 20:36:42 -0700 (PDT)
From: Matthew Jacob <mjacob@feral.com>
Reply-To: mjacob@feral.com
To: Greg Lehey <grog@lemis.com>
Cc: Sam <freep@thecity.sfsu.edu>, freebsd-scsi@FreeBSD.ORG
Subject: Re: "tape is now frozen"
In-Reply-To: <20000829130040.Q11422@wantadilla.lemis.com>
Message-ID: <Pine.BSF.4.21.0008282031330.36166-100000@beppo.feral.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


On Tue, 29 Aug 2000, Greg Lehey wrote:

> On Tuesday,  8 August 2000 at 16:39:03 -0700, Matt Jacob wrote:
> >>
> >> Indeed, it looks like he omitted the message that we most wanted to
> >> see.
> >>
> >> Let me repeat here that I find this *very* irritating.  It happens,
> >> for example, if I try to read a block which is too long.  There's no
> >> way to know the length of a tape block in advance, so this is
> >> relatively easy to get, particularly with DDS-4 drives, and it
> >
> > This error should not occur if you're in variable block mode. If you
> > set the drive in fixed block mode and read a block that's too large,
> > the tape driver cannot know where the tape heads are located. It's
> > that simple.
> 
> It's not that simple.  If it happens in the middle of the tape, I need
> to rewind the bloody thing to recover.  At least an fsf or bsf should
> be sufficient.

So you space a filemark. Where specfically on the tape are you given you've
lost knowledge of where you are? The whole point of rewind, eom or offline is
to bring the tape to a *known* place. Spacing one filemark is not sufficient.

I thought for awhile about allowing the use of rdhpos/sethpos to allow
for unfreezing.

You're also, again, begging the question as to why it has occurred. It has
occurred because there was an I/O error, or you're not using the tape
correctly (fixed block mode and you don't issue the correct read size).

I keep on getting pushback on this. Perhaps an 'unfreeze' ioctl, call it
MTIOSUICIDE, is in order. Not.


-matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Mon Aug 28 20:47:43 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80])
	by hub.freebsd.org (Postfix) with ESMTP id 2ED5A37B43C
	for <freebsd-scsi@FreeBSD.ORG>; Mon, 28 Aug 2000 20:47:39 -0700 (PDT)
Received: (from grog@localhost)
	by wantadilla.lemis.com (8.11.0/8.9.3) id e7T3lLm16220;
	Tue, 29 Aug 2000 13:17:21 +0930 (CST)
	(envelope-from grog)
Date: Tue, 29 Aug 2000 13:17:21 +0930
From: Greg Lehey <grog@lemis.com>
To: Matthew Jacob <mjacob@feral.com>
Cc: Sam <freep@thecity.sfsu.edu>, freebsd-scsi@FreeBSD.ORG
Subject: Re: "tape is now frozen"
Message-ID: <20000829131721.R11422@wantadilla.lemis.com>
References: <20000829130040.Q11422@wantadilla.lemis.com> <Pine.BSF.4.21.0008282031330.36166-100000@beppo.feral.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0i
In-Reply-To: <Pine.BSF.4.21.0008282031330.36166-100000@beppo.feral.com>; from mjacob@feral.com on Mon, Aug 28, 2000 at 08:36:42PM -0700
Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia
Phone: +61-8-8388-8286
Fax: +61-8-8388-8725
Mobile: +61-418-838-708
WWW-Home-Page: http://www.lemis.com/~grog
X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF  13 24 52 F8 6D A4 95 EF
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Monday, 28 August 2000 at 20:36:42 -0700, Matt Jacob wrote:
> On Tue, 29 Aug 2000, Greg Lehey wrote:
>> On Tuesday,  8 August 2000 at 16:39:03 -0700, Matt Jacob wrote:
>>>>
>>>> Indeed, it looks like he omitted the message that we most wanted to
>>>> see.
>>>>
>>>> Let me repeat here that I find this *very* irritating.  It happens,
>>>> for example, if I try to read a block which is too long.  There's no
>>>> way to know the length of a tape block in advance, so this is
>>>> relatively easy to get, particularly with DDS-4 drives, and it
>>>
>>> This error should not occur if you're in variable block mode. If you
>>> set the drive in fixed block mode and read a block that's too large,
>>> the tape driver cannot know where the tape heads are located. It's
>>> that simple.
>>
>> It's not that simple.  If it happens in the middle of the tape, I need
>> to rewind the bloody thing to recover.  At least an fsf or bsf should
>> be sufficient.
>
> So you space a filemark. Where specfically on the tape are you given you've
> lost knowledge of where you are?

I've lost the *exact* position.  In fact, I'm not even sure I've lost
the exact position, but I'm prepared to concede that.  Within a block
or so I know where I am.

> The whole point of rewind, eom or offline is to bring the tape to a
> *known* place. Spacing one filemark is not sufficient.

Why not?  We know we're at the beginning (or end) of a file.

> I thought for awhile about allowing the use of rdhpos/sethpos to allow
> for unfreezing.
>
> You're also, again, begging the question as to why it has occurred.

Well, no, I state it above.

> It has occurred because there was an I/O error, or you're not using
> the tape correctly (fixed block mode and you don't issue the correct
> read size).

Precisely.  But that's not a reason to penalize people by making them
wind from one end of the tape to the other.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Mon Aug 28 22:16:43 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from feral.com (feral.com [192.67.166.1])
	by hub.freebsd.org (Postfix) with ESMTP id A8DDA37B43C
	for <freebsd-scsi@FreeBSD.ORG>; Mon, 28 Aug 2000 22:16:40 -0700 (PDT)
Received: from beppo.feral.com (beppo [192.67.166.79])
	by feral.com (8.9.3/8.9.3) with ESMTP id WAA25626;
	Mon, 28 Aug 2000 22:15:44 -0700
Date: Mon, 28 Aug 2000 22:15:41 -0700 (PDT)
From: Matthew Jacob <mjacob@feral.com>
Reply-To: mjacob@feral.com
To: Greg Lehey <grog@lemis.com>
Cc: Sam <freep@thecity.sfsu.edu>, freebsd-scsi@FreeBSD.ORG
Subject: Re: "tape is now frozen"
In-Reply-To: <20000829131721.R11422@wantadilla.lemis.com>
Message-ID: <Pine.BSF.4.21.0008282211050.36166-100000@beppo.feral.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


> 
> I've lost the *exact* position.  In fact, I'm not even sure I've lost
> the exact position, but I'm prepared to concede that.  Within a block
> or so I know where I am.
> 

You cannot know that.


> > The whole point of rewind, eom or offline is to bring the tape to a
> > *known* place. Spacing one filemark is not sufficient.
> 
> Why not?  We know we're at the beginning (or end) of a file.

I have fifty files of *almost* identical data. If I think I'm at file 43, but
I'm at file 42, that's wrong. If tape position has been lost, it's been lost.
It's more than a "little bit" pregnant.

It's just barely possible to concede that a (successful) report of hardware
block position could be construed as "setting location". But this also means
that all other notions of file location can not be known any more.

> Well, no, I state it above.
> 
> > It has occurred because there was an I/O error, or you're not using
> > the tape correctly (fixed block mode and you don't issue the correct
> > read size).
> 
> Precisely.  But that's not a reason to penalize people by making them
> wind from one end of the tape to the other.

Argh!!!!! I just plain don't agree, Greg. I'm sorry. If you don't know what
the blocksize is, switch to variable mode. Even then- what's the big deal.
You're not in the middle of the tape- you're at the front of the tape if you
don't know.

I really don't want to change this behaviour. It's wrong to allow people to do
things that would encourage data lossage.

-matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Mon Aug 28 22:22:40 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80])
	by hub.freebsd.org (Postfix) with ESMTP id C8DC637B43E
	for <freebsd-scsi@FreeBSD.ORG>; Mon, 28 Aug 2000 22:22:35 -0700 (PDT)
Received: (from grog@localhost)
	by wantadilla.lemis.com (8.11.0/8.9.3) id e7T5MN828566;
	Tue, 29 Aug 2000 14:52:23 +0930 (CST)
	(envelope-from grog)
Date: Tue, 29 Aug 2000 14:52:23 +0930
From: Greg Lehey <grog@lemis.com>
To: Matthew Jacob <mjacob@feral.com>
Cc: Sam <freep@thecity.sfsu.edu>, freebsd-scsi@FreeBSD.ORG
Subject: Re: "tape is now frozen"
Message-ID: <20000829145223.Z11422@wantadilla.lemis.com>
References: <20000829131721.R11422@wantadilla.lemis.com> <Pine.BSF.4.21.0008282211050.36166-100000@beppo.feral.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0i
In-Reply-To: <Pine.BSF.4.21.0008282211050.36166-100000@beppo.feral.com>; from mjacob@feral.com on Mon, Aug 28, 2000 at 10:15:41PM -0700
Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia
Phone: +61-8-8388-8286
Fax: +61-8-8388-8725
Mobile: +61-418-838-708
WWW-Home-Page: http://www.lemis.com/~grog
X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF  13 24 52 F8 6D A4 95 EF
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Monday, 28 August 2000 at 22:15:41 -0700, Matt Jacob wrote:
>
>>
>> I've lost the *exact* position.  In fact, I'm not even sure I've lost
>> the exact position, but I'm prepared to concede that.  Within a block
>> or so I know where I am.
>
> You cannot know that.

Of course I can.  I've just typed:

  # mt fsf 3
  # tar tv
  Cannot read 65536 byte record.
  Tape is now frozen.  Jump over your left shoulder to recover.

(yes, these aren't the original messages, but I didn't want to go to
the trouble of finding a suitable tape and actually doing the
experiment).

At this point, I know that in all likelihood the tape has got to the
end of the record and is (logically) positioned on the IRG.  It may be
somewhere in the middle, but that doesn't matter.  If I do the
following:

  # mt bsf 1
  # mt fsf 1
  # tar tvb 128

it should work.

>>> The whole point of rewind, eom or offline is to bring the tape to a
>>> *known* place. Spacing one filemark is not sufficient.
>>
>> Why not?  We know we're at the beginning (or end) of a file.
>
> I have fifty files of *almost* identical data. If I think I'm at
> file 43, but I'm at file 42, that's wrong. If tape position has been
> lost, it's been lost.  It's more than a "little bit" pregnant.

I don't see that in this case.  Is that the scenario you're talking
about?

> It's just barely possible to concede that a (successful) report of
> hardware block position could be construed as "setting
> location". But this also means that all other notions of file
> location can not be known any more.

I'm not even going to argue that one.  It seems to me that once you've
lost position within a file, you've lost position.  But you know
you're between two file marks.

>> Well, no, I state it above.
>>
>>> It has occurred because there was an I/O error, or you're not using
>>> the tape correctly (fixed block mode and you don't issue the correct
>>> read size).
>>
>> Precisely.  But that's not a reason to penalize people by making them
>> wind from one end of the tape to the other.
>
> Argh!!!!! I just plain don't agree, Greg. I'm sorry. If you don't know what
> the blocksize is, switch to variable mode. Even then- what's the big deal.
> You're not in the middle of the tape- you're at the front of the tape if you
> don't know.
>
> I really don't want to change this behaviour. It's wrong to allow
> people to do things that would encourage data lossage.

Do you still maintain this in the case of the example above?

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Mon Aug 28 22:33: 4 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from panzer.kdm.org (panzer.kdm.org [216.160.178.169])
	by hub.freebsd.org (Postfix) with ESMTP id DD77037B422
	for <freebsd-SCSI@FreeBSD.ORG>; Mon, 28 Aug 2000 22:33:00 -0700 (PDT)
Received: (from ken@localhost)
	by panzer.kdm.org (8.9.3/8.9.1) id XAA36007;
	Mon, 28 Aug 2000 23:32:58 -0600 (MDT)
	(envelope-from ken)
Date: Mon, 28 Aug 2000 23:32:57 -0600
From: "Kenneth D. Merry" <ken@kdm.org>
To: David Gilbert <dgilbert@velocet.ca>
Cc: freebsd-SCSI@FreeBSD.ORG
Subject: Re: SCSI disconnect with quantum Atlas IV disks.
Message-ID: <20000828233257.A35815@panzer.kdm.org>
References: <14762.31347.647187.677745@trooper.velocet.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2i
In-Reply-To: <14762.31347.647187.677745@trooper.velocet.net>; from dgilbert@velocet.ca on Mon, Aug 28, 2000 at 10:42:59AM -0400
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Mon, Aug 28, 2000 at 10:42:59 -0400, David Gilbert wrote:
> OK... round two.  As I mentioned in my posting about RAID... I'm
> having trouble with my system disconnecting disks during intense usage 
> (like the nightly finds that run across the disk).

Hmm.

> The system is running two 29160 controllers (couldn't buy 2940's) with 
> LVD cables and terminators (all rated for 160 operation).  With 4.1,
> we're running at 40Mhz (80 MB/s) ... which is fine, but I'm wondering
> if this could be causing part of the problem.
> 
> When might the 160 patch be MFC'd?

Justin would know, you may have to mail him directly for a guess.

> I've also fired of a query to Quantum to see if there are any firmware 
> updates... but a search of their knowledge base doesn't indicate any.
> 
> One strange note is that I've found that I have to turn on the drives
> slightly after turning on the computer lest they fail to reset
> properly.  I don't know what causes this, but changing controllers,
> cables and terminators hasn't helped (I also have a set of TechRAM
> 390F's that exhibit largely the same symptoms).

I kinda wonder if your enclosure is underpowered.  One thing that can cause
drives to behave strangely is if they aren't getting enough power.

When drives spin up, and when they're under high seek load, they use
a fair bit more power than they do when they're just spinning idle.

Are your drives alone on that power supply?  If so, you should be able to
look at the drive specs for peak current/power usage, and compare that with
what your power supply is spec'ed for.

> Would it be possible to code some really BIG bus resets and reprobes
> into the SCSI layer instead of disconnecting the drive --- maybe even
> some tunable parameter --- It would seem sensible that you could do
> with some gaps in performance once you've gotten to this level of
> problem.

The types of errors you're getting, timed out in data-out phase, indicate
that a signal is stuck on the SCSI bus.  It's not just stuck momentarily,
but has been stuck for 60 seconds.

I suppose that could be caused by power problems, although it is more often
a cabling and termination problem.

From your previous message, it sounds like you've looked over the cabling
and power a good bit.

This sort of problem, though, is most often a cabling issue.  If the signal
is getting stuck on the bus, there's not a whole lot we can do about it.

From your previous message, it looks like we may not be retrying things
after sending a bus reset.  It looks like Vinum bails out immediately after
the bus reset, which likely indicates that it got an I/O error of some
sort.

The error recovery code is supposed to unconditionally retry a command that
comes back up because a bus reset or a BDR was sent.  So, in theory, a bus
reset, in and of itself, shouldn't cause an error to get propagated back up
far enough that Vinum could detect it.  (Unless Vinum has some sort of
timeout mechanism, and reports that as a fatal I/O error.)

Anyway, it looks a little strange to me, but I'm not sure what's going on.
I would suggest running this by Justin (directly), and see what he has to
say about it.

Ken
-- 
Kenneth Merry
ken@kdm.org


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Mon Aug 28 22:39:58 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from feral.com (feral.com [192.67.166.1])
	by hub.freebsd.org (Postfix) with ESMTP id 1CE3337B424
	for <freebsd-scsi@FreeBSD.ORG>; Mon, 28 Aug 2000 22:39:54 -0700 (PDT)
Received: from beppo.feral.com (beppo [192.67.166.79])
	by feral.com (8.9.3/8.9.3) with ESMTP id WAA25693;
	Mon, 28 Aug 2000 22:39:44 -0700
Date: Mon, 28 Aug 2000 22:39:41 -0700 (PDT)
From: Matthew Jacob <mjacob@feral.com>
Reply-To: mjacob@feral.com
To: Greg Lehey <grog@lemis.com>
Cc: Sam <freep@thecity.sfsu.edu>, freebsd-scsi@FreeBSD.ORG
Subject: Re: "tape is now frozen"
In-Reply-To: <20000829145223.Z11422@wantadilla.lemis.com>
Message-ID: <Pine.BSF.4.21.0008282223590.36166-100000@beppo.feral.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


On Tue, 29 Aug 2000, Greg Lehey wrote:

> On Monday, 28 August 2000 at 22:15:41 -0700, Matt Jacob wrote:
> >
> >>
> >> I've lost the *exact* position.  In fact, I'm not even sure I've lost
> >> the exact position, but I'm prepared to concede that.  Within a block
> >> or so I know where I am.
> >
> > You cannot know that.
> 
> Of course I can.  I've just typed:
> 
>   # mt fsf 3
>   # tar tv
>   Cannot read 65536 byte record.
>   Tape is now frozen.  Jump over your left shoulder to recover.

But this is wrong. You should have typed, if you don't know the format of the
tape:

# mt rewind
# mt blocksize 0
# mt fsf 3
# tar tv

If *this* gives you the 'tape frozen' message, that's a bug. Let me know.

I *do* want to know what specific error you are running into here.

Is it because you're in the middle of the tape and you forgot the blocksize?
See below.

If it's because you wandered off the end of recorded media, I *think* I could
be convinced that that counts the same as an 'mt eom' unfreeze- but I'd have
to think long and hard about whether this works on *all* tapes. 

But I think I see what you're asking below.

> (yes, these aren't the original messages, but I didn't want to go to
> the trouble of finding a suitable tape and actually doing the
> experiment).
> 
> At this point, I know that in all likelihood the tape has got to the
> end of the record and is (logically) positioned on the IRG.  It may be
> somewhere in the middle, but that doesn't matter.  If I do the
> following:
> 
>   # mt bsf 1
>   # mt fsf 1
>   # tar tvb 128
> 
> it should work.

Okay. You're saying that if I've lost position that backing up over the
previous filemark and then skipping forward should get me back to where I was
at the failed operation.

In some cases this is true- but not in all cases!! It depends on what the
*actual* error that causes the failure to occur. That is- you're asking me to
believe that a a failed read (because of block size error) does *not* in fact
skip multiple filemarks.

For all the tape drives we support would you care to bet your nuts (this is
not a joke- assume a FreeBSD system is being used to read radiologic image
data from a tape- a cancer screen for testicular cancer for *you personally*)
that this can't happen? An extreme case but......

EOM, BOT and OFFLINE are unequivocally knowable states.

Like above- I *might* be able to convince myself that a read of hardware
position after either an mt sethpos or an fsf/bsd that succeeds could be good
enough evidence that there's a closer known place to get back to than BOT. But
I'd think damn long and hard about it.

> I don't see that in this case.  Is that the scenario you're talking
> about?

If you don't know the position, you don't know the position. It's thatsimple.
If you're always augmenting position with human info, like a 'tar tvf', or
like NetWorker *never* assumes it knows really where it is on tape when it
gets ready to read or write, that's fine.

But it's inappropriate for the driver to assume this type of behaviour.

> 
> > It's just barely possible to concede that a (successful) report of
> > hardware block position could be construed as "setting
> > location". But this also means that all other notions of file
> > location can not be known any more.
> 
> I'm not even going to argue that one.  It seems to me that once you've
> lost position within a file, you've lost position.  But you know
> you're between two file marks.

But which ones? It's not clear to me you can assume the failed read case
didn't skip over several. 

Probably not, but I sure wouldn't want to guess.

I still don't see why you can't remember to do 'mt blocksize 0'.

I'd like to remind you all that you can just comment out all

                softc->flags |= SA_FLAG_TAPE_FROZEN;

in scsi_sa.c if you don't like the behaviour. It's your power saw.
Watch those fingers!


-matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Mon Aug 28 22:58:45 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80])
	by hub.freebsd.org (Postfix) with ESMTP id 418A837B43F
	for <freebsd-scsi@FreeBSD.ORG>; Mon, 28 Aug 2000 22:58:39 -0700 (PDT)
Received: (from grog@localhost)
	by wantadilla.lemis.com (8.11.0/8.9.3) id e7T5wNG28750;
	Tue, 29 Aug 2000 15:28:23 +0930 (CST)
	(envelope-from grog)
Date: Tue, 29 Aug 2000 15:28:23 +0930
From: Greg Lehey <grog@lemis.com>
To: Matthew Jacob <mjacob@feral.com>
Cc: Sam <freep@thecity.sfsu.edu>, freebsd-scsi@FreeBSD.ORG
Subject: Re: "tape is now frozen"
Message-ID: <20000829152823.D11422@wantadilla.lemis.com>
References: <20000829145223.Z11422@wantadilla.lemis.com> <Pine.BSF.4.21.0008282223590.36166-100000@beppo.feral.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0i
In-Reply-To: <Pine.BSF.4.21.0008282223590.36166-100000@beppo.feral.com>; from mjacob@feral.com on Mon, Aug 28, 2000 at 10:39:41PM -0700
Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia
Phone: +61-8-8388-8286
Fax: +61-8-8388-8725
Mobile: +61-418-838-708
WWW-Home-Page: http://www.lemis.com/~grog
X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF  13 24 52 F8 6D A4 95 EF
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Monday, 28 August 2000 at 22:39:41 -0700, Matt Jacob wrote:
>
>
> On Tue, 29 Aug 2000, Greg Lehey wrote:
>
>> On Monday, 28 August 2000 at 22:15:41 -0700, Matt Jacob wrote:
>>>
>>>>
>>>> I've lost the *exact* position.  In fact, I'm not even sure I've lost
>>>> the exact position, but I'm prepared to concede that.  Within a block
>>>> or so I know where I am.
>>>
>>> You cannot know that.
>>
>> Of course I can.  I've just typed:
>>
>>   # mt fsf 3
>>   # tar tv
>>   Cannot read 65536 byte record.
>>   Tape is now frozen.  Jump over your left shoulder to recover.
>
> But this is wrong. You should have typed, if you don't know the format of the
> tape:
>
> # mt rewind
> # mt blocksize 0
> # mt fsf 3
> # tar tv

Right, assuming I knew this in advance.

> If *this* gives you the 'tape frozen' message, that's a bug. Let me know.

I'm planning to try it, but I'm not expecting any problems there.

> I *do* want to know what specific error you are running into here.
>
> Is it because you're in the middle of the tape and you forgot the blocksize?

Yes.  Or because I didn't know it in the first place.

> See below.
>
> If it's because you wandered off the end of recorded media, I *think* I could
> be convinced that that counts the same as an 'mt eom' unfreeze- but I'd have
> to think long and hard about whether this works on *all* tapes.
>
> But I think I see what you're asking below.

Hmm.  If I've reached the end of the tape, yes, I suppose I'd want to
be able to carry on writing without first counting how many files I
had hit and rewinding.  But then I should be able to do an mt eom,
right?

>> (yes, these aren't the original messages, but I didn't want to go to
>> the trouble of finding a suitable tape and actually doing the
>> experiment).
>>
>> At this point, I know that in all likelihood the tape has got to the
>> end of the record and is (logically) positioned on the IRG.  It may be
>> somewhere in the middle, but that doesn't matter.  If I do the
>> following:
>>
>>   # mt bsf 1
>>   # mt fsf 1
>>   # tar tvb 128
>>
>> it should work.
>
> Okay. You're saying that if I've lost position that backing up over the
> previous filemark and then skipping forward should get me back to where I was
> at the failed operation.
>
> In some cases this is true- but not in all cases!! It depends on what the
> *actual* error that causes the failure to occur. That is- you're asking me to
> believe that a a failed read (because of block size error) does *not* in fact
> skip multiple filemarks.

I'd really like to see a case where that could happen.  If the read
length is too long, we just get a short read back, right?  And if it's
too short, the furthest you can hope for is to end up at the end of
the block anyway.

> For all the tape drives we support would you care to bet your nuts (this is
> not a joke- assume a FreeBSD system is being used to read radiologic image
> data from a tape- a cancer screen for testicular cancer for *you personally*)
> that this can't happen? An extreme case but......
>
> EOM, BOT and OFFLINE are unequivocally knowable states.
>
> Like above- I *might* be able to convince myself that a read of
> hardware position after either an mt sethpos or an fsf/bsd that
> succeeds could be good enough evidence that there's a closer known
> place to get back to than BOT. But I'd think damn long and hard
> about it.

I think it's worth a good think.  And yes, I know we have a lot of
junk tapes out there, on which I wouldn't bet my nuts with or without
this feature.

>> I don't see that in this case.  Is that the scenario you're talking
>> about?
>
> If you don't know the position, you don't know the position. It's
> thatsimple.

Nope, you can say "I don't know the exact position, but I know it's
somewhere between here and there".

It looks to me like the real issue is understanding how far off we can
be after a tape error.  I can't see any way to end up past a file
mark, at least not on the kind of error we're seeing here.

>>> It's just barely possible to concede that a (successful) report of
>>> hardware block position could be construed as "setting
>>> location". But this also means that all other notions of file
>>> location can not be known any more.
>>
>> I'm not even going to argue that one.  It seems to me that once you've
>> lost position within a file, you've lost position.  But you know
>> you're between two file marks.
>
> But which ones? It's not clear to me you can assume the failed read case
> didn't skip over several.
>
> Probably not, but I sure wouldn't want to guess.
>
> I still don't see why you can't remember to do 'mt blocksize 0'.

Because I'm stupid.  

Seriously, though, I'm representing all the people out there who don't
think about this every time (and I certainly do that often enough
myself).

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Mon Aug 28 23: 7:10 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from feral.com (feral.com [192.67.166.1])
	by hub.freebsd.org (Postfix) with ESMTP id 8C0CC37B422
	for <freebsd-scsi@FreeBSD.ORG>; Mon, 28 Aug 2000 23:07:07 -0700 (PDT)
Received: from beppo.feral.com (beppo [192.67.166.79])
	by feral.com (8.9.3/8.9.3) with ESMTP id XAA25790;
	Mon, 28 Aug 2000 23:06:59 -0700
Date: Mon, 28 Aug 2000 23:06:56 -0700 (PDT)
From: Matthew Jacob <mjacob@feral.com>
Reply-To: mjacob@feral.com
To: Greg Lehey <grog@lemis.com>
Cc: Sam <freep@thecity.sfsu.edu>, freebsd-scsi@FreeBSD.ORG
Subject: Re: "tape is now frozen"
In-Reply-To: <20000829152823.D11422@wantadilla.lemis.com>
Message-ID: <Pine.BSF.4.21.0008282259480.36166-100000@beppo.feral.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

> > # mt rewind
> > # mt blocksize 0
> > # mt fsf 3
> > # tar tv
> 
> Right, assuming I knew this in advance.
> 
> > If *this* gives you the 'tape frozen' message, that's a bug. Let me know.
> 
> I'm planning to try it, but I'm not expecting any problems there.
> 
> > I *do* want to know what specific error you are running into here.
> >
> > Is it because you're in the middle of the tape and you forgot the blocksize?
> 
> Yes.  Or because I didn't know it in the first place.

But what's wrong with always defaulting to blocksize 0 (variable)? Except
for QIC tapes, you then can always not give a hoot as to what the
actual blocksize is.

> > But I think I see what you're asking below.
> 
> Hmm.  If I've reached the end of the tape, yes, I suppose I'd want to
> be able to carry on writing without first counting how many files I
> had hit and rewinding.  But then I should be able to do an mt eom,
> right?

That's correct. 

> ..
> I'd really like to see a case where that could happen.  If the read
> length is too long, we just get a short read back, right?  And if it's
> too short, the furthest you can hope for is to end up at the end of
> the block anyway.

If all tape drives were like a 1/2" 9 track, yes, I'd buy this.
But what we typically have is a tremendous amount of emulation
occuring- and it's this emulation I'm mistrustful of.

> ..
> > If you don't know the position, you don't know the position. It's
> > thatsimple.
> 
> Nope, you can say "I don't know the exact position, but I know it's
> somewhere between here and there".
> 
> It looks to me like the real issue is understanding how far off we can
> be after a tape error.  I can't see any way to end up past a file
> mark, at least not on the kind of error we're seeing here.

Like I said- if I could be sure of this, I'd buy the steps you proposed.

I still think that the 'read hardware position after an fsf or a bsf'
as a place to 'return to' might be acceptable.

> Seriously, though, I'm representing all the people out there who don't
> think about this every time (and I certainly do that often enough
> myself).

Hmm. Well, it doesn't happen to me. 

I'm about to pop the hood on the tape driver anyway again- Sam's helped
me find a couple of breakages anyway. Guess it's time to sort this
out once and for all.

-matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Tue Aug 29  8:10:30 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from sabre.velocet.net (sabre.velocet.net [198.96.118.66])
	by hub.freebsd.org (Postfix) with ESMTP id 1D97537B422
	for <freebsd-SCSI@FreeBSD.ORG>; Tue, 29 Aug 2000 08:10:28 -0700 (PDT)
Received: from office.tor.velocet.net (trooper.velocet.net [216.126.82.226])
	by sabre.velocet.net (Postfix) with ESMTP
	id A3E19137F0E; Tue, 29 Aug 2000 11:10:21 -0400 (EDT)
Received: (from dgilbert@localhost)
	by office.tor.velocet.net (8.9.3/8.9.3) id LAA08697;
	Tue, 29 Aug 2000 11:10:20 -0400 (EDT)
	(envelope-from dgilbert)
From: David Gilbert <dgilbert@velocet.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <14763.53851.632330.327801@trooper.velocet.net>
Date: Tue, 29 Aug 2000 11:10:19 -0400 (EDT)
To: "Kenneth D. Merry" <ken@kdm.org>
Cc: David Gilbert <dgilbert@velocet.ca>, freebsd-SCSI@FreeBSD.ORG
Subject: Re: SCSI disconnect with quantum Atlas IV disks.
In-Reply-To: <20000828233257.A35815@panzer.kdm.org>
References: <14762.31347.647187.677745@trooper.velocet.net>
	<20000828233257.A35815@panzer.kdm.org>
X-Mailer: VM 6.75 under 20.4 "Emerald" XEmacs  Lucid
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

>>>>> "Kenneth" == Kenneth D Merry <ken@kdm.org> writes:

Kenneth> I kinda wonder if your enclosure is underpowered.  One thing
Kenneth> that can cause drives to behave strangely is if they aren't
Kenneth> getting enough power.

Kenneth> When drives spin up, and when they're under high seek load,
Kenneth> they use a fair bit more power than they do when they're just
Kenneth> spinning idle.

Kenneth> Are your drives alone on that power supply?  If so, you
Kenneth> should be able to look at the drive specs for peak
Kenneth> current/power usage, and compare that with what your power
Kenneth> supply is spec'ed for.

The case is designed for 8 drives.  There are two (redundant) power
supplies and there are 8 drive power connectors supplied (along with
two motherboard connectors that are unused).  These are 18G Atlas IV's 
which are 7200 RPM drives (not even the fastest out there).

Kenneth> The types of errors you're getting, timed out in data-out
Kenneth> phase, indicate that a signal is stuck on the SCSI bus.  It's
Kenneth> not just stuck momentarily, but has been stuck for 60
Kenneth> seconds.

Kenneth> I suppose that could be caused by power problems, although it
Kenneth> is more often a cabling and termination problem.

Well... The cables are the ones that come with the TekRAM 390F.
They're twisted in a funny way and they have 6 connectors (4 drives,
card and terminator).  The terminator is physically in the last
connector and the controller is in the first.

Dave.

-- 
============================================================================
|David Gilbert, Velocet Communications.       | Two things can only be     |
|Mail:       dgilbert@velocet.net             |  equal if and only if they |
|http://www.velocet.net/~dgilbert             |   are precisely opposite.  |
=========================================================GLO================


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Tue Aug 29  8:14:41 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from panzer.kdm.org (panzer.kdm.org [216.160.178.169])
	by hub.freebsd.org (Postfix) with ESMTP id 4F9F137B423
	for <freebsd-SCSI@FreeBSD.ORG>; Tue, 29 Aug 2000 08:14:38 -0700 (PDT)
Received: (from ken@localhost)
	by panzer.kdm.org (8.9.3/8.9.1) id JAA38534;
	Tue, 29 Aug 2000 09:14:36 -0600 (MDT)
	(envelope-from ken)
Date: Tue, 29 Aug 2000 09:14:36 -0600
From: "Kenneth D. Merry" <ken@kdm.org>
To: David Gilbert <dgilbert@velocet.ca>
Cc: freebsd-SCSI@FreeBSD.ORG
Subject: Re: SCSI disconnect with quantum Atlas IV disks.
Message-ID: <20000829091436.A38504@panzer.kdm.org>
References: <14762.31347.647187.677745@trooper.velocet.net> <20000828233257.A35815@panzer.kdm.org> <14763.53851.632330.327801@trooper.velocet.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2i
In-Reply-To: <14763.53851.632330.327801@trooper.velocet.net>; from dgilbert@velocet.ca on Tue, Aug 29, 2000 at 11:10:19AM -0400
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

On Tue, Aug 29, 2000 at 11:10:19 -0400, David Gilbert wrote:
> >>>>> "Kenneth" == Kenneth D Merry <ken@kdm.org> writes:
> 
> Kenneth> I kinda wonder if your enclosure is underpowered.  One thing
> Kenneth> that can cause drives to behave strangely is if they aren't
> Kenneth> getting enough power.
> 
> Kenneth> When drives spin up, and when they're under high seek load,
> Kenneth> they use a fair bit more power than they do when they're just
> Kenneth> spinning idle.
> 
> Kenneth> Are your drives alone on that power supply?  If so, you
> Kenneth> should be able to look at the drive specs for peak
> Kenneth> current/power usage, and compare that with what your power
> Kenneth> supply is spec'ed for.
> 
> The case is designed for 8 drives.  There are two (redundant) power
> supplies and there are 8 drive power connectors supplied (along with
> two motherboard connectors that are unused).  These are 18G Atlas IV's 
> which are 7200 RPM drives (not even the fastest out there).

Just to be sure, you should check the power rating on one of the power
supplies against the peak power usage for the drives.  (Quantum should have
specs on their web site somewhere.)

> Kenneth> The types of errors you're getting, timed out in data-out
> Kenneth> phase, indicate that a signal is stuck on the SCSI bus.  It's
> Kenneth> not just stuck momentarily, but has been stuck for 60
> Kenneth> seconds.
> 
> Kenneth> I suppose that could be caused by power problems, although it
> Kenneth> is more often a cabling and termination problem.
> 
> Well... The cables are the ones that come with the TekRAM 390F.
> They're twisted in a funny way and they have 6 connectors (4 drives,
> card and terminator).  The terminator is physically in the last
> connector and the controller is in the first.

LVD cables have been known to show up with bent or loose pins, which could
certainly cause problems.  Sometimes they're a little more fragile than
normal Ultra-Wide or narrow SCSI cables.

Ken
-- 
Kenneth Merry
ken@kdm.org


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Thu Aug 31 22:12:52 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from mail-03-real.cdsnet.net (mail-03-real.cdsnet.net [204.118.244.93])
	by hub.freebsd.org (Postfix) with SMTP id 9549137B424
	for <freebsd-scsi@FreeBSD.ORG>; Thu, 31 Aug 2000 22:12:50 -0700 (PDT)
Received: (qmail 36874 invoked from network); 1 Sep 2000 05:12:49 -0000
Received: from schizo.cdsnet.net (204.118.244.32)
  by mail-03-real.cdsnet.net with SMTP; 1 Sep 2000 05:12:49 -0000
Date: Thu, 31 Aug 2000 22:12:32 -0700 (PDT)
From: Jaye Mathisen <mrcpu@internetcds.com>
X-Sender: mrcpu@schizo.cdsnet.net
To: John Lengeling <johnl@raccoon.com>
Cc: freebsd-scsi@FreeBSD.ORG
Subject: Re: Different init speeds of raid5 plex subdisks under vinum
In-Reply-To: <39A21314.1107B6AD@raccoon.com>
Message-ID: <Pine.BSF.4.21.0008312211230.12837-100000@schizo.cdsnet.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org


I reported this when using ATA drives a couple months ago.

It is trivially reproducible.  I wasn't even using vinum, it can be
reproduced with a script that starts up like 8 parallel newfs's on really
big drives.

The consensus was it was the scheduler, but nobody seemed to worried about
it.

On Tue, 22 Aug 2000, John Lengeling wrote:

> I created a raid5 plex under vinum using 3 drives.  These are supposed to be identical drives.  They are slightly different in size.
> 
> da2 at ahc0 bus 0 target 2 lun 0
> da2: <IBM DNES-318350W SA30> Fixed Direct Access SCSI-3 device
> da2: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled
> da2: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
> da1 at ahc0 bus 0 target 1 lun 0
> da1: <IBM DNES-318350W SA60> Fixed Direct Access SCSI-3 device
> da1: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled
> da1: 17366MB (35566501 512 byte sectors: 255H 63S/T 2213C)
> da3 at ahc0 bus 0 target 3 lun 0
> da3: <IBM DNES-318350W SA60> Fixed Direct Access SCSI-3 device
> da3: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled
> da3: 17366MB (35566501 512 byte sectors: 255H 63S/T 2213C)
> 
> When I ran the vinum init command to initialize the plex, one of the drives da2 completed the initialization 3-4 times faster than da1 and da3.  Something like 20 minutes for da2 versus 60 minutes for da1/da3.  Both da1 and da3 initialized at the same rate.  Is this weird?  Did da3/da1 get transfers negotiated down to slower speeds?  Bad cabling?
> 
> Or is this normal?
> 
> johnl
> 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-scsi" in the body of the message
> 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Fri Sep  1  1:40:44 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from hotmail.com (f210.pav1.hotmail.com [64.4.31.210])
	by hub.freebsd.org (Postfix) with ESMTP id BCB8637B422
	for <freebsd-scsi@FreeBSD.org>; Fri,  1 Sep 2000 01:40:42 -0700 (PDT)
Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC;
	 Fri, 1 Sep 2000 01:40:42 -0700
Received: from 61.129.184.74 by pv1fd.pav1.hotmail.msn.com with HTTP;	Fri, 01 Sep 2000 08:40:42 GMT
X-Originating-IP: [61.129.184.74]
From: "bsdnewbie bsdnewbie" <bsdnewbie@hotmail.com>
To: freebsd-scsi@FreeBSD.org
Subject: _Debug problem
Date: Fri, 01 Sep 2000 16:40:42 CST
Mime-Version: 1.0
Content-Type: text/plain; format=flowed
Message-ID: <F210G10Gai4omRXf2LT000002a4@hotmail.com>
X-OriginalArrivalTime: 01 Sep 2000 08:40:42.0586 (UTC) FILETIME=[50408BA0:01C013F0]
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

In order to debug my scsi driver, I add one line to my source code:
"#define CAMDEBUG"
and use the macro CAM_DEBUG to print messages

However, when I try to install the module (I write a loadable module), 
system report :
kernel: link_elf: symbol cam_dflags undefined.
can't load xxx: Exec format error

Therefore, I add another line to include the file which define "cam_dflags" 
as follows:
"#include <cam/cam_xpt.c>"

Then, nightmare comes: when the program execute to the function:
"xpt_bus_register" the system crashed and began to reboot.

When I disable the line "#define CAMDEBUG" and enable the line "#include 
<cam/cam_xpt.c>", it crashed the same way.  Why include a .c file will cause 
my program crash???!!!  I think the execution path not changed.


_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.

Share information about yourself, create your own public profile at 
http://profiles.msn.com.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message


From owner-freebsd-scsi  Fri Sep  1 13:34:54 2000
Delivered-To: freebsd-scsi@freebsd.org
Received: from mass.osd.bsdi.com (adsl-63-202-177-115.dsl.snfc21.pacbell.net [63.202.177.115])
	by hub.freebsd.org (Postfix) with ESMTP id 8DB6237B423
	for <freebsd-scsi@FreeBSD.org>; Fri,  1 Sep 2000 13:34:52 -0700 (PDT)
Received: from mass.osd.bsdi.com (localhost [127.0.0.1])
	by mass.osd.bsdi.com (8.9.3/8.9.3) with ESMTP id NAA05962;
	Fri, 1 Sep 2000 13:49:14 -0700 (PDT)
	(envelope-from msmith@mass.osd.bsdi.com)
Message-Id: <200009012049.NAA05962@mass.osd.bsdi.com>
X-Mailer: exmh version 2.1.1 10/15/1999
To: "bsdnewbie bsdnewbie" <bsdnewbie@hotmail.com>
Cc: freebsd-scsi@FreeBSD.org
Subject: Re: _Debug problem 
In-reply-to: Your message of "Fri, 01 Sep 2000 16:40:42 CST."
             <F210G10Gai4omRXf2LT000002a4@hotmail.com> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Fri, 01 Sep 2000 13:49:14 -0700
From: Mike Smith <msmith@freebsd.org>
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

> In order to debug my scsi driver, I add one line to my source code:
> "#define CAMDEBUG"
> and use the macro CAM_DEBUG to print messages
>
> However, when I try to install the module (I write a loadable module), 
> system report :
> kernel: link_elf: symbol cam_dflags undefined.
> can't load xxx: Exec format error

You need to build a kernel with CAMDEBUG defined in your kernel config.
 
> Therefore, I add another line to include the file which define "cam_dflags" 
> as follows:
> "#include <cam/cam_xpt.c>"

Don't do that.  That file is part of the kernel, and is not meant to be 
included in anything else.

-- 
... every activity meets with opposition, everyone who acts has his
rivals and unfortunately opponents also.  But not because people want
to be opponents, rather because the tasks and relationships force
people to take different points of view.  [Dr. Fritz Todt]


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message