Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 31 May 1998 08:09:26 -0700 (PDT)
From:      Mike Burgett <mburgett@awen.com>
To:        freebsd-scsi@FreeBSD.ORG
Subject:   wide scsi woes...
Message-ID:  <XFMail.980531080926.mburgett@awen.com>

next in thread | raw e-mail | index | archive | help

I'm trying to bring up a new machine, and having a world of grief.  I really
don't think this is a freebsd problem, but hope maybe someone here has dealt
with this before.  The only system on the machine is freebsd, so I can't
testing with other systems would be problematic.  I hope this is the right
place to ask these questions...

I've been bringing up a new machine, p6 with an adaptec 2940uw controller, with
two drives on it.  drive 0 is an IBM DCAS 32160W, drive 1 is a DCAS 34330.  I'm
using the cable that came with the controller currently, but I have tried
another cable.

I get the same error, running 2.2.6-RELEASE or the 980523-SNAP of current
(generic and custom kernels on both), but I'm mainly been working with current,
since that's what I intend to run on this machine, and more importantly right
now, current recovers from the error without a hard reset. :)

First, here's my probes:



Copyright (c) 1992-1998 FreeBSD Inc.
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California. All rights reserved.
FreeBSD 3.0-980523-SNAP #0: Sat May 23 10:51:06 GMT 1998
    root@make.ican.net:/usr/src/sys/compile/GENERIC
Timecounter "i8254"  frequency 1193182 Hz  cost 3329 ns
Timecounter "TSC"  frequency 199432992 Hz  cost 216 ns
CPU: Pentium Pro (199.43-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x617  Stepping=7
  Features=0xfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV>
real memory  = 134217728 (131072K bytes)
avail memory = 127688704 (124696K bytes)
Probing for devices on PCI bus 0:
Correcting Natoma config for non-SMP
chip0: <Intel 82440FX (Natoma) PCI and memory controller> rev 0x02 on pci0.0.0
chip1: <Intel 82371SB PCI to ISA bridge> rev 0x01 on pci0.1.0
ide_pci0: <Intel PIIX3 Bus-master IDE controller> rev 0x00 on pci0.1.1
vga0: <Matrox MGA 2064W graphics accelerator> rev 0x01 int a irq 0 on pci0.9.0
ncr0: <ncr 53c810a fast10 scsi> rev 0x12 int a irq 9 on pci0.10.0
ncr0: waiting for scsi devices to settle
scbus0 at ncr0 bus 0
st0 at scbus0 target 5 lun 0
st0: <HP C1533A 9401> type 1 removable SCSI 2
st0: Sequential-Access 
st0: 10.0 MB/s (100 ns, offset 8)
density code 0x13, variable blocks, write-enabled
cd0 at scbus0 target 6 lun 0
cd0: <MATSHITA CD-ROM CR-8004A 2.0a> type 5 removable SCSI 2
cd0: CD-ROM 
cd0: asynchronous.
can't get the size
fxp0: <Intel EtherExpress Pro 10/100B Ethernet> rev 0x04 int a irq 10 on
pci0.11.0
fxp0: Ethernet address 00:a0:c9:b7:e3:87
ahc0: <Adaptec 2940 Ultra SCSI host adapter> rev 0x01 int a irq 11 on pci0.12.0
ahc0: aic7880 Wide Channel, SCSI Id=7, 16 SCBs
ahc0: waiting for scsi devices to settle
scbus1 at ahc0 bus 0
sd0 at scbus1 target 0 lun 0
sd0: <IBM DCAS-32160W S65A> type 0 fixed SCSI 2
sd0: Direct-Access 2063MB (4226725 512 byte sectors)
Sending SDTR!!
sd1 at scbus1 target 1 lun 0
sd1: <IBM DCAS-34330W S65A> type 0 fixed SCSI 2
sd1: Direct-Access 4134MB (8467200 512 byte sectors)
[isa probing deleted .... ]

Then, after logging in, I can generate the below error at will, running
something like bonnie, or a large make (world, buildworld)

sd1: SCB 0x1 - timed out in dataout phase, SCSISIGI == 0xe6
SEQADDR = 0x129 SCSISEQ = 0x12 SSTAT0 = 0x2 SSTAT1 = 0x13
sd1: abort message in message buffer
sd1: SCB 0x0 timedout while recovery in progress
sd0: SCB 0x2 timedout while recovery in progress
sd1: SCB 0x1 - timed out in dataout phase, SCSISIGI == 0xf6
SEQADDR = 0x129 SCSISEQ = 0x12 SSTAT0 = 0x2 SSTAT1 = 0x13
sd1: no longer in timeout
ahc0: Issued Channel A Bus Reset. 4 SCBs aborted
sd1: SCB 0x0 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0
SEQADDR = 0x15f SCSISEQ = 0x12 SSTAT0 = 0x2 SSTAT1 = 0x0
sd1: Queueing an Abort SCB
sd1: SCB 0x1 timedout while recovery in progress
sd0: SCB 0x2 timedout while recovery in progress
sd0: SCB 0x3 timedout while recovery in progress
sd1: SCB 0x0 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0
SEQADDR = 0x15f SCSISEQ = 0x12 SSTAT0 = 0x2 SSTAT1 = 0x0
sd1: no longer in timeout
ahc0: Issued Channel A Bus Reset. 4 SCBs aborted
sd0: UNIT ATTENTION asc:29,0
sd0:  Power on, reset, or bus device reset occurred
, retries:2
Sending SDTR!!


The last line is related to how I have drive 1 jumpered currently, so that it
doesn't generate unit attention, and initiates wide negeotiation after a reset.
I've tried it without those jumpers as well, and the errors still occur.

What I've tried:

Different cables.
Both drives as the end of the chain with termination enabled.
Unit Atten on POR disabled
Initiate Sync/Wide negeotiation on reset enabled.
Manually setting HBA termination to 'ON/ON' as per manual. (HBA Bios setup)
  (there is nothing hooked to the external, or internal 50 pin connectors)
Limiting speed to 10Mhz (HBA Bios setup)
Different kernels (2.2.6-Rel, 980523-SNAP) generic and custom.

This is a first experience for me on several fronts:

Freebsd/2940 combination
Ultra-wide scsi with any OS
Freebsd/P6 combination

I'm seeing this error on both drives, (both new) so I don't think it's a
problem with the drives themselves.  It 'feels' like a termination problem,
though I've tried each of the drives in the last position with termination
enabled.  I suppose it could be a HBA problem, but it's also new, and this
seems like an odd failure mode.

Any suggestions would be helpful, I'm kind of at the end of my interrupt chain
here.  

Thanks,
Mike


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.980531080926.mburgett>