Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 5 Feb 2001 22:26:49 -0800
From:      "3Phase" <Phase3@worldnet.att.net>
To:        "Mark Ibell" <marki@paradise.net.nz>
Cc:        <freebsd-questions@FreeBSD.ORG>
Subject:   Re: SCSI parity error
Message-ID:  <04d601c09006$05377d20$4fa0480c@sisyphus2>
References:  <004301c08ff0$96e0c5d0$0101a8c0@evileye>

next in thread | previous in thread | raw e-mail | index | archive | help

----- Original Message -----
From: "Mark Ibell" <marki@paradise.net.nz>
To: <freebsd-questions@freebsd.org>
Sent: Monday, February 05, 2001 07:55 PM
Subject: SCSI parity error


> Hi,
>
> We've just experienced a nasty server crash on a system running
4.1-RELEASE.
> The drive configuration is 2 x Quantum Atlas 10k2 drives running off an
> Adaptec 2940U2W controller. The relevant log entries are listed below. Any
> ideas what could have caused this - both disks appear to check out ok
> according to the SCSI BIOS 'Verify Media' option.
>
> Cheers,
> Mark
>
>
> (da1:ahc0:0:6:0): parity error detected in Data-in phase. SEQADDR(0x166)
> SCSIRATE(0x93)
> ahc0:A:6: unknown scsi bus phase 0.  Attempting to continue
> ahc0: WARNING no command for scb 0 (cmdcmplt)
> QOUTPOS = 195
> ahc0: WARNING no command for scb 96 (cmdcmplt)
>  QOUTPOS = 196
> ...
> ahc0: WARNING no command for scb 6 (cmdcmplt)
> QOUTPOS = 219
> (da1:ahc0:0:6:0): SCB 0x13 - timed out while idle, SEQADDR == 0xb
> (da1:ahc0:0:6:0): Queuing a BDR SCB
> (da1:ahc0:0:6:0): Bus Device Reset Message Sent
> (da1:ahc0:0:6:0): no longer in timeout, status = 34c
> ahc0: Bus Device Reset on A:6. 1 SCBs aborted
> (da0:ahc0:0:5:0): SCB 0x8c - timed out while idle, SEQADDR == 0xa
> (da0:ahc0:0:5:0): Queuing a BDR SCB
> (da0:ahc0:0:5:0): Bus Device Reset Message Sent
> (da0:ahc0:0:5:0): no longer in timeout, status = 34b
> ahc0: Bus Device Reset on A:5. 7 SCBs aborted
> ...

Parity usually means hardware. Are they 10k RPM drives?
Are they separate or are you using them as a virtual volume?
What was it doing when it crashed, loafing or heavy use?

Cheap test:
Get a radio, find a frequency and listen to the machine.

Give the drives a repetative task and you should be able to
'hear' each sub-system operate when it reads/writes data.

Walk away with the radio.

If you can hear it down the hall it has RF problems.
If it sounds 'different' sometimes you have a problem but
error correction is masking it.

Assuming it's been running okay for a while, check the usual
suspects like loose connections, sockets, terminators, cables,
heat, and good power.  No one tripped over the cord or used it
as a shin-detector?

-3P




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?04d601c09006$05377d20$4fa0480c>