Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Feb 2004 13:02:43 +0100
From:      Matthias Andree <ma@dt.e-technik.uni-dortmund.de>
To:        freebsd-scsi@freebsd.org
Subject:   AIC7XXX (2940UW Pro) file system corruption
Message-ID:  <m3smhhj02k.fsf@merlin.emma.line.org>

next in thread | raw e-mail | index | archive | help
Hi,

I have a 2940 UW Pro running in a FreeBSD 4-STABLE (checked out and
built kernel around Feb 3rd) machine with Yamaha CRW4416S (CD, USCSI),
Plextor PX-20TS (CD, USCSI) and Micropolis 4345WS (HDD). The external
connector is unused, the 50-pin stuff is terminated internally in the
Plextor at the bus end, the 68-pin stuff is terminated internally in the
Micropolis at the other bus end.

Last Friday, the SCSI stuff in the box went haywire, dumped card state
and finally locked the machine up - I had to press the reset button. On
reboot, fsck -p aborted the boot since /var was corrupt. At that time,
the hard disk drive was running with the "WCE" set to 0 in the saved and
current mode pages. It's a test machine, so I didn't bother to report
this yet.

I'd used both a Tekram DC-390 (AMD53C974, amd(4)) and a Tekram DC-390U
(SYM53C975, sym(4)) in the same machine with one of these 50<->68
adaptor plugs without seeing such problems, but at that time, the Yamaha
was missing.

The log entries (logged across the network) are too large to post here,
download URL (the log is gzipped):

ftp://ftp.dt.e-technik.uni-dortmund.de/pub/people/ma/aic7xxx-hang.gz

The log is segmented, the first part of Feb 6 is the boot-up message
(around 15:07), then I elided logs until 20:00, where the trouble
started at 20:10:20 with
Feb  6 20:10:20 libertas /kernel: swap_pager: indefinite wait buffer: device: #da/0x20001, blkno: 4296, size: 24576
Feb  6 20:10:41 libertas /kernel: (da0:ahc0:0:0:0): SCB 0x0 - timed out
Feb  6 20:10:42 libertas /kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
Feb  6 20:10:42 libertas /kernel: ahc0: Dumping Card State in Data-in phase, at SEQADDR 0x64

At that time, the machine was running portupgrade -a and supposed to
build some big stuff, gcc, XFree and other, from ports.

Is this a driver issue?

After reboot and manually cleaning up the /var mess which involved force
installing some ports, another portupgrade -a has completed without
problem.


I tried reading the defect lists with either of these following
commands, to no other avail than an error message card state dumps again
(not posted, only first and last lines below)

   camcontrol defects da0 -G -f block
   camcontrol defects da0 -G -f bfi
   camcontrol defects da0 -G -f phys

(pass0:ahc0:0:0:0): SCB 0xf - timed out
>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
ahc0: Dumping Card State while idle, at SEQADDR 0x7
Card was paused
...
(pass0:ahc0:0:0:0): Queuing a BDR SCB
(pass0:ahc0:0:0:0): Bus Device Reset Message Sent
(pass0:ahc0:0:0:0): no longer in timeout, status = 34b
ahc0: Bus Device Reset on A:0. 11 SCBs aborted

This card dump occurred within half a second after issuing the
camcontrol command.

I have then, as an alternative, run "sformat -verify dev=0,0,0", which
has not reported any defects or weak blocks or something, so I can
assume the drive is fine.

-- 
Matthias Andree

Encrypt your mail: my GnuPG key ID is 0x052E7D95



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?m3smhhj02k.fsf>