Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Jan 2001 20:39:47 -0600
From:      Hank Marquardt <hmarq@yerpso.net>
To:        "Kenneth D. Merry" <ken@kdm.org>
Cc:        Hank Marquardt <hmarq@oscar2.yerpso.net>, freebsd-stable@FreeBSD.ORG
Subject:   Re: Dying disk ...
Message-ID:  <20010125203946.A53543@hermes.yerpso.net>
In-Reply-To: <20010125174624.A37036@panzer.kdm.org>; from ken@kdm.org on Thu, Jan 25, 2001 at 05:46:24PM -0700
References:  <20010125150722.A288@oscar2.yerpso.net> <20010125174624.A37036@panzer.kdm.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Thanks, heat is an interesting thought ... it's a dual P133 machine in
fairly close quarters (inside the case); though it was running solid
for close to 30 days before the first lockup (4.2 Stable update/boot
around 12/15) and another couple months before that with no issues --
further, the dmesg I originally sent was from less than 15 minutes
running after having been powered off for over 24 hours -- I don't
think I'd get a heat problem that quick unless something was seriously
screwed up -- of which there was no evidence when I opened the case.

The drive is a Segate ST19171W attached to a 2940 card -- CD also SCSI
has functioned and been recognized throughout all the episodes.

Hank

On Thu, Jan 25, 2001 at 05:46:24PM -0700, Kenneth D. Merry wrote:
> [ Try to hit return every 70 characters or so.  If you're using vi,
>   type ESC and then :se wm=10 ]
> 
> On Thu, Jan 25, 2001 at 15:07:23 -0600, Hank Marquardt wrote:
> > If born out, this would be my first hardware failure under BSD so I'm just looking for another set of eyes to look at this dmesg:
> > 
> > 
> > WARNING: / was not properly dismounted
> > (da0:ahc0:0:0:0): SCB 0x31 - timed out while idle, SEQADDR == 0x3
> 
> Timed out while idle, for a disk, generally means that we sent a
> command to the disk and it still hadn't returned it after 60 seconds.
> So we assume that the disk is out to lunch and hasn't come back, and
> therefore we reset it to wake it up.
> 
> > STACK == 0x1, 0xe8, 0x143, 0x16b
> > SXFRCTL0 == 0x80
> > SCB count = 50
> > QINFIFO entries: 
> > Waiting Queue entries: 
> > Disconnected Queue entries: 2:20 0:49 
> > QOUTFIFO entries: 
> > Sequencer Free SCB List: 13 14 15 3 9 1 10 12 5 6 11 4 7 8 
> > Pending list: 20 49 
> > Kernel Free SCB list: 33 11 24 35 34 48 17 18 14 47 45 31 22 21 27 36 16 4 5 9 12 15 25 26 28 46 2 10 29 23 7 38 13 0 8 6 32 30 3 19 37 1 44 43 42 41 40 
> > sg[0] - Addr 0x1134000 : Length 1024
> > (da0:ahc0:0:0:0): Queuing a BDR SCB
> > (da0:ahc0:0:0:0): Bus Device Reset Message Sent
> > (da0:ahc0:0:0:0): no longer in timeout, status = 34b
> > ahc0: Bus Device Reset on A:0. 2 SCBs aborted
> > 
> > The machine has locked solid under X a couple times and then refused to boot, not seeing the disk at all even under the SCSI ROM startup, so I popped the case thinking it might be a loose cable or something and sure enough it booted ... though you see it fschked the disk from the crash ... I left it be for a while, came back and saw the errors on the screen .. a quick look at syslog showed similar entries at each of the crashes.
> > 
> > It seems to be running now (I'm writing this on it) ... but if the disk if flaked, I may as well go buy a new one now rather than wonder when it's going to die.
> 
> You should make sure your disk is properly cooled (is it hot to the
> touch?).  You should also check your cabling and termination.
> 
> If your disk wasn't showing up during your SCSI controller's BIOS
> probe, that could indicate a cabling or termination problem.  It could
> also mean the disk is on its way out.
> 
> What sort of disk is it?  Some disks are known to have firmware issues
> that cause them to go "out to lunch" and not come back.
> 
> Ken
> -- 
> Kenneth Merry
> ken@kdm.org
> 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-stable" in the body of the message


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010125203946.A53543>