Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 29 May 1998 00:08:53 +0900
From:      Tetsuro FURUYA <ht5t-fry@asahi-net.or.jp>
To:        mike@smith.net.au
Cc:        Tetsuro FURUYA <ht5t-fry@asahi-net.or.jp>, robinson@public.bta.net.cn, freebsd-stable@FreeBSD.ORG
Subject:   Re: Bug in wd driver 
Message-ID:  <199805281508.AAA04056@dilemma.tf.or.jp>
In-Reply-To: Your message of "Wed, 27 May 1998 14:01:47 -0700"
References:  <199805272101.OAA01902@dingo.cdrom.com>

next in thread | previous in thread | raw e-mail | index | archive | help

In Message-ID: <199805272101.OAA01902@dingo.cdrom.com>
Mike Smith <mike@smith.net.au> worte:


> Also, what are the status and error values that you see in the 
> "interrupt timeout" messages?

My output of fsck and ddb(4) is:

>fsck /usr
>.....
>wd0: interrupt timeout:
>wd0: status 50<rdy,seekdone> error 0
>wd0: interrupt timeout:
>wd0: status 50<rdy,seekdone> error 1<no_dam>

>===> hang up
>===> type 'cntrl-alt-esc'

>db>wd0s1f: hard error reading fsbn 1152850 of 1152850-1152851(wd0s1 bn
>1279826; cn 317 tn 26 sn 44)
>wd0: status 59<rdy,seekdone,drq,err> error 40<uncorr>

>===> type 'c' or 'continue'

>cannot read: BLK 1152850
>....

After this, fsck completes successfully ^^)
These are the same in bad144. bad144 says "marked bad" instead of
"cannot read".
And, moreover there is cron message (these are diffenrent blocks ?),

dilemma kernel log messages:
> wd0: interrupt timeout:
> wd0: status 50<rdy,seekdone> error 1<no_dam>
> swap_pager: indefinite wait buffer: device: 131073, blkno: 3224, size: 20480
> swap_pager: indefinite wait buffer: device: 131073, blkno: 3224, size: 20480
> swap_pager: indefinite wait buffer: device: 131073, blkno: 26752, size: 4096
> swap_pager: indefinite wait buffer: device: 131073, blkno: 3224, size: 20480
> swap_pager: indefinite wait buffer: device: 131073, blkno: 26752, size: 4096
.....

these messages are repeating several times.

As for wd.c source, I will try to experiment :)

===================================================================
It seems that this message has not reached mailing list. 
So, I write again.

In Message-ID: <199805260342.LAA02975@public.bta.net.cn> 
Michael Robinson <robinson@public.bta.net.cn> wrote:

>I wrote a message related to this problem to freebsd-questions
>yesterday, but upon further investigation, I have decided this is
>a bug, not a feature.

>I have a Tecra 510CDT (running 2.2.6-RELEASE) that suffered a
>corrupted disk when the battery power failed as it was trying to
>halt.
> 
>  1. Any I/O access to the affected sectors will cause the following
>     message:
> 
>     wd0: interrupt timeout
>     wd0: status 58<rdy,seekdone,drq> error 0
> 
>     followed by seeking noises, and the following message:
> 
>     wd0: interrupt timeout
>     wd0: status 50<rdy,seekdone> error 1<no_dam>
> 
>  2. After this, the process requesting the I/O will be completely
>     locked, but the disk will continue to make seeking noises 
>     continuously until the system is powered off.
>     Other processes are able to access the affected slice/partition
>     (ls, cat, etc.) without any difficulty, as long as they avoid 
>     the 7 affected sectors.  Any process which requires privileged
>     kernel calls (halt, ps, etc.) will lock immediately and 
>     completely.
> 
>  3. Other than the two messages above, wd produces no error messages.
> 
>  4. Hard reset is the only way to recover.
> 
> I tried to work around this problem with bad144, but rapidly discovered
> that bad144 is something of a bad joke in FreeBSD.  Does anyone have
> any recommendations for how to fix the wd driver or otherwise recover
> from this fault?
> 
> 	-Michael Robinson

I have been encountered at the same defaults in using Panasonic AL-N1,
and FreeBSD-2.2.2.

And bad144 was hangupped.
But I have found out how to manipulate bad144, or fsck , or badsect.

My kernel has kernel-debugger ddb(4) installed in it.
                              ^^^^^^
So, listening to the hamming sound of wd0 drive, and when wd drive
is hangupped, invoke kernel-debugger by typing ctrl-alt-ESC keys.
                                               ^^^^^^^^^^^^
A while after stopping of disk access, type 'c' or 'continue',
and go back to bad144 or fsck.
Several attempts may complete the identification of bad clusters.
As for my machine, this was worked.

========================================================================
TEL: 048-852-3520    FAX: 048-858-1597
E-Mail:
     ht5t-fry@asahi-net.or.jp
     tfu@ff.iij4u.or.jp
pgp-fingerprint:
     pub  Tetsuro FURUYA <ht5t-fry@asahi-net.or.jp>
      Key fingerprint = F1 BA 5F C1 C2 48 1D C7  AE 5F 16 ED 12 17 75 38
=========================================================================

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199805281508.AAA04056>