Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Jun 1998 15:28:29 -0700
From:      Mike Smith <mike@smith.net.au>
To:        Tetsuro FURUYA <ht5t-fry@asahi-net.or.jp>
Cc:        mike@smith.net.au, robinson@public.bta.net.cn, freebsd-stable@FreeBSD.ORG, freebsd-questions@FreeBSD.ORG, Tetsuro FURUYA <tfu@ff.iij4u.or.jp>
Subject:   Re: Bug in wd driver 
Message-ID:  <199806102228.PAA00747@dingo.cdrom.com>
In-Reply-To: Your message of "Thu, 11 Jun 1998 04:41:08 %2B0900." <199806101941.EAA11696@dilemma.tf.or.jp> 

next in thread | previous in thread | raw e-mail | index | archive | help
> > > >fsck /usr
> > > >.....
> > > >wd0: interrupt timeout:
> > > >wd0: status 50<rdy,seekdone> error 0
> > > >wd0: interrupt timeout:
> > > >wd0: status 50<rdy,seekdone> error 1<no_dam>
> > > 
> > > >===> hang up
> > > >===> type 'cntrl-alt-esc'
> > 
> > This defers the interrupt timeout...
> > 
> > > >db>wd0s1f: hard error reading fsbn 1152850 of 1152850-1152851(wd0s1 bn
> > > >1279826; cn 317 tn 26 sn 44)
> > > >wd0: status 59<rdy,seekdone,drq,err> error 40<uncorr>
> > 
> > ... but not the interrupt, which finally arrives and contains real 
> > error information.  Note that the interrupt timeouts in your case 
> > *don't* have DRQ set.  Are you running in multi-block mode?
> > 
> > > As for wd.c source, I will try to experiment :)
> > 
> > Please do.  It looks like your information may lead to a result here.  
> 
> It seems too late for writing reply to mailing list.

Not at all; better late than never!

> But, this seems important to note-users, so I dare to report the result of
> my experiment of patch to /usr/src/sys/i386/isa/wd.c
> which Mr. Mike Smith's stated,
...
> >        if (wdtab[ctrlr].b_errcnt == 0)
> >                du->dk_timeout = 1 + 10;
> >        else
> >                du->dk_timeout = 1 + 3;   <---- Only this line.
> >
> >
> >Increase the 10 and 3 values (first and subsequent timeouts).  Try 
> >raising them lots, then come down slowly.
> 
> Unfortunately, my /usr/src/sys/i386/isa/wd.c is different
> from the above source code.
> There is just only the last line in the wd.c.
> 
> So, I rewrite only this last line, and increased 3 to 50. ( Is this OK?)

It's just a number, and you're in the best position to determine 
whether it's big enough.

> Up to now, I have not yet experienced any disk crash, nor cannot-mount-root
> problem, nor anything bad else.

Excellent!  And thanks for confirming this.  I hope that the original 
plaintiff is in a position to try this themselves - I would be more 
than happy to be completely wrong about the situation.  8)

> You have written that 
> >raising them lots, then come down slowly.
> 
> Is there any inconvenience when du->dk_timeout value is
> very large ?
> What if du->dk_timeout value is too large ?

The only inconvenience is in the case where the disk has truly failed 
to generate an interrupt, and the delay involved before reporting the 
failure.

> What is this du->dk_timeout ?

It determines how long a disk is allowed to take to complete a command.

> I've just tried 'cd /usr; badsect BAD 1152850 1215577' & 'fsck /dev/rwd0s1f',
>  but 'bad144 -s -v /dev/wd0' should work fine. 
> ( I had often used bad144. But now, my bad sectors of wd0 become too many
>  for bad144 :( )
> badsect & fsck don't take care of swap area,
>  nevertheless they are working fine now :)
> 
> So, Thank you Mr. Mike Smith !

No, definitely this time the thanks are for you.  I'll look at
increasing this timeout significantly for both -stable and -current, if 
someone doesn't beat me to it.

-- 
\\  Sometimes you're ahead,       \\  Mike Smith
\\  sometimes you're behind.      \\  mike@smith.net.au
\\  The race is long, and in the  \\  msmith@freebsd.org
\\  end it's only with yourself.  \\  msmith@cdrom.com



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199806102228.PAA00747>