Date: Wed, 10 Jun 1998 15:28:29 -0700 From: Mike Smith <mike@smith.net.au> To: Tetsuro FURUYA <ht5t-fry@asahi-net.or.jp> Cc: mike@smith.net.au, robinson@public.bta.net.cn, freebsd-stable@FreeBSD.ORG, freebsd-questions@FreeBSD.ORG, Tetsuro FURUYA <tfu@ff.iij4u.or.jp> Subject: Re: Bug in wd driver Message-ID: <199806102228.PAA00747@dingo.cdrom.com> In-Reply-To: Your message of "Thu, 11 Jun 1998 04:41:08 %2B0900." <199806101941.EAA11696@dilemma.tf.or.jp>
next in thread | previous in thread | raw e-mail | index | archive | help
> > > >fsck /usr > > > >..... > > > >wd0: interrupt timeout: > > > >wd0: status 50<rdy,seekdone> error 0 > > > >wd0: interrupt timeout: > > > >wd0: status 50<rdy,seekdone> error 1<no_dam> > > > > > > >===> hang up > > > >===> type 'cntrl-alt-esc' > > > > This defers the interrupt timeout... > > > > > >db>wd0s1f: hard error reading fsbn 1152850 of 1152850-1152851(wd0s1 bn > > > >1279826; cn 317 tn 26 sn 44) > > > >wd0: status 59<rdy,seekdone,drq,err> error 40<uncorr> > > > > ... but not the interrupt, which finally arrives and contains real > > error information. Note that the interrupt timeouts in your case > > *don't* have DRQ set. Are you running in multi-block mode? > > > > > As for wd.c source, I will try to experiment :) > > > > Please do. It looks like your information may lead to a result here. > > It seems too late for writing reply to mailing list. Not at all; better late than never! > But, this seems important to note-users, so I dare to report the result of > my experiment of patch to /usr/src/sys/i386/isa/wd.c > which Mr. Mike Smith's stated, ... > > if (wdtab[ctrlr].b_errcnt == 0) > > du->dk_timeout = 1 + 10; > > else > > du->dk_timeout = 1 + 3; <---- Only this line. > > > > > >Increase the 10 and 3 values (first and subsequent timeouts). Try > >raising them lots, then come down slowly. > > Unfortunately, my /usr/src/sys/i386/isa/wd.c is different > from the above source code. > There is just only the last line in the wd.c. > > So, I rewrite only this last line, and increased 3 to 50. ( Is this OK?) It's just a number, and you're in the best position to determine whether it's big enough. > Up to now, I have not yet experienced any disk crash, nor cannot-mount-root > problem, nor anything bad else. Excellent! And thanks for confirming this. I hope that the original plaintiff is in a position to try this themselves - I would be more than happy to be completely wrong about the situation. 8) > You have written that > >raising them lots, then come down slowly. > > Is there any inconvenience when du->dk_timeout value is > very large ? > What if du->dk_timeout value is too large ? The only inconvenience is in the case where the disk has truly failed to generate an interrupt, and the delay involved before reporting the failure. > What is this du->dk_timeout ? It determines how long a disk is allowed to take to complete a command. > I've just tried 'cd /usr; badsect BAD 1152850 1215577' & 'fsck /dev/rwd0s1f', > but 'bad144 -s -v /dev/wd0' should work fine. > ( I had often used bad144. But now, my bad sectors of wd0 become too many > for bad144 :( ) > badsect & fsck don't take care of swap area, > nevertheless they are working fine now :) > > So, Thank you Mr. Mike Smith ! No, definitely this time the thanks are for you. I'll look at increasing this timeout significantly for both -stable and -current, if someone doesn't beat me to it. -- \\ Sometimes you're ahead, \\ Mike Smith \\ sometimes you're behind. \\ mike@smith.net.au \\ The race is long, and in the \\ msmith@freebsd.org \\ end it's only with yourself. \\ msmith@cdrom.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199806102228.PAA00747>