Date: Thu, 27 May 1999 20:55:48 -0500 From: David Kelly <dkelly@hiwaay.net> To: "Kenneth D. Merry" <ken@plutotech.com> Cc: freebsd-scsi@FreeBSD.ORG Subject: Re: proper mode page values? Message-ID: <199905280155.UAA53624@nospam.hiwaay.net> In-Reply-To: Message from "Kenneth D. Merry" <ken@plutotech.com> of "Wed, 26 May 1999 20:06:06 MDT." <199905270206.UAA16406@panzer.plutotech.com>
next in thread | previous in thread | raw e-mail | index | archive | help
"Kenneth D. Merry" writes: > > swap_pager: indefinite wait buffer: device: 0x30401, blkno: 264, size: 4096 > > > > The problem blocks are always 264, 272, and 496. > > You should also be getting some SCSI error message printed on the console. Actually that is what was being written to the console and /var/log/messages. I don't remember any other messages but the problem is easily repeatable (might take an hour, and the machine is at work while my email is at home). System spit out more than 10 or 15 of those messages before it locked up. Meanwile it was getting slower. X was not running. Could Alt-Fn between virtual consoles. Could control-alt-esc and get the "no kernel debugger" message. If I could read my whole tape then I could update the system and isntall the kernel debugger too. System doesn't have sources on it at the moment. :-( > That's rather odd. It may be that the Anaconda is staying on the bus too > long or something. I dunno. That's what I'm thinking. At home I have my tape drives on a narrow Adaptec 2940, the twin of the 2940 in the work machine. And a matching Anaconda in both places. But at home the HD is on a wide Symbios 875. > > Tried using camcontrol to view my bad block lists. Doesn't work on that > > IBM drive, nor the IBM drive on this machine: > > [...] > > nospam: [1037] camcontrol defects -n da -u 0 -f block -P > > error reading defect list: Input/output error > > You need to use the -v switch on the command line to see why the command is > failing. Fair enough. Doesn't look like -v adds much information: nospam: [1045] camcontrol defects -v -n da -u 0 -f block -G error reading defect list: Input/output error CAM status is 0 nospam: [1046] camcontrol defects -n da -u 0 -f block -G error reading defect list: Input/output error nospam: [1047] id uid=0(root) gid=0(wheel) groups=0(wheel), 2(kmem), 3(sys), 4(tty), 5(operator), 20(staff), 31(guest) nospam: [1048] Ah! Forgot to check /var/log/messages. This is the output for a single attempt at "camcontrol defects", the one listed above: May 27 19:12:22 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded. May 27 19:12:22 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 80) @0xc0abbe00. May 27 19:12:22 nospam /kernel: (pass2:ncr0:0:0:0): extraneous data discarded. May 27 19:12:22 nospam /kernel: (pass2:ncr0:0:0:0): COMMAND FAILED (9 80) @0xc0abbe00. > > So then I go looking at mode pages to see what is set and to see if by > > any chance the drive was told not to substitute replacements for > > weakening blocks: > > > > nospam: [1038] camcontrol modepage -n da -u 0 -m 1 -P 2 > > AWRE (Auto Write Reallocation Enbld): 0 > > ARRE (Auto Read Reallocation Enbld): 0 > > Those are the defaults. > > > nospam: [1039] camcontrol modepage -n da -u 0 -m 1 -P 3 > > AWRE (Auto Write Reallocation Enbld): 1 > > ARRE (Auto Read Reallocation Enbld): 1 > > And these are the saved parameters. Yes, I understood defaults and saved. The observation was the 9G drive was shipped with different saved (-P 3) values than the factory defaults (-P 2). The 9G saved values appear to be the same as the 4G defaults which were also the same as its saved values. So sifting thru everything, it would appear AWRE and ARRE are Good Things To Set? How about "TB (Transfer Block)"? That sounds like one that will attempt to copy (transfer) the contents of a sick but not dead block. But then AWRE and ARRE sound like they do that too. Or maybe "EER (Enable Early Recovery)" is one that will attempt to recover and repair before the damage is permanent? "DTE (Disable Transfer on Error)", now why would we enable something like TB then use a different parameter to disable it? This must mean something totally different. I'm confused. > 1. Enable read and write reallocation, and then do a dd to overwrite the > entire disk. That will force any bad blocks to get remapped. With AWRE and ARRE enabled in the first place I should never need to do the above? Right? The advantage of scanning the whole disk at once as above is to verify there are no problems and/or to observe the automatic bad block replacement doing its thing? > > Are my modepage parameters sane? Was looking at page 0x01 because I was > > worried about error handling. But here's the popular 0x08 too: [...] > > Looks okay to me. The only one you might want to play with is the WCE bit, > which enables write caching. That won't have any effect WCE is the only one I've played with. Had to use bonnie to tell the difference. So I put it back the way I found it. -- David Kelly N4HHE, dkelly@nospam.hiwaay.net ===================================================================== The human mind ordinarily operates at only ten percent of its capacity -- the rest is overhead for the operating system. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199905280155.UAA53624>