From owner-freebsd-questions Tue Feb 13 21:42:40 2001 Delivered-To: freebsd-questions@freebsd.org Received: from elmls01.ce.mediaone.net (elmls01.ce.mediaone.net [24.131.128.25]) by hub.freebsd.org (Postfix) with ESMTP id 837B437B491 for ; Tue, 13 Feb 2001 21:42:31 -0800 (PST) Received: from [192.168.1.100] (el01-24-131-141-229.ce.mediaone.net [24.131.141.229]) by elmls01.ce.mediaone.net (8.8.7/8.8.7) with ESMTP id XAA19009; Tue, 13 Feb 2001 23:42:21 -0600 (CST) Mime-Version: 1.0 X-Sender: dcschooley@pop.ce.mediaone.net Message-Id: In-Reply-To: <20010212195422.S47700@wantadilla.lemis.com> References: <20010212195422.S47700@wantadilla.lemis.com> x-advocacy: An Apple a Day Keeps Windows Away Date: Tue, 13 Feb 2001 23:41:47 -0600 To: Greg Lehey From: David Schooley Subject: Re: Vinum behavior (long) Cc: freebsd-questions@FreeBSD.ORG Content-Type: text/plain; charset="us-ascii" Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG At 7:54 PM +1030 2/12/01, Greg Lehey wrote: >On Monday, 12 February 2001 at 0:28:04 -0600, David Schooley wrote: >> I have been doing some experimenting with vinum, primarily to >> understand it before putting it to regular use. I have a few >> questions, primarily due to oddities I can't explain. >> >> The setup consists 4 identical 30GB ATA drives, each on its own >> channel. One pair of channels is comes off of the motherboard >> controller; the other pair hangs off of a PCI card. I am running >> 4.2-STABLE, cvsup'ed some time within the past week. >> >> The configuration file I am using is as follows and is fairly close >> to the examples in the man page and elsewhere, although it raises >> some questions by itself. What I attempted to do was make sure each >> drive was mirrored to the corresponding drive on the other >> controller, i.e., 1<->3, and 2->4: >> >> *** >> drive drive1 device /dev/ad0s1d >> drive drive2 device /dev/ad2s1d >> drive drive3 device /dev/ad4s1d >> drive drive4 device /dev/ad6s1d >> >> volume raid setupstate >> plex org striped 300k >> sd length 14655m drive drive1 >> sd length 14655m drive drive2 >> sd length 14655m drive drive3 >> sd length 14655m drive drive4 >> plex org striped 300k >> sd length 14655m drive drive3 >> sd length 14655m drive drive4 >> sd length 14655m drive drive1 >> sd length 14655m drive drive2 >> >> *** >> >> I wanted to see what would happen if I lost an entire IDE controller, >> so I set everything up, mounted the new volume and copied over >> everything from /usr/local. I shut the machine down, cut the power to >> drives 3 and 4, and restarted. Upon restart, vinum reported that >> drives 3 and 4 had failed. If my understanding is correct, then I >> should have been OK since any data on drives 3 and 4 would have been >> a copy of what was on drives 1 and 2, respectively. > >Correct. > >> For the next part of the test, I attempted to duplicate a directory >> in the raid version of /usr/local. It partially worked, but there >> there were errors > >What errors? Here is part of /var/log/messages. This is at the point when I tried to write to the RAID with two of the drives failed. Feb 13 22:19:52 bicycle /kernel: vinum: raid.p0.s3 is stale by force Feb 13 22:19:52 bicycle /kernel: vinum: raid.p1.s0 is stale by force Feb 13 22:19:52 bicycle /kernel: vinum: raid.p0.s2 is stale by force Feb 13 22:19:52 bicycle /kernel: vinum: raid.p1.s1 is stale by force Feb 13 22:19:52 bicycle /kernel: spec_getpages:(#vinum/0) I/O read failure: (error=0) bp 0xc48b97d4 vp 0xca5bdec0 Feb 13 22:19:52 bicycle /kernel: size: 28672, resid: 28672, a_count: 28672, valid: 0x0 Feb 13 22:19:52 bicycle /kernel: nread: 0, reqpage: 0, pindex: 9, pcount: 7 Feb 13 22:19:52 bicycle /kernel: vm_fault: pager read error, pid 283 (cp) Feb 13 22:19:52 bicycle /kernel: spec_getpages:(#vinum/0) I/O read failure: (error=0) bp 0xc48b9688 vp 0xca5bdec0 Feb 13 22:19:52 bicycle /kernel: size: 65536, resid: 65536, a_count: 65536, valid: 0x0 Feb 13 22:19:52 bicycle /kernel: nread: 0, reqpage: 0, pindex: 200, pcount: 16 Feb 13 22:19:52 bicycle /kernel: vm_fault: pager read error, pid 283 (cp) Feb 13 22:19:52 bicycle /kernel: spec_getpages:(#vinum/0) I/O read failure: (error=0) bp 0xc48b9688 vp 0xca5bdec0 Feb 13 22:19:52 bicycle /kernel: size: 36864, resid: 36864, a_count: 36864, valid: 0x0 Here is the output of "vinum list". p0.s2, p0.s3, p1.s0, and p1.s1 are the "failed" subdisks. I used smaller subdisks this time to keep the recovery time down during testing, but everything else is the same as before. D drive1 State: up Device /dev/ad0s1d Avail: 28287/29311 MB (96%) D drive2 State: up Device /dev/ad2s1d Avail: 28287/29311 MB (96%) D drive3 State: up Device /dev/ad4s1d Avail: 28287/29311 MB (96%) D drive4 State: up Device /dev/ad6s1d Avail: 28287/29311 MB (96%) 1 volumes: V raid State: up Plexes: 2 Size: 2047 MB 2 plexes: P raid.p0 S State: up Subdisks: 4 Size: 2047 MB P raid.p1 S State: up Subdisks: 4 Size: 2047 MB 8 subdisks: S raid.p0.s0 State: up PO: 0 B Size: 511 MB S raid.p0.s1 State: up PO: 300 kB Size: 511 MB S raid.p0.s2 State: up PO: 600 kB Size: 511 MB S raid.p0.s3 State: up PO: 900 kB Size: 511 MB S raid.p1.s0 State: up PO: 0 B Size: 511 MB S raid.p1.s1 State: up PO: 300 kB Size: 511 MB S raid.p1.s2 State: up PO: 600 kB Size: 511 MB S raid.p1.s3 State: up PO: 900 kB Size: 511 MB Vinum history file: 13 Feb 2001 22:02:50.424135 *** vinum started *** 13 Feb 2001 22:02:50.424819 create -f vinum2.conf drive drive1 device /dev/ad0s1d drive drive2 device /dev/ad2s1d drive drive3 device /dev/ad4s1d drive drive4 device /dev/ad6s1d volume raid setupstate plex org striped 300k sd length 512m drive drive1 sd length 512m drive drive2 sd length 512m drive drive3 sd length 512m drive drive4 plex org striped 300k sd length 512m drive drive3 sd length 512m drive drive4 sd length 512m drive drive1 sd length 512m drive drive2 13 Feb 2001 22:02:50.438116 *** Created devices *** 13 Feb 2001 22:15:05.884974 *** vinum started *** 13 Feb 2001 22:15:05.935591 list 13 Feb 2001 22:15:18.232052 *** vinum started *** 13 Feb 2001 22:15:19.258144 list 13 Feb 2001 22:15:25.930953 quit 13 Feb 2001 22:26:38.521465 *** vinum started *** 13 Feb 2001 22:26:39.499981 list 13 Feb 2001 22:26:53.305830 start raid.p0 13 Feb 2001 22:27:00.825452 start raid.p1 13 Feb 2001 22:27:03.218408 list > >> during the copy and only about two thirds of the data was >> successfully copied. >> >> Question #1: Shouldn't this have worked? > >Answer: Yes, it should have. What went wrong? See above. > >> After I "fixed" the "broken" controller and restarted the machine, >> vinum's list looked like this: >> >> *** >> 4 drives: >> D drive1 State: up Device /dev/ad0s1d Avail: 1/29311 MB (0%) >> D drive2 State: up Device /dev/ad2s1d Avail: 1/29311 MB (0%) >> D drive3 State: up Device /dev/ad4s1d Avail: 1/29311 MB (0%) >> D drive4 State: up Device /dev/ad6s1d Avail: 1/29311 MB (0%) >> >> 1 volumes: >> V raid State: up Plexes: 2 Size: 57 GB >> >> 2 plexes: >> P raid.p0 S State: corrupt Subdisks: 4 Size: 57 GB >> P raid.p1 S State: corrupt Subdisks: 4 Size: 57 GB >> >> 8 subdisks: >> S raid.p0.s0 State: up PO: 0 B Size: 14 GB >> S raid.p0.s1 State: up PO: 300 kB Size: 14 GB >> S raid.p0.s2 State: stale PO: 600 kB Size: 14 GB >> S raid.p0.s3 State: stale PO: 900 kB Size: 14 GB >> S raid.p1.s0 State: stale PO: 0 B Size: 14 GB >> S raid.p1.s1 State: stale PO: 300 kB Size: 14 GB >> S raid.p1.s2 State: up PO: 600 kB Size: 14 GB >> S raid.p1.s3 State: up PO: 900 kB Size: 14 GB >> *** >> >> This makes sense. Now after restarting raid.p0 and waiting for >> everything to resync, I got this: >> >> *** >> 2 plexes: >> P raid.p0 S State: up Subdisks: 4 Size: 57 GB >> P raid.p1 S State: corrupt Subdisks: 4 Size: 57 GB >> >> 8 subdisks: >> S raid.p0.s0 State: up PO: 0 B Size: 14 GB >> S raid.p0.s1 State: up PO: 300 kB Size: 14 GB >> S raid.p0.s2 State: up PO: 600 kB Size: 14 GB >> S raid.p0.s3 State: up PO: 900 kB Size: 14 GB >> S raid.p1.s0 State: stale PO: 0 B Size: 14 GB <--- still stale > >Please don't wrap output. Sorry. One of these days I'll ditch Eudora. > > > S raid.p1.s1 State: stale PO: 300 kB Size: 14 GB <--- still stale > > S raid.p1.s2 State: up PO: 600 kB Size: 14 GB >> S raid.p1.s3 State: up PO: 900 kB Size: 14 GB >> *** >> >> Now the only place that raid.p0.s2 and raid.p0.s3 could have gotten >> their data is from raid.p1.s0 and raid.p1.s1, neither of which were >> involved in the "event". > >Correct. > >> Question #2: Since the data on raid.p0 now matches raid.p1, >> shouldn't raid.p1 have come up automatically and without having to >> copy data from raid.p0? > >No. According to the output above, raid.p1 hasn't been started yet. >There's also no indication in your message or in the output that you >tried to start it. If the start had died in the middle, the list >command would have shown that. At this point I had not started raid.p1 because I wanted to see if it would start itself. I thought it might, since all of the drives have good data once raid.p0 comes up. According to your vinum web pages, this requires a logging facility that is not implemented yet. I don't usually unplug drives to watch things break, so I won't lose sleep over it. > >Getting back to the first problem, my first guess is that you tried >only 'start raid.p0', and didn't do a 'start raid.p1'. If you did, >I'd like to see the output I ask for in the man page and at >http://www.vinumvm.org/vinum/how-to-debug.html. It's too detailed to >repeat here. > I think I have included all of the requested output. Kernel debugging won't be possible, if necessary, until next week at the earliest. Everything works great with raid-1; raid-0+1 works if I only pull one drive. I don't think I have faulty hardware. Thanks. -- --------------------------------------------------- David C. Schooley, Ph.D. Transmission Operations/Technical Operations Support Commonwealth Edison Company work phone: 630-691-4466/(472)-4466 work email: mailto:david.c.schooley@ucm.com home email: mailto:dcschooley@ieee.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message