From owner-freebsd-fs@FreeBSD.ORG Mon May 26 07:30:53 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F267F37B401 for ; Mon, 26 May 2003 07:30:52 -0700 (PDT) Received: from tinny.eis.net.au (tinny.eis.net.au [203.12.171.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id AC73A43F93 for ; Mon, 26 May 2003 07:30:51 -0700 (PDT) (envelope-from ernie@tinny.eis.net.au) Received: (from ernie@localhost) by tinny.eis.net.au (8.8.8/8.8.3) id AAA19596 for freebsd-fs@freebsd.org; Tue, 27 May 2003 00:30:42 +1000 (EST) From: Ernie Elu Message-Id: <200305261430.AAA19596@tinny.eis.net.au> To: freebsd-fs@freebsd.org Date: Tue, 27 May 2003 00:30:42 +1000 (EST) X-Mailer: ELM [version 2.4ME+ PL43 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Superblock recovery from backup block 32 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 May 2003 14:30:53 -0000 I have an IDE drive running FreeBSD 4.8-RELEASE that had a superblock corruption on the / partition as a result of a faulty motherboard. When I do an fsck_ffs -o -b 32 /dev/ad2s2a it passes all the scans. However after that fsck_ffs /dev/ad2s2a still fails: root # fsck_ffs /dev/ad2s2a ** /dev/ad2s2a Cannot find file system superblock /dev/ad2s2a: INCOMPLETE LABEL: type 4.2BSD fsize 0, frag 0, cpg 0, size 13211287 I did a fair bit of reading the archives and I thought that fsck is meant to replace the faulty superblock automatically when you specify an alternative with the -b flag, but that is not happening. Can anyone tell me how to replace the superblock with the copy at block 32 which still seems to be o.k? From owner-freebsd-fs@FreeBSD.ORG Thu May 29 05:36:59 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6859F37B401; Thu, 29 May 2003 05:36:59 -0700 (PDT) Received: from mail.eecs.harvard.edu (bowser.eecs.harvard.edu [140.247.60.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id A7A6B43F85; Thu, 29 May 2003 05:36:58 -0700 (PDT) (envelope-from ellard@eecs.harvard.edu) Received: by mail.eecs.harvard.edu (Postfix, from userid 465) id 3563F54C491; Thu, 29 May 2003 08:36:55 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mail.eecs.harvard.edu (Postfix) with ESMTP id 32ACA54C48E; Thu, 29 May 2003 08:36:55 -0400 (EDT) Date: Thu, 29 May 2003 08:36:55 -0400 (EDT) From: Daniel Ellard To: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: how to do asynchrounous I/O at the device level? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 May 2003 12:36:59 -0000 I'm not sure if this a question for fs or hackers, so I apologize if you see this twice. I'm writing a device driver for a "soft-mirrored" disk. The idea is similar to ordinary disk mirroring, except that the focus is entirely on higher performance instead of fault tolerance -- the secondary disk need not be an exact duplicate of the first. (I have a method for keeping track of which blocks on the secondary are actually in sync with the primary, and which might contain stale data.) What I want to do is an ordinary write to the primary disk and an asynchronous write to the secondary, so that it is possible that the calling process can continue on its way before the writes are actually finished on the secondary. I've implemented my scheme with synchronous mirror writes by hacking up the CCD driver. (This wasn't a big deal, because CCD already implements disk mirroring, but because I'm also futzing around with a bunch of other stuff, the resulting code is structured a bit differently.) Now I want to make the secondary writes asynchronous. The challenge is that I need to make copies of whatever state the device underneath CCD needs in order to do the I/Os. As soon as the primary writes are finished, the file system is going to deallocate or reuse the structures it passed down to CCD. I can't hack the file system code to delay this, because I need to hide all this inside the device driver. I know I need to copy the buffer, and clone the buf struct. My questions are: 1. How to properly clone the buf struct to make a "standalone" buf. Just bcopy'ing it will result in it being filled with pointers linking it to the rest of the buffer pool, which I suspect will lead to horrible problems later -- I'm pretty sure that I don't want the buffer manager to know about this buf, or this buf to believe it's part of the buffer pool. 2. Whether there's any other state that I need to preserve. If this sort of thing has already been implemented somewhere, just point me to it... Thanks, -Dan From owner-freebsd-fs@FreeBSD.ORG Fri May 30 15:22:01 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3A9B137B401; Fri, 30 May 2003 15:22:01 -0700 (PDT) Received: from mx2.confluentasp.com (mx2.confluentasp.com [216.26.153.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 04FC343F75; Fri, 30 May 2003 15:22:00 -0700 (PDT) (envelope-from mikej@confluenttech.com) Received: from neo.confluentasp.local (35.in-addr.arpa.confluentasp.com [216.26.153.35] (may be forged)) by mx2.confluentasp.com (8.11.6/8.11.6) with ESMTP id h4UMLwo81414; Fri, 30 May 2003 18:21:58 -0400 (EDT) X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Date: Fri, 30 May 2003 18:21:53 -0400 Message-ID: <9D7F0DF3FB16D41184010050DA90E000BF6668@neo.confluentasp.local> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Vinum / 4.8 / Referenced disk / Recovery Thread-Index: AcMm+d+kARWv6ez9TOupo+g/TUsA4w== From: "Michael G. Jung" To: , Subject: Vinum / 4.8 / Referenced disk / Recovery X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 May 2003 22:22:01 -0000 After a reboot on 4.8 I ended up with a degraded raid 5 partition...=20 The only thing special about my setup is.... 4944 drives spread over 3 = channels, running SMP kernel..... One sub disk was down and the and the drive was referenced... in = scouring the=20 mailing lists I saw where a referenced disk means you have referenced a = non-existent drive - I read this as one vinum didn't think was = defined.. in my=20 case it was drive29 --> /dev/da29s1e I don't know how this got referenced !!! It's been reboot many times = and this has not happened. =20 So I boldly created a config file for vinum and re-created the = drive..... --- config file ---- drive drvie29 device /dev/da29e --- end ---- but I still can not start the sub disk.....=20 (root@jammin) /home/staff/mikej# vinum start raid5-1.p0.s15 Can't start raid5-1.p0.s15: Drive is down (5) (root@jammin) /home/staff/mikej#=20 Here is what vinum thinks...... Do I rm the sub disk and re-create = it????? Will this kill my raid-5 partition ?? Thanks !! (root@jammin) /home/staff/mikej# vinum printconfig # Vinum configuration of jammin.mikej.com, saved at Fri May 30 18:15:48 = 2003 drive drive1 device /dev/da1s1e drive drive2 device /dev/da2s1e drive drive3 device /dev/da3s1e drive d2 device /dev/da15s1e drive drive16 device /dev/da16s1e drive drive17 device /dev/da17s1e drive drive18 device /dev/da18s1e drive drive19 device /dev/da19s1e drive d1 device /dev/da20s1e drive drive21 device /dev/da21s1e drive drive22 device /dev/da22s1e drive drive23 device /dev/da23s1e drive drive24 device /dev/da24s1e drive drive25 device /dev/da25s1e drive drive26 device /dev/da26s1e drive drive27 device /dev/da27s1e drive drive28 device /dev/da28s1e drive drive29 device /dev/da29s1e drive *invalid* device=20 volume mirror1 volume raid5-1 plex name mirror1.p0 org concat vol mirror1=20 plex name mirror1.p1 org concat vol mirror1=20 plex name raid5-1.p0 org raid5 62s vol raid5-1=20 sd name mirror.p0.s0 drive d1 plex mirror1.p0 len 35551232s driveoffset = 265s plexoffset 0s sd name mirror.p1.s0 drive d2 plex mirror1.p1 len 35551232s driveoffset = 265s plexoffset 0s sd name raid5-1.p0.s0 drive drive1 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 0s sd name raid5-1.p0.s1 drive drive2 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 62s sd name raid5-1.p0.s2 drive drive3 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 124s sd name raid5-1.p0.s3 drive drive16 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 186s sd name raid5-1.p0.s4 drive drive17 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 248s sd name raid5-1.p0.s5 drive drive18 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 310s sd name raid5-1.p0.s6 drive drive19 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 372s sd name raid5-1.p0.s7 drive drive21 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 434s sd name raid5-1.p0.s8 drive drive22 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 496s sd name raid5-1.p0.s9 drive drive23 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 558s sd name raid5-1.p0.s10 drive drive24 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 620s sd name raid5-1.p0.s11 drive drive25 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 682s sd name raid5-1.p0.s12 drive drive26 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 744s sd name raid5-1.p0.s13 drive drive27 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 806s sd name raid5-1.p0.s14 drive drive28 plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 868s sd name raid5-1.p0.s15 drive *invalid* plex raid5-1.p0 len 35551172s = driveoffset 265s plexoffset 930s (root@jammin) /etc# vinum l 18 drives: D drive1 State: up Device /dev/da1s1e Avail: = 0/17359 MB (0%) D drive2 State: up Device /dev/da2s1e Avail: = 0/17359 MB (0%) D drive3 State: up Device /dev/da3s1e Avail: = 0/17359 MB (0%) D d2 State: up Device /dev/da15s1e Avail: = 0/17359 MB (0%) D drive16 State: up Device /dev/da16s1e Avail: = 0/17359 MB (0%) D drive17 State: up Device /dev/da17s1e Avail: = 0/17359 MB (0%) D drive18 State: up Device /dev/da18s1e Avail: = 0/17359 MB (0%) D drive19 State: up Device /dev/da19s1e Avail: = 0/17359 MB (0%) D d1 State: up Device /dev/da20s1e Avail: = 0/17359 MB (0%) D drive21 State: up Device /dev/da21s1e Avail: = 0/17359 MB (0%) D drive22 State: up Device /dev/da22s1e Avail: = 0/17359 MB (0%) D drive23 State: up Device /dev/da23s1e Avail: = 0/17359 MB (0%) D drive24 State: up Device /dev/da24s1e Avail: = 0/17359 MB (0%) D drive25 State: up Device /dev/da25s1e Avail: = 0/17359 MB (0%) D drive26 State: up Device /dev/da26s1e Avail: = 0/17359 MB (0%) D drive27 State: up Device /dev/da27s1e Avail: = 0/17359 MB (0%) D drive28 State: up Device /dev/da28s1e Avail: = 0/17359 MB (0%) D drive29 State: up Device /dev/da29s1e Avail: = 17359/17359 MB (100%) D *invalid* State: referenced Device Avail: 0/0 MB 2 volumes: V mirror1 State: up Plexes: 2 Size: 16 = GB V raid5-1 State: up Plexes: 1 Size: 254 = GB 3 plexes: P mirror1.p0 C State: up Subdisks: 1 Size: 16 = GB P mirror1.p1 C State: up Subdisks: 1 Size: 16 = GB P raid5-1.p0 R5 State: degraded Subdisks: 16 Size: 254 = GB 18 subdisks: S mirror.p0.s0 State: up PO: 0 B Size: 16 = GB S mirror.p1.s0 State: up PO: 0 B Size: 16 = GB S raid5-1.p0.s0 State: up PO: 0 B Size: 16 = GB S raid5-1.p0.s1 State: up PO: 31 kB Size: 16 = GB S raid5-1.p0.s2 State: up PO: 62 kB Size: 16 = GB S raid5-1.p0.s3 State: up PO: 93 kB Size: 16 = GB S raid5-1.p0.s4 State: up PO: 124 kB Size: 16 = GB S raid5-1.p0.s5 State: up PO: 155 kB Size: 16 = GB S raid5-1.p0.s6 State: up PO: 186 kB Size: 16 = GB S raid5-1.p0.s7 State: up PO: 217 kB Size: 16 = GB S raid5-1.p0.s8 State: up PO: 248 kB Size: 16 = GB S raid5-1.p0.s9 State: up PO: 279 kB Size: 16 = GB S raid5-1.p0.s10 State: up PO: 310 kB Size: 16 = GB S raid5-1.p0.s11 State: up PO: 341 kB Size: 16 = GB S raid5-1.p0.s12 State: up PO: 372 kB Size: 16 = GB S raid5-1.p0.s13 State: up PO: 403 kB Size: 16 = GB S raid5-1.p0.s14 State: up PO: 434 kB Size: 16 = GB S raid5-1.p0.s15 State: stale PO: 465 kB Size: 16 = GB (root@jammin) /etc#=20 From owner-freebsd-fs@FreeBSD.ORG Fri May 30 19:58:14 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0A91B37B401; Fri, 30 May 2003 19:58:14 -0700 (PDT) Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80]) by mx1.FreeBSD.org (Postfix) with ESMTP id D1E0443F3F; Fri, 30 May 2003 19:58:11 -0700 (PDT) (envelope-from grog@lemis.com) Received: by wantadilla.lemis.com (Postfix, from userid 1004) id 6835D527A9; Sat, 31 May 2003 12:28:08 +0930 (CST) Date: Sat, 31 May 2003 12:28:08 +0930 From: Greg 'groggy' Lehey To: "Michael G. Jung" Message-ID: <20030531025808.GF56538@wantadilla.lemis.com> References: <9D7F0DF3FB16D41184010050DA90E000BF6668@neo.confluentasp.local> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Y/WcH0a6A93yCHGr" Content-Disposition: inline In-Reply-To: <9D7F0DF3FB16D41184010050DA90E000BF6668@neo.confluentasp.local> User-Agent: Mutt/1.4i Organization: The FreeBSD Project Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-418-838-708 WWW-Home-Page: http://www.FreeBSD.org/ X-PGP-Fingerprint: 9A1B 8202 BCCE B846 F92F 09AC 22E6 F290 507A 4223 cc: freebsd-fs@freebsd.org cc: freebsd-hackers@freebsd.org Subject: Re: Vinum / 4.8 / Referenced disk / Recovery X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 31 May 2003 02:58:14 -0000 --Y/WcH0a6A93yCHGr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Friday, 30 May 2003 at 18:21:53 -0400, Michael G. Jung wrote: > After a reboot on 4.8 I ended up with a degraded raid 5 partition... > > The only thing special about my setup is.... 4944 drives spread over 3 ch= annels, > running SMP kernel..... That's a lot of drives. > One sub disk was down and the and the drive was referenced... in scouring= the > mailing lists I saw where a referenced disk means you have referenced a > non-existent drive - I read this as one vinum didn't think was defined.= . in my > case it was drive29 --> /dev/da29s1e > > I don't know how this got referenced !!! =20 It's part of your configuration. From the printconfig output: > drive drive29 device /dev/da29s1e > It's been reboot many times and this has not happened. It probably hasn't failed for. > > So I boldly created a config file for vinum and re-created the drive..... > > --- config file ---- > drive drvie29 device /dev/da29e > --- end ---- > > but I still can not start the sub disk..... > > (root@jammin) /home/staff/mikej# vinum start raid5-1.p0.s15 > Can't start raid5-1.p0.s15: Drive is down (5) > (root@jammin) /home/staff/mikej# > > Here is what vinum thinks...... Do I rm the sub disk and re-create > it????? No. > Will this kill my raid-5 partition ?? If you do enough messing around with the configuration, yes, you can kill your RAID-5 plex. In all probability, your drive has failed and requires replacement. You'll see that from the system log file. Look at http://www.vinumvm.org/vinum/how-to-debug.html and http://www.vinumvm.org/vinum/replacing-drive.html. You don't need to submit the information if you can understand it and take the appropriate action. Greg -- See complete headers for address and phone numbers --Y/WcH0a6A93yCHGr Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (FreeBSD) iD8DBQE+2BpAIubykFB6QiMRAlL4AJ949n0b6TrzdH6hkgRyM8wqe869vgCfc4J1 1swXGq5SQ8oM+ERS3K7x0ZQ= =JeV9 -----END PGP SIGNATURE----- --Y/WcH0a6A93yCHGr--