Date: Wed, 17 Oct 2007 07:40:06 GMT From: Nate Eldredge <nge@cs.hmc.edu> To: freebsd-bugs@FreeBSD.org Subject: Re: bin/115174: growfs(8) needs zero-writing for safe filesystem expansion [PATCH] Message-ID: <200710170740.l9H7e6IG053292@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR bin/115174; it has been noted by GNATS. From: Nate Eldredge <nge@cs.hmc.edu> To: bug-followup@FreeBSD.org, etc@fluffles.net, scottl@freebsd.org Cc: Subject: Re: bin/115174: growfs(8) needs zero-writing for safe filesystem expansion [PATCH] Date: Wed, 17 Oct 2007 00:32:57 -0700 (PDT) I took a look at this. Let me say up front that I am not an fs expert, I'm just reading the code, so any of this could be wrong. It looks like this was introduced in revision 1.23 of src/sbin/growfs/growfs.c, commited by scottl. There are actually two different bugs for UFS1 and UFS2. The idea is that inodes have to be initialized to zero before they are used, and fsck complains if this isn't the case, even if those inodes are not in use according to the cylinder group bitmap. In UFS1, newfs is expected to initialize all inodes. In UFS2, inode initialization is "lazy"; there is a field cg_initediblk in the cylinder group indicating how many inodes have been initialized, starting from the first. (It looks like this and cg_niblk count inodes, not blocks, despite the name.) If more are needed, the kernel initializes them as necessary. For some reason, newfs initializes the first two blocks worth of inodes in every cg. growfs works by creating new cylinder groups in the filesystem, each of which comes with its own chunk of inodes. Prior to revision 1.23, growfs initialized the inodes in all the cgs it created, for both UFS1 and UFS2. Revision 1.23 attempted to optimize this behavior. For UFS2, it removed the inode initialization completely, but did not adjust cg_initediblk accordingly, so the kernel and fsck still thought that all inodes were initialized when actually they were not. My patch fixes this by setting cg_initedblk to zero. It would be possible to behave like newfs and initialize the first few inodes, but I don't see why that is necessary. It is simpler to skip it entirely, and in my tests it works fine. For UFS1, revision 1.23 changes it so that all *but* the first two blocks of inodes are initialized. It looks like the author was confused by some out-of-context code from newfs (which I'm not sure is exactly right itself), since this doesn't make any sense. My patch changes it back to initializing all inodes, and removes some leftover UFS2 initialization code as well, which is no longer used. I tested this patch on a md filesystem, filling it with random bytes, creating and filling a 100 MB filesystem, then growing it to 200 MB. fsck passes the new filesystem and all files remain intact, with both UFS1 and UFS2. --- /usr/src/sbin/growfs/growfs.c Tue Nov 29 23:20:16 2005 +++ ./growfs.c Tue Oct 16 23:07:47 2007 @@ -376,7 +376,6 @@ long d, dlower, dupper, blkno, start; ufs2_daddr_t i, cbase, dmax; struct ufs1_dinode *dp1; - struct ufs2_dinode *dp2; struct csum *cs; if (iobuf == NULL && (iobuf = malloc(sblock.fs_bsize)) == NULL) { @@ -401,7 +400,7 @@ acg.cg_magic = CG_MAGIC; acg.cg_cgx = cylno; acg.cg_niblk = sblock.fs_ipg; - acg.cg_initediblk = sblock.fs_ipg; + acg.cg_initediblk = 0; acg.cg_ndblk = dmax - cbase; if (sblock.fs_contigsumsize > 0) acg.cg_nclusterblks = acg.cg_ndblk / sblock.fs_frag; @@ -445,26 +444,16 @@ setbit(cg_inosused(&acg), i); acg.cg_cs.cs_nifree--; } - /* - * XXX Newfs writes out two blocks of initialized inodes - * unconditionally. Should we check here to make sure that they - * were actually written? - */ if (sblock.fs_magic == FS_UFS1_MAGIC) { bzero(iobuf, sblock.fs_bsize); - for (i = 2 * sblock.fs_frag; i < sblock.fs_ipg / INOPF(&sblock); + for (i = 0; i < sblock.fs_ipg / INOPF(&sblock); i += sblock.fs_frag) { dp1 = (struct ufs1_dinode *)iobuf; - dp2 = (struct ufs2_dinode *)iobuf; #ifdef FSIRAND - for (j = 0; j < INOPB(&sblock); j++) - if (sblock.fs_magic == FS_UFS1_MAGIC) { - dp1->di_gen = random(); - dp1++; - } else { - dp2->di_gen = random(); - dp2++; - } + for (j = 0; j < INOPB(&sblock); j++) { + dp1->di_gen = random(); + dp1++; + } #endif wtfs(fsbtodb(&sblock, cgimin(&sblock, cylno) + i), sblock.fs_bsize, iobuf, fso, Nflag); -- Nate Eldredge nge@cs.hmc.edu
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200710170740.l9H7e6IG053292>