From owner-freebsd-fs@FreeBSD.ORG Fri Feb 1 09:09:08 2013 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 68218B5A for ; Fri, 1 Feb 2013 09:09:08 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au [211.29.132.187]) by mx1.freebsd.org (Postfix) with ESMTP id 25654698 for ; Fri, 1 Feb 2013 09:09:06 +0000 (UTC) Received: from c211-30-173-106.carlnfd1.nsw.optusnet.com.au (c211-30-173-106.carlnfd1.nsw.optusnet.com.au [211.30.173.106]) by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id r1198suE025477 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 1 Feb 2013 20:08:57 +1100 Date: Fri, 1 Feb 2013 20:08:54 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: fs@freebsd.org Subject: some fixes for msdosfs Message-ID: <20130201182606.A1492@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=MscKcBme c=1 sm=1 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=IPjz-GnoPKUA:10 a=Nf6N-zW9uRpywVz5buAA:9 a=CjuIK1q_8ugA:10 a=ADzlCoBTbut52P-S:21 a=qp1OgrSAsFp3MxqR:21 a=TEtd8y5WR3g2ypngnwZWYw==:117 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Feb 2013 09:09:08 -0000 Please commit some of these fixes. 1. The directory entry for dotdot was corrupted in the FAT32 case when moving a directory to a subdir of the root directory from somewhere else. For all directory moves that change the parent directory, the dotdot entry must be fixed up. For msdosfs, the root directory is magic for non-FAT32. It is less magic for FAT32, but needs the same magic for the dotdot fixup. It didn't have it. chkdsk and fsck_msdosfs fix the corrupt directory entries with no problems. The fix is simple -- use the same magic for dotdot in msdosfs_rename() as in msdosfs_mkdir(). But the patch is large due to related cleanups and unrelated changes that -current already has. @ Index: msdosfs_vnops.c @ =================================================================== @ RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_vnops.c,v @ retrieving revision 1.147 @ diff -u -2 -r1.147 msdosfs_vnops.c @ --- msdosfs_vnops.c 4 Feb 2004 21:52:53 -0000 1.147 @ +++ msdosfs_vnops.c 31 Jan 2013 17:36:21 -0000 @ @@ -926,5 +978,5 @@ @ int doingdirectory = 0, newparent = 0; @ int error; @ - u_long cn; @ + u_long cn, pcl; @ daddr_t bn; @ struct denode *fddep; /* from file's parent directory */ This is in msdosfs_rename(). Use the same variable name as in mkdir(), instead of reusing cn. @ @@ -1199,9 +1251,13 @@ @ } @ dotdotp = (struct direntry *)bp->b_data + 1; @ - putushort(dotdotp->deStartCluster, dp->de_StartCluster); @ + pcl = dp->de_StartCluster; @ + if (FAT32(pmp) && pcl == pmp->pm_rootdirblk) @ + pcl = MSDOSFSROOT; @ + putushort(dotdotp->deStartCluster, pcl); @ if (FAT32(pmp)) @ - putushort(dotdotp->deHighClust, dp->de_StartCluster >> 16); @ - error = bwrite(bp); @ - if (error) { @ + putushort(dotdotp->deHighClust, pcl >> 16); Use the same code as in mkdir(). Don't comment on it again. @ + if (fvp->v_mount->mnt_flag & MNT_ASYNC) @ + bdwrite(bp); @ + else if ((error = bwrite(bp)) != 0) { @ /* XXX should really panic here, fs is corrupt */ @ VOP_UNLOCK(fvp, 0, td); Unrelated changes that -current already has. @ @@ -1313,6 +1369,11 @@ @ putushort(denp[0].deMTime, ndirent.de_MTime); @ pcl = pdep->de_StartCluster; @ + /* @ + * Although the root directory has a non-magic starting cluster @ + * number for FAT32, chkdsk and fsck_msdosfs still require @ + * references to it in dotdot entries to be magic. @ + */ @ if (FAT32(pmp) && pcl == pmp->pm_rootdirblk) @ - pcl = 0; @ + pcl = MSDOSFSROOT; @ putushort(denp[1].deStartCluster, pcl); @ putushort(denp[1].deCDate, ndirent.de_CDate); This is in msdosfs_mkdir(). Document the magic there. Don't hard-code 0. @ @@ -1324,9 +1385,10 @@ @ if (FAT32(pmp)) { @ putushort(denp[0].deHighClust, newcluster >> 16); @ - putushort(denp[1].deHighClust, pdep->de_StartCluster >> 16); @ + putushort(denp[1].deHighClust, pcl >> 16); @ } Don't depend on magic soft-coding of 0. For the FAT32 root directory, pdep->de_StartCluster is usually < 65536. Perhaps it is always small. The old code depended on this to get a result of 0 when the value is shifted. @ @ - error = bwrite(bp); @ - if (error) @ + if (ap->a_dvp->v_mount->mnt_flag & MNT_ASYNC) @ + bdwrite(bp); @ + else if ((error = bwrite(bp)) != 0) @ goto bad; @ Unrelated changes that -current already has. Further cleanups: the condition for being the root directory should probably be written as (DETOV(pdep)->v_vflag & VV_ROOT). msdosfs uses this in some places, but it prefers to test if a denode's first cluster number is MSDOSFSROOT. The latter is simpler and was equivalent before FAT32 existed. msdosfs still uses it a lot, but it now means that the denode is for a non-FAT32 root directory. This is quite confusing. The old test often gives the correct classification because for the FAT32 case where it differs, the root directory is not magic, and the code under the test is really handling the magic case. Comments add to the confusion because they are mostly unchanged and still say that the root directory is always magic. Cases where the root directory is magic for FAT32 are mostly classified using the (FAT32(pmp) && cn == pmp->pm_rootdirblk) condition. Apparently there is some magic that requires the FAT32(pmp) condition before pmp->pm_rootdirblk can be trusted. The VV_ROOT condition seems better for these cases. ============ 2. mountmsdosfs() had an insane sanity test. While testing the above, I tried FAT32 on a small partition. This failed to mount because pmp->pm_Sectors was nonzero. Normally, FAT32 file systems are so large that the 16-bit pm_Sectors can't hold the size. This is indicated by setting it to 0 and using only pm_HugeSectors. But at least old versions of newfs_msdos use the 16-bit field if possible, and msdosfs supports this except for breaking its own support in the sanity check. This is quite different from the handling of pm_FATsecs -- now the 16-bit value is always ignored for FAT32 except for checking that it is 0, and newfs_msdos doesn't use the 16-bit value for FAT32. I just removed the sanity test. @ Index: msdosfs_vfsops.c @ =================================================================== @ RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_vfsops.c,v @ retrieving revision 1.120 @ diff -u -2 -r1.120 msdosfs_vfsops.c @ --- msdosfs_vfsops.c 16 Jun 2004 09:47:03 -0000 1.120 @ +++ msdosfs_vfsops.c 31 Jan 2013 17:53:25 -0000 @ @@ -431,5 +459,4 @@ @ if (bsp->bs710.bsBootSectSig2 != BOOTSIG2 @ || bsp->bs710.bsBootSectSig3 != BOOTSIG3 @ - || pmp->pm_Sectors @ || pmp->pm_FATsecs @ || getushort(b710->bpbFSVers)) { The sanity tests of the signatures have already been removed in -current. ============ 3. Backup FATs were sometimes marked dirty by copying their first block from the primary FAT, and then they were not marked clean on unmount. This bug has been known for a long time, and always happened while testing (1), so I fixed it. My tests were mostly to create a new file system, 1 mkdir and move the new directory forth and back from the root partition to corrupt it. Since all the FAT entries are in the first block of the FAT, backing this up always marks the backups as unclean. chkdsk and fsck_msdosfs fix this, but it gives them extra work and uninspires confidence in the backups. @ Index: msdosfs_fat.c @ =================================================================== @ RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_fat.c,v @ retrieving revision 1.50 @ diff -u -2 -r1.50 msdosfs_fat.c @ --- msdosfs_fat.c 1 Sep 2008 13:18:16 -0000 1.50 @ +++ msdosfs_fat.c 31 Jan 2013 15:07:41 -0000 @ @@ -337,6 +338,6 @@ @ u_long fatbn; @ { @ - int i; @ struct buf *bpn; @ + int cleanfat, i; @ @ #ifdef MSDOSFS_DEBUG @ @@ -378,4 +356,10 @@ @ * bwrite()'s and really slow things down. @ */ @ + if (fatbn != pmp->pm_fatblk || FAT12(pmp)) @ + cleanfat = 0; @ + else if (FAT16(pmp)) @ + cleanfat = 16; @ + else @ + cleanfat = 32; @ for (i = 1; i < pmp->pm_FATs; i++) { @ fatbn += pmp->pm_FATsecs; @ @@ -384,5 +368,10 @@ @ 0, 0, 0); @ bcopy(bp->b_data, bpn->b_data, bp->b_bcount); @ - if (pmp->pm_flags & MSDOSFSMNT_WAITONFAT) @ + /* Force the clean bit on in the other copies. */ @ + if (cleanfat == 16) @ + ((u_int8_t *)bpn->b_data)[3] |= 0x80; @ + else if (cleanfat == 32) @ + ((u_int8_t *)bpn->b_data)[7] |= 0x08; @ + if (pmp->pm_mountp->mnt_flag & MNT_SYNCHRONOUS) @ bwrite(bpn); @ else Unrelated change for the bwrite() condition. The MSDOSFSMNT_WAITONFAT flag is bogus and broken. It does less than track the MNT_SYNCHRONOUS flag. It is set to the latter at mount time but not updated by MNT_UPDATE. You could exploit this to set it to a different value than the current MNT_SYNCHRONOUS setting, but this is undocumented and fragile. (FAT updates should be sync by default, but this is too slow, so the default is async (delayed) FAT, async (delayed or async) file data and sync metadata for denodes, which is probably a worse combination than async everything. But you could change the FAT write policy to sync by mounting with sync and then MNT_UPDATEing with nosync to get nosync (default) for file data.) @ @@ -394,11 +383,10 @@ @ * Write out the first (or current) fat last. @ */ @ - if (pmp->pm_flags & MSDOSFSMNT_WAITONFAT) @ + if (pmp->pm_mountp->mnt_flag & MNT_SYNCHRONOUS) @ bwrite(bp); @ else @ bdwrite(bp); Fixing the condition is more important for the primary FAT. The backups should probably always be written with async or even delayed writes. (async would be not much different from sync, since we don't check for success. It would still be very slow.) @ - /* @ - * Maybe update fsinfo sector here? @ - */ @ + @ + pmp->pm_fmod |= 1; Unrelated changes. (I moved all fsinfo updates from here, and use pm_fmod as a set of flags, with the 1 flag indicating the old (unused) condition of a modified FAT.) @ } @ @ @@ -1097,5 +1085,5 @@ @ * manipulating the upper bit of the FAT entry for cluster 1. Note that @ * this bit is not defined for FAT12 volumes, which are always assumed to @ - * be dirty. @ + * be clean. @ * @ * The fatentry() routine only works on cluster numbers that a file could Vaguely related -- fix a backwards comment in markvoldirty(). markvoldirty() is too specialized. The bug wouldn't have existed if updatefats() had been used instead of the direct bread()/bwrite() of a single block in markvoldirty(). There are also some locking and blocksize problems with this special i/o. However, I prefer not to write the all backups for marking the volume dirty, and it was easy to keep them marked clean in updatefats(). Writing the primary FAT when only it is dirty is already 1 too many writes. So it is a feature that markvoldirty() only writes 1 block. Bruce