Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 1 Feb 2013 20:08:54 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        fs@freebsd.org
Subject:   some fixes for msdosfs
Message-ID:  <20130201182606.A1492@besplex.bde.org>

next in thread | raw e-mail | index | archive | help
Please commit some of these fixes.

1.
The directory entry for dotdot was corrupted in the FAT32 case when moving
a directory to a subdir of the root directory from somewhere else.

For all directory moves that change the parent directory, the dotdot
entry must be fixed up.  For msdosfs, the root directory is magic for
non-FAT32.  It is less magic for FAT32, but needs the same magic for
the dotdot fixup.  It didn't have it.

chkdsk and fsck_msdosfs fix the corrupt directory entries with no problems.

The fix is simple -- use the same magic for dotdot in msdosfs_rename()
as in msdosfs_mkdir().  But the patch is large due to related cleanups
and unrelated changes that -current already has.

@ Index: msdosfs_vnops.c
@ ===================================================================
@ RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_vnops.c,v
@ retrieving revision 1.147
@ diff -u -2 -r1.147 msdosfs_vnops.c
@ --- msdosfs_vnops.c	4 Feb 2004 21:52:53 -0000	1.147
@ +++ msdosfs_vnops.c	31 Jan 2013 17:36:21 -0000
@ @@ -926,5 +978,5 @@
@  	int doingdirectory = 0, newparent = 0;
@  	int error;
@ -	u_long cn;
@ +	u_long cn, pcl;
@  	daddr_t bn;
@  	struct denode *fddep;	/* from file's parent directory	 */

This is in msdosfs_rename().

Use the same variable name as in mkdir(), instead of reusing cn.

@ @@ -1199,9 +1251,13 @@
@  		}
@  		dotdotp = (struct direntry *)bp->b_data + 1;
@ -		putushort(dotdotp->deStartCluster, dp->de_StartCluster);
@ +		pcl = dp->de_StartCluster;
@ +		if (FAT32(pmp) && pcl == pmp->pm_rootdirblk)
@ +			pcl = MSDOSFSROOT;
@ +		putushort(dotdotp->deStartCluster, pcl);
@  		if (FAT32(pmp))
@ -			putushort(dotdotp->deHighClust, dp->de_StartCluster >> 16);
@ -		error = bwrite(bp);
@ -		if (error) {
@ +			putushort(dotdotp->deHighClust, pcl >> 16);

Use the same code as in mkdir().  Don't comment on it again.

@ +		if (fvp->v_mount->mnt_flag & MNT_ASYNC)
@ +			bdwrite(bp);
@ +		else if ((error = bwrite(bp)) != 0) {
@  			/* XXX should really panic here, fs is corrupt */
@  			VOP_UNLOCK(fvp, 0, td);

Unrelated changes that -current already has.

@ @@ -1313,6 +1369,11 @@
@  	putushort(denp[0].deMTime, ndirent.de_MTime);
@  	pcl = pdep->de_StartCluster;
@ +	/*
@ +	 * Although the root directory has a non-magic starting cluster
@ +	 * number for FAT32, chkdsk and fsck_msdosfs still require
@ +	 * references to it in dotdot entries to be magic.
@ +	 */
@  	if (FAT32(pmp) && pcl == pmp->pm_rootdirblk)
@ -		pcl = 0;
@ +		pcl = MSDOSFSROOT;
@  	putushort(denp[1].deStartCluster, pcl);
@  	putushort(denp[1].deCDate, ndirent.de_CDate);

This is in msdosfs_mkdir().

Document the magic there.  Don't hard-code 0.

@ @@ -1324,9 +1385,10 @@
@  	if (FAT32(pmp)) {
@  		putushort(denp[0].deHighClust, newcluster >> 16);
@ -		putushort(denp[1].deHighClust, pdep->de_StartCluster >> 16);
@ +		putushort(denp[1].deHighClust, pcl >> 16);
@  	}

Don't depend on magic soft-coding of 0.  For the FAT32 root directory,
pdep->de_StartCluster is usually < 65536.  Perhaps it is always small.
The old code depended on this to get a result of 0 when the value is
shifted.

@ 
@ -	error = bwrite(bp);
@ -	if (error)
@ +	if (ap->a_dvp->v_mount->mnt_flag & MNT_ASYNC)
@ +		bdwrite(bp);
@ +	else if ((error = bwrite(bp)) != 0)
@  		goto bad;
@

Unrelated changes that -current already has.

Further cleanups: the condition for being the root directory should
probably be written as (DETOV(pdep)->v_vflag & VV_ROOT).  msdosfs uses
this in some places, but it prefers to test if a denode's first cluster
number is MSDOSFSROOT.  The latter is simpler and was equivalent before
FAT32 existed.  msdosfs still uses it a lot, but it now means that the
denode is for a non-FAT32 root directory.  This is quite confusing.
The old test often gives the correct classification because for the
FAT32 case where it differs, the root directory is not magic, and the
code under the test is really handling the magic case.  Comments add
to the confusion because they are mostly unchanged and still say that
the root directory is always magic.  Cases where the root directory
is magic for FAT32 are mostly classified using the (FAT32(pmp) && cn
== pmp->pm_rootdirblk) condition.  Apparently there is some magic that
requires the FAT32(pmp) condition before pmp->pm_rootdirblk can be
trusted.  The VV_ROOT condition seems better for these cases.

============

2.
mountmsdosfs() had an insane sanity test.

While testing the above, I tried FAT32 on a small partition.  This
failed to mount because pmp->pm_Sectors was nonzero.  Normally,
FAT32 file systems are so large that the 16-bit pm_Sectors can't
hold the size.  This is indicated by setting it to 0 and using
only pm_HugeSectors.  But at least old versions of newfs_msdos
use the 16-bit field if possible, and msdosfs supports this except
for breaking its own support in the sanity check.  This is quite
different from the handling of pm_FATsecs -- now the 16-bit value
is always ignored for FAT32 except for checking that it is 0,
and newfs_msdos doesn't use the 16-bit value for FAT32.

I just removed the sanity test.

@ Index: msdosfs_vfsops.c
@ ===================================================================
@ RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_vfsops.c,v
@ retrieving revision 1.120
@ diff -u -2 -r1.120 msdosfs_vfsops.c
@ --- msdosfs_vfsops.c	16 Jun 2004 09:47:03 -0000	1.120
@ +++ msdosfs_vfsops.c	31 Jan 2013 17:53:25 -0000
@ @@ -431,5 +459,4 @@
@  		if (bsp->bs710.bsBootSectSig2 != BOOTSIG2
@  		    || bsp->bs710.bsBootSectSig3 != BOOTSIG3
@ -		    || pmp->pm_Sectors
@  		    || pmp->pm_FATsecs
@  		    || getushort(b710->bpbFSVers)) {

The sanity tests of the signatures have already been removed in -current.

============

3.
Backup FATs were sometimes marked dirty by copying their first block
from the primary FAT, and then they were not marked clean on unmount.

This bug has been known for a long time, and always happened while
testing (1), so I fixed it.  My tests were mostly to create a new file
system, 1 mkdir and move the new directory forth and back from the root
partition to corrupt it.  Since all the FAT entries are in the first
block of the FAT, backing this up always marks the backups as unclean.

chkdsk and fsck_msdosfs fix this, but it gives them extra work and
uninspires confidence in the backups.

@ Index: msdosfs_fat.c
@ ===================================================================
@ RCS file: /home/ncvs/src/sys/fs/msdosfs/msdosfs_fat.c,v
@ retrieving revision 1.50
@ diff -u -2 -r1.50 msdosfs_fat.c
@ --- msdosfs_fat.c	1 Sep 2008 13:18:16 -0000	1.50
@ +++ msdosfs_fat.c	31 Jan 2013 15:07:41 -0000
@ @@ -337,6 +338,6 @@
@  	u_long fatbn;
@  {
@ -	int i;
@  	struct buf *bpn;
@ +	int cleanfat, i;
@ 
@  #ifdef MSDOSFS_DEBUG
@ @@ -378,4 +356,10 @@
@  		 * bwrite()'s and really slow things down.
@  		 */
@ +		if (fatbn != pmp->pm_fatblk || FAT12(pmp))
@ +			cleanfat = 0;
@ +		else if (FAT16(pmp))
@ +			cleanfat = 16;
@ +		else
@ +			cleanfat = 32;
@  		for (i = 1; i < pmp->pm_FATs; i++) {
@  			fatbn += pmp->pm_FATsecs;
@ @@ -384,5 +368,10 @@
@  			    0, 0, 0);
@  			bcopy(bp->b_data, bpn->b_data, bp->b_bcount);
@ -			if (pmp->pm_flags & MSDOSFSMNT_WAITONFAT)
@ +			/* Force the clean bit on in the other copies. */
@ +			if (cleanfat == 16)
@ +				((u_int8_t *)bpn->b_data)[3] |= 0x80;
@ +			else if (cleanfat == 32)
@ +				((u_int8_t *)bpn->b_data)[7] |= 0x08;
@ +			if (pmp->pm_mountp->mnt_flag & MNT_SYNCHRONOUS)
@  				bwrite(bpn);
@  			else

Unrelated change for the bwrite() condition.  The MSDOSFSMNT_WAITONFAT
flag is bogus and broken.  It does less than track the MNT_SYNCHRONOUS
flag.  It is set to the latter at mount time but not updated by
MNT_UPDATE.  You could exploit this to set it to a different value
than the current MNT_SYNCHRONOUS setting, but this is undocumented and
fragile.  (FAT updates should be sync by default, but this is too slow,
so the default is async (delayed) FAT, async (delayed or async) file
data and sync metadata for denodes, which is probably a worse
combination than async everything.  But you could change the FAT write
policy to sync by mounting with sync and then MNT_UPDATEing with nosync
to get nosync (default) for file data.)

@ @@ -394,11 +383,10 @@
@  	 * Write out the first (or current) fat last.
@  	 */
@ -	if (pmp->pm_flags & MSDOSFSMNT_WAITONFAT)
@ +	if (pmp->pm_mountp->mnt_flag & MNT_SYNCHRONOUS)
@  		bwrite(bp);
@  	else
@  		bdwrite(bp);

Fixing the condition is more important for the primary FAT.

The backups should probably always be written with async or even delayed
writes.  (async would be not much different from sync, since we don't
check for success.  It would still be very slow.)

@ -	/*
@ -	 * Maybe update fsinfo sector here?
@ -	 */
@ +
@ +	pmp->pm_fmod |= 1;

Unrelated changes.  (I moved all fsinfo updates from here, and use pm_fmod
as a set of flags, with the 1 flag indicating the old (unused) condition
of a modified FAT.)

@  }
@

@ @@ -1097,5 +1085,5 @@
@   * manipulating the upper bit of the FAT entry for cluster 1.  Note that
@   * this bit is not defined for FAT12 volumes, which are always assumed to
@ - * be dirty.
@ + * be clean.
@   *
@   * The fatentry() routine only works on cluster numbers that a file could

Vaguely related -- fix a backwards comment in markvoldirty().

markvoldirty() is too specialized.  The bug wouldn't have existed if
updatefats() had been used instead of the direct bread()/bwrite() of
a single block in markvoldirty().  There are also some locking and
blocksize problems with this special i/o.  However, I prefer not to
write the all backups for marking the volume dirty, and it was easy
to keep them marked clean in updatefats().  Writing the primary FAT
when only it is dirty is already 1 too many writes.  So it is a feature
that markvoldirty() only writes 1 block.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130201182606.A1492>