Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Jan 2003 12:26:10 +0100 (CET)
From:      Dmitry Karasik <dmitry@karasik.eu.org>
To:        FreeBSD-gnats-submit@FreeBSD.org
Subject:   kern/47628: msdosfs file corruption fix
Message-ID:  <200301291126.h0TBQAPR001353@raven.plab.ku.dk>

next in thread | raw e-mail | index | archive | help

>Number:         47628
>Category:       kern
>Synopsis:       msdosfs file corruption fix
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Wed Jan 29 03:30:01 PST 2003
>Closed-Date:
>Last-Modified:
>Originator:     Dmitry Karasik
>Release:        FreeBSD 4.7-STABLE i386
>Organization:
>Environment:
System: FreeBSD raven.plab.ku.dk 4.7-STABLE FreeBSD 4.7-STABLE #30: Wed Jan 29 11:07:09 CET 2003 root@raven.plab.ku.dk:/usr/obj/usr/src/sys/RAVEN i386


>Description:

        MSDOSFS does not support file holes, and when application writes beyound
        the end of file, pads the gap with zeroes. Under certain conditions
        this does not happen, - the files either filled with junk, or neighbor
        files get corrupted.

        I observed the bug on FAT32 partitions, 4GB and 57GB on two machines
        with 4.7-STABLE. I didn't observe the bug on FAT32 partitions on 4.3-STABLE.

>How-To-Repeat:

        The following perl script, if run on msdosfs, writes 10 files filled with
        an unique character. The files 0, 1, 2, 3, 4 are either filled with junk at 
        at the end, or random zero chunks are present in the middle of file. In the 
        latter case the script prints "inconsistency at ... " message.

        
            use strict;

            my $i;

            my $space = 5000000;
            my $maxfd = 10;
            my $use_nofd = 1;
            $|=1;
            print "init\n";

            my $ufdlim = $use_nofd ? int( $maxfd/2) : ($maxfd-1);

            for ( $i = 0; $i < $maxfd; $i++) {
               open F, "> $i";
               my $c = ($i) x 0;
               print F $c;
               close F;
            }

            eval "open F$_, qq(+< $_)" for 0 .. $ufdlim;

            print "start\n";

            my $t = 0;

            while ( 1) {
               my $fno  = int rand $maxfd;
               my $size = int rand ( $space / 100) + 1;

               my $fn;
               if ( $fno <= $ufdlim) {
                  $fn = eval "*F$fno";
               } else {
                  $fn = \*F;
                  open F, "+< $fno" or die "Error on $fno:$!\n";
                  sysseek F, 0, 2 or die "is:$!\n";
               }
               $size = 1024;

               $t += $size;

               my $sc = ( substr($fno,-1,1)) x $size;
               print int($t*100/$space/$maxfd), "%\r";
               sysseek $fn, 2047, 1 or die "seek $fno:$!";
               syswrite $fn, "X" or die "write $fno:$!";
               sysseek $fn, -2048, 1 or die "Seek $fno:$!";
               syswrite $fn, $sc or die "Write $fno:$!";
               close F if $fno >= $ufdlim;
               last if $fn <= $ufdlim && -s $fno > $space;
            }

            eval "close F$_;" for 0 .. $ufdlim;

            print "test\n";

            my $bigp = 0;
            for ( $i = 0; $i <= $ufdlim; $i++) {
               open F, "< $i";
               my $size = -s $i;
               my $lsize = $size - 2048 - 10240;
               my $sz;
               my $r;
               my $pos = 0;
               my $ss = substr($i,-1,1);

               my $rd = 10240;
               
               while (( $sz = sysread (F, $r, $rd)) > 0) {
                  my $str = ($ss) x $sz;
                  warn sprintf("$i inconsistent at %08x\n", $pos) if $r ne $str; 
                  $pos += $sz;
                  $bigp += $sz;
                  print int( $bigp * 100 / $size / $maxfd), "%\r";
                  if ( $lsize >= 10240) {
                     $lsize -= 10240;
                  } elsif ( $lsize == 0) {
                     last;
                  } else {
                     $rd = $lsize; 
                     $lsize = 0;
                  }
               }
               close F;
            #   unlink $i;
            }

            print "done\n";

>Fix:

	Since the corruption does not happen when files are written without seek-over-eof,
        it seems that the zero-padding code in MSDOSFS is faulty. The patch with more
        thorough explanation follows, but it would be necessary for someone with knowledge
        of msdosfs and vfs look at it first. Since the code didn't change after 1997, I
        believe it is some bsd-specific bio/vfs changes that triggered the bug.

        
--- msdosfs_fat.c	Wed Jan 29 11:58:46 2003
+++ /usr/src/sys/msdosfs/msdosfs_fat.c	Wed Jan 29 11:27:47 2003
@@ -1092,7 +1092,7 @@
 					*bpp = bp;
 					bpp = NULL;
 				} else
-					bdwrite(bp);
+					bwrite(bp);
 			}
 		}
 	}


--- msdosfs_vnops.c	Wed Jan 29 11:58:46 2003
+++ /usr/src/sys/msdosfs/msdosfs_vnops.c	Wed Jan 29 11:36:23 2003
@@ -710,11 +710,45 @@
 	 * files with holes in them, DOS doesn't so we must fill the hole
 	 * with zeroed blocks.
 	 */
+        /*
+
+        This code below was commented after it was noticed that writing beyond 
+        a file produces random corruption of the current file and its neighbors. 
+        This does not happen with regular file extension though. Example code
+
+           seek( f, 2047, SEEK_END); 
+           write( f, x, 1);
+           seek( f, -2048, SEEK_END);
+           write( f, x, 1024);
+
+        repeated simultaneously on several files leaves random garbage in the end
+        of some files, which points either on a race in FAT cluster allocation code,
+        or/and that the zero chunks that suppose to fill the gap were written to
+        a wrong place.
+        
+        The probable cause is that deextend() calls extendfile(.., DE_CLEAR),
+        which flushes zeroed blocks by bdwrite(). The patch reverts it to
+        bwrite() instead, and calls extendfile() directly. Since deextend() also
+        invokes deupdat(), it is probably a place where a race could occur, so
+        this is another reason to call extendfile() directly.
+
+        OTOH I do not know if it is o.k. not to update de_FileSize as dextend()
+        does, or probably vnode_pager_setsize() call must be done also. Although
+        it seems that the patch fights the corruptions, and ever if it does 
+        successfully, I believe the patch must be reviewed by someone who has
+        deeper knowledge of kernel, primarily because the corruptions were noticed
+        on 4.7-STABLE, but not on 4.3-STABLE, so I presume some changes in vfs
+        influenced msdosfs.
+
+        Dmitry Karasik, <dmitry@karasik.eu.org>
+        29.01.2003
+
 	if (uio->uio_offset > dep->de_FileSize) {
 		error = deextend(dep, uio->uio_offset, cred);
 		if (error)
 			return (error);
 	}
+        */
 
 	/*
 	 * Remember some values in case the write fails.
@@ -726,6 +760,14 @@
 	 * If we write beyond the end of the file, extend it to its ultimate
 	 * size ahead of the time to hopefully get a contiguous area.
 	 */
+	if (uio->uio_offset > dep->de_FileSize) {
+		count = de_clcount(pmp, uio->uio_offset + resid) -
+			de_clcount(pmp, osize);
+		error = extendfile(dep, count, NULL, NULL, DE_CLEAR);
+		if (error &&  (error != ENOSPC || (ioflag & IO_UNIT)))
+			goto errexit;
+		lastcn = dep->de_fc[FC_LASTFC].fc_frcn;
+	} else
 	if (uio->uio_offset + resid > osize) {
 		count = de_clcount(pmp, uio->uio_offset + resid) -
 			de_clcount(pmp, osize);


>Release-Note:
>Audit-Trail:
>Unformatted:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200301291126.h0TBQAPR001353>