Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Aug 1998 20:40:01 -0700 (PDT)
From:      Matthew Dillon <dillon@backplane.com>
To:        freebsd-bugs@FreeBSD.ORG
Subject:   Re: kern/7418 (file corruption on mmap-based-read during file write())
Message-ID:  <199808140340.UAA15933@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/7418; it has been noted by GNATS.

From: Matthew Dillon <dillon@backplane.com>
To: Luoqi Chen <luoqi@chen.ml.org>
Cc: freebsd-gnats-submit@freebsd.org, luoqi@watermarkgroup.com
Subject: Re: kern/7418 (file corruption on mmap-based-read during file write())
Date: Thu, 13 Aug 1998 20:33:41 -0700 (PDT)

     I'm trying to track down the second PR I sent in... kern/7418 in this
     case.  I am concentrating on what happens when a page fault from an mmap'd
     file occurs in one process while a second process is blocked in
     ufs/ufs_readwrite() on the same file.
 
     The file corruption that I am seeing is approximately this:
 
     [this represents one page of memory]
 
 	[data][data][data..00 00 00 00][data]
 
     Where 00's replace what should have been valid data in the file.
     data written into the locations where the corruption occurs is being 
     replaced by 00.  i.e. I might see 
 
 	'abcdef [00 00 00] (PAGE BOUNDRY) jklmnop'.
 
     The most interesting item is that the 00 corruption always *ends* at
     a page boundry in the file.
 
     --
 
     So here is my question.  Refer to ufs/ufs_readwrite.c around line 337,
     as shown below.   What happens if VOP_BALLOC() blocks or uiomove()
     blocks during the discrete write() and, while blocked, another process
     has a read fault on precisely the same logical page via mmap()?
 
     Is it possible for ufs_readwrite to obtain a bp, copy the write() data
     to it, but for the bp to then somehow be thrown away?  
 
     I also don't understand why B_RELBUF is being set for the bp.  Can't 
     this cause the bp to be completely thrown away (aka kern/vfs_bio.c line 
     710)??   Makes sense for a READ, but I don't understand why B_RELBUF
     is being set for the bp in the WRITE.
 
         for (error = 0; uio->uio_resid > 0;) {
                 lbn = lblkno(fs, uio->uio_offset);
                 blkoffset = blkoff(fs, uio->uio_offset);
                 xfersize = fs->fs_bsize - blkoffset;
                 if (uio->uio_resid < xfersize)
                         xfersize = uio->uio_resid;
 
                 if (uio->uio_offset + xfersize > ip->i_size)
                         vnode_pager_setsize(vp, uio->uio_offset + xfersize);
 
                 if (fs->fs_bsize > xfersize)
                         flags |= B_CLRBUF;
                 else
                         flags &= ~B_CLRBUF;
 /* XXX is uio->uio_offset the right thing here? */
                 error = VOP_BALLOC(vp, uio->uio_offset, xfersize,
                     ap->a_cred, flags, &bp);
                 if (error != 0)
                         break;
 
                 if (uio->uio_offset + xfersize > ip->i_size) {
                         ip->i_size = uio->uio_offset + xfersize;
                         extended = 1;
                 }
 
                 size = BLKSIZE(fs, ip, lbn) - bp->b_resid;
                 if (size < xfersize)
                         xfersize = size;
 
                 error =
                     uiomove((char *)bp->b_data + blkoffset, (int)xfersize, uio);
                 if ((ioflag & IO_VMIO) &&
                    (LIST_FIRST(&bp->b_dep) == NULL))
                         bp->b_flags |= B_RELBUF;
 
 		...
 	}
 
     Finally, I tried looking in other filesystem device code for comparable
     source.  The ext2fs code seems to simply copy the ufs code:
 
                 error =
                     uiomove((char *)bp->b_data + blkoffset, (int)xfersize, uio);
                 if ((ioflag & IO_VMIO) &&
                    (LIST_FIRST(&bp->b_dep) == NULL)) /* in ext2fs? */
                         bp->b_flags |= B_RELBUF; 
 
     The msdosfs does not set B_RELBUF in either its read or write, nor does
     the nfs code.   It's all very confusing.
 
 						-Matt
 
     Matthew Dillon  Engineering, HiWay Technologies, Inc. & BEST Internet 
                     Communications
     <dillon@backplane.com> (Please include original email in any response)    
 

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199808140340.UAA15933>