From owner-freebsd-bugs  Fri Aug 14 12:10:06 1998
Return-Path: <owner-freebsd-bugs@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id MAA22850
          for freebsd-bugs-outgoing; Fri, 14 Aug 1998 12:10:06 -0700 (PDT)
          (envelope-from owner-freebsd-bugs@FreeBSD.ORG)
Received: from freefall.freebsd.org (freefall.FreeBSD.ORG [204.216.27.21])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id MAA22845
          for <freebsd-bugs@FreeBSD.org>; Fri, 14 Aug 1998 12:10:04 -0700 (PDT)
          (envelope-from gnats@FreeBSD.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.8.8/8.8.5) id MAA27836;
	Fri, 14 Aug 1998 12:10:01 -0700 (PDT)
Date: Fri, 14 Aug 1998 12:10:01 -0700 (PDT)
Message-Id: <199808141910.MAA27836@freefall.freebsd.org>
To: freebsd-bugs@FreeBSD.ORG
From: Luoqi Chen <luoqi@watermarkgroup.com>
Subject: Re: kern/7418 (file corruption on mmap-based-read during file write())
Reply-To: Luoqi Chen <luoqi@watermarkgroup.com>
Sender: owner-freebsd-bugs@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

The following reply was made to PR kern/7418; it has been noted by GNATS.

From: Luoqi Chen <luoqi@watermarkgroup.com>
To: dillon@backplane.com, luoqi@chen.ml.org
Cc: freebsd-gnats-submit@freebsd.org, luoqi@watermarkgroup.com
Subject: Re: kern/7418 (file corruption on mmap-based-read during file write())
Date: Fri, 14 Aug 1998 15:02:26 -0400 (EDT)

 > 
 >     I've noticed something else in regards to the corruption which might
 >     throw more light on the problem.
 > 
 >     [one page]
 > 
 >     A,B,C,D ... discrete usenet articles stored in file.
 > 
 >     [AAAAAAA][AAABBBBBB][BBBBBBBB][BBBBCCCCCCdddddd][DDDDDD...]
 >     page #1	#2	    #3		#4		#5
 > 
 >     The lower case 'd's indicate corruption... that is, areas of the file
 >     that were corrupted to 0x00.  The interesting item is that not only does
 >     the corruption end at a page boundry, it *begins* at the beginning of
 >     an article.  That is, article 'C' does not get corrupted at all, nor
 >     is there a piece of the beginning of D that is not corrupted... the 
 >     corruption begins at the beginning of article D and ends at the page
 >     boundry (article D continues past the page boundry.  The portion after
 >     the page boundry is not corrupted).
 > 
 I have a very interesting observation of the spool file dump in the PR 7418
 report, article 'C' (<6pip3s$f5b$1@nnrp1.dejanews.com>) was completely lost,
 so it is not just the article that crosses the page boundary, corruption
 could occur for articles precede it and on the same page. I also notice that
 both in PR 7418 and in your illustration above, corruptions take place on
 even numbered pages, i.e., 2nd page of the two-page FFS blocks, is this
 always the case? Are you aware of any other commonalities among all
 corruption cases you had seen?
 
 >     So what about this possibillity (this is only a possibility, not an actual
 >     trace):
 > 
 > 	* process 1 write()'s article B to the file
 > 
 > 	* process 1 write()'s article C to the file
 > 
 > 	* process 2 mmap's and faults pages associated with C (i.e. page #4)
 > 		(at this point, 'D' has not been written yet and no corruption
 > 		has yet occured, page #4 properly contains 00's after the end
 > 		of article C).  This is a PROT_READ,MAP_SHARED map.
 > 
 > 	* kernel starts writing page #4 to disk  (or kernel starts writing
 > 	  page #4 to disk and then process #2 faults it in for reading).
 > 
 > 	* process 1  write()'s article D to the file while page #4 is 
 > 	  dirty and the I/O is in progress.  Corruption somehow occurs.
 > 
 I doubt this could happen: the buffer contains page #3 and #4 will be marked
 busy, write() will block until the I/O is complete. And when the kernel is
 writing page #4, process #2 shouldn't fault reading it since it is already
 in core.
 
 >     Is there anything fishy in this sequence of events that could cause the
 >     corruption?  The corruption I see occurs at least a dozen times a day, 
 >     probably more.  But that is out of 800,000 article appends to spool files
 >     (per day).  Thus, the window of opportunity would be relatively small.
 > 
 >     The machine in question is heavily IO loaded... it has lots of memory
 >     and there isn't much pageout/swap activity, but the memory is being 
 >     exercised very heavily due to the news spool and reading functions. 
 >     There is very heavy read-only mmap'ing of the spool files as well.  I
 >     can well imagine this creating concurrency situations that would not
 >     otherwise occur in other setups.  For example, disk-write latency 
 >     increases severely, creating a larger potential window of opportunity then
 >     if the disks were less heavily loaded.
 > 
 I have written a small test program to simulate what diablo does, and so
 far has not been able to get any corruption. The test was done on an otherwise
 idle machine. So heavy load is definitely a contributing factor here.
 
 > 						-Matt
 > 
 >     Matthew Dillon  Engineering, HiWay Technologies, Inc. & BEST Internet 
 >                     Communications
 >     <dillon@backplane.com> (Please include original email in any response)    
 > 
 > 
 -lq

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message