From owner-freebsd-bugs Fri Aug 14 12:10:06 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id MAA22850 for freebsd-bugs-outgoing; Fri, 14 Aug 1998 12:10:06 -0700 (PDT) (envelope-from owner-freebsd-bugs@FreeBSD.ORG) Received: from freefall.freebsd.org (freefall.FreeBSD.ORG [204.216.27.21]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id MAA22845 for ; Fri, 14 Aug 1998 12:10:04 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.8.8/8.8.5) id MAA27836; Fri, 14 Aug 1998 12:10:01 -0700 (PDT) Date: Fri, 14 Aug 1998 12:10:01 -0700 (PDT) Message-Id: <199808141910.MAA27836@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.ORG From: Luoqi Chen Subject: Re: kern/7418 (file corruption on mmap-based-read during file write()) Reply-To: Luoqi Chen Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org The following reply was made to PR kern/7418; it has been noted by GNATS. From: Luoqi Chen To: dillon@backplane.com, luoqi@chen.ml.org Cc: freebsd-gnats-submit@freebsd.org, luoqi@watermarkgroup.com Subject: Re: kern/7418 (file corruption on mmap-based-read during file write()) Date: Fri, 14 Aug 1998 15:02:26 -0400 (EDT) > > I've noticed something else in regards to the corruption which might > throw more light on the problem. > > [one page] > > A,B,C,D ... discrete usenet articles stored in file. > > [AAAAAAA][AAABBBBBB][BBBBBBBB][BBBBCCCCCCdddddd][DDDDDD...] > page #1 #2 #3 #4 #5 > > The lower case 'd's indicate corruption... that is, areas of the file > that were corrupted to 0x00. The interesting item is that not only does > the corruption end at a page boundry, it *begins* at the beginning of > an article. That is, article 'C' does not get corrupted at all, nor > is there a piece of the beginning of D that is not corrupted... the > corruption begins at the beginning of article D and ends at the page > boundry (article D continues past the page boundry. The portion after > the page boundry is not corrupted). > I have a very interesting observation of the spool file dump in the PR 7418 report, article 'C' (<6pip3s$f5b$1@nnrp1.dejanews.com>) was completely lost, so it is not just the article that crosses the page boundary, corruption could occur for articles precede it and on the same page. I also notice that both in PR 7418 and in your illustration above, corruptions take place on even numbered pages, i.e., 2nd page of the two-page FFS blocks, is this always the case? Are you aware of any other commonalities among all corruption cases you had seen? > So what about this possibillity (this is only a possibility, not an actual > trace): > > * process 1 write()'s article B to the file > > * process 1 write()'s article C to the file > > * process 2 mmap's and faults pages associated with C (i.e. page #4) > (at this point, 'D' has not been written yet and no corruption > has yet occured, page #4 properly contains 00's after the end > of article C). This is a PROT_READ,MAP_SHARED map. > > * kernel starts writing page #4 to disk (or kernel starts writing > page #4 to disk and then process #2 faults it in for reading). > > * process 1 write()'s article D to the file while page #4 is > dirty and the I/O is in progress. Corruption somehow occurs. > I doubt this could happen: the buffer contains page #3 and #4 will be marked busy, write() will block until the I/O is complete. And when the kernel is writing page #4, process #2 shouldn't fault reading it since it is already in core. > Is there anything fishy in this sequence of events that could cause the > corruption? The corruption I see occurs at least a dozen times a day, > probably more. But that is out of 800,000 article appends to spool files > (per day). Thus, the window of opportunity would be relatively small. > > The machine in question is heavily IO loaded... it has lots of memory > and there isn't much pageout/swap activity, but the memory is being > exercised very heavily due to the news spool and reading functions. > There is very heavy read-only mmap'ing of the spool files as well. I > can well imagine this creating concurrency situations that would not > otherwise occur in other setups. For example, disk-write latency > increases severely, creating a larger potential window of opportunity then > if the disks were less heavily loaded. > I have written a small test program to simulate what diablo does, and so far has not been able to get any corruption. The test was done on an otherwise idle machine. So heavy load is definitely a contributing factor here. > -Matt > > Matthew Dillon Engineering, HiWay Technologies, Inc. & BEST Internet > Communications > (Please include original email in any response) > > -lq To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message