Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Aug 1998 06:06:43 -0700
From:      Mika Nystroem <mika@cs.caltech.edu>
To:        Peter Hawkins <thepish@FreeBSD.ORG>
Cc:        mika@varese.cs.caltech.edu, dillon@best.net, freebsd-bugs@FreeBSD.ORG
Subject:   Re: kern/7596: serious data integrity problem when reading WHILE writing NFSv3 client-end 
Message-ID:  <199808131306.GAA20197@varese.cs.caltech.edu>
In-Reply-To: Your message of "Thu, 13 Aug 1998 20:58:48 %2B1000." <Pine.BSF.3.96.980813205820.476B-100000@dana.clari.net.au> 

next in thread | previous in thread | raw e-mail | index | archive | help
Peter Hawkins writes:
>See also PR 7418 - the plot thickens...
>

Hmm, this is very curious.  You don't think the page boundary 
business could have to do with stdio buffering (or some other
mechanism for delaying writes at the page boundaries)?

I've managed to whittle down my test case to the following:

(writer)

#include <math.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
main()
{
   FILE *fp;
   int i=0;
   float d=3.01111;
   fp=fopen("x","wb");
   fwrite(&d,sizeof(float),1,fp); 
   while (1) {
      int j;
      int stop;
      while(i++%14) { fwrite(&d,sizeof(float),1,fp); }
      
      /* delay a bit */
      stop=random()%20000;
      if (random()%33) fflush(fp);
      for (j=0; j<stop; j++) { volatile double e; e=10.1; e=sin(e); }
   }
}


(reader)

#include <string.h>
#include <stdio.h>

main()
{

#define LEN sizeof(float)
   FILE *fp;
   char data[LEN];

   int len;
   fp=fopen("x","rb");

   while(
     fread(data,LEN,1,fp)
   );
}


The random stuff is actually unnecessary, but it shows that the
problem is not page-boundary-related.  The sine business is just
a delay loop, and is necessary.  If the code looks a bit weird,
it's because it's "emulating" the I/O behavior of a version of
SPICE that exhibits the same problem.  Here's what I do:  run the
writer concurrently with a shell script that keeps re-running the
reader (well you could easily add this to the program, too!)

Here's a typical od of the file "x":

0000000  133007 040100 133007 040100 133007 040100 133007 040100
*
0000560  000000 000000 000000 000000 000000 000000 000000 000000
*
0001000  000000 000000 000000 000000 000000 000000 133007 040100
0001020  133007 040100 133007 040100 133007 040100 133007 040100
*
0001560  133007 040100 133007 040100 000000 000000 000000 000000
0001600  000000 000000 000000 000000 000000 000000 000000 000000
*
0001740  133007 040100 133007 040100 133007 040100 133007 040100
*
0002340  133007 040100 000000 000000 000000 000000 000000 000000
0002360  000000 000000 000000 000000 000000 000000 000000 000000
*
0002500  000000 000000 000000 000000 000000 000000 133007 040100
0002520  133007 040100 133007 040100 133007 040100 133007 040100
*

Sure, I normally run these applications against a FreeBSD NFS server
with a four-way CCD, but this particular case was on a Slowaris
2.5 (sparcstation 1) machine with perfectly standard UFS.  I haven't
been able to exhibit it on a local FFS disk.  I also haven't been
able to exhibit it with a NetBSD NFS client, which is a bit odd
because all the code that my untrained eye found in /sys/nfs that
looked suspicious was the same on NetBSD and FreeBSD :)

  Mika
  <mika@cs.caltech.edu>

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199808131306.GAA20197>