Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 16 Dec 2001 12:24:28 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Conrad Minshall <conrad@apple.com>
Cc:        Jordan Hubbard <jkh@winston.freebsd.org>, Peter Wemm <peter@wemm.org>, Mike Smith <msmith@FreeBSD.ORG>, hackers@FreeBSD.ORG, msmith@mass.dis.org
Subject:   Re: Found NFS data corruption bug... (was Re: NFS: How to make FreeBSD   fall on its face in one easy step )
Message-ID:  <200112162024.fBGKOSt22277@apollo.backplane.com>
References:  <58885.1008217148@winston.freebsd.org> <l03130304b8413ae1338b@[17.219.180.26]> <l03130302b841fdae1ebc@[17.219.180.26]>

next in thread | previous in thread | raw e-mail | index | archive | help
:Two things I've done to speed it up are to restrict the size of transfers
:(use the -o flag) and eliminate all the size checks (use the -n flag).
:
:Why would MFS be much faster than UFS?  On the server doesn't the whole
:file end up cached?  ...and the metadata changes likewise via softupdates.

    With NFSv3 it's semi-asynchronous (2-phase commit), but you still must
    eventually commit the data.  Also, ftruncate() on the FreeBSD server
    side creates a special case that requires all data to be flushed out
    to its physical media.  Kirk had so many problems trying to integrate
    ftruncate() into softupdates that he finally gave up and threw the 
    FSYNC in.

    I'm guessing that the FSYNC in ftruncate() on the server side is mostly
    responsible for the slowness.  The NFSv3 two-phase commits are reasonably
    decoupled for smallish (< 8MB) file sizes but we definitely have some
    sequencing problems with larger file sizes... I should be able to 
    saturate a 100BaseT network link doing a linear NFS write and the best I
    can get is 2-3 MBytes/sec.  I'm analying the problem but may not be able
    to get a patch in by the 4.5 release.

:Running fsx during resource shortages (low memory or buf structs) has
:exposed a bug or two.
:
:Running it with operations 512 or page aligned also exposed bugs - see
:usage for  those flags.

    Yes, those are mainly the ones I found.  About 7 bug fixes are going
    to go into the next FreeBSD release related to mmap & DEV_BSIZE NFS
    interactions.  I'll set a MAXMEM limit and run the test in a low-memory
    environment as well.  So far with a fairly normal box + bug fixes the
    program runs fine in an overnight test.  We still have a known issue 
    with out-of-order operations from nfsiod's that apparently may come
    up after a week or so of testing.  I asked Jordan to try to track down
    the NeXT guy who fixed that one in the old NFS stack.  We also have
    known issues with multiple clients competing for the same file creating
    issues.

:Note that if you get a failure at operation 50 million there is an fsx flag
:which allows you to restart at, say, 49 million.  Of course some failures
:don't reproduce reliably at the same spot anyway.
:
:I gave out fsx source code at the recent CIFS (SMB) plugfest.  If I make
:the 2002 Connectathon I'll give it out there too.  I don't test it on
:Windows so those defines may be in need of repair.  Please send me any
:patches or cool additions.
:
:--
:Conrad Minshall, conrad@apple.com, 408 974-2749

    I've already used the restart option to good effect!  Awesome option!
    When I first ran the program it couldn't get through 10,000 operations
    without failing.  As I started fixing things in FreeBSD the number of
    operations increased.  When it got to around 30,000 I started using the
    option to reproduce the bug at the stuck-point more quickly and most
    of the time it worked.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200112162024.fBGKOSt22277>