Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 May 2016 17:10:52 -0400
From:      Allan Jude <allanjude@freebsd.org>
To:        freebsd-cloud@freebsd.org
Subject:   Re: Help with a performance issue
Message-ID:  <57339FDC.4090608@freebsd.org>
In-Reply-To: <CA%2Bin=xDhse7V3x%2BJoXJzueucznCAckMGPRXiNptVbF7DfgmX_Q@mail.gmail.com>
References:  <CA%2Bin=xCqiOruEPFLut0s88XQbbsUOsZ%2Ba0DC8acBurcdG2_8uQ@mail.gmail.com> <CA%2Bin=xDhse7V3x%2BJoXJzueucznCAckMGPRXiNptVbF7DfgmX_Q@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2016-05-11 16:22, Ben Howard via freebsd-cloud wrote:
> The short story is that we are seeing profound differences between UFS and
> ZFS in terms of network to disk writes.
> 
> # FreeBSD with ZFS
> $ fetch http://speedtest.bahnhof.se/1000M.zip
> 1000M.zip 27% of 1000 MB 11 MBps 01m00s
> 
> # Stock FreeBSD in NYC3 with UFS:
> $ fetch http://speedtest.bahnhof.se/1000M.zip
> 1000M.zip 1% of 1000 MB 529 kBps 32m12s
> 
> The test case is flaky, unfortunately.  Switching the backing filesystem,
> IMHO, should not illicit such a massive difference.
> 
> Any ideas?
> 
> Thanks,
> Ben

Part of this will be the way that ZFS works. In the case of an
asynchronous write like downloading a file, it will buffer the writes to
ram, and flush them to disk in one contiguous block every 5 seconds
(unless the memory buffer gets full, or some other operation causes a sync)

Using a slightly more predictable test:
# dd if=/dev/zero of=zerofile bs=1m count=1k
1024+0 records in
1024+0 records out
1073741824 bytes transferred in 97.438103 secs (11019732 bytes/sec)

I get about the same performance on UFS as you were getting on ZFS.

However, running 'gstat' (GEOM stat, GEOM is the FreeBSD storage
abstraction layer), shows very high write latency, sometimes operations
taking as much as 1000ms to complete.

 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
   11      7      0      0    0.0      7    884   1041  107.1| vtbd0

During the write I saw speeds as high as 25MB/s, but often much less
than that.

There also seem to be stalls where nothing happens for a second.

Interestingly, I actually saw steadier performance when UFS was mounted
with the 'sync' option, which I would have expected to decrease
performance a great deal.

The average latency went from 50-100ms to under 2ms:

 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0    587      0      0    0.0    587  18719    1.8   99.6| vtbd0

    1   1551      1     32    0.2   1550  49564    0.5   74.5| vtbd0

When I did the test again...
# dd if=/dev/zero of=zerofile3 bs=1m count=1k
1024+0 records in
1024+0 records out
1073741824 bytes transferred in 45.618443 secs (23537450 bytes/sec)

It was faster.

However, when mounted with 'sync', the amount of writes measured in
gstat if twice as high as the amount generated by userland.

Using mbuffer to rate-limit writes to 10mb/s, a 'sync' mounted / showed
20mb/s of writes to the drive.

dd if=/dev/zero bs=1m count=1k | mbuffer -o zerofile -s 128k -b 64 -R 10m

in @ 10.0 MiB/s, out @ 10.0 MiB/s,  243 MiB total, buffer 100% fulll
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0    722      0      0    0.0    722  23061    0.6   41.8| vtbd0


Switching back to 'noasync' (the default):
Suddenly the speed is very fast.

in @  123 MiB/s, out @  123 MiB/s, 3446 MiB total, buffer  98% full
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    3   1049      0      0    0.0   1049 133702    5.4   95.9| vtbd0


Very strange. I'll do some more tests with a fresh droplet, this was
from an existing one.

I wonder if it has to do with lazy allocation? I know ZFS developers
have seen this issue on Amazon, where if they full the drive with 0's
before its first use, performance is much better afterwards.


-- 
Allan Jude



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?57339FDC.4090608>