Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 26 Nov 2013 18:41:13 -0500 (EST)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        FreeBSD FS <freebsd-fs@freebsd.org>
Cc:        Kostik Belousov <kib@freebsd.org>
Subject:   RFC: NFS client patch to reduce sychronous writes
Message-ID:  <731168702.21452440.1385509273449.JavaMail.root@uoguelph.ca>
In-Reply-To: <1139579526.21452374.1385509250511.JavaMail.root@uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
------=_Part_21452438_736821265.1385509273446
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

Hi,

The current NFS client does a synchronous write
to the server when a non-contiguous write to the
same buffer cache block occurs. This is done because
there is a single dirty byte range recorded in the
buf structure. This results in a lot of synchronous
writes for software builds (I believe it is the loader
that loves to write small non-contiguous chunks to
its output file). Some users disable synchronous
writing on the server to improve performance, but
this puts them at risk of data loss when the server
crashes.

Long ago jhb@ emailed me a small patch that avoided
the synchronous writes by simply making the dirty byte
range a superset of the bytes written. The problem
with doing this is that for a rare (possibly non-existent)
application that writes non-overlapping byte ranges
to the same file from multiple clients concurrently,
some of these writes might get lost by stale data in
the superset of the byte range being written back to
the server. (Crappy, run on sentence, but hopefully
it makes sense;-)

I created a patch that maintained a list of dirty byte
ranges. It was complicated and I found that the list
often had to be > 100 entries to avoid the synchronous
writes.

So, I think his solution is preferable, although I've
added a couple of tweaks:
- The synchronous writes (old/current algorithm) is still
  used if there has been file locking done on the file.
  (I think any app. that writes a file from multiple clients
   will/should use file locking.)
- The synchronous writes (old/current algorithm) is used
  if a sysctl is set. This will avoid breakage for any app.
  (if there is one) that writes a file from multiple clients
  without doing file locking.

For testing on my very slow single core hardware, I see about
a 10% improvement in kernel build times, but with fewer I/O
RPCs:
             Read RPCs  Write RPCs
old/current  50K        122K
patched      39K         40K
--> it reduced the Read RPC count by about 20% and cut the
    Write RPC count to 1/3rd.
I think jhb@ saw pretty good performance results with his patch.

Anyhow, the patch is attached and can also be found here:
  http://people.freebsd.org/~rmacklem/noncontig-write.patch

I'd like to invite folks to comment/review/test this patch,
since I think it is ready for head/current.

Thanks, rick
ps: Kostik, maybe you could look at it. In particular, I am
    wondering if I zero'd out the buffer the correct way, via
    vfs_bio_bzero_buf()?

------=_Part_21452438_736821265.1385509273446
Content-Type: text/x-patch; name=noncontig-write.patch
Content-Disposition: attachment; filename=noncontig-write.patch
Content-Transfer-Encoding: base64

LS0tIGZzL25mc2NsaWVudC9uZnNfY2xiaW8uYy5vcmlnCTIwMTMtMDgtMjggMTg6NDU6NDEuMDAw
MDAwMDAwIC0wNDAwCisrKyBmcy9uZnNjbGllbnQvbmZzX2NsYmlvLmMJMjAxMy0xMS0yNSAyMTo0
MjoxNi4wMDAwMDAwMDAgLTA1MDAKQEAgLTcyLDYgKzcyLDEyIEBAIGV4dGVybiBpbnQgbmZzX2tl
ZXBfZGlydHlfb25fZXJyb3I7CiAKIGludCBuY2xfcGJ1Zl9mcmVlY250ID0gLTE7CS8qIHN0YXJ0
IG91dCB1bmxpbWl0ZWQgKi8KIAorU1lTQ1RMX0RFQ0woX3Zmc19uZnMpOworCitzdGF0aWMgaW50
CW5jbF9vbGRub25jb250aWd3cml0aW5nID0gMDsKK1NZU0NUTF9JTlQoX3Zmc19uZnMsIE9JRF9B
VVRPLCBvbGRfbm9uY29udGlnX3dyaXRpbmcsIENUTEZMQUdfUlcsCisJICAgJm5jbF9vbGRub25j
b250aWd3cml0aW5nLCAwLCAiTkZTIHVzZSBvbGQgbm9uY29udGlnIHdyaXRpbmcgYWxnIik7CisK
IHN0YXRpYyBzdHJ1Y3QgYnVmICpuZnNfZ2V0Y2FjaGVibGsoc3RydWN0IHZub2RlICp2cCwgZGFk
ZHJfdCBibiwgaW50IHNpemUsCiAgICAgc3RydWN0IHRocmVhZCAqdGQpOwogc3RhdGljIGludCBu
ZnNfZGlyZWN0aW9fd3JpdGUoc3RydWN0IHZub2RlICp2cCwgc3RydWN0IHVpbyAqdWlvcCwKQEAg
LTg3NCw3ICs4ODAsNyBAQCBuY2xfd3JpdGUoc3RydWN0IHZvcF93cml0ZV9hcmdzICphcCkKIAlz
dHJ1Y3QgdmF0dHIgdmF0dHI7CiAJc3RydWN0IG5mc21vdW50ICpubXAgPSBWRlNUT05GUyh2cC0+
dl9tb3VudCk7CiAJZGFkZHJfdCBsYm47Ci0JaW50IGJjb3VudDsKKwlpbnQgYmNvdW50LCBub25j
b250aWdfd3JpdGUsIG9iY291bnQ7CiAJaW50IGJwX2NhY2hlZCwgbiwgb24sIGVycm9yID0gMCwg
ZXJyb3IxOwogCXNpemVfdCBvcmlnX3Jlc2lkLCBsb2NhbF9yZXNpZDsKIAlvZmZfdCBvcmlnX3Np
emUsIHRtcF9vZmY7CkBAIC0xMDM3LDcgKzEwNDMsMTUgQEAgYWdhaW46CiAJCSAqIHVuYWxpZ25l
ZCBidWZmZXIgc2l6ZS4KIAkJICovCiAJCW10eF9sb2NrKCZucC0+bl9tdHgpOwotCQlpZiAodWlv
LT51aW9fb2Zmc2V0ID09IG5wLT5uX3NpemUgJiYgbikgeworCQlpZiAoKG5wLT5uX2ZsYWcgJiBO
SEFTQkVFTkxPQ0tFRCkgPT0gMCAmJgorCQkgICAgbmNsX29sZG5vbmNvbnRpZ3dyaXRpbmcgPT0g
MCkKKwkJCW5vbmNvbnRpZ193cml0ZSA9IDE7CisJCWVsc2UKKwkJCW5vbmNvbnRpZ193cml0ZSA9
IDA7CisJCWlmICgodWlvLT51aW9fb2Zmc2V0ID09IG5wLT5uX3NpemUgfHwKKwkJICAgIChub25j
b250aWdfd3JpdGUgIT0gMCAmJgorCQkgICAgbGJuID09IChucC0+bl9zaXplIC8gYmlvc2l6ZSkg
JiYKKwkJICAgIHVpby0+dWlvX29mZnNldCArIG4gPiBucC0+bl9zaXplKSkgJiYgbikgewogCQkJ
bXR4X3VubG9jaygmbnAtPm5fbXR4KTsKIAkJCS8qCiAJCQkgKiBHZXQgdGhlIGJ1ZmZlciAoaW4g
aXRzIHByZS1hcHBlbmQgc3RhdGUgdG8gbWFpbnRhaW4KQEAgLTEwNDUsOCArMTA1OSw4IEBAIGFn
YWluOgogCQkJICogbmZzbm9kZSBhZnRlciB3ZSBoYXZlIGxvY2tlZCB0aGUgYnVmZmVyIHRvIHBy
ZXZlbnQKIAkJCSAqIHJlYWRlcnMgZnJvbSByZWFkaW5nIGdhcmJhZ2UuCiAJCQkgKi8KLQkJCWJj
b3VudCA9IG9uOwotCQkJYnAgPSBuZnNfZ2V0Y2FjaGVibGsodnAsIGxibiwgYmNvdW50LCB0ZCk7
CisJCQlvYmNvdW50ID0gbnAtPm5fc2l6ZSAtIChsYm4gKiBiaW9zaXplKTsKKwkJCWJwID0gbmZz
X2dldGNhY2hlYmxrKHZwLCBsYm4sIG9iY291bnQsIHRkKTsKIAogCQkJaWYgKGJwICE9IE5VTEwp
IHsKIAkJCQlsb25nIHNhdmU7CkBAIC0xMDU4LDkgKzEwNzIsMTIgQEAgYWdhaW46CiAJCQkJbXR4
X3VubG9jaygmbnAtPm5fbXR4KTsKIAogCQkJCXNhdmUgPSBicC0+Yl9mbGFncyAmIEJfQ0FDSEU7
Ci0JCQkJYmNvdW50ICs9IG47CisJCQkJYmNvdW50ID0gb24gKyBuOwogCQkJCWFsbG9jYnVmKGJw
LCBiY291bnQpOwogCQkJCWJwLT5iX2ZsYWdzIHw9IHNhdmU7CisJCQkJaWYgKG5vbmNvbnRpZ193
cml0ZSAhPSAwICYmIGJjb3VudCA+IG9iY291bnQpCisJCQkJCXZmc19iaW9fYnplcm9fYnVmKGJw
LCBvYmNvdW50LCBiY291bnQgLQorCQkJCQkgICAgb2Jjb3VudCk7CiAJCQl9CiAJCX0gZWxzZSB7
CiAJCQkvKgpAQCAtMTE1OSwxOSArMTE3NiwyMyBAQCBhZ2FpbjoKIAkJICogYXJlYSwganVzdCB1
cGRhdGUgdGhlIGJfZGlydHlvZmYgYW5kIGJfZGlydHllbmQsCiAJCSAqIG90aGVyd2lzZSBmb3Jj
ZSBhIHdyaXRlIHJwYyBvZiB0aGUgb2xkIGRpcnR5IGFyZWEuCiAJCSAqCisJCSAqIElmIHRoZXJl
IGhhcyBiZWVuIGEgZmlsZSBsb2NrIGFwcGxpZWQgdG8gdGhpcyBmaWxlCisJCSAqIG9yIHZmcy5u
ZnMub2xkX25vbmNvbnRpZ193cml0aW5nIGlzIHNldCwgZG8gdGhlIGZvbGxvd2luZzoKIAkJICog
V2hpbGUgaXQgaXMgcG9zc2libGUgdG8gbWVyZ2UgZGlzY29udGlndW91cyB3cml0ZXMgZHVlIHRv
CiAJCSAqIG91ciBoYXZpbmcgYSBCX0NBQ0hFIGJ1ZmZlciAoIGFuZCB0aHVzIHZhbGlkIHJlYWQg
ZGF0YQogCQkgKiBmb3IgdGhlIGhvbGUpLCB3ZSBkb24ndCBiZWNhdXNlIGl0IGNvdWxkIGxlYWQg
dG8KIAkJICogc2lnbmlmaWNhbnQgY2FjaGUgY29oZXJlbmN5IHByb2JsZW1zIHdpdGggbXVsdGlw
bGUgY2xpZW50cywKIAkJICogZXNwZWNpYWxseSBpZiBsb2NraW5nIGlzIGltcGxlbWVudGVkIGxh
dGVyIG9uLgogCQkgKgotCQkgKiBBcyBhbiBvcHRpbWl6YXRpb24gd2UgY291bGQgdGhlb3JldGlj
YWxseSBtYWludGFpbgotCQkgKiBhIGxpbmtlZCBsaXN0IG9mIGRpc2NvbnRpbnVvdXMgYXJlYXMs
IGJ1dCB3ZSB3b3VsZCBzdGlsbAotCQkgKiBoYXZlIHRvIGNvbW1pdCB0aGVtIHNlcGFyYXRlbHkg
c28gdGhlcmUgaXNuJ3QgbXVjaAotCQkgKiBhZHZhbnRhZ2UgdG8gaXQgZXhjZXB0IHBlcmhhcHMg
YSBiaXQgb2YgYXN5bmNocm9uaXphdGlvbi4KKwkJICogSWYgdmZzLm5mcy5vbGRfbm9uY29udGln
X3dyaXRpbmcgaXMgbm90IHNldCBhbmQgdGhlcmUgaGFzCisJCSAqIG5vdCBiZWVuIGZpbGUgbG9j
a2luZyBkb25lIG9uIHRoaXMgZmlsZToKKwkJICogUmVsYXggY29oZXJlbmN5IGEgYml0IGZvciB0
aGUgc2FrZSBvZiBwZXJmb3JtYW5jZSBhbmQKKwkJICogZXhwYW5kIHRoZSBjdXJyZW50IGRpcnR5
IHJlZ2lvbiB0byBjb250YWluIHRoZSBuZXcKKwkJICogd3JpdGUgZXZlbiBpZiBpdCBtZWFucyB3
ZSBtYXJrIHNvbWUgbm9uLWRpcnR5IGRhdGEgYXMKKwkJICogZGlydHkuCiAJCSAqLwogCi0JCWlm
IChicC0+Yl9kaXJ0eWVuZCA+IDAgJiYKKwkJaWYgKG5vbmNvbnRpZ193cml0ZSA9PSAwICYmIGJw
LT5iX2RpcnR5ZW5kID4gMCAmJgogCQkgICAgKG9uID4gYnAtPmJfZGlydHllbmQgfHwgKG9uICsg
bikgPCBicC0+Yl9kaXJ0eW9mZikpIHsKIAkJCWlmIChid3JpdGUoYnApID09IEVJTlRSKSB7CiAJ
CQkJZXJyb3IgPSBFSU5UUjsKLS0tIGZzL25mc2NsaWVudC9uZnNub2RlLmgub3JpZwkyMDEzLTEx
LTE5IDE4OjE3OjM3LjAwMDAwMDAwMCAtMDUwMAorKysgZnMvbmZzY2xpZW50L25mc25vZGUuaAky
MDEzLTExLTI1IDIxOjI5OjU4LjAwMDAwMDAwMCAtMDUwMApAQCAtMTU3LDYgKzE1Nyw3IEBAIHN0
cnVjdCBuZnNub2RlIHsKICNkZWZpbmUJTkxPQ0tXQU5UCTB4MDAwMTAwMDAgIC8qIFdhbnQgdGhl
IHNsZWVwIGxvY2sgKi8KICNkZWZpbmUJTk5PTEFZT1VUCTB4MDAwMjAwMDAgIC8qIENhbid0IGdl
dCBhIGxheW91dCBmb3IgdGhpcyBmaWxlICovCiAjZGVmaW5lCU5XUklURU9QRU5FRAkweDAwMDQw
MDAwICAvKiBIYXMgYmVlbiBvcGVuZWQgZm9yIHdyaXRpbmcgKi8KKyNkZWZpbmUJTkhBU0JFRU5M
T0NLRUQJMHgwMDA4MDAwMCAgLyogSGFzIGJlZW4gZmlsZSBsb2NrZWQuICovCiAKIC8qCiAgKiBD
b252ZXJ0IGJldHdlZW4gbmZzbm9kZSBwb2ludGVycyBhbmQgdm5vZGUgcG9pbnRlcnMKLS0tIGZz
L25mc2NsaWVudC9uZnNfY2x2bm9wcy5jLm9yaWcJMjAxMy0xMS0xOSAxODoxOTo0Mi4wMDAwMDAw
MDAgLTA1MDAKKysrIGZzL25mc2NsaWVudC9uZnNfY2x2bm9wcy5jCTIwMTMtMTEtMjUgMjE6MzI6
NDcuMDAwMDAwMDAwIC0wNTAwCkBAIC0zMDc5LDYgKzMwNzksMTAgQEAgbmZzX2FkdmxvY2soc3Ry
dWN0IHZvcF9hZHZsb2NrX2FyZ3MgKmFwKQogCQkJCQlucC0+bl9jaGFuZ2UgPSB2YS52YV9maWxl
cmV2OwogCQkJCX0KIAkJCX0KKwkJCS8qIE1hcmsgdGhhdCBhIGZpbGUgbG9jayBoYXMgYmVlbiBh
Y3F1aXJlZC4gKi8KKwkJCW10eF9sb2NrKCZucC0+bl9tdHgpOworCQkJbnAtPm5fZmxhZyB8PSBO
SEFTQkVFTkxPQ0tFRDsKKwkJCW10eF91bmxvY2soJm5wLT5uX210eCk7CiAJCX0KIAkJTkZTVk9Q
VU5MT0NLKHZwLCAwKTsKIAkJcmV0dXJuICgwKTsKQEAgLTMwOTgsNiArMzEwMiwxMiBAQCBuZnNf
YWR2bG9jayhzdHJ1Y3Qgdm9wX2FkdmxvY2tfYXJncyAqYXApCiAJCQkJZXJyb3IgPSBFTk9MQ0s7
CiAJCQl9CiAJCX0KKwkJaWYgKGVycm9yID09IDAgJiYgYXAtPmFfb3AgPT0gRl9TRVRMSykgewor
CQkJLyogTWFyayB0aGF0IGEgZmlsZSBsb2NrIGhhcyBiZWVuIGFjcXVpcmVkLiAqLworCQkJbXR4
X2xvY2soJm5wLT5uX210eCk7CisJCQlucC0+bl9mbGFnIHw9IE5IQVNCRUVOTE9DS0VEOworCQkJ
bXR4X3VubG9jaygmbnAtPm5fbXR4KTsKKwkJfQogCX0KIAlyZXR1cm4gKGVycm9yKTsKIH0K
------=_Part_21452438_736821265.1385509273446--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?731168702.21452440.1385509273449.JavaMail.root>