Date: Thu, 28 Nov 2013 22:00:36 -0800 From: Kirk McKusick <mckusick@mckusick.com> To: Konstantin Belousov <kostikbel@gmail.com> Cc: FreeBSD FS <freebsd-fs@freebsd.org> Subject: Re: RFC: NFS client patch to reduce sychronous writes Message-ID: <201311290600.rAT60aff046648@chez.mckusick.com> In-Reply-To: <20131128071821.GH59496@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
> Date: Thu, 28 Nov 2013 09:18:21 +0200 > From: Konstantin Belousov <kostikbel@gmail.com> > To: Kirk McKusick <mckusick@mckusick.com> > Cc: Rick Macklem <rmacklem@uoguelph.ca>, FreeBSD FS <freebsd-fs@freebsd.org> > Subject: Re: RFC: NFS client patch to reduce sychronous writes > > On Wed, Nov 27, 2013 at 03:20:14PM -0800, Kirk McKusick wrote: >> The ``fix'' of bzero'ing every buffer cache page was made to UFS/FFS >> for this problem and it killed write performance of the filesystem >> by nearly half. We corrected this by only doing the bzero when the >> file is mmap'ed which helped things considerably (since most files >> being written are not also bmap'ed). > > I am not sure that I follow. > > For UFS, leaving any part of the buffer with undefined garbage would > cause the garbage to appear on the next mmap(2), since page in is > implemented as translation of the file offsets into disk offsets and > than reading disk blocks. The read always fetch full page. UFS cannot > know if the file would be mapped sometime in future, or after the > reboot. > > In fact, UFS is quite plentiful WRT zeroing buffers on write. It is easy > to see almost all places where it is done, by searching for BA_CLRBUF > flag for UFS_BALLOC(). UFS does perform the optimization of _trying_ to > not clear newly allocated buffer on write if uio covers the whole buffer > range. Still, on error it falls back to clearing, which is performed by > vfs_bio_clrbuf() call in ffs_write(). You are entirely correct in your analysis. The original "fix" was to always clear every buffer even when it was being completely filled (which is the most common case). I changed the filling completely case to first try the copyin and only zeroing it when the copyin fails. Making that change nearly doubled the the speed of bulk writes. ~Kirk
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201311290600.rAT60aff046648>