From owner-svn-src-all@freebsd.org Fri Jul 31 04:12:53 2015 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E7BCC9AEDD7; Fri, 31 Jul 2015 04:12:52 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D3D2F6A4; Fri, 31 Jul 2015 04:12:52 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.70]) by repo.freebsd.org (8.14.9/8.14.9) with ESMTP id t6V4CqVq007541; Fri, 31 Jul 2015 04:12:52 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by repo.freebsd.org (8.14.9/8.14.9/Submit) id t6V4CqZv007540; Fri, 31 Jul 2015 04:12:52 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201507310412.t6V4CqZv007540@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Fri, 31 Jul 2015 04:12:52 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r286106 - head/sys/kern X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jul 2015 04:12:53 -0000 Author: kib Date: Fri Jul 31 04:12:51 2015 New Revision: 286106 URL: https://svnweb.freebsd.org/changeset/base/286106 Log: vn_io_fault() handling of the LOR for i/o into the file-backed buffers has observable overhead when the buffer pages are not resident or not mapped. The overhead comes at least from two factors, one is the additional work needed to detect the situation, prepare and execute the rollbacks. Another is the consequence of the i/o splitting into the batches of the held pages, causing filesystems see series of the smaller i/o requests instead of the single large request. Note that expected case of the resident i/o buffer does not expose these issues. Provide a prefaulting for the userspace i/o buffers, disabled by default. I am careful of not enabling prefaulting by default for now, since it would be detrimental for the applications which speculatively pass extra-large buffers of anonymous memory to not deal with buffer sizing (if such apps exist). Found and tested by: bde, emaste Sponsored by: The FreeBSD Foundation MFC after: 1 week Modified: head/sys/kern/vfs_vnops.c Modified: head/sys/kern/vfs_vnops.c ============================================================================== --- head/sys/kern/vfs_vnops.c Fri Jul 31 03:40:09 2015 (r286105) +++ head/sys/kern/vfs_vnops.c Fri Jul 31 04:12:51 2015 (r286106) @@ -116,6 +116,9 @@ static const int io_hold_cnt = 16; static int vn_io_fault_enable = 1; SYSCTL_INT(_debug, OID_AUTO, vn_io_fault_enable, CTLFLAG_RW, &vn_io_fault_enable, 0, "Enable vn_io_fault lock avoidance"); +static int vn_io_fault_prefault = 0; +SYSCTL_INT(_debug, OID_AUTO, vn_io_fault_prefault, CTLFLAG_RW, + &vn_io_fault_prefault, 0, "Enable vn_io_fault prefaulting"); static u_long vn_io_faults_cnt; SYSCTL_ULONG(_debug, OID_AUTO, vn_io_faults, CTLFLAG_RD, &vn_io_faults_cnt, 0, "Count of vn_io_fault lock avoidance triggers"); @@ -1020,6 +1023,59 @@ vn_io_fault_doio(struct vn_io_fault_args uio->uio_rw); } +static int +vn_io_fault_touch(char *base, const struct uio *uio) +{ + int r; + + r = fubyte(base); + if (r == -1 || (uio->uio_rw == UIO_READ && subyte(base, r) == -1)) + return (EFAULT); + return (0); +} + +static int +vn_io_fault_prefault_user(const struct uio *uio) +{ + char *base; + const struct iovec *iov; + size_t len; + ssize_t resid; + int error, i; + + KASSERT(uio->uio_segflg == UIO_USERSPACE, + ("vn_io_fault_prefault userspace")); + + error = i = 0; + iov = uio->uio_iov; + resid = uio->uio_resid; + base = iov->iov_base; + len = iov->iov_len; + while (resid > 0) { + error = vn_io_fault_touch(base, uio); + if (error != 0) + break; + if (len < PAGE_SIZE) { + if (len != 0) { + error = vn_io_fault_touch(base + len - 1, uio); + if (error != 0) + break; + resid -= len; + } + if (++i >= uio->uio_iovcnt) + break; + iov = uio->uio_iov + i; + base = iov->iov_base; + len = iov->iov_len; + } else { + len -= PAGE_SIZE; + base += PAGE_SIZE; + resid -= PAGE_SIZE; + } + } + return (error); +} + /* * Common code for vn_io_fault(), agnostic to the kind of i/o request. * Uses vn_io_fault_doio() to make the call to an actual i/o function. @@ -1041,6 +1097,12 @@ vn_io_fault1(struct vnode *vp, struct ui ssize_t adv; int error, cnt, save, saveheld, prev_td_ma_cnt; + if (vn_io_fault_prefault) { + error = vn_io_fault_prefault_user(uio); + if (error != 0) + return (error); /* Or ignore ? */ + } + prot = uio->uio_rw == UIO_READ ? VM_PROT_WRITE : VM_PROT_READ; /*