From owner-cvs-src@FreeBSD.ORG Tue Oct 4 23:51:37 2005 Return-Path: X-Original-To: cvs-src@FreeBSD.org Delivered-To: cvs-src@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9CD0116A41F; Tue, 4 Oct 2005 23:51:37 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0B0B743D46; Tue, 4 Oct 2005 23:51:36 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.0.86]) by mailout1.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j94NpZ3H022440; Wed, 5 Oct 2005 09:51:35 +1000 Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy1.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j94NpX9L022743; Wed, 5 Oct 2005 09:51:33 +1000 Date: Wed, 5 Oct 2005 09:51:33 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Diomidis Spinellis In-Reply-To: <200510041458.j94Eww99013781@repoman.freebsd.org> Message-ID: <20051005084913.V49706@delplex.bde.org> References: <200510041458.j94Eww99013781@repoman.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/vm vm_mmap.c X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Oct 2005 23:51:38 -0000 On Tue, 4 Oct 2005, Diomidis Spinellis wrote: > dds 2005-10-04 14:58:58 UTC > > FreeBSD src repository > > Modified files: > sys/vm vm_mmap.c > Log: > Update the vnode's access time after an mmap operation on it. > Before this change a copy operation with cp(1) would not update the > file access times. > > According to the POSIX mmap(2) documentation: the st_atime field > of the mapped file may be marked for update at any time between the > mmap() call and the corresponding munmap() call. The initial read > or write reference to a mapped region shall cause the file's st_atime > field to be marked for update if it has not already been marked for > update. % Index: vm_mmap.c % =================================================================== % RCS file: /home/ncvs/src/sys/vm/vm_mmap.c,v % retrieving revision 1.201 % retrieving revision 1.202 % diff -u -2 -r1.201 -r1.202 % --- vm_mmap.c 20 Sep 2005 22:08:27 -0000 1.201 % +++ vm_mmap.c 4 Oct 2005 14:58:58 -0000 1.202 % @@ -1165,4 +1165,16 @@ % *objp = obj; % *flagsp = flags; % + % + /* Update access time. */ % + if ((vp->v_mount->mnt_flag & MNT_NOATIME) == 0) { % + struct vattr vattr; % + struct timespec ts; % + % + VATTR_NULL(&vattr); % + vfs_timestamp(&ts); % + vattr.va_atime = ts; % + (void)VOP_SETATTR(vp, &vattr, td->td_ucred, td); % + % + } % done: % vput(vp); This is a large pessimization for nfs and a usually-small pessimization for local file systems. It seems to be missing necesssary calls to vn_{start,finish}_write() and the locking messes needed to make these calls. For nfs, it causes an rpc call for every mmap, even when everything is cached so there are no other rpc calls, and even though nfs's support for atimes in general is quite broken (normal atime updates for read() only occur if the data is not cached so that rpcs to read the data are required; these rpcs update the atime accidentally iff the server's MNT_NOATIME flag is not set, but the above forces an update iff the client's MNT_NOATIME is not set) Thus for cp on nfs, this change gives the silly behaviour that if cp uses mmap then the atime is always updated if the client's MNT_NOATIME flag is not set, but if cp uses read() then the atime is only marked for update, only in non-determinate cases, only if the server's MNT_NOATIME flag is not set. For local file systems, it inefficiencies by updating instead of marking for update. The inefficiencies are fs-dependent (except for the cost of vfs_timestamp() which may be significant). For ffs, even if it is mounted -sync, the updates are implemented using delayed writes so they are not much more inefficient than marking for update. Unfortunately, the update marks are fs-dependent so it is not possible to just set them in the vnode. Most of these problems are avoided for the similar pessimization of execve(). My version of part of this is: % /* % * Here we should update the access time of the file. This must % * implemented by the underlying file system in the same way as % * access timestamps for VOP_READ() because we want to avoid % * blocking and/or i/o and have not called vn_start_write(). % */ % if ((imgp->vp->v_mount->mnt_flag & (MNT_NOATIME | MNT_RDONLY)) == 0) { % VATTR_NULL(&atimeattr); % atimeattr.va_vaflags |= VA_EXECVE_ATIME; % (void)VOP_SETATTR(imgp->vp, &atimeattr, td->td_ucred, td); % } This uses a new VA_ flag to tell VOP_SETATTR() to only mark the atime. Only ffs supports this flag. When only this flag is set, VOP_SETATTR() should do nothing for file systems like nfs that don't really support atimes. As mentioned in the comment, marking the atime must not involve writing so that callers don't have to worry about calling vn_{start,finish}_write(). See setutimes() for the easy part of the setup needed to call VOP_SETATTR() just for setting a time. In mmap(), things are more complicated because the vnode is already unlocked. I think you can just unlock it early. If not, see kern_mknod() for the messy locking needed for using vn_start_write() on locked vnodes (it is necessary to unlock and restart in some cases). I think you should use the VA_EXECVE_ATIME flag as above after renaming it to VA_MARK_ATIME and/or putting the code in a utility function. The patch has some style bugs (comment does less than echo the code (the update is conditional), nested declarations, unsorted declarations, misplaced blank line). Bruce