Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 5 May 2003 21:30:39 +1000 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Kirk McKusick <mckusick@beastie.mckusick.com>
Cc:        Jens Schweikhardt <schweikh@FreeBSD.org>
Subject:   Re: Access times on executables (kern/25777) 
Message-ID:  <20030505205914.D8183@gamplex.bde.org>
In-Reply-To: <200305050617.h456HNTh017652@beastie.mckusick.com>
References:  <200305050617.h456HNTh017652@beastie.mckusick.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 4 May 2003, Kirk McKusick wrote:

> > From: Bruce Evans <bde@zeta.org.au>
> >
> > On Sun, 4 May 2003, Kirk McKusick wrote:
> > > OK, so how about instead of use VOP_SETATTR, we just try to VOP_READ
> > > one byte of data. It will run with the speed of read (and indeed since
> > > we just mapped in the header above, the data should be in the cache).
> > > It has the benefit of speed and not requiring the user to own the file.
> > > It has the drawback of requiring that the file be readable though most
> > > executables are set to be readable.
> >
> > It would work better than that.  VOP_READ() works right here since the
> > permissions checks are done at the vfs level at open(2) time, not on every
> > VOP_READ() or read(2).
> >
> > I think VOP_SETATTR() should work similarly: do permissions checks at
> > the vfs level using VOP_ACCESS() instead of in every file system, so
> > that there is less duplicated code and the permissions checks don't
> > get in the way when you don't want them.  This change is now easy for
> > at least setting times.  In Lite2, ufs_setattr() had to dereference

> It will work right on the local filesystem, but I think that it would
> still be broken on NFS since NFS does permission checking on every RPC.

I assume that "it" is VOP_ACCESS() here.  Yes, we don't want to go near
VOP_ACCESS() or the normal path of VOP_SETATTR(set_times) for remote
file systems since the access checking would be very expensive.

> Still, I think that VOP_READ is likely to be more efficient and work in
> a higher percentage of the times (always on the local filesystem and
> most of the time on NFS). Thus, I enclose my (new) proposed change below.

> Index: sys/kern_exec.c
> ===================================================================
> RCS file: /usr/ncvs/src/sys/kern/kern_exec.c,v
> retrieving revision 1.218
> diff -c -r1.218 kern_exec.c
> *** kern_exec.c	1 Apr 2003 01:26:20 -0000	1.218
> --- kern_exec.c	5 May 2003 06:15:21 -0000
> ***************
> *** 598,603 ****
> --- 598,626 ----
>   		exec_setregs(td, imgp->entry_addr,
>   		    (u_long)(uintptr_t)stack_base, imgp->ps_strings);
>
> + 	/*
> + 	 * At this point, it counts as an access.
> + 	 * Ensure that atime gets updated if needed.
> + 	 */
> + 	if (!(textvp->v_mount->mnt_flag & MNT_NOATIME)) {

Note that this check doesn't help for the most expensive case (nfs),
since nfs doesn't implement the noatime mount flag so it never sets
MNT_NOATIME.  However, nfs doesn't actually implement setting of atimes
either -- if the read goes to the server, then the server sets the atime
(unless the file system on the server is mounted -noatime) and the change
is eventually seen by the clent, but most reads on the client are from
the buffer cache so they don't affect the atime on the server unless we
tell it, and we never tell it directly.  This seems to be required for
reasonable efficiency.

> + 		struct uio auio;
> + 		struct iovec aiov;
> + 		char ch;
> +
> + 		auio.uio_iov = &aiov;
> + 		auio.uio_iovcnt = 1;
> + 		aiov.iov_base = &ch;
> + 		aiov.iov_len = 1;
> + 		auio.uio_resid = 1;
> + 		auio.uio_offset = 0;
> + 		auio.uio_segflg = UIO_SYSSPACE;
> + 		auio.uio_rw = UIO_READ;
> + 		auio.uio_td = td;
> + 		vn_lock(textvp, LK_EXCLUSIVE | LK_RETRY, td);
> + 		(void) VOP_READ(textvp, &auio, 0, p->p_ucred);

In the nfs clase, this read is very unlikely to do anything, because
we read the header earlier so this read is very likely to be from
the buffer cache.  Then the previous read may have set the atime, but
this one won't.

> + 		VOP_UNLOCK(textvp, 0, td);
> + 	}
> +
>   done1:
>   	/*
>   	 * Free any resources malloc'd earlier that we didn't use.
>

I think this version would be OK if it skipped remote file systems
explicitly.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030505205914.D8183>