Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 08 May 2003 11:18:55 -0700
From:      Kirk McKusick <mckusick@beastie.mckusick.com>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        Jens Schweikhardt <schweikh@FreeBSD.org>
Subject:   Re: Access times on executables (kern/25777) 
Message-ID:  <200305081818.h48IItTh026354@beastie.mckusick.com>
In-Reply-To: Your message of "Thu, 08 May 2003 18:28:33 %2B1000." <20030508175350.S59316@gamplex.bde.org> 

next in thread | previous in thread | raw e-mail | index | archive | help
Thanks for your very through analysis of the options!

Since most filesystems are not mounted with -async, I am curious
how the numbers change when you run them normally (ufs with soft
updates and NFS client/server just normally) since that is what
folks are likely to really do. Also, does option 1 actually result
in access times being set on NFS, or are the times rejected because
the binaries are not owned by the user?

	Kirk McKusick

=-=-=-=-=-=

Date: Thu, 8 May 2003 18:28:33 +1000 (EST)
From: Bruce Evans <bde@zeta.org.au>
To: Kirk McKusick <mckusick@mckusick.com>
cc: Jens Schweikhardt <schweikh@FreeBSD.org>, arch@FreeBSD.org,
        Brian Buhrow <buhrow@lothlorien.nfbcal.org>
Subject: Re: Access times on executables (kern/25777) 
In-Reply-To: <20030505205914.D8183@gamplex.bde.org>
X-ASK-Info: Whitelist match

I'm replying to an earlier article in this thread since I lost your last
reply.

On Sun, 4 May 2003, Kirk McKusick wrote:

> Still, I think that VOP_READ is likely to be more efficient and work in
> a higher percentage of the times (always on the local filesystem and
> most of the time on NFS). Thus, I enclose my (new) proposed change below.

> Index: sys/kern_exec.c
> ===================================================================
> RCS file: /usr/ncvs/src/sys/kern/kern_exec.c,v
> retrieving revision 1.218
> diff -c -r1.218 kern_exec.c
> *** kern_exec.c	1 Apr 2003 01:26:20 -0000	1.218
> --- kern_exec.c	5 May 2003 06:15:21 -0000
> ***************
> *** 598,603 ****
> --- 598,626 ----
>   		exec_setregs(td, imgp->entry_addr,
>   		    (u_long)(uintptr_t)stack_base, imgp->ps_strings);
>
> + 	/*
> + 	 * At this point, it counts as an access.
> + 	 * Ensure that atime gets updated if needed.
> + 	 */
> + 	if (!(textvp->v_mount->mnt_flag & MNT_NOATIME)) {
> + 		struct uio auio;
> + 		struct iovec aiov;
> + 		char ch;
> +
> + 		auio.uio_iov = &aiov;
> + 		auio.uio_iovcnt = 1;
> + 		aiov.iov_base = &ch;
> + 		aiov.iov_len = 1;
> + 		auio.uio_resid = 1;
> + 		auio.uio_offset = 0;
> + 		auio.uio_segflg = UIO_SYSSPACE;
> + 		auio.uio_rw = UIO_READ;
> + 		auio.uio_td = td;
> + 		vn_lock(textvp, LK_EXCLUSIVE | LK_RETRY, td);
> + 		(void) VOP_READ(textvp, &auio, 0, p->p_ucred);
> + 		VOP_UNLOCK(textvp, 0, td);
> + 	}
> +
>   done1:
>   	/*
>   	 * Free any resources malloc'd earlier that we didn't use.

I ran some benchmarks after fixing some bugs in the above (textvp is
stale (maybe NULL the first time?), and as jhb pointed out, p->p_ucred
should be td->td_ucred).

I benchmarked the following methods:
- method 1: my original method using VOP_SETATTR() updated
- method 2: fixed version of above
- method 3: version of the above that just calls vn_rdwr().

%%%
Index: kern_exec.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/kern_exec.c,v
retrieving revision 1.208
diff -u -2 -r1.208 kern_exec.c
--- kern_exec.c	13 Jan 2003 23:04:31 -0000	1.208
+++ kern_exec.c	8 May 2003 07:11:05 -0000
@@ -1,2 +1,4 @@
+int exec_atime_method = -1;
+
 /*
  * Copyright (c) 1993, David Greenman
@@ -611,4 +599,49 @@
 		    (u_long)(uintptr_t)stack_base, imgp->ps_strings);

+ 	/*
+ 	 * At this point, it counts as an access.
+ 	 * Ensure that atime gets updated if needed.
+ 	 */
+	if (!(ndp->ni_vp->v_mount->mnt_flag & MNT_NOATIME)) {
+	if (exec_atime_method == 1) {
+		struct vattr vattr;
+		struct mount *mp;
+
+		if (vn_start_write(ndp->ni_vp, &mp, V_WAIT | PCATCH) != 0)
+			goto out;
+		VATTR_NULL(&vattr);
+		vfs_timestamp(&vattr.va_atime);
+		vattr.va_vaflags |= VA_EXECVE_ATIME;
+		VOP_LEASE(ndp->ni_vp, td, td->td_ucred, LEASE_WRITE);
+		vn_lock(ndp->ni_vp, LK_EXCLUSIVE | LK_RETRY, td);
+		(void) VOP_SETATTR(ndp->ni_vp, &vattr, td->td_ucred, td);
+		VOP_UNLOCK(ndp->ni_vp, 0, td);
+		vn_finished_write(mp);
+out: ;
+	} else if (exec_atime_method == 2) {
+		struct iovec aiov;
+		struct uio auio;
+		char ch;
+
+		auio.uio_iov = &aiov;
+		auio.uio_iovcnt = 1;
+		aiov.iov_base = &ch;
+		aiov.iov_len = 1;
+		auio.uio_resid = 1;
+		auio.uio_offset = 0;
+		auio.uio_segflg = UIO_SYSSPACE;
+		auio.uio_rw = UIO_READ;
+		auio.uio_td = td;
+		vn_lock(ndp->ni_vp, LK_EXCLUSIVE | LK_RETRY, td);
+		(void) VOP_READ(ndp->ni_vp, &auio, 0, p->p_ucred);
+		VOP_UNLOCK(ndp->ni_vp, 0, td);
+	} else if (exec_atime_method == 3) {
+		char ch;
+
+		(void) vn_rdwr(UIO_READ, ndp->ni_vp, &ch, 1, 0, UIO_SYSSPACE,
+		    0, td->td_ucred, td->td_ucred, NULL, td);
+	}
+	}
+
 done1:
 	/*
%%%

The VOP_READ() methods had much more overhead than I expected or like
(about 25% for a tiny statically linked program (`main() {}').  Even
the VOP_SETATTR() method was faster.

Raw output:

%%%
ufs-static-noexec
        4.32 real         0.04 user         4.13 sys
        4.23 real         0.07 user         4.01 sys
        4.30 real         0.06 user         4.10 sys
ufs-static-static
    method -1
        7.02 real         0.06 user         6.77 sys
        7.10 real         0.07 user         6.83 sys
        7.09 real         0.06 user         6.83 sys
        7.03 real         0.19 user         6.65 sys
        7.11 real         0.22 user         6.68 sys
        7.00 real         0.04 user         6.77 sys
        7.12 real         0.19 user         6.75 sys
        7.07 real         0.22 user         6.66 sys
        7.07 real         0.07 user         6.81 sys
    method 1
        7.72 real         0.23 user         7.29 sys
        7.76 real         0.25 user         7.31 sys
        7.71 real         0.30 user         7.21 sys
        7.69 real         0.29 user         7.21 sys
        7.70 real         0.31 user         7.18 sys
        7.71 real         0.19 user         7.32 sys
    method 2
        7.83 real         0.07 user         7.56 sys
        7.84 real         0.27 user         7.37 sys
        7.82 real         0.28 user         7.34 sys
    method 3
        7.90 real         0.26 user         7.43 sys
        7.89 real         0.05 user         7.62 sys
        7.92 real         0.05 user         7.66 sys
        7.86 real         0.05 user         7.61 sys
        7.88 real         0.18 user         7.49 sys
        7.84 real         0.21 user         7.43 sys
        7.88 real         0.31 user         7.36 sys
        7.82 real         0.28 user         7.34 sys
        7.86 real         0.24 user         7.42 sys

nfs-static-noexec
        4.13 real         0.05 user         3.95 sys
        4.20 real         0.17 user         3.89 sys
        4.25 real         0.25 user         3.87 sys
nfs-static-static
    method -1
       11.43 real         0.04 user         9.18 sys
       11.41 real         0.21 user         8.96 sys
       11.43 real         0.27 user         8.94 sys
    method 1
       11.54 real         0.28 user         8.88 sys
       11.64 real         0.19 user         9.08 sys
       11.53 real         0.17 user         8.97 sys
    method 2
       11.77 real         0.09 user         9.49 sys
       11.78 real         0.25 user         9.31 sys
       11.80 real         0.33 user         9.26 sys
    method 3
       11.94 real         0.24 user         9.46 sys
       11.86 real         0.32 user         9.35 sys
       11.81 real         0.28 user         9.35 sys
%%%

*fs-static-noexec tests 10000 forks with no exec of a statically linked
program on *fs.

*fs-static-static tests 10000 forks with exec of statically linked
programs on *fs.

The system is an old Celeron (the speed doesn't matter much; only compare
the relative times).

All file systems were mounted -async.

The exec part of the fork-exec takes an average of about 270 usec for the
ufs case.  Setting the atime adds about 60 usec to this for method 1.
Method 2 adds another 10 usec.  Method 3 adds another 4 usec to method 2.

Plain read() takes much less than 60 usec on this machine:

	$ dd if=/kernel of=/dev/null bs=1 count=1000000
	1000000+0 records in
	1000000+0 records out
	1000000 bytes transferred in 17.703695 secs (56485 bytes/sec)

so most of the overheads for exec must be indirect.  I think they are
just close() forcing update marks to timestamps.  This happens after
every exec, but not after every read in the dd benchmark.  I think
method 1 turns out to be fastest since it forces the update marks to
timestamps directly.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200305081818.h48IItTh026354>