Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Mar 2013 09:52:09 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Maksim Yevmenkin <maksim.yevmenkin@gmail.com>
Cc:        current@freebsd.org
Subject:   Re: [RFC] vfs.read_min proposal
Message-ID:  <20130328075209.GL3794@kib.kiev.ua>
In-Reply-To: <CAFPOs6rNDZTqWJZ3hK=px5RX5G44Z3hfzCLQcfceQ2n_7oU3GA@mail.gmail.com>
References:  <CAFPOs6rNDZTqWJZ3hK=px5RX5G44Z3hfzCLQcfceQ2n_7oU3GA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--DX8dQmZ7BQ17qNYG
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Mar 27, 2013 at 01:43:32PM -0700, Maksim Yevmenkin wrote:
> Hello,
>=20
> i would like to get some reviews, opinions and/or comments on the patch b=
elow.
>=20
> a little bit background, as far as i understand, cluster_read() can
> initiate two disk i/o's: one for exact amount of data being requested
> (rounded up to a filesystem block size) and another for a configurable
> read ahead. read ahead data are always extra and do not super set data
> being requested. also, read ahead can be controlled via f_seqcount (on
> per descriptor basis) and/or vfs.read_max (global knob).
>=20
> in some cases and/or on some work loads it can be beneficial to bundle
> original data and read ahead data in one i/o request. in other words,
> read more than caller has requested, but only perform one larger i/o,
> i.e. super set data being requested and read ahead.

The totread argument to the cluster_read() is supplied by the filesystem
to indicate how many data in the current request is specified. Always
overriding this information means two things:
- you fill the buffer and page cache with potentially unused data.
  For some situations, like partial reads, it would be really bad.
- you increase the latency by forcing the reader to wait for the whole
  cluster which was not asked for.

So it looks as very single- and special-purpose hack. Besides, the
global knob is obscure and probably would not have any use except your
special situation. Would a file flag be acceptable for you ?

What is the difference in the numbers you see, and what numbers ?
Is it targeted for read(2) optimizations, or are you also concerned
with the read-ahead done at the fault time ?

>=20
> =3D=3D=3D
>=20
> Index: trunk/cache/src/sys/kern/vfs_cluster.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> diff -u -N -r515 -r1888
> --- trunk/cache/src/sys/kern/vfs_cluster.c	(.../vfs_cluster.c)	(revision =
515)
> +++ trunk/cache/src/sys/kern/vfs_cluster.c	(.../vfs_cluster.c)	(revision =
1888)
> @@ -75,6 +75,10 @@
>  SYSCTL_INT(_vfs, OID_AUTO, read_max, CTLFLAG_RW, &read_max, 0,
>      "Cluster read-ahead max block count");
>=20
> +static int read_min =3D 1;
> +SYSCTL_INT(_vfs, OID_AUTO, read_min, CTLFLAG_RW, &read_min, 0,
> +    "Cluster read min block count");
> +
>  /* Page expended to mark partially backed buffers */
>  extern vm_page_t	bogus_page;
>=20
> @@ -169,13 +173,21 @@
>  	} else {
>  		off_t firstread =3D bp->b_offset;
>  		int nblks;
> +		long minread;
>=20
>  		KASSERT(bp->b_offset !=3D NOOFFSET,
>  		    ("cluster_read: no buffer offset"));
>=20
>  		ncontig =3D 0;
>=20
>  		/*
> +		 * Adjust totread if needed
> +		 */
> +		minread =3D read_min * size;
> +		if (minread > totread)
> +			totread =3D minread;
> +
> +		/*
>  		 * Compute the total number of blocks that we should read
>  		 * synchronously.
>  		 */
>=20
> =3D=3D=3D
>=20
> thanks,
> max
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

--DX8dQmZ7BQ17qNYG
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQIbBAEBAgAGBQJRU/aoAAoJEJDCuSvBvK1B+4oP9160js6W/5YvNaBOzagF54r/
wkLqFtQaBKU9QAMBY2mFOsTAvdVObrWnyHRW5PJxiHH3ZSb5qrwx6ChzcQcG73y1
ZsoUCVAKvqdEhbMpC5rgPIAxMgg96FdxkmojCDl4GBgXGOPd0tari/EJmYdnoBLB
QmusXB8cnSTp162SClUGviKUpMMiON05aBH03Xx+xmPYuuwmTsoHCrTE2vTM39o2
6J/iC7wPHUcZdBKwGjAj3fDdgl0ptrh0fwPDYjSzQHhTKnZO8d8vs6MpKVaLdtUg
yx8XMzme39piEGMBZN24hVNnqCyox6o/mhuD6VEC0n5G4p/OpgJmP9EEQfSi62Mv
s+C58LEQuvTqgOZ3h31AhiqeL8yxf+B43mY/x6TodeKMlWz2831KIeNQ2avm5HQM
U6nXeWLsbjeYMvuWLqwVH3guVcoaQwYxm6GDOLBtfNbu3J7kuYba0T8a72eEFWtj
+yfopxKzGRTc9wAaKHK17hepb36YdNY6hoKt0qTGbrnRPQbuUeieV1OZ/53ufyDs
WIgJHO8+4TLVey6bIXS+CqaeV4dWLYc6mdJmeMgK1vqJY7yDZT7GJaQ7SmXmfwdJ
qfkq82IAdlxbttdhM83TTp0cozP/QoPPeszgsJw6/GBjCLhBGbyC0RhFYM51WXBM
wo9UgUdATPhTX5c4sAQ=
=I/Jw
-----END PGP SIGNATURE-----

--DX8dQmZ7BQ17qNYG--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130328075209.GL3794>