From owner-freebsd-current@FreeBSD.ORG Thu Mar 28 07:52:14 2013 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 03FB1150 for ; Thu, 28 Mar 2013 07:52:14 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id 566062D3 for ; Thu, 28 Mar 2013 07:52:13 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.6/8.14.6) with ESMTP id r2S7q9gr072406; Thu, 28 Mar 2013 09:52:09 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.0 kib.kiev.ua r2S7q9gr072406 Received: (from kostik@localhost) by tom.home (8.14.6/8.14.6/Submit) id r2S7q90K072405; Thu, 28 Mar 2013 09:52:09 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 28 Mar 2013 09:52:09 +0200 From: Konstantin Belousov To: Maksim Yevmenkin Subject: Re: [RFC] vfs.read_min proposal Message-ID: <20130328075209.GL3794@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="DX8dQmZ7BQ17qNYG" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: current@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Mar 2013 07:52:14 -0000 --DX8dQmZ7BQ17qNYG Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Mar 27, 2013 at 01:43:32PM -0700, Maksim Yevmenkin wrote: > Hello, >=20 > i would like to get some reviews, opinions and/or comments on the patch b= elow. >=20 > a little bit background, as far as i understand, cluster_read() can > initiate two disk i/o's: one for exact amount of data being requested > (rounded up to a filesystem block size) and another for a configurable > read ahead. read ahead data are always extra and do not super set data > being requested. also, read ahead can be controlled via f_seqcount (on > per descriptor basis) and/or vfs.read_max (global knob). >=20 > in some cases and/or on some work loads it can be beneficial to bundle > original data and read ahead data in one i/o request. in other words, > read more than caller has requested, but only perform one larger i/o, > i.e. super set data being requested and read ahead. The totread argument to the cluster_read() is supplied by the filesystem to indicate how many data in the current request is specified. Always overriding this information means two things: - you fill the buffer and page cache with potentially unused data. For some situations, like partial reads, it would be really bad. - you increase the latency by forcing the reader to wait for the whole cluster which was not asked for. So it looks as very single- and special-purpose hack. Besides, the global knob is obscure and probably would not have any use except your special situation. Would a file flag be acceptable for you ? What is the difference in the numbers you see, and what numbers ? Is it targeted for read(2) optimizations, or are you also concerned with the read-ahead done at the fault time ? >=20 > =3D=3D=3D >=20 > Index: trunk/cache/src/sys/kern/vfs_cluster.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > diff -u -N -r515 -r1888 > --- trunk/cache/src/sys/kern/vfs_cluster.c (.../vfs_cluster.c) (revision = 515) > +++ trunk/cache/src/sys/kern/vfs_cluster.c (.../vfs_cluster.c) (revision = 1888) > @@ -75,6 +75,10 @@ > SYSCTL_INT(_vfs, OID_AUTO, read_max, CTLFLAG_RW, &read_max, 0, > "Cluster read-ahead max block count"); >=20 > +static int read_min =3D 1; > +SYSCTL_INT(_vfs, OID_AUTO, read_min, CTLFLAG_RW, &read_min, 0, > + "Cluster read min block count"); > + > /* Page expended to mark partially backed buffers */ > extern vm_page_t bogus_page; >=20 > @@ -169,13 +173,21 @@ > } else { > off_t firstread =3D bp->b_offset; > int nblks; > + long minread; >=20 > KASSERT(bp->b_offset !=3D NOOFFSET, > ("cluster_read: no buffer offset")); >=20 > ncontig =3D 0; >=20 > /* > + * Adjust totread if needed > + */ > + minread =3D read_min * size; > + if (minread > totread) > + totread =3D minread; > + > + /* > * Compute the total number of blocks that we should read > * synchronously. > */ >=20 > =3D=3D=3D >=20 > thanks, > max > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" --DX8dQmZ7BQ17qNYG Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIbBAEBAgAGBQJRU/aoAAoJEJDCuSvBvK1B+4oP9160js6W/5YvNaBOzagF54r/ wkLqFtQaBKU9QAMBY2mFOsTAvdVObrWnyHRW5PJxiHH3ZSb5qrwx6ChzcQcG73y1 ZsoUCVAKvqdEhbMpC5rgPIAxMgg96FdxkmojCDl4GBgXGOPd0tari/EJmYdnoBLB QmusXB8cnSTp162SClUGviKUpMMiON05aBH03Xx+xmPYuuwmTsoHCrTE2vTM39o2 6J/iC7wPHUcZdBKwGjAj3fDdgl0ptrh0fwPDYjSzQHhTKnZO8d8vs6MpKVaLdtUg yx8XMzme39piEGMBZN24hVNnqCyox6o/mhuD6VEC0n5G4p/OpgJmP9EEQfSi62Mv s+C58LEQuvTqgOZ3h31AhiqeL8yxf+B43mY/x6TodeKMlWz2831KIeNQ2avm5HQM U6nXeWLsbjeYMvuWLqwVH3guVcoaQwYxm6GDOLBtfNbu3J7kuYba0T8a72eEFWtj +yfopxKzGRTc9wAaKHK17hepb36YdNY6hoKt0qTGbrnRPQbuUeieV1OZ/53ufyDs WIgJHO8+4TLVey6bIXS+CqaeV4dWLYc6mdJmeMgK1vqJY7yDZT7GJaQ7SmXmfwdJ qfkq82IAdlxbttdhM83TTp0cozP/QoPPeszgsJw6/GBjCLhBGbyC0RhFYM51WXBM wo9UgUdATPhTX5c4sAQ= =I/Jw -----END PGP SIGNATURE----- --DX8dQmZ7BQ17qNYG--