From owner-freebsd-current@FreeBSD.ORG  Wed Mar 27 20:43:38 2013
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id A6AD9222
 for <current@freebsd.org>; Wed, 27 Mar 2013 20:43:38 +0000 (UTC)
 (envelope-from maksim.yevmenkin@gmail.com)
Received: from mail-oa0-f46.google.com (mail-oa0-f46.google.com
 [209.85.219.46]) by mx1.freebsd.org (Postfix) with ESMTP id 7B912F04
 for <current@freebsd.org>; Wed, 27 Mar 2013 20:43:38 +0000 (UTC)
Received: by mail-oa0-f46.google.com with SMTP id k1so9252082oag.19
 for <current@freebsd.org>; Wed, 27 Mar 2013 13:43:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:x-received:date:message-id:subject:from:to
 :content-type; bh=X/XJj5AzxQPTTJGKuznstuC2IA08fAOGrirF4gqPi2I=;
 b=YsFlAEhxzPG3gSWzGUZwjDAUEZBiBqDHOtef4L4MNfplu66qGD+e5R3uQVRVROo6OG
 izm/aGmncGzScRkFHTyguGICpI8H+QtnJo4Re/1JmiowZB+07BLFFi7COYQ1hpv0DKwC
 IgSTJb5OdgQEBMqgcndFFPEPqPkcpkhAs8TsCWDLL3g54AdAdGuZousTQPzHkya+8UVO
 T989l1ZFr9rUSKTTLH9wV58a8q8yf2fZ6xA07TlGza2pgMf3WQzl4XvEULhZvsA4Kzb7
 FmyAqNnIAUgZB42Ro7eyBP5j30c1NYzkU0v33rEJFCNhrZyLm4nuHUyVYs9vOISu4YZf
 MB5Q==
MIME-Version: 1.0
X-Received: by 10.182.144.73 with SMTP id sk9mr4533699obb.20.1364417012527;
 Wed, 27 Mar 2013 13:43:32 -0700 (PDT)
Received: by 10.76.108.7 with HTTP; Wed, 27 Mar 2013 13:43:32 -0700 (PDT)
Date: Wed, 27 Mar 2013 13:43:32 -0700
Message-ID: <CAFPOs6rNDZTqWJZ3hK=px5RX5G44Z3hfzCLQcfceQ2n_7oU3GA@mail.gmail.com>
Subject: [RFC] vfs.read_min proposal
From: Maksim Yevmenkin <maksim.yevmenkin@gmail.com>
To: current@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Mar 2013 20:43:38 -0000

Hello,

i would like to get some reviews, opinions and/or comments on the patch below.

a little bit background, as far as i understand, cluster_read() can
initiate two disk i/o's: one for exact amount of data being requested
(rounded up to a filesystem block size) and another for a configurable
read ahead. read ahead data are always extra and do not super set data
being requested. also, read ahead can be controlled via f_seqcount (on
per descriptor basis) and/or vfs.read_max (global knob).

in some cases and/or on some work loads it can be beneficial to bundle
original data and read ahead data in one i/o request. in other words,
read more than caller has requested, but only perform one larger i/o,
i.e. super set data being requested and read ahead.

===

Index: trunk/cache/src/sys/kern/vfs_cluster.c
===================================================================
diff -u -N -r515 -r1888
--- trunk/cache/src/sys/kern/vfs_cluster.c	(.../vfs_cluster.c)	(revision 515)
+++ trunk/cache/src/sys/kern/vfs_cluster.c	(.../vfs_cluster.c)	(revision 1888)
@@ -75,6 +75,10 @@
 SYSCTL_INT(_vfs, OID_AUTO, read_max, CTLFLAG_RW, &read_max, 0,
     "Cluster read-ahead max block count");

+static int read_min = 1;
+SYSCTL_INT(_vfs, OID_AUTO, read_min, CTLFLAG_RW, &read_min, 0,
+    "Cluster read min block count");
+
 /* Page expended to mark partially backed buffers */
 extern vm_page_t	bogus_page;

@@ -169,13 +173,21 @@
 	} else {
 		off_t firstread = bp->b_offset;
 		int nblks;
+		long minread;

 		KASSERT(bp->b_offset != NOOFFSET,
 		    ("cluster_read: no buffer offset"));

 		ncontig = 0;

 		/*
+		 * Adjust totread if needed
+		 */
+		minread = read_min * size;
+		if (minread > totread)
+			totread = minread;
+
+		/*
 		 * Compute the total number of blocks that we should read
 		 * synchronously.
 		 */

===

thanks,
max