From owner-freebsd-current@FreeBSD.ORG Wed Mar 27 20:43:38 2013 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id A6AD9222 for ; Wed, 27 Mar 2013 20:43:38 +0000 (UTC) (envelope-from maksim.yevmenkin@gmail.com) Received: from mail-oa0-f46.google.com (mail-oa0-f46.google.com [209.85.219.46]) by mx1.freebsd.org (Postfix) with ESMTP id 7B912F04 for ; Wed, 27 Mar 2013 20:43:38 +0000 (UTC) Received: by mail-oa0-f46.google.com with SMTP id k1so9252082oag.19 for ; Wed, 27 Mar 2013 13:43:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=X/XJj5AzxQPTTJGKuznstuC2IA08fAOGrirF4gqPi2I=; b=YsFlAEhxzPG3gSWzGUZwjDAUEZBiBqDHOtef4L4MNfplu66qGD+e5R3uQVRVROo6OG izm/aGmncGzScRkFHTyguGICpI8H+QtnJo4Re/1JmiowZB+07BLFFi7COYQ1hpv0DKwC IgSTJb5OdgQEBMqgcndFFPEPqPkcpkhAs8TsCWDLL3g54AdAdGuZousTQPzHkya+8UVO T989l1ZFr9rUSKTTLH9wV58a8q8yf2fZ6xA07TlGza2pgMf3WQzl4XvEULhZvsA4Kzb7 FmyAqNnIAUgZB42Ro7eyBP5j30c1NYzkU0v33rEJFCNhrZyLm4nuHUyVYs9vOISu4YZf MB5Q== MIME-Version: 1.0 X-Received: by 10.182.144.73 with SMTP id sk9mr4533699obb.20.1364417012527; Wed, 27 Mar 2013 13:43:32 -0700 (PDT) Received: by 10.76.108.7 with HTTP; Wed, 27 Mar 2013 13:43:32 -0700 (PDT) Date: Wed, 27 Mar 2013 13:43:32 -0700 Message-ID: Subject: [RFC] vfs.read_min proposal From: Maksim Yevmenkin To: current@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Mar 2013 20:43:38 -0000 Hello, i would like to get some reviews, opinions and/or comments on the patch below. a little bit background, as far as i understand, cluster_read() can initiate two disk i/o's: one for exact amount of data being requested (rounded up to a filesystem block size) and another for a configurable read ahead. read ahead data are always extra and do not super set data being requested. also, read ahead can be controlled via f_seqcount (on per descriptor basis) and/or vfs.read_max (global knob). in some cases and/or on some work loads it can be beneficial to bundle original data and read ahead data in one i/o request. in other words, read more than caller has requested, but only perform one larger i/o, i.e. super set data being requested and read ahead. === Index: trunk/cache/src/sys/kern/vfs_cluster.c =================================================================== diff -u -N -r515 -r1888 --- trunk/cache/src/sys/kern/vfs_cluster.c (.../vfs_cluster.c) (revision 515) +++ trunk/cache/src/sys/kern/vfs_cluster.c (.../vfs_cluster.c) (revision 1888) @@ -75,6 +75,10 @@ SYSCTL_INT(_vfs, OID_AUTO, read_max, CTLFLAG_RW, &read_max, 0, "Cluster read-ahead max block count"); +static int read_min = 1; +SYSCTL_INT(_vfs, OID_AUTO, read_min, CTLFLAG_RW, &read_min, 0, + "Cluster read min block count"); + /* Page expended to mark partially backed buffers */ extern vm_page_t bogus_page; @@ -169,13 +173,21 @@ } else { off_t firstread = bp->b_offset; int nblks; + long minread; KASSERT(bp->b_offset != NOOFFSET, ("cluster_read: no buffer offset")); ncontig = 0; /* + * Adjust totread if needed + */ + minread = read_min * size; + if (minread > totread) + totread = minread; + + /* * Compute the total number of blocks that we should read * synchronously. */ === thanks, max