Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 8 Jan 2010 02:15:10 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Alexander Motin <mav@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject:   Re: svn commit: r201658 - head/sbin/geom/class/stripe
Message-ID:  <20100108013737.S56162@delplex.bde.org>
In-Reply-To: <4B450F30.20705@FreeBSD.org>
References:  <201001061712.o06HCICF087127@svn.freebsd.org> <9bbcef731001060938k2b0014a2m15eef911b9922b2c@mail.gmail.com>  <4B44D8FA.2000608@FreeBSD.org> <9bbcef731001061103u33fd289q727179454b21ce18@mail.gmail.com> <4B450F30.20705@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 7 Jan 2010, Alexander Motin wrote:

> Ivan Voras wrote:
>> Yes, my experience which lead to the post was mostly on UFS which,
>> while AFAIK it does read-ahead, it still does it serially (I think
>> this is implied by your experiments with NCQ and ZFS vs UFS) - so in
>> any case only 2 drives are hit with 64k stripe size at any moment in
>> time.
>
> I do not think it is true. On system with default MAXPHYS I've made
> gstripe with 64K block of 4 equal drives with 108MB/s of maximal read
> speed. Reads with dd from large pre-written file on UFS shown:
>
> vfs.read_max=8 (default) - 235090074 bytes/sec
> vfs.read_max=16          - 378385148 bytes/sec
> vfs.read_max=32          - 386620109 bytes/sec

Maybe I'm wrong about it being limited by MAXPHYS.  'racluster' is
limited by MAXPHYS, but 'maxra' (vfs.read_max) is not, and these
interact confusingly.

BTW, vfs.read_max has bogus units -- fs blocks (bsize not fsize for
ffs IIRC).  The default of 8 works very badly when the fs block size
is small (512 say).  In my version, the units are DEV_BSIZE blocks and
the default is the default MAXPHYS/DEV_BSIZE (should be MAXPHYS/DEV_BSIZE).

> I've put some printfs into the clustering read code and found enough
> read-ahead there. So it works.
>
> One thing IMHO would be nice to see there is the alignment of the
> read-ahead requests to the array stripe size/offset. Dirty hack I've
> tried there, reduced number of requests to the array components by 30%.

ffs thinks that bsize alignment is adequate.  It doesn't try to align
files any more than that.  Then for sequential reads from the beginning
of the file, vfs read clustering tries to read MAXPHYS bytes at a time,
so it perfectly preserves any initial misalignment.  I'm not sure what
happens for large random reads.  Does seeking ouside of the read-ahead
reset the alignment to the seek point?  It shouldn't, if alignment
done by the file system is to work right.  However, vfs should re-align
if the file system or user i/o doesn't, so that all of its reads of
mnt_iosize_max bytes start on an alignment boundary.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100108013737.S56162>