Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 20 Dec 2007 15:41:48 -0800
From:      Julian Elischer <julian@elischer.org>
To:        Peter Schuller <peter.schuller@infidyne.com>
Cc:        freebsd-fs@freebsd.org, ticso@cicely.de, Ivan Voras <ivoras@freebsd.org>
Subject:   Re: readv: parallel or sequential?
Message-ID:  <476AFDBC.9040301@elischer.org>
In-Reply-To: <200712210036.49040.peter.schuller@infidyne.com>
References:  <fjbb3v$n60$1@ger.gmane.org>	<200712202140.08367.peter.schuller@infidyne.com>	<20071220221735.GB67140@cicely12.cicely.de> <200712210036.49040.peter.schuller@infidyne.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Peter Schuller wrote:
>> In case the application uses serialized access there is not much to do
>> beside preread or caching writes to make use of multiple spindles.
> 
> Agreed.
> 
>> But an application has to be carefull, because parallel access within
>> a single file almost always mean that access is not linear anymore, so
>> many opther performance tunings won't work as good as they could, so
>> this could easily outweight the performance gain from multiple access.
> 
> For seek bound applications you don't really care anyway. If you have a 
> mixture of stream bound and seak bound I/O going on you will run into various 
> issues which are difficult to avoid without very careful application-specific 
> tuning I think. But for the simple case of doing concurrent seek-bound I/O I 
> would expect it to be handled gracefully by the OS.
> 
> And I do mean to the same file, rather than file descriptor (in response to 
> the other post on descriptors).
> 
>> Nonlinear access from within an application has to be for another reason
>> and not as a performance tuning.
> 
> Why? Again, PostgreSQL, other databases, or any file access pattern which is 
> seek bound stands to gain more or less linearly from concurrent I/O being 
> propagated to constituent devices in a non-serialized fashion. This is a 
> pretty basic assumption in my book when designing an application. Whenever 
> something is seek bound, assuming I have concurrency in my app, I look at the 
> number of constituent devices on the device and the type of RAID or similar 
> being used (including stripe sizes in relation to the size of my I/O 
> requests, etc).
> 
> I fully expect to be able to scale linearly with the number of underlying 
> devices, assuming raid0/raid10 or something equivalent, and assuming I have a 
> concurrency that is sufficiently high to keep all drives busy.
> 
> (There are valid exceptions of course, such as raidz/raidz2. But that's beyond 
> the scope of this discussion.)

multiple reads and writes to the same file *From different file descriptors*
(same process or not) might proceed in "parallel" but readv and writev will 
be implemented serially to the filesystem. now IF THE FILESYSTEM IS NOT DOING
SYNCHRONOUS DISK ACCESSES the reads and writes might proceed in parallel or be
grouped, clustered or otherwise rearanged.

> 




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?476AFDBC.9040301>