Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 14 Nov 1998 10:41:05 +1030
From:      Greg Lehey <grog@lemis.com>
To:        Mike Smith <mike@smith.net.au>, Bernd Walter <ticso@cicely.de>
Cc:        Peter Jeremy <peter.jeremy@auss2.alcatel.com.au>, hackers@FreeBSD.ORG
Subject:   Re: [Vinum] Stupid benchmark: newfsstone
Message-ID:  <19981114104105.S781@freebie.lemis.com>
In-Reply-To: <199811132336.PAA01117@dingo.cdrom.com>; from Mike Smith on Fri, Nov 13, 1998 at 03:36:14PM -0800
References:  <19981114002523.39363@cicely.de> <199811132336.PAA01117@dingo.cdrom.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Friday, 13 November 1998 at 15:36:14 -0800, Mike Smith wrote:
>> On Fri, Nov 13, 1998 at 01:50:40PM -0800, Mike Smith wrote:
>>>> Greg Lehey <grog@lemis.com> wrote:
>>>>>  And it's almost impossible to find
>>>>> spindle synchronized disks nowadays.
>>>>
>>>> Seagate Barracuda's support it, I assumed that the newer Seagates did
>>>> as well.  The impression I got was that all you had to do was wire the
>>>> `spindle sync' lines from all the disks together and then designate
>>>> all except one as a sync'd slave.  Admittedly, I've never tried
>>>> actually using it.
>>>
>>> Most modern "server class" SCSI disks support it.  It's not useful
>>> unless you turn off tagged queueing, caching and most other drive
>>> performance features.
>> Where's the problem with these options on when using Spindle Sync?
>
> The whole point of spindle sync is to exactly lock all the drives
> together to coordinate read/write activity.  These features in
> conjunction with sector sparing and quantum differences between disks
> means that synchronising spindles is a complete waste of time, as the
> disks won't be mimicking each other anyway.

Apart from this, in a multiuser environment you won't just be
reading a single file.  Any form of write to a block device or
multiple reads will cause the block device subsystem to issue requests
at different times, so having the spindles synched, no spare sectors
and identical disks still wouldn't help.

>>>>>  Finally, aggregating involves a
>>>>> scatter/gather approach which, unless I've missed something, is not
>>>>> supported at a hardware level.  Each request to the driver specifies
>>>>> one buffer for the transfer, so the scatter gather would have to be
>>>>> done by allocating more memory and performing the transfer there (for
>>>>> a read) and then copying to the correct place.
>>>>
>>>> Since the actual data transfer occurs to physical memory, whilst the
>>>> kernel buffers are in VM, this should just require some imaginative
>>>> juggling of the PTE's so the physical pages (or actual scatter/gather
>>>> requests) are de-interleaved (to match the data on each spindle).
>>>
>>> You'd have to cons a new map and have it present the scattered target
>>> area as a linear region.  This is expensive, and the performance boost
>>> is likely to be low to nonexistent for optimal stripe sizes.
>>> Concatenation of multiple stripe reads is only a benefit if the stripe
>>> is small (so that concatenation significantly lowers overhead).
>> That's right - but you can't expect a high linear performance increase
>> when using great stripes.
>
> That depends on the application's read behaviour.  If reads are larger
> than the stripe size, you still win.

We're talking about stripes larger than the maximum physical I/O
length here.

I'll tell you what: anybody who wants, go and look at the request
building code in /usr/src/lkm/vinum/request.c and rebuild it to
perform the "aggregation" optimizations that Bernd wants, and I'll
put it into the code.

Greg
--
See complete headers for address, home page and phone numbers
finger grog@lemis.com for PGP public key

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19981114104105.S781>