Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 13 Nov 1998 16:54:22 +1030
From:      Greg Lehey <grog@lemis.com>
To:        Peter Jeremy <peter.jeremy@auss2.alcatel.com.au>, hackers@FreeBSD.ORG
Subject:   Re: [Vinum] Stupid benchmark: newfsstone
Message-ID:  <19981113165422.X781@freebie.lemis.com>
In-Reply-To: <98Nov13.140613est.40335@border.alcanet.com.au>; from Peter Jeremy on Fri, Nov 13, 1998 at 02:06:39PM %2B1100
References:  <98Nov13.140613est.40335@border.alcanet.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
On Friday, 13 November 1998 at 14:06:39 +1100, Peter Jeremy wrote:
> Greg Lehey <grog@lemis.com> wrote:
>>  And it's almost impossible to find
>> spindle synchronized disks nowadays.
>
> Seagate Barracuda's support it, I assumed that the newer Seagates did
> as well.  The impression I got was that all you had to do was wire the
> `spindle sync' lines from all the disks together and then designate
> all except one as a sync'd slave.  Admittedly, I've never tried
> actually using it.

OK, I haven't actually tried, but there were several messages out
there suggesting that spindle synchronization was on its way out.

>> Finally, aggregating involves a scatter/gather approach which,
>> unless I've missed something, is not supported at a hardware level.
>> Each request to the driver specifies one buffer for the transfer,
>> so the scatter gather would have to be done by allocating more
>> memory and performing the transfer there (for a read) and then
>> copying to the correct place.
>
> Since the actual data transfer occurs to physical memory, whilst the
> kernel buffers are in VM, this should just require some imaginative
> juggling of the PTE's so the physical pages (or actual scatter/gather
> requests) are de-interleaved (to match the data on each spindle).

Yes, I hadn't neglected that.  But VM pages are 4 kB in size, and
Bernd was talking about 512 byte stripe width.  Bring it up to 4 kB
and you're relatively unlikely to lap your stripe.

> This does assume that the actual stripe is a multiple of the
> pagesize (if scatter/gather isn't supported).  

Oops.  But yes, agreed.

> And I'm not saying that implementing this would be easy or clean.

I've never been intimately involved in an implementation myself, but I
could imagine it could be made either easy or clean.

> What would be useful is some help (from vinum or ccd) to ensure that
> the cylinder group blocks (superblock + inode maps etc) don't cross
> stripes.

It's difficult, since you can do what you like with the volume.  Use
whatever parameters you like for the newfs.  Of course, it's obvious
that if your cylinder group sizes are a multiple of a relatively large
power of two, and your stripe sizes are too, then you're liable to end
up with all your superblocks on the same drive.

Greg
--
See complete headers for address, home page and phone numbers
finger grog@lemis.com for PGP public key

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19981113165422.X781>