Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 27 Jul 2008 02:55:36 +0200
From:      Ivan Voras <ivoras@freebsd.org>
To:        freebsd-questions@freebsd.org
Subject:   Re: graid3
Message-ID:  <g6gh2a$a7m$1@ger.gmane.org>
In-Reply-To: <20080725114402.G5386@wojtek.tensor.gdynia.pl>
References:  <20080725114402.G5386@wojtek.tensor.gdynia.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig63630725DBF5A87B69DB818C
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable

Wojciech Puchar wrote:
> i read the graid3 manual and http://www.acnc.com/04_01_03.html to make =

> sure i know what's RAID3 and i don't understand few things.
>=20
> 1)
>=20
> "The number of components must be equal to 3, 5, 9, 17, etc.
>                 (2^n + 1)."
>=20
> why it can't be say 5 disks+parity?

The reason is in the definition on "RAID 3", which says the updates to=20
the RAID device must be atomic. In some ideal universe, RAID 3 is=20
implemented in hardware and on individual bytes, but here we cannot=20
write to the drives in units other than sectorsize and sectorsize is 512 =

bytes.

Parity needs to be calculated with regards to each sector, so at the=20
sector level, the minimum number of sectors is three sectors: two for=20
data and one for parity. This means the high-level atomic sectorsize is=20
2*512=3D1024 bytes. If you inspect your RAID 3 devices, you'll see just t=
hat:

# diskinfo -v /dev/raid3/homes
/dev/raid3/homes
         1024            # sectorsize
         107374181376    # mediasize in bytes (100G)
         104857599       # mediasize in sectors

But each drive has a normal sectorsize of 512:

# diskinfo -v /dev/ad4
/dev/ad4
         512             # sectorsize
         80026361856     # mediasize in bytes (75G)
         156301488       # mediasize in sectors

Sector sizes cannot be arbitrary for various reasons, mostly dealing=20
with how memory pages and virtual memory are managed. In short, they=20
need to be powers of two. This restricts us to high-level ("big") sector =

sizes that can be exactly one of the following values: 1024, 2048, 4096, =

8192, etc. Since drive sectors are fixed to 512 bytes, this means that=20
the number of *data* drives must also be a power of two: 2, 4, 8, 16,=20
etc. Add one more drive for the parity and you get the starting=20
sequence: 3, 5, 9, 17.

In practice, this means that if you have 17 drives in RAID3, the=20
sectorsize of the array itself will be 16*512 =3D 8192. Each write to the=
=20
array will update all 17 drives before returning (one sector on each=20
drive, ensuring an atomic operation). Note that the file system created=20
on such an array will also have its characteristics modified to the=20
sector size (the fragment size will be the sector size).

> 2) "-r  Use parity component for reading in round-robin fashion.
> "Without this option the parity component is not used at
> all for reading operations when the device is in a complete state.
>  With this option specified random I/O read operations are even 40% fas=
ter
> , but sequential reads are slower.  One cannot use this option if the -=
w=20
> option is also specified."
>=20
>=20
> how parity disk could speed up random I/O?

It will work well only when the number of drives is small (i.e. three=20
drives), by using the parity drive as a valid source of data, avoiding=20
some seeks to all drives. I think that, theoretically, you can save at=20
most 0.33 (1/3) of all seeks - I don't know where the 40% number comes fr=
om.



--------------enig63630725DBF5A87B69DB818C
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIi8eJldnAQVacBcgRAmfQAKCRMuPfeZdLbi1GeVZmb3H8JgY6SwCgmOnU
od/i6cQGCMEqMgGT84himXM=
=WSbr
-----END PGP SIGNATURE-----

--------------enig63630725DBF5A87B69DB818C--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?g6gh2a$a7m$1>