Date: Tue, 31 Jul 2007 10:10:06 +0100 From: Tom Evans <tevans.uk@googlemail.com> To: Dominic Bishop <dom@bishnet.net> Cc: freebsd-questions@freebsd.org Subject: Re: Increasing GELI performance Message-ID: <1185873007.1444.14.camel@localhost> In-Reply-To: <20070728132614.2BC7F13C461@mx1.freebsd.org> References: <20070728132614.2BC7F13C461@mx1.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--=-/o59l4bV3uOWVEnWu3Ue Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Sat, 2007-07-28 at 14:26 +0100, Dominic Bishop wrote: > I've just been testing out GELI performance on an underlying RAID using a > 3ware 9550SXU-12 running RELENG_6 as of yesterday and seem to be hitting = a > performance bottleneck, but I can't see where it is coming from. >=20 > Testing with an unencrypted 100GB GPT partition (/dev/da0p1) gives me aro= und > 200-250MB/s read and write speeds to give an idea of the capability of th= e > disk device itself. >=20 > Using GELI with a default 128bit AES key seems to limit at ~50MB/s , > changing the sector size all the way upto 128KB makes no difference > whatsoever to the performance. If I use the threads sysctl in loader.conf > and drop the geli threads to 1 thread only (instead of the usual 3 it spa= wns > on this system) the performance still does not change at all. Monitoring > during writes with systat confirms that it really is spawning 1 or 3 thre= ads > correctly in these cases. >=20 > Here is a uname -a from the machine >=20 > FreeBSD 004 6.2-STABLE FreeBSD 6.2-STABLE #2: Fri Jul 27 20:10:05 CEST 20= 07 > dom@004:/u1/obj/u1/src/sys/004 amd64 >=20 > Kernel is a copy of GENERIC with GELI option added >=20 > Encrypted partition created using : geli init -s 65536 /dev/da0p1 >=20 > Simple write test done with: dd if=3D/dev/zero of=3D/dev/da0p1.eli bs=3D1= m > count=3D10000 (same as I did on the unencyrpted, a full test with bonnie+= + > shows similar speeds) >=20 > Systat output whilst writing, showing 3 threads: >=20 >=20 > /0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10 > Load Average |||| >=20 > /0 /10 /20 /30 /40 /50 /60 /70 /80 /90 /10= 0 > root idle: cpu3 XXXXXXXXX =20 > root idle: cpu1 XXXXXXXX =20 > <idle> XXXXXXXX > root idle: cpu0 XXXXXXX=20 > root idle: cpu2 XXXXXX > root g_eli[2] d XXX=20 > root g_eli[0] d XXX > root g_eli[1] d X=20 > root g_up > root dd >=20 > Output from vmstat -w 5 > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr ad4 da0 in sy cs us s= y > id > 0 1 0 38124 3924428 208 0 1 0 9052 0 0 0 1758 451 6354 = 1 > 15 84 > 0 1 0 38124 3924428 0 0 0 0 13642 0 0 411 2613 128 9483 = 0 > 22 78 > 0 1 0 38124 3924428 0 0 0 0 13649 0 0 411 2614 130 9483 = 0 > 22 78 > 0 1 0 38124 3924428 0 0 0 0 13642 0 0 411 2612 128 9477 = 0 > 22 78 > 0 1 0 38124 3924428 0 0 0 0 13642 0 0 411 2611 128 9474 = 0 > 23 77 >=20 > Output from iostat -x 5 > extended device statistics =20 > device r/s w/s kr/s kw/s wait svc_t %b =20 > ad4 2.2 0.7 31.6 8.1 0 3.4 1=20 > da0 0.2 287.8 2.3 36841.5 0 0.4 10=20 > pass0 0.0 0.0 0.0 0.0 0 0.0 0=20 > extended device statistics =20 > device r/s w/s kr/s kw/s wait svc_t %b =20 > ad4 0.0 0.0 0.0 0.0 0 0.0 0=20 > da0 0.0 411.1 0.0 52622.1 0 0.4 15=20 > pass0 0.0 0.0 0.0 0.0 0 0.0 0=20 > extended device statistics =20 > device r/s w/s kr/s kw/s wait svc_t %b =20 > ad4 0.0 0.0 0.0 0.0 0 0.0 0=20 > da0 0.0 411.1 0.0 52616.2 0 0.4 15=20 > pass0 0.0 0.0 0.0 0.0 0 0.0 0=20 >=20 >=20 > Looking at these results myself I cannot see where the bottleneck is, I > would assume since changing the sector size or the geli threads doesn't > affect performance that there is some other single threaded part limiting= it > but I don't know enough about how it works to say what. >=20 > CPU in the machine is a pair of these: > CPU: Intel(R) Xeon(R) CPU 5110 @ 1.60GHz (1603.92-MHz K8-clas= s > CPU) >=20 > I've also come across some other strange issues with some other machines > which have identical arrays but only a pair of 32bit 3.0Ghz xeons in them > (Also using releng_6 as of yesterday, just i386 not amd64). On those geli > will launch a single thread by default (cores-1 seems to be the default) > however I cannot force it to launch 2 by using the sysctl, although on th= e 4 > core machine I can successfully use it to launch 4. It would be nice to b= e > able to use both cores on the 32bit machines for geli but given the resul= ts > I've shown here I'm not sure it would gain me much at the moment. >=20 > Another problem I've found is that if I use a sector size for GELI > 8192 > bytes then I'm unable to newfs the encrypted partition afterwards, it fai= ls > immediately with this error: >=20 > newfs /dev/da0p1.eli > increasing block size from 16384 to fragment size (65536) > /dev/da0p1.eli: 62499.9MB (127999872 sectors) block size 65536, fragment > size 65536 > using 5 cylinder groups of 14514.56MB, 232233 blks, 58112 inodes. > newfs: can't read old UFS1 superblock: read error from block device: Inva= lid > argument >=20 > The underlying device is readable/writeable however as dd can read/write = to > it without any errors. >=20 > If anyone has any suggestions/thoughts on any of these points it would be > much appreciated, these machines will be performing backups over 1Gbit LA= N > so more speed than I can currently get would be preferable. >=20 > I sent this to geom@ and meant to CC here as that seems to be a pretty qu= iet > list so might not get seen there, I forgot the CC so apologies for sendin= g > separately here. I'll add here a few extra bits sent to geom@ to a respon= se: >=20 > Trying newfs with -S option to specify sector size matching -s option to > geli init: >=20 > newfs -S 65536 /dev/da0p1.eli > increasing block size from 16384 to fragment size (65536) > /dev/da0p1.eli: 62499.9MB (127999872 sectors) block size 65536, fragment > size 65536 > using 5 cylinder groups of 14514.56MB, 232233 blks, 58112 inodes. > newfs: can't read old UFS1 superblock: read error from block device: Inva= lid > argument >=20 > Diskinfo reports correct sector size for geli layer and 512 byte for > underlying GPT partition: > diskinfo -v /dev/da0p1 > /dev/da0p1 > 512 # sectorsize > 65536000000 # mediasize in bytes (61G) > 128000000 # mediasize in sectors > 7967 # Cylinders according to firmware. > 255 # Heads according to firmware. > 63 # Sectors according to firmware. >=20 > diskinfo -v /dev/da0p1.eli > /dev/da0p1.eli > 65536 # sectorsize > 65535934464 # mediasize in bytes (61G) > 999999 # mediasize in sectors > 62 # Cylinders according to firmware. > 255 # Heads according to firmware. > 63 # Sectors according to firmware. >=20 > Testing on a onetime geli encryption of the underlying raw device to bypa= ss > the GPT shows very similar poor results: >=20 > dd if=3D/dev/da0.eli of=3D/dev/null bs=3D1m count=3D1000 > 1000+0 records in > 1000+0 records out > 1048576000 bytes transferred in 29.739186 secs (35259069 bytes/sec) >=20 > dd if=3D/dev/zero of=3D/dev/da0.eli bs=3D1m count=3D1000 > 1000+0 records in > 1000+0 records out > 1048576000 bytes transferred in 23.501061 secs (44618241 bytes/sec) >=20 > For comparison the same test done on the unencrypted raw device: >=20 > dd if=3D/dev/da0 of=3D/dev/null bs=3D1m count=3D1000 > 1000+0 records in > 1000+0 records out > 1048576000 bytes transferred in 5.802704 secs (180704717 bytes/sec) >=20 > dd if=3D/dev/zero of=3D/dev/da0 bs=3D1m count=3D1000 > 1000+0 records in > 1000+0 records out > 1048576000 bytes transferred in 4.026869 secs (260394859 bytes/sec) >=20 >=20 > Looking at 'top -S -s1' whilst doing a long read/write using geli shows a > geli thread for each core but there only ever seems to be one in a runnin= g > state at any given time, the others will be in a state of 'geli:w'. This > would suggest why performance is identical with only 1 geli thread and wi= th > 4 geli threads. >=20 > Regards, >=20 > Dominic Bishop >=20 A simple solution is just to add some crypto hardware into the mix to beef things up. Something like a Soekris VPN 1401 would do the trick. See hifn(4) and http://www.soekris.com/vpn1401.htm --=-/o59l4bV3uOWVEnWu3Ue Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQBGrvxqlcRvFfyds/cRAhbvAKCzOMQ7oG+Jinl6nDQGepL5IhBHuACfZ5nB qkKFFM1qv8A5imfWSMe34Ac= =NMMe -----END PGP SIGNATURE----- --=-/o59l4bV3uOWVEnWu3Ue--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1185873007.1444.14.camel>