Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 21 Jan 2007 08:09:19 +0100
From:      Max Laier <max@love2party.net>
To:        freebsd-net@freebsd.org
Subject:   Re: slow writes on nfs with bge devices
Message-ID:  <200701210809.27770.max@love2party.net>
In-Reply-To: <20070121155510.C23922@delplex.bde.org>
References:  <20070121155510.C23922@delplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart16990224.mgHh93Ab6g
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

On Sunday 21 January 2007 07:25, Bruce Evans wrote:
> nfs writes much less well with bge NICs than with other NICs (sk, fxp,

Do you use hardware checksumming on the bge?  There is an XXX in=20
bge_start_locked() that looks a bit suspicious to me.

> xl, even rl).  Sometimes writing a 20K source file from vi seems to
> take about 2 seconds instead of seeming to be instantaneous (this gets
> faster as the system warms up).  Iozone shows the problem more
> reproducibly.  E.g.:
>
> 100Mbps fxp server -> 1Gbps bge 5701 client, udp:
> %%%
>  	IOZONE: Performance Test of Sequential File I/O  --  V1.16 (10/28/92)
>  		By Bill Norcott
>
>  	Operating System: FreeBSD -- using fsync()
>
> IOZONE: auto-test mode
>
>  	MB      reclen  bytes/sec written   bytes/sec read
>  	1       512     1516885             291918639
>  	1       1024    1158783             491354263
>  	1       2048    1573651             715694105
>  	1       4096    1223692             917431957
>  	1       8192    729513              1097929467
>  	2       512     1694809             281196631
>  	2       1024    1379228             507917189
>  	2       2048    1659521             789608264
>  	2       4096    4606056             1064567574
>  	2       8192    1142288             1318131028
>  	4       512     1242214             298269971
>  	4       1024    1853545             492110628
>  	4       2048    2120136             742888430
>  	4       4096    1896792             1121799065
>  	4       8192    850210              1441812403
>  	8       512     1563847             281422325
>  	8       1024    1480844             492749552
>  	8       2048    1658649             850165954
>  	8       4096    2105283             1211348180
>  	8       8192    2098425             1554875506
>  	16      512     1508821             296842294
>  	16      1024    1966239             527850530
>  	16      2048    2036609             842656736
>  	16      4096    1666138             1200594889
>  	16      8192    2293378             1620824908
> Completed series of tests
> %%%
>
> Here bge barely reaches 10Mbps speeds (~1.2 MB/S) for writing.  Reading
> is cached well and fast.  100Mbps xl on the same client with the same
> server goes at full 100Mbps speed (11.77 MB/S for all file sizes
> including larger ones since the disk is not the limit at 100Mbps).
> 1Gbps sk on a different client with the same server goes at full
> 100Nbps speed.
>
> Switching to tcp gives full 100 Mbps speed.  However, when the bge link
> speed is reduced to 100Mbps, udp becomes about 10 times slower than the
> above and tcp becomes about as slow as the above (maybe a bit faster,
> but far below 11.77 MB/S).
>
> bge is also slow at nfs serving:
>
> 1Gbps bge 5701 server -> 1Gbps sk client:
> %%%
>
>  	IOZONE: Performance Test of Sequential File I/O  --  V1.16 (10/28/92)
>  		By Bill Norcott
>
>  	Operating System: FreeBSD -- using fsync()
>
> IOZONE: auto-test mode
>
>  	MB      reclen  bytes/sec written   bytes/sec read
>  	1       512     36255350            242114472
>  	1       1024    3051699             413319147
>  	1       2048    22406458            632021710
>  	1       4096    22447700            851162198
>  	1       8192    3522493             1047562648
>  	2       512     3270779             48125247
>  	2       1024    28992179            46693718
>  	2       2048    5956380             753318255
>  	2       4096    27616650            1053311658
>  	2       8192    5573338             48290208
>  	4       512     9004770             47435659
>  	4       1024    9576276             45601645
>  	4       2048    30348874            85116667
>  	4       4096    8635673             86150049
>  	4       8192    9356773             47100031
>  	8       512     9762446             46424146
>  	8       1024    10054027            58344604
>  	8       2048    9197430             60253061
>  	8       4096    15934077            59476759
>  	8       8192    8765470             47647937
>  	16      512     5670225             46239891
>  	16      1024    9425169             45950990
>  	16      2048    9833515             46242945
>  	16      4096    14812057            51313693
>  	16      8192    9203742             47648722
> Completed series of tests
> %%%
>
> Now the available bandwidth is 10 times larger and about 9/10 of it is
> still not used, with a high variance.  For larger files, the variance
> is lower and the average speed is about 10MB/S.  The disk can only do
> about 40MB/S and the slowest of the 1Gbps NICS (sk) can only sustain
> 80MB/S through udp and about 50MB/S through tcp (it is limited by the
> 33 MHz 32-bit PCI bus and by being less smart than the bge interface).
> When the bge NIC was on the system which is now the server with the fxp
> NIC, bge and nfs worked unsurprisingly, just slower than I would have
> liked.  The write speed was 20-30MB/S for large files and 30-40MB/S for
> medium-sized files, with low variance.  This is the only configuration
> in which nfs/bge worked as expected.
>
> The problem is very old and not very hardware dependent.  Similar
> behaviour happens when some of the following are changed:
>
> OS -> FreeBSD-~5.2 or FreeBSD-6
> hardware -> newer amd64 CPU (Turion X2) with 5705 (iozone output for
> this below) instead of old amd64 CPU with 5701.  The newer amd64
> normally runs an i386-SMP current kernel while the old amd64 was
> running an amd64-UP current kernel in the above tests, but normally
> runs ~5.2 amd64-UP and behaves similarly with that. The combination
> that seemed to work right was an AthlonXP for the server with the same
> 5701 and any kernel.  The only strangeness with that was that current
> kernels gave a 5-10% slower nfs server despite giving a 30-90% larger
> packet rate for small packets.
>
>  	IOZONE: Performance Test of Sequential File I/O  --  V1.16 (10/28/92)
>  		By Bill Norcott
>
>  	Operating System: FreeBSD -- using fsync()
>
> 100Mbps fxp server -> 1Gbps bge 5705 client:
> %%%
> IOZONE: auto-test mode
>
>  	MB      reclen  bytes/sec written   bytes/sec read
>  	1       512     2994400             185462027
>  	1       1024    3074084             337817536
>  	1       2048    2991691             576792985
>  	1       4096    3074759             884740798
>  	1       8192    3078019             1176892296
>  	2       512     4262096             186709962
>  	2       1024    2994468             339893080
>  	2       2048    5112176             584846610
>  	2       4096    4754187             909815165
>  	2       8192    5100574             1212919611
>  	4       512     5298715             187129017
>  	4       1024    5302620             344445041
>  	4       2048    4985597             590579630
>  	4       4096    3703618             927711124
>  	4       8192    5236177             1240896243
>  	8       512     5142274             186899396
>  	8       1024    6207933             345564808
>  	8       2048    6162773             593088329
>  	8       4096    6031445             936751120
>  	8       8192    6072523             1224102288
>  	16      512     5427113             186797193
>  	16      1024    5065901             345544445
>  	16      2048    5462338             595487384
>  	16      4096    5256552             937013065
>  	16      8192    5097101             1226320870
> Completed series of tests
> %%%
>
> rl on a system with 1/20 as much CPU is faster than this.
>
> The problem doesn't seem to affect much besides writes on nfs.  The
> bge 5701 works very well for most things.  It has a much better bus
> interface than the 5705 and works even better after moving it to the
> old amd64 system (it can now saturate 1Gbps where on the AthlonXP it
> only got 3/4 of the way, while the 5705 only gets 1/4 of the way).
> I've been working on minimising network latency and maximising packet
> rate, and normally have very low network latency (60-80 uS for ping)
> and fairly high packet rates.  The changes for this are not the caause
> of the bug :-), since the behaviour is not affected by running kernels
> without these changes or by sysctl''ing the changes to be null.=20
> However, the problem looks like ones caused by large latencies combined
> with non-streaming protocols.  To write at just 11.77 MB/S, at least
> 8000 packets/second must be set from the client to the server.  Working
> clients sustain this rate, but broken clients the rate is much lower
> and not sustained:
>
> Output from netstat -s 1 on server while writing a ~1GB file via
> 5701/udp: %%%
>              input        (Total)           output
>     packets  errs      bytes    packets  errs      bytes colls
>         900     0    1513334        142     0      33532     0
>        1509     0    2564836        236     0      57368     0
>        1647     0    2295802        259     0      51106     0
>        1603     0    1502736        252     0      32926     0
>        1055     0     637014        163     0      13938     0
>         558     0    1542510         86     0      34340     0
>         984     0     989854        155     0      21816     0
>         864     0    1320786        135     0      38152     0
>         883     0    1558060        165     0      34340     0
>        1177     0    3780102        203     0      85850     0
>        2087     0     954212        331     0      21210     0
>        1187     0    1413568        190     0      31310     0
>         650     0    3320604        101     0      75346     0
>        1565     0    1706542        246     0      37976     0
>        2055     0    2360620        329     0      52318     0
>        1554     0    2416996        244     0      54226     0
>        1402     0    2579894        220     0      58176     0
>        1690     0     774488        267     0      16968     0
>        1323     0    3690650        209     0      83830     0
>         591     0    4519858         92     0     103110     0
> %%%
>
> There is no sign of any packet loss or switch problems.  Forcing
> 1000baseTX full-duplex has no effect.  Forcing 100baseTX full-duplex
> makes the problem more obvious.  The mtu is 1500 throughout since
> only bge-5701 and sk support jumbo frames and I want to use udp for
> nfs.
>
> 5705/udp is better:
> %%%
>              input        (Total)           output
>     packets  errs      bytes    packets  errs      bytes colls
>        5209     0    6607758        846     0     151702     0
>        4763     0    6684546        773     0     153520     0
>        4758     0    6618498        769     0     151298     0
>        3582     0    7057568        576     0     162498     0
>        4935     0    5115068        800     0     116756     0
>        4924     0    6622026        798     0     152802     0
>        4095     0    6018462        657     0     137450     0
>        4647     0    5270442        751     0     120594     0
>        4673     0    5451948        758     0     123624     0
>        2340     0    6001986        372     0     138168     0
>        3750     0    6150610        604     0     140996     0
> %%%
>
> sk/udp works right:
> %%%
>              input        (Total)           output
>     packets  errs      bytes    packets  errs      bytes colls
>        8638     0   12384676       1440     0     293062     0
>        8636     0   12415646       1439     0     293708     0
>        8637     0   12415646       1441     0     293708     0
>        8637     0   12415646       1439     0     293708     0
>        8637     0   12417160       1440     0     293708     0
>        8636     0   12413162       1439     0     293506     0
>        8637     0   12414132       1439     0     293708     0
>        8636     0   12417160       1440     0     293708     0
>        8637     0   12415646       1439     0     293708     0
>        8636     0   12417160       1440     0     293708     0
>        8637     0   12414676       1439     0     293506     0
> %%%
>
> sk is under ~5.2 with latency/throughput/efficiency optimizations
> that don't have much effect here.
>
> Bruce
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

=2D-=20
/"\  Best regards,                      | mlaier@freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier@EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News

--nextPart16990224.mgHh93Ab6g
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (FreeBSD)

iD8DBQBFsxGnXyyEoT62BG0RAnb8AJwKV9ZihIC9m3XiHwsJLrAcQBa6CQCdHrbD
T/L2QEOgFi2qQe5Jte2vKbU=
=iMtp
-----END PGP SIGNATURE-----

--nextPart16990224.mgHh93Ab6g--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200701210809.27770.max>