FreeBSD Mail Archives

Date:      Fri, 19 Nov 2004 15:10:17 +0100
From:      Emanuel Strobl <Emanuel.Strobl@gmx.net>
To:        freebsd-current@freebsd.org
Cc:        Robert Watson <rwatson@freebsd.org>
Subject:   Re: serious networking (em) performance (ggate and NFS) problem
Message-ID:  <200411191510.23551.Emanuel.Strobl@gmx.net>
In-Reply-To: <Pine.NEB.3.96L.1041119124719.92822D-100000@fledge.watson.org>
References:  <Pine.NEB.3.96L.1041119124719.92822D-100000@fledge.watson.org>

--nextPart4186532.iQCMD9Znuz
Content-Type: multipart/mixed;
  boundary="Boundary-01=_L7fnBSoXBK1mVkt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

--Boundary-01=_L7fnBSoXBK1mVkt
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Am Freitag, 19. November 2004 13:56 schrieb Robert Watson:
> On Fri, 19 Nov 2004, Emanuel Strobl wrote:
> > Am Donnerstag, 18. November 2004 13:27 schrieb Robert Watson:
> > > On Wed, 17 Nov 2004, Emanuel Strobl wrote:
> > > > I really love 5.3 in many ways but here're some unbelievable transf=
er
[...]
> Well, the claim that if_em doesn't benefit from polling is inaccurate in
> the general case, but quite accurate in the specific case.  In a box with
> multiple NIC's, using polling can make quite a big difference, not just by
> mitigating interrupt load, but also by helping to prioritize and manage
> the load, preventing live lock.  As I indicated in my earlier e-mail,

I understand, thanks for the explanation

> It looks like the netperf TCP test is getting just under 27MB/s, or
> 214Mb/s.  That does seem on the low side for the PCI bus, but it's also

Nut sure if I understand that sentence correctly, does it mean the "slow"=20
400MHz PII is causing this limit? (low side for the PCI bus?)

> instructive to look at the netperf UDP_STREAM results, which indicate that
> the box believes it is transmitting 417Mb/s but only 67Mb/s are being
> received or processed fast enough by netserver on the remote box.  This
> means you've achieved a send rate to the card of about 54Mb/s.  Note that
> you can actually do the math on cycles/packet or cycles/byte here -- with
> TCP_STREAM, it looks like some combination of recipient CPU and latency
> overhead is the limiting factor, with netserver running at 94% busy.

Hmm, I can't puzzle a picture out of this.=20

>
> Could you try using geom gate to export a malloc-backed md device, and see
> what performance you see there?  This would eliminate the storage round

It's a pleasure:

test2:~#15: dd if=3D/dev/zero of=3D/mdgate/testfile bs=3D16k count=3D6000
6000+0 records in
6000+0 records out
98304000 bytes transferred in 5.944915 secs (16535812 bytes/sec)
test2:~#17: dd if=3D/mdgate/testfile of=3D/dev/null bs=3D16k
6000+0 records in
6000+0 records out
98304000 bytes transferred in 5.664384 secs (17354755 bytes/sec)

This time it's no difference between disk and memory filesystem, but on=20
another machine with a ich2 chipset and a 3ware controller (my current=20
productive system which I try to replace with this project) there was a big=
=20
difference. Attached is the corresponding message.

Thanks,

=2DHarry

> trip and guarantee the source is in memory, eliminating some possible
> sources of synchronous operation (which would increase latency, reducing
> throughput).  Looking at CPU consumption here would also be helpful, as it
> would allow us to reason about where the CPU is going.
>
> > I was aware of that and because of lacking a GbE switch anyway I decided
> > to use a simple cable ;)
>
> Yes, this is my favorite configuration :-).
>
> > > (5) Next, I'd measure CPU consumption on the end box -- in particular,
> > > use top -S and systat -vmstat 1 to compare the idle condition of the
> > > system and the system under load.
> >
> > I additionally added these values to the netperf results.
>
> Thanks for your very complete and careful testing and reporting :-).
>
> Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
> robert@fledge.watson.org      Principal Research Scientist, McAfee Resear=
ch
>
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

--Boundary-01=_L7fnBSoXBK1mVkt
Content-Type: message/rfc822;
  name="forwarded message"
Content-Transfer-Encoding: 7bit
Content-Description: Emanuel Strobl <Emanuel.Strobl@gmx.net>: Re: asymmetric
	NFS transfer rates
Content-Disposition: inline

From: Emanuel Strobl <Emanuel.Strobl@gmx.net>
To: freebsd-current@freebsd.org
Subject: Re: asymmetric NFS transfer rates
Date: Mon, 8 Nov 2004 04:29:11 +0100
User-Agent: KMail/1.7
Cc: Doug White <dwhite@gumbysoft.com>,
 Robert Watson <rwatson@freebsd.org>
References: <Pine.NEB.3.96L.1041102131322.21044C-100000@fledge.watson.org>
	<20041102105534.K63929@carver.gumbysoft.com>
In-Reply-To: <20041102105534.K63929@carver.gumbysoft.com>
X-OS: FreeBSD
X-Birthday: 10/06/72
X-Address: Munich, 80686
X-Tel: +49 89 18947781
X-CelPhone: +49 173 9967781
X-Country: Germany
MIME-Version: 1.0
Content-Type: multipart/signed;
  boundary="nextPart1243660.JDKRhvq51c";
  protocol="application/pgp-signature";
  micalg=pgp-sha1
Content-Transfer-Encoding: 7bit
Message-Id: <200411080429.12846.Emanuel.Strobl@gmx.net>
X-UID: 3906
X-Length: 4558

--nextPart1243660.JDKRhvq51c
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Am Dienstag, 2. November 2004 19:56 schrieb Doug White:
> On Tue, 2 Nov 2004, Robert Watson wrote:
> > On Tue, 2 Nov 2004, Emanuel Strobl wrote:
> > > It's a IDE Raid controller (3ware 7506-4, a real one) and the file is
> > > indeed huge, but not abnormally. I have a harddisk video recorder, so=
 I
> > > have lots of 700MB files. Also if I copy my photo collection from the
> > > server it takes 5 Minutes but copying _to_ the server it takes almost
> > > 15 Minutes and the average file size is 5 MB. Fast Ethernet isn't
> > > really suitable for my needs, but at least the 10MB/s should be
> > > reached. I can't imagine I get better speeds when I upgrade to GbE,
> > > (which the important boxes are already, just not the switch) because
> > > NFS in it's current state isn't able to saturate a 100baseTX line, at
> > > least in one direction. That's the real anstonishing thing for me. Why
> > > does reading staurate 100BaseTX but writes only a third?
> >
> > Have you tried using tcpdump/ethereal to see if there's any significant
> > packet loss (for good reasons or not) going on?  Lots of RPC retransmits
> > would certainly explain the lower performance, and if that's not it, it
> > would be good to rule out.  The traces might also provide some insight
> > into the specific I/O operations, letting you see what block sizes are =
in
> > use, etc.  I've found that dumping to a file with tcpdump and reading
> > with ethereal is a really good way to get a picture of what's going on
> > with NFS: ethereal does a very nice job decoding the RPCs, as well as
> > figuring out what packets are related to each other, etc.
>
> It'd also be nice to know the mount options (nfs blocksizes in
> particular).

I haven't done intensive wire-dumps yet, but I figured out some oddities.
My main problem seems to be the 3ware controller in combination with NFS. I=
f I=20
create a malloc backed md0 I can push more than 9MB/s to it with UDP and mo=
re=20
that 10MB/s with TCP (both without modifying r/w-size).
I can also copy a 100M file from twed0s1d to twed0s1e (so from and to the s=
ame=20
RAID5 array which is worst rate) with 15MB/s so the array can't be the=20
bottleneck.
Only when I push to the RAID5 array via NFS I only get 4MB/s, no matter if =
I=20
use UDP, TCP or nonstandard r/w-sizes.

Next thing I found is that if I tune -w to anything higher than the standar=
d=20
8192 the average transfer rate of one big file degrades with UDP but=20
increases with TCP (like I would expect).
UDP transfer seems to hic-up with -w tuned, transfer rates peak at 8MB/s bu=
t=20
the next second they stay at 0-2MB/s (watched with systat -vm 1) but with T=
CP=20
everything runs smooth, regardless of the -w value.

Now back to my real problem: Can you imagine that NFS and twe are blocking=
=20
each other or something like that? Why do I get such really bad transfer=20
rates when both parts are in use but every single part on its own seems to=
=20
work fine?

Thanks for any help,

=2DHarry

--nextPart1243660.JDKRhvq51c
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (FreeBSD)

iD8DBQBBjugIBylq0S4AzzwRAnX+AJ0TC2LI6GsiX/L3SHfxdQWwzfdvDwCdEMhq
Ndcd6c3XokaY1ksXnJ2jRcU=
=pPRL
-----END PGP SIGNATURE-----

--nextPart1243660.JDKRhvq51c--

--Boundary-01=_L7fnBSoXBK1mVkt--

--nextPart4186532.iQCMD9Znuz
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (FreeBSD)

iD8DBQBBnf7PBylq0S4AzzwRAoCZAJ4jB/GnTqOM3G/J9bzhOMLY1Y5PmgCcCDAj
ebL9TYVVBAJn40IwFs36X1M=
=sjxG
-----END PGP SIGNATURE-----

--nextPart4186532.iQCMD9Znuz--

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200411191510.23551.Emanuel.Strobl>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation