Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 Nov 2004 13:18:40 +0100
From:      Emanuel Strobl <Emanuel.Strobl@gmx.net>
To:        freebsd-current@freebsd.org
Cc:        Robert Watson <rwatson@freebsd.org>
Subject:   Re: serious networking (em) performance (ggate and NFS) problem
Message-ID:  <200411191318.46405.Emanuel.Strobl@gmx.net>
In-Reply-To: <Pine.NEB.3.96L.1041118121834.66045B-100000@fledge.watson.org>
References:  <Pine.NEB.3.96L.1041118121834.66045B-100000@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart1648706.zu1aVHYG7D
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Am Donnerstag, 18. November 2004 13:27 schrieb Robert Watson:
> On Wed, 17 Nov 2004, Emanuel Strobl wrote:
> > I really love 5.3 in many ways but here're some unbelievable transfer

=46irst, thanks a lot to all of you paying attention to my problem again.
I'll use this as a cumulative answer to the many postings of you, first=20
answering Roberts questions and at the bottom those of the others.

I changed cables and couldn't reproduce that bad results so I changed cable=
s=20
back but also cannot reproduce them, especially the ggate write, formerly=20
with 2,6MB/s now performs at 15MB/s, but I haven't done any polling tests=20
anymore, just interrupt driven, since Matt explained that em doesn't benefi=
t=20
of polling in any way.

Results don't indicate a serious problem now but are still about a third of=
=20
what I'd expected with my hardware. Do I really need Gigahertz Class CPUs t=
o=20
transfer 30MB/s over GbE?

>
> I think the first thing you want to do is to try and determine whether the
> problem is a link layer problem, network layer problem, or application
> (file sharing) layer problem.  Here's where I'd start looking:
>
> (1) I'd first off check that there wasn't a serious interrupt problem on
>     the box, which is often triggered by ACPI problems.  Get the box to be
>     as idle as possible, and then use vmstat -i or stat -vmstat to see if
>     anything is spewing interrupts.

Everything is fine

>
> (2) Confirm that your hardware is capable of the desired rates: typically
>     this involves looking at whether you have a decent card (most if_em
>     cards are decent), whether it's 32-bit or 64-bit PCI, and so on.  For
>     unidirectional send on 32-bit PCI, be aware that it is not possible to
>     achieve gigabit performance because the PCI bus isn't fast enough, for
>     example.

I'm aware that my 32bit/33MHz PCI bus is a "bottleneck", but I saw almost=20
80MByte/s running over the bus to my test-stripe-set (over the HPT372). So=
=20
I'm pretty sure the system is good for 40MB/s ober the GbE line, which was=
=20
sufficient for me.

>
> (3) Next, I'd use a tool like netperf (see ports collection) to establish
>     three characteristics: round trip latency from user space to user
>     space (UDP_RR), TCP throughput (TCP_STREAM), and large packet
>     throughput (UDP_STREAM).  With decent boxes on 5.3, you should have no
>     trouble at all maxing out a single gig-e with if_em, assuming all is
>     working well hardware wise and there's no software problem specific to
>     your configuration.

Please find the results on http://www.schmalzbauer.de/document.php?id=3D21
There is also a lot of additional information and more test results

>
> (4) Note that router latency (and even switch latency) can have a
>     substantial impact on gigabit performance, even with no packet loss,
>     in part due to stuff like ethernet flow control.  You may want to put
>     the two boxes back-to-back for testing purposes.
>

I was aware of that and because of lacking a GbE switch anyway I decided to=
=20
use a simple cable ;)

> (5) Next, I'd measure CPU consumption on the end box -- in particular, use
>     top -S and systat -vmstat 1 to compare the idle condition of the
>     system and the system under load.
>

I additionally added these values to the netperf results.

> If you determine there is a link layer or IP layer problem, we can start
> digging into things like the error statistics in the card, negotiation
> issues, etc.  If not, you want to move up the stack to try and
> characterize where it is you're hitting the performance issue.

Am Donnerstag, 18. November 2004 17:53 schrieb M. Warner Losh:
> In message: <Pine.NEB.3.96L.1041118121834.66045B-100000@fledge.watson.org>
>
> =A0 =A0 =A0 =A0 =A0 =A0 Robert Watson <rwatson@freebsd.org> writes:
> : (1) I'd first off check that there wasn't a serious interrupt problem on
> : =A0 =A0 the box, which is often triggered by ACPI problems. =A0Get the =
box to
> : be as idle as possible, and then use vmstat -i or stat -vmstat to see if
> : anything is spewing interrupts.
>
> Also, make sure that you aren't sharing interrupts between
> GIANT-LOCKED and non-giant-locked cards. =A0This might be exposing bugs
> in the network layer that debug.mpsafenet=3D0 might correct. =A0Just
> noticed that our setup here has that setup, so I'll be looking into
> that area of things.

As you can see on the link above, no shared IRQs


--nextPart1648706.zu1aVHYG7D
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (FreeBSD)

iD8DBQBBneSmBylq0S4AzzwRAkDcAJ40RoyPKUrK+40jHAcTfNqoj+mHvgCfSeNs
f4mOm2aRdKjE2yN6spBoFJU=
=R0oN
-----END PGP SIGNATURE-----

--nextPart1648706.zu1aVHYG7D--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200411191318.46405.Emanuel.Strobl>