Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 Nov 2004 14:07:45 -0800
From:      Sean McNeil <sean@mcneil.com>
To:        John-Mark Gurney <gurney_j@resnet.uoregon.edu>
Cc:        Robert Watson <rwatson@freebsd.org>
Subject:   Re: Re[4]: serious networking (em) performance (ggate and NFS)	problem
Message-ID:  <1101161265.3317.9.camel@server.mcneil.com>
In-Reply-To: <20041122213108.GY57546@funkthat.com>
References:  <Pine.NEB.3.96L.1041122112718.19086S-100000@fledge.watson.org> <1101154446.79991.13.camel@server.mcneil.com> <20041122213108.GY57546@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--=-Neck75QEDKh+GjKidRwQ
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

Hi John-Mark,

On Mon, 2004-11-22 at 13:31 -0800, John-Mark Gurney wrote:
> Sean McNeil wrote this message on Mon, Nov 22, 2004 at 12:14 -0800:
> > On Mon, 2004-11-22 at 11:34 +0000, Robert Watson wrote:
> > > On Sun, 21 Nov 2004, Sean McNeil wrote:
> > >=20
> > > > I have to disagree.  Packet loss is likely according to some of my
> > > > tests.  With the re driver, no change except placing a 100BT setup =
with
> > > > no packet loss to a gigE setup (both linksys switches) will cause
> > > > serious packet loss at 20Mbps data rates.  I have discovered the on=
ly
> > > > way to get good performance with no packet loss was to
> > > >=20
> > > > 1) Remove interrupt moderation
> > > > 2) defrag each mbuf that comes in to the driver.
> > >=20
> > > Sounds like you're bumping into a queue limit that is made worse by
> > > interrupting less frequently, resulting in bursts of packets that are
> > > relatively large, rather than a trickle of packets at a higher rate.
> > > Perhaps a limit on the number of outstanding descriptors in the drive=
r or
> > > hardware and/or a limit in the netisr/ifqueue queue depth.  You might=
 try
> > > changing the default IFQ_MAXLEN from 50 to 128 to increase the size o=
f the
> > > ifnet and netisr queues.  You could also try setting net.isr.enable=
=3D1 to
> > > enable direct dispatch, which in the in-bound direction would reduce =
the
> > > number of context switches and queueing.  It sounds like the device d=
river
> > > has a limit of 256 receive and transmit descriptors, which one suppos=
es is
> > > probably derived from the hardware limit, but I have no documentation=
 on
> > > hand so can't confirm that.
> >=20
> > I've tried bumping IFQ_MAXLEN and it made no difference.  I could rerun
>=20
> And the default for if_re is RL_IFQ_MAXLEN which is already 512...  As
> is mentioned below, the card can do 64 segments (which usually means 32
> packets since each packet usually has a header + payload in seperate
> packets)...

It sounds like you believe this is an if_re-only problem.  I had the
feeling that the if_em driver performance problems were related in some
way.  I noticed that if_em does not do anything with m_defrag and
thought it might be a little more than coincidence.

> > this test to be 100% certain I suppose.  It was done a while back.  I
> > haven't tried net.isr.enable=3D1, but packet loss is in the transmissio=
n
> > direction.  The device driver has been modified to have 1024 transmit
> > and receive descriptors each as that is the hardware limitation.  That
> > didn't matter either.  With 1024 descriptors I still lost packets
> > without the m_defrag.
>=20
> hmmm...  you know, I wonder if this is a problem with the if_re not
> pulling enough data from memory before starting the transmit...  Though
> we currently have it set for unlimited... so, that doesn't seem like it
> would be it..

Right.  Plus it now has 1024 descriptors on my machine and, like I said,
made little difference.

> > The most difficult thing for me to understand is:  if this is some sort
> > of resource limitation why will it work with a slower phy layer
> > perfectly and not with the gigE?  The only thing I could think of was
> > that the old driver was doing m_defrag calls when it filled the transmi=
t
> > descriptor queues up to a certain point.  Understanding the effects of
> > m_defrag would be helpful in figuring this out I suppose.
>=20
> maybe the chip just can't keep the transmit fifo loaded at the higher
> speeds...  is it possible vls is doing a writev for multisegmented UDP
> packet?   I'll have to look at this again...

I suppose.  As I understand it, though, it should be sending out
1316-byte data packets at a metered pace.  Also, wouldn't it behave the
same for 100BT vs. gigE?  Shouldn't I see packet loss with 100BT if this
is the case?

> > > It would be interesting on the send and receive sides to inspect the
> > > counters for drops at various points in the network stack; i.e., are =
we
> > > dropping packets at the ifq handoff because we're overfilling the
> > > descriptors in the driver, are packets dropped on the inbound path go=
ing
> > > into the netisr due to over-filling before the netisr is scheduled, e=
tc.=20
> > > And, it's probably interesting to look at stats on filling the socket
> > > buffers for the same reason: if bursts of packets come up the stack, =
the
> > > socket buffers could well be being over-filled before the user thread=
 can
> > > run.
> >=20
> > Yes, this would be very interesting and should point out the problem.  =
I
> > would do such a thing if I had enough knowledge of the network pathways=
.
> > Alas, I am very green in this area.  The receive side has no issues,
> > though, so I would focus on transmit counters (with assistance).
>=20

--=-Neck75QEDKh+GjKidRwQ
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (FreeBSD)

iD8DBQBBomMxyQsGN30uGE4RAlqhAJ9TrsixxX3K64+oIOUibWB6sb8hzgCdGTqN
CwnP4Vx2F7UK7Bzn+8HlRWA=
=/Ax1
-----END PGP SIGNATURE-----

--=-Neck75QEDKh+GjKidRwQ--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1101161265.3317.9.camel>