Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Jan 2019 04:03:57 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Martin Birgmeier <d8zNeCFG@aon.at>
Cc:        Eugene Grosbein <eugen@grosbein.net>, net@freebsd.org
Subject:   Re: [Bug 235031] [em] em0: poor NFS performance, strange behavior
Message-ID:  <20190121014017.W945@besplex.bde.org>
In-Reply-To: <50a63079-4c2d-fc5c-47c5-1070b8fcd20c@aon.at>
References:  <bug-235031-7501@https.bugs.freebsd.org/bugzilla/> <bug-235031-7501-goXNmp3zVl@https.bugs.freebsd.org/bugzilla/> <20190119204156.D929@besplex.bde.org> <3e407ee7-54e3-a6ac-5535-d11aceca9558@grosbein.net> <20190120061258.X3312@besplex.bde.org> <16ce1832-13da-d7bb-cce2-6682e058b5a6@aon.at> <20190120145627.X1077@besplex.bde.org> <fd67eca6-7c1d-687d-91ae-e09138732ed1@aon.at> <20190120231915.M2326@besplex.bde.org> <50a63079-4c2d-fc5c-47c5-1070b8fcd20c@aon.at>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 20 Jan 2019, Martin Birgmeier wrote:

> The machine A with the em0 issue is running at 1 Gbps and acts as NFS
> server. The NFS client B has a 100 Mbps interface. B gets a throughput
> of only 1 Mbyte/s when talking to A but the full 10 Mbyte/s when talking
> to another third machine C. In addition, while B is talking to A, if at
> the same time A runs an iperf to C, the situation for B improves (up to
> 5..7 Mbyte/s).
>
> All machines are connected by a DGS-1210-24 1 Gbps switch.

I see.  I get worse misbehaviour (nfs write speed 24 KB/s for 512-blocks
instead of 1 MB/s) after changing the media of the bge NIC on my server
to 1000base full-duplex (where the switch is a cheap TP-Link 1 Gbps).
ping remains fast.  Concurrent ping doesn't improve nfs.  For the em0
NIC on my client, even the null change from autoselect to 1000baseT
full-duplex often corrupts the NIC state so that even ping doesn't
work.  I got tired of that and fixed the missing stopping:

XX Index: iflib.c
XX ===================================================================
XX --- iflib.c	(revision 332488)
XX +++ iflib.c	(working copy)
XX @@ -2232,7 +2234,7 @@
XX 
XX  	CTX_LOCK(ctx);
XX  	if ((err = IFDI_MEDIA_CHANGE(ctx)) == 0)
XX -		iflib_init_locked(ctx);
XX +		iflib_if_init_locked(ctx);
XX  	CTX_UNLOCK(ctx);
XX  	return (err);
XX  }

The fix works perfectly.  Now it is safe to change the media on the.  The
null change from autoselect to 1000baseT full-duplex on the client now
doesn't corrupt the state or change the nfs or ping speeds.  Changing
the media to 100baseTX full-duplex on the client gives much the same
misbehaviour as changing the media on the server similarly (not quite
so bad).  But changing the mediat to 100baseTX full-duplex on both gives
much worse behaviour.  Sometimes it causes the frame error reported
by my previous patch.  Clearly there is a protocol mismatch.

This problem occurs often.  I don't know how it can occur when there
is a switch.  The switch should translate to 1000 Mbps for the em0 side.

I don't really understand this, but have a lot of code in mii/e1000phy.c
related to it, and once tested this with all combinations of speeds and
duplexes.  e1000phy.c has nothing to do with Intel e1000, but is for an
old Marvell phy.  I have one on an sk NIC, and it stopped working at
1 Gbps on cold days.  The simplest fix was to set the speed manually,
but this gave problems like the above, and gives an unnecessarily low
speed on warm days.  At least my version of e1000phy.c or sk has some
link flags which give more control over this.

Half-duplex on both sides works!  The old version of bge on the server
doesn't support mediaopt half-duplex, but seems to default to that and
ifconfig prints nothing for the duplex.  -current em0 supports it.
Working means that the nfs write speed is about 9 MB/s.  Half-duplex
is of course slightly slower than full-duplex.  Similarly for 10baseT/UTP.

I found my old tables of working combinations of duplexes and autoselects
for bge <-> switch <-> sk and bge <-> sk.  The switch affects the working
combinations.  The tables are cryptic, but seem to be as follows:

switch case:
 	bge	sk	success
 	---	--	-------
 	A	A	n/a (handling of the sk bug gives a fuzzy auto speed)
 	1	A	n/a
 	1F	A	n/a
 	1	1	OK (as above)
 	1F	1	fail
 	1	1F	OK! (1F -> 1)
 	1F	1F	fail! (as above)
 	A	1	OK (A -> 1)
 	A	1F	OK (A -> 1F)

direct case:
 	bge	sk	success
 	---	--	-------
 	A	A	n/a (handling of the sk bug gives a fuzzy auto speed)
 	1	A	OK (A -> 1)
 	1F	A	partial succes (giving half-duplex!?)
 	1	1	OK (as above)
 	1F	1	fail
 	1	1F	fail (as expected, but different from switch case!)
 	1F	1F	fail! (as above)
 	A	1	OK (A -> 1)
 	A	1F	OK (A -> 1F)

Here 1 means a speed of 1000 Mbps or possibly 100 Mbps, A means autoselect,
F means full duplex, and the absense of F means half-duplex or nothing.

A for both should work and is normally used, and the only really weird case
is 1F for both not working.

> ...
> I have also discovered that there is net/intel-em-kmod. What is the
> relationship between the driver in the base sources and this one? How
> advisable is it to use the driver from ports?

I don't know about that.  I guess Intel still does some development,
especially for newer chipsets.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190121014017.W945>