From owner-freebsd-net@freebsd.org Sat Aug 31 11:44:44 2019 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C4773CF04E for ; Sat, 31 Aug 2019 11:44:44 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mailman.nyi.freebsd.org (mailman.nyi.freebsd.org [IPv6:2610:1c1:1:606c::50:13]) by mx1.freebsd.org (Postfix) with ESMTP id 46LDzJ0l5jz4Q43 for ; Sat, 31 Aug 2019 11:44:44 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: by mailman.nyi.freebsd.org (Postfix) id 19527CF04D; Sat, 31 Aug 2019 11:44:44 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 190C0CF04C for ; Sat, 31 Aug 2019 11:44:44 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id 46LDzH1p0yz4Q42 for ; Sat, 31 Aug 2019 11:44:42 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 2BB8643DC5D; Sat, 31 Aug 2019 21:44:40 +1000 (AEST) Date: Sat, 31 Aug 2019 21:44:38 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Martin Birgmeier cc: net@freebsd.org Subject: Re: [Bug 235031] [em] em0: poor NFS performance, strange behavior In-Reply-To: Message-ID: <20190831212849.X1183@besplex.bde.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=FNpr/6gs c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=6I5d2MoRAAAA:8 a=TX4ivX0axYP3Q2pITWgA:9 a=CjuIK1q_8ugA:10 a=IjZwj45LgO3ly-622nXo:22 X-Rspamd-Queue-Id: 46LDzH1p0yz4Q42 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of brde@optusnet.com.au designates 211.29.132.246 as permitted sender) smtp.mailfrom=brde@optusnet.com.au X-Spamd-Result: default: False [-3.28 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_IN_DNSWL_LOW(-0.10)[246.132.29.211.list.dnswl.org : 127.0.5.1]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:211.29.132.0/23:c]; FREEMAIL_FROM(0.00)[optusnet.com.au]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[optusnet.com.au]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; IP_SCORE_FREEMAIL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_HAM_SHORT(-0.98)[-0.981,0]; RCPT_COUNT_TWO(0.00)[2]; IP_SCORE(0.00)[ip: (-7.01), ipnet: 211.28.0.0/14(-3.29), asn: 4804(-2.42), country: AU(0.01)]; FREEMAIL_TO(0.00)[aon.at]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[optusnet.com.au]; ASN(0.00)[asn:4804, ipnet:211.28.0.0/14, country:AU]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 31 Aug 2019 11:44:44 -0000 On Thu, 15 Aug 2019 a bug that doesn't want replies@freebsd.org wrote: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235031 > > --- Comment #36 from Martin Birgmeier --- > I just notice that the console and syslog have about 20 messages of > > em: frame error: ignored > em: frame error: ignored > em: frame error: ignored > em: frame error: ignored > em: frame error: ignored > > Uptime is 2 1/2 hours. You seem to be using my old patch which is not in -current: Index: em_txrx.c XX =================================================================== XX --- em_txrx.c (revision 348771) XX +++ em_txrx.c (working copy) XX @@ -629,9 +629,20 @@ XX XX /* Make sure bad packets are discarded */ XX if (errors & E1000_RXD_ERR_FRAME_ERR_MASK) { XX +#if 0 XX adapter->dropped_pkts++; XX - /* XXX fixup if common */ XX return (EBADMSG); XX +#else XX + /* XX + * XXX the above error handling is worse than none. XX + * First it it drops 'i' packets before the current XX + * one and doesn't count them. Then it returns an XX + * error. iflib can't really handle this error. XX + * It just resets, and this usually drops many more XX + * packets (without counting them) and much time. XX + */ XX + printf("lem: frame error: ignored\n"); XX +#endif XX } XX XX ri->iri_frags[i].irf_flid = 0; XX @@ -692,8 +703,12 @@ XX XX /* Make sure bad packets are discarded */ XX if (staterr & E1000_RXDEXT_ERR_FRAME_ERR_MASK) { XX +#if 0 XX adapter->dropped_pkts++; XX return EBADMSG; XX +#else XX + printf("em: frame error: ignored\n"); XX +#endif XX } XX XX ri->iri_frags[i].irf_flid = 0; Without this patch, no message is printed and the device takes a long time to recover (when I wrote the patch, recovery was from something like a watchdog timeout after many seconds). With the patch, the recovery is good enough for nfs over udp to not lose any data or time out, but I don't trust this so I print the message. Pre-iflib versions of [l]em handled this correctly by dropping a single packet, which was easy to do. Unpatched iflib makes a mess by returning with subsequent packets unprocessed. It apparently just stops receiving until kicked by a watchdog. I don't know what causes this error. Maybe just a bad cable or switch. I don't see it for I218V with the same cable and switch. Bruce