From owner-freebsd-current@freebsd.org Sat Nov 19 18:48:35 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 57CA7C4B697 for ; Sat, 19 Nov 2016 18:48:35 +0000 (UTC) (envelope-from ohartman@zedat.fu-berlin.de) Received: from outpost1.zedat.fu-berlin.de (outpost1.zedat.fu-berlin.de [130.133.4.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0FCE31827; Sat, 19 Nov 2016 18:48:34 +0000 (UTC) (envelope-from ohartman@zedat.fu-berlin.de) Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69]) by outpost.zedat.fu-berlin.de (Exim 4.85) with esmtps (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (envelope-from ) id <1c8Acv-0022tD-2m>; Sat, 19 Nov 2016 19:44:37 +0100 Received: from x55b385d5.dyn.telefonica.de ([85.179.133.213] helo=thor.walstatt.dynvpn.de) by inpost2.zedat.fu-berlin.de (Exim 4.85) with esmtpsa (TLSv1.2:AES256-GCM-SHA384:256) (envelope-from ) id <1c8Acu-003fqO-Nx>; Sat, 19 Nov 2016 19:44:36 +0100 Date: Sat, 19 Nov 2016 19:44:35 +0100 From: "O. Hartmann" To: YongHyeon PYUN Cc: FreeBSD CURRENT , gnn@FreeBSD.org, "Andrey V. Elsukov" Subject: Re: CURRENT: re(4) crashing system Message-ID: <20161119194424.6335338a@thor.walstatt.dynvpn.de> In-Reply-To: <20161107021623.GA1557@michelle.fasterthan.co.kr> References: <20161023132538.6bf55fb2@hermann> <20161024051359.GA1185@michelle.fasterthan.co.kr> <20161024140337.47af924e@freyja.zeit4.iv.bundesimmobilien.de> <20161025020538.GA1238@michelle.fasterthan.co.kr> <20161025070338.76ad6711@hermann> <20161027010004.GA1215@michelle.fasterthan.co.kr> <20161028212113.5c4a2ca2@hermann> <20161031021222.GA1252@michelle.fasterthan.co.kr> <20161106132036.06add6ca@hermann> <20161107021623.GA1557@michelle.fasterthan.co.kr> Reply-To: ohartmann@walstatt.org Organization: FU Berlin MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Originating-IP: 85.179.133.213 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Nov 2016 18:48:35 -0000 Am Mon, 7 Nov 2016 11:16:23 +0900 YongHyeon PYUN schrieb: > On Sun, Nov 06, 2016 at 01:20:36PM +0100, Hartmann, O. wrote: > > On Mon, 31 Oct 2016 11:12:22 +0900 > > YongHyeon PYUN wrote: > > > > > On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote: > > > > On Thu, 27 Oct 2016 10:00:04 +0900 > > > > YongHyeon PYUN wrote: > > > > > > > > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > > > > > On Tue, 25 Oct 2016 11:05:38 +0900 > > > > > > YongHyeon PYUN wrote: > > > > > > > > > > > > > > > > [...] > > > > > > > > > > > > I'm not sure but it's likely the issue is related with > > > > > > > EEE/Green Ethernet handling. EEE is negotiated feature with > > > > > > > link partner. If you directly connect your laptop to non-EEE > > > > > > > capable link partner like other re(4) box without switches > > > > > > > you may be able to tell whether the issue is EEE/Green > > > > > > > Ethernet related one or not. > > > > > > > > > > > > Me either since when I discovered a problem the first time with > > > > > > CURRENT, that was the Friday before last week's Friday, there > > > > > > was a unlucky coicidence: I got the new switch, FreeBSD > > > > > > introduced a serious bug and I changed the NICs. > > > > > > > > > > > > The laptop, the last in the row of re(4) equipted systems on > > > > > > which I use the Realtek NIC, does well now with Green IT > > > > > > technology, but crashes on plugging/unplugging - not on each > > > > > > event, but at least in one of ten. > > > > > > > > > > Hmm, it seems you know how to trigger the issue. When you unplug > > > > > UTP cable was there active network traffic on re(4) device? > > > > > It would be helpful to know which event triggers the crash(e.g. > > > > > unplugging or plugging). And would you show me backtrace of > > > > > panic? > > > > > > I guess the Green IT issue is more a unlucky guess of mine and > > > > > > went hand in hand with the problem I face with CURRENT right > > > > > > now on some older, Non UEFI machines. > > > > > > > > > > > > > > > > Ok. > > > > > > > > > > [...] > > > > > > > > > > > > As requested the informations about re0 and rgephy0 on the > > > > > > laptop (Lenovo E540) > > > > > > > > > > > > [...] > > > > > > > > > > > > rgephy0: PHY 1 on miibus0 > > > > > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, > > > > > > 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, > > > > > > 1000baseT-FDX-master, 1000baseT-FDX-flow, > > > > > > 1000baseT-FDX-flow-master, auto, auto-flow > > > > > > > > > > > > re0: > > > > > > port 0x3000-0x30ff mem > > > > > > 0xf0d04000-0xf0d04fff,0xf0d00000-0xf0d03fff at device 0.0 on > > > > > > pci2 re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip > > > > > > rev. 0x50800000 re0: MAC rev. 0x00100000 > > > > > > > > > > This looks like 8168GU controller. > > > > > > > > > > [...] > > > > > > > > > > > I use options netmap in kernel config, but the problem is also > > > > > > present without this option - just for the record. > > > > > > > > > > > > > > > > Yup, netmap(4) has nothing to do with the crash. > > > > > > > > > > Thanks. > > > > > > > > Attached, you'll find the backtrace of the crash. This time it was > > > > really easy - just one pull of the LAN cabling - and we are > > > > happy :-/ > > > > > > > > Please let me know if you need something else. I will return to > > > > normal operations (disabling debugging) due to CURRENT is very > > > > unstable at the moment on other hosts beyond r307157. > > > > > > > > > > It seems the attachment was stripped. > > > > This time I hope I got it right! > > > > Attached you'll find the latest CURRENT's backtrace on the provoked > > crash (plug and unplug). > > > > I also saved the kernel and coredump, so if you need me to do further > > investigations,please let me know. > > > > Thanks a lot for the backtrace. This backtrace is not the one I > expected and I guess the issue is related with cached route removal > on interface down. Quick looking over the code didn't reveal the > cause of crash(I'm not familiar with that part code). Probably > gnn@ may have better idea what's going on here(CCed). > > Thanks. In another thread I complained about permanent crashes on several "older" Intel architectures (IvyBridge and down). It has been revealed, that option FLOWTABLE in the kernel, which is part of my custom kernels a long time for now, has been identified as the culprit on those systems. Commenting out that special option solved the problem! Interestingly, also commenting out this option from the kernel config of the laptop in question of this thread, I wasn't able - as of this writing - to reproduce the crashes, so it might be that the same issue with FLOWTABLE has been triggered by pluggin and/or unpluggin the LAN cord. Usually I was able to trigger the coredump after two or three rounds, this time I tried it over ten times with no effect. But on the contrary, the NIC of the laptop doesn't negotiate for 1 GBit/s with my switch, it remains with 100 MBit/s. The switch is a Netgear GS110TP V2. Regards, oh