From owner-freebsd-arm@freebsd.org  Wed Jun 14 16:17:16 2017
Return-Path: <owner-freebsd-arm@freebsd.org>
Delivered-To: freebsd-arm@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6B142BEE1F4
 for <freebsd-arm@mailman.ysv.freebsd.org>;
 Wed, 14 Jun 2017 16:17:16 +0000 (UTC) (envelope-from hps@selasky.org)
Received: from mail.turbocat.net (turbocat.net [IPv6:2a01:4f8:c17:6c4b::2])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 1A0B66EF37;
 Wed, 14 Jun 2017 16:17:16 +0000 (UTC) (envelope-from hps@selasky.org)
Received: from hps2016.home.selasky.org (unknown [62.141.129.119])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by mail.turbocat.net (Postfix) with ESMTPSA id 30B92260858;
 Wed, 14 Jun 2017 18:17:14 +0200 (CEST)
Subject: Re: Continuing problem with RPI USB-based network adapters
To: Karl Denninger <karl@denninger.net>,
 "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>,
 Benno Rice <benno@FreeBSD.org>, Oleksandr Tymoshenko <gonzo@freebsd.org>
References: <ddacbd01-a283-797f-a012-b765f10f1b3d@denninger.net>
From: Hans Petter Selasky <hps@selasky.org>
Message-ID: <0fab102c-6152-967e-5b70-99cbd897e757@selasky.org>
Date: Wed, 14 Jun 2017 18:15:10 +0200
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101
 Thunderbird/52.1.1
MIME-Version: 1.0
In-Reply-To: <ddacbd01-a283-797f-a012-b765f10f1b3d@denninger.net>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-arm@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "Porting FreeBSD to ARM processors." <freebsd-arm.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-arm>,
 <mailto:freebsd-arm-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arm/>
List-Post: <mailto:freebsd-arm@freebsd.org>
List-Help: <mailto:freebsd-arm-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-arm>,
 <mailto:freebsd-arm-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Jun 2017 16:17:16 -0000

On 06/13/17 16:45, Karl Denninger wrote:
> A good long while back I tried to run down an apparent problem with
> ue-based network drivers that seemed to be linked to having more than
> one interface instance attached to a physical interface -- such as using
> "ue0" for the "base" link and "ue0.2" for VLAN 2 on the same physical wire.
> 
> The symptoms would be that the interface would "flap" every 10 or 20
> minutes; it would go down an up without apparent cause.
> 
> I can now quite-reliably report that it's not linked to VLANs; it also
> appears to show up *ANY* time there are multiple instances of an
> Ethernet interface up on the "ue" driver, irrespective of whether it's
> multiple instances on one interface (e.g. the VLAN example) OR multiple
> instances on multiple physical interfaces (e.g. ue0, ue1 on a plugged-in
> USB ethernet instance, etc.)
> 
> I have _*41 days*_ of uptime at present on a single-instance device with
> ZERO flaps.  But on a device with three instances, one with two physical
> interfaces in which one has a VLAN and base, the other just a base
> interface, it happens every few minutes.  If I "down" the VLAN interface
> /it still happens./
> 
> Jun 13 09:04:53 IPGw kernel: ue0.3: link state changed to UP
> Jun 13 09:25:46 IPGw kernel: ue0: link state changed to DOWN
> Jun 13 09:25:46 IPGw kernel: ue0.3: link state changed to DOWN
> Jun 13 09:25:46 IPGw kernel: ue0: link state changed to UP
> Jun 13 09:25:46 IPGw kernel: ue0.3: link state changed to UP
> Jun 13 09:37:50 IPGw kernel: ue0: link state changed to DOWN
> Jun 13 09:37:50 IPGw kernel: ue0.3: link state changed to DOWN
> Jun 13 09:37:50 IPGw kernel: ue0: link state changed to UP
> Jun 13 09:37:50 IPGw kernel: ue0.3: link state changed to UP
> 
> If there are logging entries that I can enable to try to find the cause
> of this it would be great -- this particular device is a RPI3 running
> -HEAD, but the issue traces back to at least 11.0 on the RPI2, where I
> saw it repeatedly close to a year ago, and there has apparently been no
> resolution.
> 
> This looks PR-worthy but without some sort of trace on the REASON for
> the flap it's not so useful, thus the question as to whether I can dig
> up a logging option that will inform as to *why* the interface was
> marked down.  It is NOT the switch port that the unit is plugged into OR
> the physical RPI3 hardware; I have swapped both the RPI3 board and
> switch port but the problem still exists and the other unit with one
> interface in service and NO flaps over 41 days of uptime is plugged into
> the same physical network switch.

Hi Karl,

The link state changed events come from the PHY on the devices you have 
shown. This is handled by the so-called MII bus code in FreeBSD.

sys/dev/mii/smscphy.c

> static void
> smscphy_status(struct mii_softc *sc)
> {
>         struct mii_data *mii;
>         uint32_t bmcr, bmsr, status;
> 
>         mii = sc->mii_pdata;
>         mii->mii_media_status = IFM_AVALID;
>         mii->mii_media_active = IFM_ETHER;
> 
>         bmsr = PHY_READ(sc, MII_BMSR) | PHY_READ(sc, MII_BMSR);
>         if ((bmsr & BMSR_LINK) != 0)
>                 mii->mii_media_status |= IFM_ACTIVE;


I see the code is reading the MII_BMSR status register multiple times. 
Maybe this is to cover up some kind of bug.

I suggest as a first step, change the two places where MII_BMSR is read 
to read it 4 times instead of two. Does it make any difference.

Also I would suggest that you can add some prints to inspect the 
individual values of each MII_BMSR read, to see what is going on.

CC'ed Benno and Oleksandr in case they have any suggestion.

--HPS