Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 12 Oct 2011 16:17:09 -0400
From:      Karim <fodillemlinkarim@gmail.com>
To:        pyunyh@gmail.com
Cc:        freebsd-net@freebsd.org
Subject:   Re: if_msk.c link negotiation / packet drops
Message-ID:  <4E95F5C5.5050609@gmail.com>
In-Reply-To: <20111012192730.GB9138@michelle.cdnetworks.com>
References:  <4E94637A.5090607@gmail.com> <20111011171029.GA5661@michelle.cdnetworks.com> <CAN6yY1tWQZwdqgYdN3uBBdXiGQ2OFDMYbSjhEUeTimHjBnR9iA@mail.gmail.com> <4E959F06.6040906@gmail.com> <20111012170347.GA9138@michelle.cdnetworks.com> <4E95DDEB.1090500@gmail.com> <20111012192730.GB9138@michelle.cdnetworks.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11-10-12 03:27 PM, YongHyeon PYUN wrote:
> On Wed, Oct 12, 2011 at 02:35:23PM -0400, Karim wrote:
>> Hi,
>> On 11-10-12 01:03 PM, YongHyeon PYUN wrote:
>>> On Wed, Oct 12, 2011 at 10:07:02AM -0400, Karim wrote:
> [...]
>
>>> Hmm, that indicates driver lost established link. msk(4) will
>>> detect this condition and stop RX/TX MACs until it knows PHY
>>> re-established a link. This may be the reason why you see occasional
>>> packet drops. However I don't know why PHY loses established link
>>> in the middle of working.
>>>
>> Yes, I am convinced this lost of link is related to the packet drops as
>> well. At this point we can safely discard cabling issues or router
>> issues (physical ones that is) since the same happens on a different
>> network with different cables.
>>>>  From the code in e1000phy_status:
>>>>
>>>> static void
>>>> e1000phy_status(struct mii_softc *sc)
>>>> {
>>>>      struct mii_data *mii = sc->mii_pdata;
>>>>      int bmcr, bmsr, ssr;
>>>>
>>>>      mii->mii_media_status = IFM_AVALID;
>>>>      mii->mii_media_active = IFM_ETHER;
>>>>
>>>>      bmsr = PHY_READ(sc, E1000_SR) | PHY_READ(sc, E1000_SR);
>>>>      bmcr = PHY_READ(sc, E1000_CR);
>>>>      ssr = PHY_READ(sc, E1000_SSR);
>>>>
>>>>      if (bmsr&   E1000_SR_LINK_STATUS)
>>>>          mii->mii_media_status |= IFM_ACTIVE;
>>>>
>>>>
>>>> I can see the bmsr&   E1000_SR_LINK_STATUS check failing when the problem
>>>> occurs. As a side note why are we ORing the same call twice isn't the
>>>> same thing as calling it once:
>>>>
>>>> bmsr = PHY_READ(sc, E1000_SR) | PHY_READ(sc, E1000_SR);
>>>>
>>> The E1000_SR_LINK_STATUS bit is latched low so it should be read
>>> twice. If you want to read once use E1000_SSR_LINK bit of
>>> E1000_SSR register but I remember that bit was not reliable on some
>>> PHY models.
>> Thanks for the explanation and the alternative. The ssr register seems
>> to give me the right bit (E1000_SSR_LINK) but it also gives me an extra
>> bit 0x0100 that is not defined in e1000phyreg.h. Any idea what that bit
>> would be/means?
>>
> I guess it's related with advanced power saving. It would indicate
> current Energy detect status in PHY POV.
> Generally Marvell's PHY will enter into automatic power saving mode
> when it does not see any energy signal on the link. I don't know
> exact time when it enters into that mode but it would take less
> than 10 seconds if PHY do not see energy signal from link partner
> once it initiated auto-negotiation.
> However, e1000phy(4) always disables energy detect feature in
> e1000phy_reset() so it wouldn't affect your issue, I guess.
>
> One interesting thing is that 0x100 of E1000_SSR register indicates
> energy detect status is in "Sleep mode" which means it didn't
> detect energy signal(i.e. lost link). I'm not sure whether this bit
> report correct status when energy detect feature is disabled
> though.
>
> Can you check whether your switch supports energy detect feature?
> Or if your switch support EEE feature, try disabling it.
>
The way I understand this is EEE only works on GE ports and because this 
is set to 100MB it should be disabled. Moreover Cisco's has something 
called Green Ethernet which works on all ports but AFAIK the switch 
we're plugged in does not have those features.

I find it disconcerting that the E1000_SSR register reports both 
E1000_SSR_LINK (0x400) and 'Sleep mode' (0x100) at the same time but I 
guess, as you mentioned, this might not be correctly reported. I will 
assume we can disregard that bit for now.
>>> By chance, does your back-ported driver include r222219?
>>> If yes, did you cold boot after applying the change?
>>> Warm boot does have effect.
>> I do have this patch in the back-ported driver and due to several
>> reasons I didn't cold boot the appliance. We will give that a try and see.
>>
> Ok, let me know whether that makes any difference or not.
So far so good, will have to give it the night to see.

Again, thanks for all the help,

Karim.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E95F5C5.5050609>