Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 22 Sep 2016 08:23:17 +0100
From:      Steven Hartland <killing@multiplay.co.uk>
To:        Ryan Stone <rysto32@gmail.com>, Gleb Smirnoff <glebius@freebsd.org>
Cc:        Kubilay Kocak <koobs@freebsd.org>, freebsd-net <freebsd-net@freebsd.org>,  Karl Pielorz <kpielorz_lst@tdx.co.uk>
Subject:   Re: lagg Interfaces - don't do Gratuitous ARP?
Message-ID:  <796ab1c4-3a05-1e59-f8ba-355e75080935@multiplay.co.uk>
In-Reply-To: <CAFMmRNwZBEJ9Me4FSh=W7fRNjm4344jiUGuJqX8KUB_0sWcajA@mail.gmail.com>
References:  <0D84203FAAFD0A8E7BBB24A3@10.12.30.106> <bc33560b-59bc-01be-6a5d-7994ac121258@multiplay.co.uk> <6E574F1B61786E6032824A88@10.12.30.106> <2c62f5f0-3fb4-f513-2a8f-02de3a1d552f@FreeBSD.org> <20160921235703.GG1018@cell.glebi.us> <CAFMmRNwZBEJ9Me4FSh=W7fRNjm4344jiUGuJqX8KUB_0sWcajA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 22/09/2016 02:12, Ryan Stone wrote:
>
>
> On Wed, Sep 21, 2016 at 7:57 PM, Gleb Smirnoff <glebius@freebsd.org 
> <mailto:glebius@freebsd.org>> wrote:
>
>     IMHO, the original patch was absolutely evil hack touching multiple
>     layers, for the sake of a very special problem.
>
>     I think, that in order to kick forwarding table on switches, lagg
>     should:
>
>     - allocate an mbuf itself
>     - set its source hardware address to its own
>     - set destination hardware to broadcast
>     - put some payload in there, to make packet of valid size. Why
>     should it be
>       gratuitous ARP? A machine can be running IPv6 only, or may even
>     use whatever
>       higher level protocol, e.g. PPPoE. We shouldn't involve IP into
>     this Layer 2
>       problem at all.
>     - Finally, send the prepared mbuf down the lagg member(s).
>
>     And please don't hack half of the network stack to achieve that :)
>
>
> The original report in this thread is about a system where it takes 
> almost 15 minutes for the network to start working again after a 
> failover.  That does not sound to me like a switch problem.  That 
> sounds to me like the ARP cache on the remote system.  To fix such a 
> case we have to touch L3.
15mins is a long time however we don't do ARP correctly in this case, 
which is almost certainly the cause.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?796ab1c4-3a05-1e59-f8ba-355e75080935>