Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 14 Aug 2013 14:40:24 +0200
From:      Luigi Rizzo <rizzo@iet.unipi.it>
To:        "Alexander V. Chernikov" <melifaro@ipfw.ru>
Cc:        Lawrence Stewart <lstewart@freebsd.org>, Lev Serebryakov <lev@FreeBSD.org>, FreeBSD Net <net@freebsd.org>
Subject:   Re: route/arp lifetime (Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux))
Message-ID:  <20130814124024.GA64548@onelab2.iet.unipi.it>
In-Reply-To: <520B74DD.1060102@ipfw.ru>
References:  <520A6D07.5080106@freebsd.org> <520AFBE8.1090109@freebsd.org> <520B24A0.4000706@freebsd.org> <520B3056.1000804@freebsd.org> <20130814102109.GA63246@onelab2.iet.unipi.it> <587579055.20130814154713@serebryakov.spb.ru> <20130814120551.GA64260@onelab2.iet.unipi.it> <520B74DD.1060102@ipfw.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Aug 14, 2013 at 04:15:25PM +0400, Alexander V. Chernikov wrote:
> On 14.08.2013 16:05, Luigi Rizzo wrote:
> > On Wed, Aug 14, 2013 at 03:47:13PM +0400, Lev Serebryakov wrote:
> >> Hello, Luigi.
> >> You wrote 14 ?????????????? 2013 ??., 14:21:09:
> >>
> >> LR> Then the problem remains that we should keep a copy of route and
> >> LR> arp information in the socket instead of redoing the lookups on
> >> LR> every single transmission, as they consume some 25% of the time of
> >> LR> a sendto(), and probably even more when it comes to large tcp
> >> LR> segments, sendfile() and the like.
> >>    And we should invalidate this info on ARP/route changes, or connection
> >>   will be lost in such cases, am I right?.. So, on each such event code
> >>   should look into all sockets and check, if routing/ARP information is still
> >>   valid for them. Or we should store lists of sockets in routing and ARP
> >>   tables... I don't know, what is worse.
> > I think we should start by acknowledging that routing and ARP
> > information is inherently stale, and changes unfrequently.
> > So it is not a disaster if we have incorrect information for some
> > short amount of time (milliseconds) because in the end the remote
> > party that decides to change it and inform us may take much longer
> > than that to distribute the update.
> You can save rte&arp, however doing this
> gives you perfect chance to crash your kernel if egress interface is 
> destroyed (like vlan or ng or tun).

I hope I learned not to follow a stale ifp pointer :)
anyways ARP is really just the mac address so there is no
dandling pointer issue.

For the ifp associated to the route,
i do not see a huge problem in marking the route/ifp as
zombie and destroy it when the last reference goes away.

Not that the current way is any better -- you need to lock/unlock
the rte while you do the lookup, and hold a refcount to the ifp
until the packet is queued. So how does my suggestion make
things worse ?

cheers
luigi


> >
> >
> > Considering that each lookup takes between 100..300ns if you are
> > lucky (not many misses, relatively empty table etc.), one could
> > reasonably do the lookup at most once per millisecond or so (just
> > reading 'ticks', no need for a nanotime() if you have a slow clock),
> > or whenever we get an error related to the socket, either in the
> > forward path (e.g. ifp points to an interface that is down) or in
> > the reverse path (e.g. a dupack because we sent a packet to the
> > wrong place).
> This sounds like "Hey, the kernel lookup is slow (which is true), let's 
> make a hack and don't bother lookups".
> This approach gives us mtx-locked rte refcounts which are used (misused) 
> in many places making things worse and decreasing the ability to fix the 
> things up..
> >
> > cheers
> > luigi
> > _______________________________________________
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> >
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130814124024.GA64548>