Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 7 Jul 2008 13:04:18 -0700
From:      Alfred Perlstein <alfred@freebsd.org>
To:        Andre Oppermann <andre@freebsd.org>
Cc:        cvs-src@FreeBSD.org, src-committers@FreeBSD.org, Robert Watson <rwatson@FreeBSD.org>, cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/sys/netinet udp_usrreq.c
Message-ID:  <20080707200418.GE95574@elvis.mu.org>
In-Reply-To: <48720552.9000605@freebsd.org>
References:  <200807071057.m67Av9WD014167@repoman.freebsd.org> <20080707121042.W63144@fledge.watson.org> <48720552.9000605@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
* Andre Oppermann <andre@freebsd.org> [080707 05:01] wrote:
> Robert Watson wrote:
> >On Mon, 7 Jul 2008, Robert Watson wrote:
> >
> >>rwatson     2008-07-07 10:56:55 UTC
> >>
> >> FreeBSD src repository
> >>
> >> Modified files:
> >>   sys/netinet          udp_usrreq.c
> >> Log:
> >> SVN rev 180344 on 2008-07-07 10:56:55Z by rwatson
> >>
> >> First step towards parallel transmit in UDP: if neither a specific
> >> source or a specific destination address is requested as part of a send
> >> on a UDP socket, read lock the inpcb rather than write lock it.  This
> >> will allow fully parallel transmit down to the IP layer when sending
> >> simultaneously from multiple threads on a connected UDP socket.
> >>
> >> Parallel transmit for more complex cases, such as when sendto(2) is
> >> invoked with an address and there's already a local binding, will
> >> follow.
> >
> >This change doesn't help the particularly interesting applications, such 
> >as named, etc, as they usually call sendto() with an address rather than 
> >connect() the UDP socket, but upcoming changes should address that.  
> >Once you get to the IP layer, the routing code shows up as a massive 
> >source of contention, and it would be great if someone wanted to work on 
> >improving concurrency for routing lookups.  Re-introducing the route 
> >cache for inpcbs would also help the connect() case, but not the 
> >sendto() case, but is still a good idea as it would help TCP a *lot*.  
> >Once you get below the IP layer, contention on device driver transmit 
> >locks appears to be the next major locking-related performance issue.  
> >The UDP changes I'm in the throes of merging have lead to significant 
> >performance improvements for UDP applications, such as named and 
> >memcached, and hopefully can be MFC'd for 7.1 or 7.2.
> 
> Caching the route in the inpcb has a number of problems:
> 
>  - any routing table change has to walk all inpcb's to invalidate
>    and remove outdated and invalid references.
> 
>  - adding host routes again just bloats the table again and makes
>    lookups more expensive.
> 
>  - host routes (cloned) do not change when the underlying route is
>    adjusted and packets are still routed to the old gateway (for
>    example new default route).
> 
>  - We have a tangled mess of cross-pointers and dependencies again
>    precluding optimizations to the routing table and code itself.

Can't you address #1, #3 and #4 by copying the entry and using
a generation count?  When a route change happens, then just
bump the generation count, the copy will be invalidated and then
next time it's attempted to be used, it will be thrown out.

Can't comment on the rest of this as I'm not that familiar...

> 
> A different path to a reduced routing overhead may be the following:
> 
>  - move ARP out of the routing table into its own per-AF and interface
>    structure and optimized for fast perfect match lookups;  This removes
>    a lot of bloat and dependencies from the routing table.
> 
>  - prohibit any direct references to specific routes (pointers) in the
>    routing table;  Lookups take the ifp/nexthop and unlock the table
>    w/o any further references;
> 
>  - The per-route locks can be removed and a per-AF global optimized table
>    lock can be introduced.
> 
>  - A clear separation between route lookup and modify (add/remove) should
>    be made;  With this change differentiated locking strategies can be
>    used (rwlocks and/or the routing table can be replicated per-cpu).
> 
>  - Make a distinction between host and router mode to allow for different
>    optimizations  (rmlock for hosts and rwlocks for routers for example).
> 
> Our current routing code has its fingers still in too many things.  Once
> it can be untangled way more optimization and simplification is possible.

That sounds cool.

-- 
- Alfred Perlstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080707200418.GE95574>