Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Aug 2011 12:38:26 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Lev Serebryakov <lev@freebsd.org>
Cc:        freebsd-arch@freebsd.org
Subject:   10gbps scalability (was: Re: FreeBSD problems and preliminary ways to solve)
Message-ID:  <alpine.BSF.2.00.1108201234280.4529@fledge.watson.org>
In-Reply-To: <368496955.20110820101506@serebryakov.spb.ru>
References:  <slrnj4oiiq.21rg.vadim_nuclight@kernblitz.nuclight.avtf.net> <810527321.20110819123700@serebryakov.spb.ru> <201108191401.23083.pieter@degoeje.nl> <425884435.20110819175307@serebryakov.spb.ru> <20110819172252.GE88904@in-addr.com> <368496955.20110820101506@serebryakov.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help

On Sat, 20 Aug 2011, Lev Serebryakov wrote:

>> Can you honestly say the same about handling line rate packet forwarding
>> for multiple 10G cards?
>
> I agree with you. I've not say, that 10G routing is very important for many 
> users. My comment about 10G was answer to statement, that "The niche for 
> routers & traffic analysis is still ours.". I wanted to say, that it is so 
> may be now, but not for long.

Part of the key here will be reworking things like ipfw(4) and pf(4) to scale 
better than they do currently.  For pf(4), it's particularly important that we 
align hardware work distribution via RSS with state management for TCP 
connections.  I've been working on this for the base system TCP implementation 
over the last few years, and got most of it into 9.x (but not the actual RSS 
driver interface, as I wasn't convinced it was a stable KPI in the form I 
prototyped it in).  Post-9.0, I'll try to get the RSS KPI cleaned up so that 
we can merge it and get our device drivers updated.

There's also a related work-in-progress I have that teaches the network stack 
how to program NIC filters, usually implemented as TCAMs (Chelsio) or hardware 
hash tables (Solarflare) about network stack connection affinity.  My plan is 
to work on making this substantially more real once the RSS patches are in. 
(Those are, themselves, fairly minor: we have connection groups already in 
9.0, and the RSS changes simply cause existing software-side hash tables to 
align with hardware-side hashing: the tricky bit is a sustainable KPI for 
device driver writers).

These are closely related to the issue of userspace networking, which Luigi is 
starting to explore with netmap.  Ideally, you could use the same NIC for both 
kernel network stack stuff and userspace applications, using hardware filters 
to decide whether individual packets go to a descriptor ring in the kernel or 
userspace.  Solarflare's Open Onload is an interesting potential model there, 
although perhaps not the exact model we want (they rely on shared network 
stacks between kernel and userspace, and for most of our purposes, less 
sharing is not only sufficient, but perhaps better).

Robert



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1108201234280.4529>