Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 15 Dec 2007 19:01:09 +0000 (GMT)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Kip Macy <kip.macy@gmail.com>
Cc:        FreeBSD Current <freebsd-current@freebsd.org>, freebsd-arch@freebsd.org
Subject:   Re: pending changes for TOE support
Message-ID:  <20071215184737.A85668@fledge.watson.org>
In-Reply-To: <b1fa29170712151010h5d977bb8k8584ed5b6b804c1c@mail.gmail.com>
References:  <b1fa29170712121303x537fd11fj4b8827bb353ad8e4@mail.gmail.com>  <b1fa29170712150057m690bd36bm7a1910969e92293b@mail.gmail.com>  <20071215100351.Q70617@fledge.watson.org> <20071215152959.V85668@fledge.watson.org> <b1fa29170712151010h5d977bb8k8584ed5b6b804c1c@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Sat, 15 Dec 2007, Kip Macy wrote:

>> BTW, could you prepare similar comments for toedev.h?
>
> Ok, toedev.h isn't actually originally from me. However, it is a part of the 
> API and so should be documented.

Thanks -- when designing a KPI, it's important to keep in mind that the goal 
of the author using it will not be to understand our TCP code, but instead, to 
implement the driver as quickly and cheaply as possible.  Therefore we should 
make it as easy as possible for the authors of drivers to get it right.  This 
comes out a few other places in my requests for clarification or additions -- 
imagine you weren't familiar with our TCP, what could you get wrong?

>> This structure seems to introduce quite a bit more indirection for 
>> non-offloaded cases than before, especially to do things like this:
>
> This should never be called as sc_tu will only be set by the TOE interface 
> to the syncache. Also bear in mind that this and detach are only called once 
> per connection The real overhead is in calling tcp_output via an indirect 
> function call versus a direct function call. Considering the level of 
> gratuitous overhead in the output path I would be surprised if this were 
> measurable.

Undoubtably this is true for high-end systems sporting PCIe interfaces with 
10gbps cards, but we also run in other configurations.  A complaint we've had 
a fair amount in the last few years is that our work has increasingly targeted 
high-end systems where overhead is in cache misses, and decreasingly targeted 
low-end systems where overhead is in instruction count.  While you offer the 
opportunity to compile out some of this, I think we should try to make these 
things capable of being fast in both circumstances without multiple compile 
paths where it's easy.

>> This would compile to a straight call to tcp_output() when offloading isn't 
>> compiled in, and when it is compiled in and offloading isn't enabled, we do 
>> a simple flag check rather than invoking a function via a series of 
>> pointers.
>
> This is what the current code does. See tcp_ofld.h in CVS. The indirection 
> was suggested by Sam as a cleaner abstraction.

I think I prefer the CVS version of the two, although I would like to collapse 
the two ifdef parts per my example, so that ifdef and non-ifdef cases are 
side-by-side, and so that function headers are shared by the two cases. 
Ideally, we'd make the ifdef very, very small, as it's well-known that when we 
have two mutually exclusive code paths and one isn't compiled or used 
frequently, it rots.

>> +int tcp_offload_connect(struct socket *so, struct sockaddr *nam);
>>
>> This prototype appear not to be documented.
>
> Ok, it appears fairly self-documenting to me =-D

I'm sure you can find some insight to express that's not self-obvious in the 
code :-).

>> I find myself wondering if the offload option should be a TCP socket option 
>> rather than a socket-layer option, but don't have strong feelings about 
>> this.
>
> The assumption is that, if you a) pay the extra for a version of the card 
> that supports TOE and b) you load the toe module (it isn't part of the NIC 
> driver) that you want your existing software to have its connections 
> offloaded. I don't know of any customers that want to modify their user 
> applications to selectively enable TOE.

I think you mis-understood my question -- I was wondering whether it should be 
selectively disabled by an IP-layer socket option rather than a socket-layer 
socket option.

>> What do you intend the policy model to be for enabling TOE, in general? 
>> That if the TOE capability is available and the TOE capability is enabled, 
>> all TCP sockets via the enabled interface will be offloaded unless 
>> SO_NO_OFFLOAD is set?
>
> That is the current usage model. At some point we may tie it into 
> pf/ipf/ipfw to provide for offload policy. However, currently we offload all 
> connections on an interface until we run out of TCAM entries.
>
>> Are you currently anticipating enabling the capability by default?
>
> Yes, *if the TOE driver is loaded*.

I have somewhat mixed feelings about this, and feel like there should be 
something eloquent to say about having functionality administratively enabled 
rather than a default when compiled in, but can't quite figure out a nice 
expression of that.  I think it might have something to do with expecting 
vendors of 10gbps cards to like shipping two modules for every device rather 
than one, and having the right behavior out of the box if they do ship just 
one.

Robert N M Watson
Computer Laboratory
University of Cambridge



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071215184737.A85668>