Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 15 Dec 2007 10:10:21 -0800
From:      "Kip Macy" <kip.macy@gmail.com>
To:        "Robert Watson" <rwatson@freebsd.org>
Cc:        FreeBSD Current <freebsd-current@freebsd.org>, freebsd-arch@freebsd.org
Subject:   Re: pending changes for TOE support
Message-ID:  <b1fa29170712151010h5d977bb8k8584ed5b6b804c1c@mail.gmail.com>
In-Reply-To: <20071215152959.V85668@fledge.watson.org>
References:  <b1fa29170712121303x537fd11fj4b8827bb353ad8e4@mail.gmail.com> <b1fa29170712150057m690bd36bm7a1910969e92293b@mail.gmail.com> <20071215100351.Q70617@fledge.watson.org> <20071215152959.V85668@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Dec 15, 2007 8:23 AM, Robert Watson <rwatson@freebsd.org> wrote:
>
> On Sat, 15 Dec 2007, Robert Watson wrote:
>
> > More comments to follow.
>
> Some more, back to tcp_offload.c which I didn't look at last time around:
>
> +       ifp = rt->rt_ifp;
> +       tdev = TOEDEV(ifp);
> +       if (tdev == NULL)
> +               return (EINVAL);
> +
> +       if (tdev->tod_can_offload(tdev, so) == 0)
> +               return (EINVAL);
>
> I sort of expected to see a if_capenable check for TOE here, but I guess it
> doesn't make much difference if the flag is going to be checked at the lower
> level.  However, in the case of things like TCP checksum offload, we do do the
> check at the higher level, so it wouldn't be inconsistent to do that.


Yes, IFCAP_TOE should be checked here.

>
> BTW, could you prepare similar comments for toedev.h?

Ok, toedev.h isn't actually originally from me. However, it is a part
of the API and so should be documented.

>
> +struct toe_usrreqs tcp_offload_usrreqs = {
> +       .tu_send =              tcp_output,
> +       .tu_rcvd =              tcp_output,
> +       .tu_disconnect =        tcp_output,
> +       .tu_abort =             tcp_output,
> +       .tu_detach =            tcp_offload_detach_noop,
> +       .tu_syncache_event =    tcp_offload_syncache_event_noop,
> +};
>
> This structure seems to introduce quite a bit more indirection for
> non-offloaded cases than before, especially to do things like this:
>
> +static void
> +tcp_offload_syncache_event_noop(int event, void *toepcb)
> +{
> +}


This should never be called as sc_tu will only be set by the TOE
interface to the syncache. Also bear in mind that this and detach are
only called once per connection The real overhead is in calling
tcp_output via an indirect function call versus a direct function
call. Considering the level of gratuitous overhead in the output path
I would be surprised if this were measurable.


>
> The compiler can't compile these out because it likely doesn't do much in the
> way of function pointer analysis, and probably shouldn't.  However, it leaves
> quite a few code paths heavier weight than before.  I think I'd prefer a model
> in which a TF_OFFLOAD flag is set on the tcpcb and then we conditionally
> invoke tu_foo as a result.  I.e.,:
>
> static __inline int
> tcp_gen_send(struct tcpcb *tp)
> {
>
> #ifndef TCP_OFFLOAD_DISABLE
>         if (tcp->f_flag & TF_OFFLOADED)
>                 return (tp->t_tu->tu_send(tp));
>         else
> #endif
>                 return (tcp_output(tp));
> }
>
> This would compile to a straight call to tcp_output() when offloading isn't
> compiled in, and when it is compiled in and offloading isn't enabled, we do a
> simple flag check rather than invoking a function via a series of pointers.


This is what the current code does. See tcp_ofld.h in CVS. The
indirection was suggested by Sam as a cleaner abstraction.



>
> Back to tcp_offload.h:
>
> +#define        SC_ENTRY_PRESENT                1       /* 4-tuple already
> present */
> +#define        SC_DROP                         2       /* connection was
> timed out */
>
> I think you should give these a different prefix, since they're part of the
> interface between the syncache and offload, and SC_ generally are flags used
> internal to the syncache.  Later you use TCP_OFFLOAD_ as the prefix, so that
> may be appropriate here also.
>
> +#define        TCP_OFFLOAD_LISTEN_OPEN         1
> +#define        TCP_OFFLOAD_LISTEN_CLOSE        2
> +
> +typedef        void    (*tcp_offload_listen_fn)(void *, int, struct tcpcb *);
> +EVENTHANDLER_DECLARE(tcp_offload_listen, tcp_offload_listen_fn);
>
> Here and with the syncache interface, it seems like you're adding new
> operations as modes to functions, whereas elsewhere you're adding multiple
> functions.  There isn't too much difference, but I think I'd rather see a
> slightly wider interface (i.e., more functions) and avoid flags that change
> the semantics of particular functions.  That is to say: tu_syncache_add and
> tu_syncache_drop, and likewise tcp_offload_listen and tcp_offload_unlisten (or
> something similar).

I've kept the interface to as few functions as possible. I've expanded
tcp_output to 4 functions because it was necessary semantically. I'll
widen the interface if Sam agrees.


>
> +int    tcp_offload_connect(struct socket *so, struct sockaddr *nam);
>
> This prototype appear not to be documented.

Ok, it appears fairly self-documenting to me =-D

>
> +/*
> + * The tcp_gen_* routines are wrappers around the toe_usrreqs calls,
> + * in the non-offloaded case they translate to tcp_output.
>
> Not really sure about the _gen_ naming, but not sure I have a better
> suggestion in mind just yet -- maybe _output_ since they control output.  I'm
> not entirely thrilled that all of these become inline functions in include
> files, although since a couple are used in multiple tcp*.c files, it may be
> the least bad of the options.  My comments above about possibly restructuring
> this to avoid lots of indirection for non-offloaded case strengthen the
> argument for inlining, however.
>
> +#ifndef TCP_OFFLOAD_DISABLE
>
> I notice you've renamed the option, and I like the new name a lot better.
> However, I also notice that it doesn't seem to appear in conf/options or
> conf/files.  Could you make sure to add it?
>
> +/*
> + * The socket has not been marked as "do not offload"
> + */
> +#define        SO_OFFLOADABLE(so)      ((so->so_options & SO_NO_OFFLOAD) == 0)
>
> I find myself wondering if the offload option should be a TCP socket option
> rather than a socket-layer option, but don't have strong feelings about this.

The assumption is that, if you a) pay the extra for a version of the
card that supports TOE and b) you load the toe module (it isn't part
of the NIC driver) that you want your existing software to have its
connections offloaded. I don't know of any customers that want to
modify their user applications to selectively enable TOE.


>
> What do you intend the policy model to be for enabling TOE, in general?  That
> if the TOE capability is available and the TOE capability is enabled, all TCP
> sockets via the enabled interface will be offloaded unless SO_NO_OFFLOAD is
> set?

That is the current usage model. At some point we may tie it into
pf/ipf/ipfw to provide for offload policy. However, currently we
offload all connections on an interface until we run out of TCAM
entries.

>Are you currently anticipating enabling the capability by default?

Yes, *if the TOE driver is loaded*.


>
> Nitpick on style:
>
> +       /* disconnect offload device, if any */
>
> Comments that are sentences should begin with an upper case letter and end
> with a period.  :-)  And another one:
>
> +#ifdef TCP_OFFLOAD_DISABLE
> +#define TOEPCB_SET(sc) (0)
> +#else
> +#define TOEPCB_SET(sc) ((sc)->sc_toepcb != NULL)
> +#endif
> +
> +
>
> Too many blank lines there.
>
> I've noticed that in quite a few places, we prefer X_ISSET() to test for a set
> flag or other condition, in order to prevent confusion with X_SET() assigning
> a value.  I think that's pretty sensible, and should do it more myself, so
> encourage you to do the same

That sounds reasonable.


 -Kip



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?b1fa29170712151010h5d977bb8k8584ed5b6b804c1c>