Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 19 Jan 2013 17:19:19 -0800
From:      Adrian Chadd <adrian@freebsd.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        Barney Cordoba <barney_cordoba@yahoo.com>, Luigi Rizzo <rizzo@iet.unipi.it>, freebsd-net@freebsd.org
Subject:   Re: two problems in dev/e1000/if_lem.c::lem_handle_rxtx()
Message-ID:  <CAJ-Vmomd1ivZjWdiC8_O1qwim2dctq1o%2By5=UH2eivU4NdCOAQ@mail.gmail.com>
In-Reply-To: <201301191114.29959.jhb@freebsd.org>
References:  <1358610450.75691.YahooMailClassic@web121604.mail.ne1.yahoo.com> <201301191114.29959.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 19 January 2013 08:14, John Baldwin <jhb@freebsd.org> wrote:

> However, I did describe an alternate setup where you can fix this.  Part of
> the key is to get various NICs to share a single logical queue of tasks.  You
> could simulate this now by having all the deferred tasks share a single
> taskqueue with a pool of tasks, but that will still not fully cooperate with
> ithreads.  To do that you have to get the interrupt handlers themselves into
> the shared taskqueue.  Some changes I have in a p4 branch allow you to do that
> by letting interrupt handlers reschedule themselves (avoiding the need for a
> separate task and preventing the task from running concurrently with the
> interrupt handler) and providing some (but not yet all) of the framework to
> allow multiple devices to share a single work queue backed by a shared pool of
> threads.

How would that work when I want to pin devices to specific cores?

We at ${WORK} developed/acquired/etc a company that makes network
processors that are highly threaded, where you want to pin specific
things to specific CPUs.
If you just push network device handling to a pool of threads without
allowing for pinning, you'll end up with some very, very poor
behaviour.

Windows, for example, complains loudly (read: BSODs saying your driver
is buggy) if your tasklets take too much CPU to run without
preempting. So at ${WORK}, we do yield RX processing after a (fair)
while.

Maybe we do want a way to allow the RX taskqueue to yield itself in a
way that (a) let's us re-schedule it, and (b) tells the taskqueue to
actually yield after this point and let other things have a go.

Barney - yes I think processing 100 packets each time through the
loop, on a gige interface, is a bit silly. My point was specifically
about how to avoid livelock without introducing artificial delays in
waiting for the next mitigated interrupt to occur (because you don't
necessarily get another interrupt when you re-enable things, depending
upon what the hardware is doing / how buggy your driver is.) Ideally
you'd set some hard limit on how much CPU time the task takes before
it yields, so you specifically avoid livelock under DoS conditions.

Actually, one thing I did at a previous job, many years ago now, was
to do weighted random / tail dropping of frames in the driver RX
handling itself, rather than having it go up to the stack and take all
the extra CPU to process things. Again, my suggestion is how to avoid
livelock under highly stressful conditions, rather than just going
down the path of polling (for example.)




Adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmomd1ivZjWdiC8_O1qwim2dctq1o%2By5=UH2eivU4NdCOAQ>