Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Apr 2014 22:42:26 -0400
From:      Patrick Kelsey <kelsey@ieee.org>
To:        hiren panchasara <hiren.panchasara@gmail.com>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Adrian Chadd <adrian@freebsd.org>
Subject:   Re: netisr observations
Message-ID:  <CAD44qMVqm0RA0vqdvVEsh2EOJuUQpTyAxFF7kCqs4T2USFrP9Q@mail.gmail.com>
In-Reply-To: <CALCpEUFVtcs9HhEcZ=AEvEudpsc-=QyXjMg8MXFFdLiUHUf-kQ@mail.gmail.com>
References:  <CALCpEUHhUkZ9b=2ynaN5-MkxOObs%2BO4RTsUhmhcMeC-WDnAxKg@mail.gmail.com> <CAJ-Vmo=TUVwuoWJeTCYvC-2sYvLRh%2BevACukS%2BaNHOaz9hwkrA@mail.gmail.com> <CAD44qMUVLLw0UNTgaTZ74=Ktq46ROT9E%2BssrHznHPhqujScBkA@mail.gmail.com> <CALCpEUFVtcs9HhEcZ=AEvEudpsc-=QyXjMg8MXFFdLiUHUf-kQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Apr 11, 2014 at 8:23 PM, hiren panchasara <
hiren.panchasara@gmail.com> wrote:

> On Fri, Apr 11, 2014 at 11:30 AM, Patrick Kelsey <kelsey@ieee.org> wrote:
> >
> > The output of netstat -Q shows IP dispatch is set to default, which is
> > direct (NETISR_DISPATCH_DIRECT).  That means each IP packet will be
> > processed on the same CPU that the Ethernet processing for that packet
> was
> > performed on, so CPU selection for IP packets will not be based on
> flowid.
> > The output of netstat -Q shows Ethernet dispatch is set to direct
> > (NETISR_DISPATCH_DIRECT if you wind up reading the code), so the Ethernet
> > processing for each packet will take place on the same CPU that the
> driver
> > receives that packet on.
> >
> > For the igb driver with queues autoconfigured and msix enabled, as the
> > sysctl output shows you have, the driver will create a number of queues
> > subject to device limitations, msix message limitations, and the number
> of
> > CPUs in the system, establish a separate interrupt handler for each one,
> and
> > bind each of those interrupt handlers to a separate CPU.  It also
> creates a
> > separate single-threaded taskqueue for each queue.  Each queue interrupt
> > handler sends work to its associated taskqueue when the interrupt fires.
> > Those taskqueues are where the Ethernet packets are received and
> processed
> > by the driver.  The question is where those taskqueue threads will be
> run.
> > I don't see anything in the driver that makes an attempt to bind those
> > taskqueue threads to specific CPUs, so really the location of all of the
> > packet processing is up to the scheduler (i.e., arbitrary).
> >
> > The summary is:
> >
> > 1. the hardware schedules each received packet to one of its queues and
> > raises the interrupt for that queue
> > 2. that queue interrupt is serviced on the same CPU all the time, which
> is
> > different from the CPUs for all other queues on that interface
> > 3. the interrupt handler notifies the corresponding task queue, which
> runs
> > its task in a thread on whatever CPU the scheduler chooses
> > 4. that task dispatches the packet for Ethernet processing via netisr,
> which
> > processes it on whatever the current CPU is
> > 5. Ethernet processing dispatches that packet for IP processing via
> netisr,
> > which processes it on whatever the current CPU is
>
> I really appreciate you taking time and explaining this. Thank you.
>

Sure thing.  I've had my head in the netisr code frequently lately, and
it's nice to be able to share :)


>
> I am specially confused with ip "Queued" column from netstat -Q
> showing 203888563 only for cpu3. Does this mean that cpu3 queues
> everything and then distributes among other cpus? Where does this
> queuing on cpu3 happens out of 5 stages you mentioned above?
>
> This value gets populated in snwp->snw_queued field for each cpu
> inside sysctl_netisr_work().
>

The way your system is configured, all inbound packets are being
direct-dispatched.  Those packets will bump the dispatched and handled
counters, but not the queued counter.  The queued counter only gets bumped
when something is queued to a netisr thread.  You can figure out where that
is happening, despite everything apparently being configured for direct
dispatch, by looking at where netisr_queue() and netisr_queue_src() are
being called from.  netisr_queue() is called during ipv6 forwarding and
output and ipv4 output when the destination is a local address, gre
processing, route socket processing, if_simloop() (which is called to loop
back multicast packets, for example)...  netisr_queue_src() is called
during ipsec and divert processing.

One thing to consider also when thinking about what the netisr per-cpu
counters represent is that netisr really maintains per-cpu workstream
context, not per-netisr-thread.  Direct-dispatched packets contribute to
the statistics of the workstream context of whichever CPU they are being
direct-dispatched on.  Packets handled by a netisr thread contribute to the
statistics of the workstream context of the CPU it was created for, whether
or not it was bound to, or is currently running on, that CPU.  So when you
look at the statistics in netstat -Q output for CPU 3, dispatched is the
number of packets direct-dispatched on CPU 3, queued is the number of
packets queued to the netisr thread associated with CPU 3 (but that may be
running all over the place if net.isr.bindthreads is 0), and handled is the
number of packets processed directly on CPU 3 or in the netisr thread
associated with CPU3.



>
> >
> > You might want to try changing the default netisr dispatch policy to
> > 'deferred' (sysctl net.isr.dispatch=deferred).  If you do that, the
> Ethernet
> > processing will still happen on an arbitrary CPU chosen by the scheduler,
> > but the IP processing should then get mapped to a CPU based on the flowid
> > assigned by the driver.  Since igb assigns flowids based on received
> queue
> > number, all IP (and above) processing for that packet should then be
> > performed on the same CPU the queue interrupt was bound to.
>
> I will give this a try and see how things behave.
>
> I was also thinking about net.isr.bindthreads. netisr_start_swi() does
> intr_event_bind() if we have it bindthreads set to 1. What would that
> gain me, if anything?
>
>
That's a good point.  If you move to deferred dispatch and bind the
threads, then you keep the interrupt processing and IP-and-above protocol
processing for packets from a given igb queue on the same CPU always.  If
you don't bind the netisr threads, then all IP-and-above protocol
processing for packets from a given igb queue will always happen in the
same netisr thread and you will get whatever locality benefits the
scheduler manages to give you.  I think the choice depends on what else you
have going on in the system and what your priorities are.  Binding the
netisr threads will get you the best locality benefits for input packet
processing, but might create hot-spot problems if you have other system
activities you want bound to certain CPUs from an overlapping CPU set.  Not
binding the netisr threads probably gives up some locality benefits in
packet processing, but the scheduler can move the network processing work
away from other workloads you might have bound to some CPUs (and might care
more about getting the locality benefit).


> Would it stop moving intr{swi1: netisr 3} on to different cpus (as I
> am seeing in 'top' o/p) and bind it to a single cpu?
>
>
Yes it would.


> I've came across a thread discussing some side-effects of this though:
> http://lists.freebsd.org/pipermail/freebsd-hackers/2012-January/037597.html
>
>
Looks like the suggested fix was incorporated into the kernel about a month
after that thread (so, 2 years ago) in r230984 (
http://svnweb.freebsd.org/base?view=revision&revision=230984).  That's in
10-stable as well as -current.

-Patrick



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAD44qMVqm0RA0vqdvVEsh2EOJuUQpTyAxFF7kCqs4T2USFrP9Q>