Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 21 Nov 2009 08:10:14 -0900
From:      Mel Flynn <mel.flynn+fbsd.questions@mailing.thruhere.net>
To:        Brett Glass <brett@lariat.net>
Cc:        questions@freebsd.org
Subject:   Re: kern.polling.lost_polls
Message-ID:  <b491a0c45ff8b78fcd75239b31bd1c9b@sbmail.office-on-the.net>
In-Reply-To: <200911210207.TAA21572@lariat.net>
References:  <200911202135.OAA18537@lariat.net> <db2308c2d90148218fcc9209721b9920@sbmail.office-on-the.net> <200911210207.TAA21572@lariat.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 20 Nov 2009 19:07:42 -0700, Brett Glass <brett@lariat.net> wrote:
> At 06:25 PM 11/20/2009, Mel Flynn wrote:
>=20
>>So that means that you give the kernel .25 microseconds to poll and act
on
>>any pending network IO. That's probably not enough.
>=20
> I think that you mean ".25 milliseconds," not ".25 microseconds," above=
.

Yes, sorry. It should be enough, but...it's related to CPU speed and numb=
er
of interfaces. On FreeBSD-net they can give you better advice, most notab=
ly
whether all 6 interfaces are done in one poll and so each task needs to b=
e
completed within 1/HZ/N? I cannot say this with certainty.

>>It is further explained by
>>the
>>comment in sys/kern/kern_poll.c:
>>/*
>>  * Hook from hardclock. Tries to schedule a netisr, but keeps track
>>  * of lost ticks due to the previous handler taking too long.
>>  * Normally, this should not happen, because polling handler should
>>  * run for a short time. However, in some cases (e.g. when there are
>>  * changes in link status etc.) the drivers take a very long time
>>  * (even in the order of milliseconds) to reset and reconfigure the
>>  * device, causing apparent lost polls.
>>  *
>>  * The first part of the code is just for debugging purposes, and trie=
s
>>  * to count how often hardclock ticks are shorter than they should,
>>  * meaning either stray interrupts or delayed events.
>>  */
>=20
> Well, even at HZ=3D2000, kern.polling.lost_polls and=20
> kern.polling.suspect are both incrementing, as is kern.polling.stalled:
>=20
> stargate# sysctl -a | grep polling
> kern.polling.burst: 150
> kern.polling.burst_max: 150
> kern.polling.each_burst: 5
> kern.polling.idle_poll: 0
> kern.polling.user_frac: 50
> kern.polling.reg_frac: 20
> kern.polling.short_ticks: 0
> kern.polling.lost_polls: 41229
> kern.polling.pending_polls: 0
> kern.polling.residual_burst: 0
> kern.polling.handlers: 2

That bugs me: if you have 6 devices, the number of handlers should be
6.
/*
 * Try to register routine for polling. Returns 0 if successful
 * (and polling should be enabled), error code otherwise.
 * A device is not supposed to register itself multiple times.
 *
 * This is called from within the *_ioctl() functions.
 */

Unless this should really read "drivers", but I think it's devices.

> kern.polling.enable: 0
> kern.polling.phase: 0
> kern.polling.suspect: 31653
> kern.polling.stalled: 10
> kern.polling.idlepoll_sleeping: 1
> hw.acpi.thermal.polling_rate: 10
>=20
> But if I slow the clock down to 1000 Hz, it's unclear if the=20
> machine will be able to keep up with traffic. I was already getting=20
> more than 1,000 network interrupts per second before I tried=20
> polling, and I'm not sure how many packets the interfaces (some=20
> fxp, some em) can buffer up. I'm going to try it, but if it doesn't=20
> work I will have to go back to interrupt-driven operation.

You might be able if your network architecture allows it, to bring down
the task load by increasing the MTU and enable jumbo frames.
>From em(4):
     Support for Jumbo Frames is provided via the interface MTU setting.
     Selecting an MTU larger than 1500 bytes with the ifconfig(8) utility
con=E2=80=90
     figures the adapter to receive and transmit Jumbo Frames. The maximu=
m
     MTU size for Jumbo Frames is 16114.

--=20
Mel




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?b491a0c45ff8b78fcd75239b31bd1c9b>