Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 26 Nov 2010 13:18:40 GMT
From:      Mykola Zubach <zuborg@gmail.com>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/152599: scheduler issue - cpu overusage by 'intr' kernel thread
Message-ID:  <201011261318.oAQDIeHh040520@red.freebsd.org>
Resent-Message-ID: <201011261320.oAQDK6Gn084842@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         152599
>Category:       kern
>Synopsis:       scheduler issue - cpu overusage by 'intr' kernel thread
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Nov 26 13:20:06 UTC 2010
>Closed-Date:
>Last-Modified:
>Originator:     Mykola Zubach
>Release:        8.1-RELEASE
>Organization:
AdvancedHosters.com
>Environment:
FreeBSD DS1102 8.1-RELEASE-p1 FreeBSD 8.1-RELEASE-p1 #0: Mon Oct 18 11:31:13 UTC 2010     root@DS1124:/usr/obj/usr/src/sys/Z-AMD64  amd64
>Description:
Compare this system load

# cpuset -g -p 11
pid 11 mask: 0, 1, 2, 3
# cpuset -g -p 39283
pid 39283 mask: 0, 1, 2, 3

last pid: 46295;  load averages:  0.02,  0.03,  0.02     up 20+05:19:55  12:56:31
126 processes: 5 running, 95 sleeping, 26 waiting
CPU:  2.7% user,  0.0% nice,  6.9% system,  3.4% interrupt, 86.9% idle
Mem: 112M Active, 13G Inact, 2114M Wired, 486M Cache, 1645M Buf, 96M Free
Swap: 2048M Total, 72K Used, 2048M Free
  PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   10 root     171 ki31     0K    64K CPU2    2 438.6H 99.66% {idle: cpu2}
   10 root     171 ki31     0K    64K RUN     0 332.7H 96.48% {idle: cpu0}
   10 root     171 ki31     0K    64K RUN     3 431.1H 91.06% {idle: cpu3}
   10 root     171 ki31     0K    64K CPU1    1 444.4H 84.67% {idle: cpu1}
39283 www       53    0 78948K 72184K kqread  1   1:15 20.31% nginx
   11 root     -44    -     0K   416K CPU0    0 159.4H  5.81% {swi1: netisr 0}

and this one:

# cpuset -g -p 11
pid 11 mask: 0
# cpuset -g -p 39283
pid 39283 mask: 0

last pid: 47792;  load averages:  1.12,  0.78,  0.59     up 20+05:26:59  13:03:35
132 processes: 8 running, 100 sleeping, 24 waiting
CPU:  0.3% user,  0.0% nice,  2.1% system,  0.4% interrupt, 97.2% idle
Mem: 115M Active, 13G Inact, 2121M Wired, 649M Cache, 1644M Buf, 96M Free
Swap: 2048M Total, 80K Used, 2048M Free
  PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
   10 root     171 ki31     0K    64K CPU0    0 332.7H 100.00% {idle: cpu0}
   10 root     171 ki31     0K    64K CPU1    1 444.6H 99.37% {idle: cpu1}
   10 root     171 ki31     0K    64K CPU3    3 431.3H 98.44% {idle: cpu3}
   10 root     171 ki31     0K    64K RUN     2 438.8H 98.10% {idle: cpu2}
39283 www       46    0 84068K 76412K kqread  0   2:17  3.61% nginx
    3 root      -8    -     0K    16K -       3 189:16  0.83% g_up
   11 root     -44    -     0K   416K WAIT    0 159.5H  0.49% {swi1: netisr 0}
    4 root      -8    -     0K    16K -       3 102:07  0.24% g_down
    6 root      44    -     0K    16K psleep  3  84:08  0.10% pagedaemon
   19 root      -8    -     0K    16K m:w1    2  76:08  0.05% g_mirror gm1

Cpu usage is significantly lower in second case (while loadaverage is higher).
The only difference is cpu binding of processes 11 and 39283.

It looks to be cpu cache coherence issue.
I'm not sure is it possible improve scheduling in general case or not, but it looks to be more efficient to bind at least 'intr' kernel process to some single core and, may be, do some special scheduling for processes in 'kqread' state.

Bandwidth is 650+650=1300Mbit/s total at the moment (there are em0 and em1 on the server, both with POLLING).
Nginx uses kqueue,aio+sendfile, kernel is build with ZERO_SOCKETS.

On some servers binding of pid 11 to cpu0 reduces cpu usage from 40-80% to several percents.
>How-To-Repeat:

>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201011261318.oAQDIeHh040520>