Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 15 Nov 2008 04:59:16 -0800 (PST)
From:      Won De Erick <won.derick@yahoo.com>
To:        rwatson@freebsd.org, freebsd-hackers@freebsd.org
Subject:   NET.ISR and CPU utilization performance w/ HP DL 585 using FreeBSD 7.1 Beta2
Message-ID:  <557765.55617.qm@web45804.mail.sp1.yahoo.com>

next in thread | raw e-mail | index | archive | help
Hello,

I tested HP DL 585 (16 CPUs, w/ built-in Broadcom NICs) running FreeBSD 7.1 Beta2 under heavy network traffic (TCP).

SCENARIO A : Bombarded w/ TCP traffic:

When net.isr.direct=1,

  PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
   52 root        1 -68    -     0K    16K CPU11  b  38:43 95.36% irq32: bce1
   51 root        1 -68    -     0K    16K CPU10  a  25:50 85.16% irq31: bce0
   16 root        1 171 ki31     0K    16K RUN    a  65:39 15.97% idle: cpu10
   28 root        1 -32    -     0K    16K WAIT   8  12:28  5.18% swi4: clock sio
   15 root        1 171 ki31     0K    16K RUN    b  52:46  3.76% idle: cpu11
   45 root        1 -64    -     0K    16K WAIT   7   7:29  1.17% irq17: uhci0
   47 root        1 -64    -     0K    16K WAIT   6   1:11  0.10% irq16: ciss0
   27 root        1 -44    -     0K    16K WAIT   0  28:52  0.00% swi1: net

When net.isr.direct=0,

   16 root        1 171 ki31     0K    16K CPU10  a 106:46 92.58% idle: cpu10
   19 root        1 171 ki31     0K    16K CPU7   7 133:37 89.16% idle: cpu7
   27 root        1 -44    -     0K    16K WAIT   0  52:20 76.37% swi1: net
   25 root        1 171 ki31     0K    16K RUN    1 132:30 70.26% idle: cpu1
   26 root        1 171 ki31     0K    16K CPU0   0 111:58 64.36% idle: cpu0
   15 root        1 171 ki31     0K    16K CPU11  b  81:09 57.76% idle: cpu11
   52 root        1 -68    -     0K    16K WAIT   b  64:00 42.97% irq32: bce1
   51 root        1 -68    -     0K    16K WAIT   a  38:22 12.26% irq31: bce0
   45 root        1 -64    -     0K    16K WAIT   7  11:31 12.06% irq17: uhci0
   47 root        1 -64    -     0K    16K WAIT   6   1:54  3.66% irq16: ciss0
   28 root        1 -32    -     0K    16K WAIT   8  16:01  0.00% swi4: clock sio

Overall CPU utilization has significantly dropped, but I noticed that swi1 has taken CPU0 with high utilization when the net.isr.direct=0.
What does this mean?

SCENARIO B : Bombarded w/ more TCP traffic:

Worst thing, the box has become unresponsive (can't be PINGed, inaccessible through SSH) after more traffic was added retaining net.isr.direct=0.
This is due maybe to the 100% utilization on CPU0 for sw1:net (see below result, first line). bce's and swi's seem to race each other based on the result when net.isr.direct=1, swi1 . 
The rest of the CPUs are sitting pretty (100% Idle). Can you shed some lights on this?

When net.isr.direct=0:
   27 root        1 -44    -     0K    16K CPU0   0   5:45 100.00% swi1: net
   11 root        1 171 ki31     0K    16K CPU15  0   0:00 100.00% idle: cpu15
   13 root        1 171 ki31     0K    16K CPU13  0   0:00 100.00% idle: cpu13
   17 root        1 171 ki31     0K    16K CPU9   0   0:00 100.00% idle: cpu9
   18 root        1 171 ki31     0K    16K CPU8   0   0:00 100.00% idle: cpu8
   21 root        1 171 ki31     0K    16K CPU5   5 146:17 99.17% idle: cpu5
   22 root        1 171 ki31     0K    16K CPU4   4 146:17 99.07% idle: cpu4
   14 root        1 171 ki31     0K    16K CPU12  0   0:00 99.07% idle: cpu12
   16 root        1 171 ki31     0K    16K CPU10  a 109:33 98.88% idle: cpu10
   15 root        1 171 ki31     0K    16K CPU11  b  86:36 93.55% idle: cpu11
   52 root        1 -68    -     0K    16K WAIT   b  59:42 13.87% irq32: bce1

When net.isr.direct=1,
   52 root        1 -68    -     0K    16K CPU11  b  55:04 97.66% irq32: bce1
   51 root        1 -68    -     0K    16K CPU10  a  33:52 73.88% irq31: bce0
   16 root        1 171 ki31     0K    16K RUN    a 102:42 26.86% idle: cpu10
   15 root        1 171 ki31     0K    16K RUN    b  81:20  3.17% idle: cpu11
   28 root        1 -32    -     0K    16K WAIT   e  13:40  0.00% swi4: clock sio

With regards to bandwidth in all scenarios above, the result is extremely low (expected is several hundred Mb/s). Why? 

  -         iface                   Rx                   Tx                Total
  ==============================================================================
             bce0:           4.69 Mb/s           10.49 Mb/s           15.18 Mb/s
             bce1:          20.66 Mb/s            4.68 Mb/s           25.34 Mb/s
              lo0:           0.00  b/s            0.00  b/s            0.00  b/s
  ------------------------------------------------------------------------------
            total:          25.35 Mb/s           15.17 Mb/s           40.52 Mb/s


Thanks,

Won


      




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?557765.55617.qm>