Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 26 Jul 2014 22:12:26 -0400
From:      John Jasen <jjasen@gmail.com>
To:        Adrian Chadd <adrian@freebsd.org>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: fastforward/routing: a 3 million packet-per-second system?
Message-ID:  <53D4600A.1010505@gmail.com>
In-Reply-To: <CAJ-Vmok6vyrD-2%2BswCBPP_VeMqH6t7CXuVBPUaDfwyRrb6aiTg@mail.gmail.com>
References:  <53CE80DD.9090109@gmail.com>	<CAACLuR1r0axCYWXeLDSa-m07eAVgTMBVW5sNbt%2By_Lt2ss1r7Q@mail.gmail.com>	<CAJ-Vmonsc79ULDOT9trtOotq7mRh1XJkhL2JfDNxXP16OFWMFg@mail.gmail.com> <CAJ-Vmok6vyrD-2%2BswCBPP_VeMqH6t7CXuVBPUaDfwyRrb6aiTg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
flowtables enabled. This is a 30 second burst of lock profiling,
processed via:
head -2 debug.lock.pref.stats.out-20140726-1 ; cat
debug.lock.pref.stats.out-20140726-1 | sort -nk4 | tail
debug.lock.prof.stats:

     max  wait_max       total  wait_total       count    avg wait_avg
cnt_hold cnt_lock name
       0         0        4758          31       22444      0      0 
0    103 /usr/src/sys/kern/sched_ule.c:886 (spin mutex:sched lock 11)
       1         1         770          45        2833      0      0 
0    115 /usr/src/sys/kern/sched_ule.c:886 (spin mutex:sched lock 6)
       3        16          50          54          30      1      1 
0      8 /usr/src/sys/dev/usb/usb_device.c:2783 (sleep mutex:Giant)
       0        14           5          61          28      0      2 
0     13 /usr/src/sys/dev/usb/usb_device.c:2758 (sleep mutex:Giant)
       1         1        3219          68       10452      0      0 
0    161 /usr/src/sys/kern/sched_ule.c:886 (spin mutex:sched lock 7)
      12         4         446         100         154      2      0 
0     46 /usr/src/sys/kern/kern_condvar.c:145 (sleep mutex:Giant)
       1         0       84164         976      405264      0      0 
0   2681 /usr/src/sys/kern/subr_turnstile.c:552 (spin mutex:turnstile chain)
       4         3    12812430      262856    68940690      0      0  0
546014 /usr/src/sys/net/route.c:439 (sleep mutex:rtentry)
       3         4     4683374     1274675    68940690      0      0  0
3433615 /usr/src/sys/netinet/ip_fastfwd.c:593 (sleep mutex:rtentry)
       5         6     3513959     2106043    68940690      0      0  0
7509422 /usr/src/sys/netinet/in_rmx.c:114 (sleep mutex:rtentry)

The high level overview of pmc output is also attached.

Profile trace for function: __rw_rlock() [19.54%]
Profile trace for function: _rw_runlock_cookie() [9.25%]
Profile trace for function: __mtx_unlock_flags() [5.97%]
Profile trace for function: memcpy() [5.68%]
Profile trace for function: bcopy() [5.53%]
Profile trace for function: bcmp() [5.09%]
Profile trace for function: _mtx_lock_spin_cookie() [4.77%]
Profile trace for function: ip_fastforward() [4.27%]
Profile trace for function: t4_eth_tx() [3.37%]
Profile trace for function: rn_match() [2.92%]
Profile trace for function: ether_output() [2.79%]
Profile trace for function: rtalloc1_fib() [2.35%]
Profile trace for function: get_scatter_segment() [1.68%]
Profile trace for function: uma_zalloc_arg() [1.65%]
Profile trace for function: rtfree() [1.58%]
Profile trace for function: bzero() [1.54%]
Profile trace for function: ether_nh_input() [1.47%]
Profile trace for function: cxgbe_transmit() [1.39%]
Profile trace for function: service_iq() [1.26%]
Profile trace for function: rtalloc_ign_fib() [1.21%]
Profile trace for function: __mtx_lock_sleep() [1.12%]
Profile trace for function: arpresolve() [1.11%]
Profile trace for function: uma_zfree_arg() [1.02%]
Profile trace for function: reclaim_tx_descs() [0.92%]
Profile trace for function: _mtx_trylock_flags_() [0.77%]
Profile trace for function: bounce_bus_dmamap_load_buffer() [0.76%]
Profile trace for function: ether_demux() [0.72%]
Profile trace for function: mb_ctor_mbuf() [0.59%]
Profile trace for function: in_localip() [0.51%]

As usual, to save everyone's mail INBOX, the full output is available on
request.

Thanks again!




On 07/25/2014 04:53 PM, Adrian Chadd wrote:
> Yeah:
>
> Adrians-MacBook-Pro:Downloads adrian$ head -2
> debug.lock.prof.stats.out.20140725  ; cat
> debug.lock.prof.stats.out.20140725 | sort -nk4 | tail -10
>
> debug.lock.prof.stats:
>
>      max  wait_max       total  wait_total       count    avg wait_avg
> cnt_hold cnt_lock name
>
>        6         3         419         145         160      2      0
> 0     63 /usr/src/sys/kern/kern_condvar.c:145 (sleep mutex:Giant)
>
>      282       133         991         215           8    123     26
> 0      2 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657
> (sleep mutex:cxl3 txq26)
>
>       69        72          71         250           5     14     50
> 0      4 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657
> (sleep mutex:cxl1 txq37)
>
>      281       197        1638         286          13    126     22
> 0      2 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657
> (sleep mutex:cxl1 txq46)
>
>      351       182        2416         499          38     63     13
> 0     10 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657
> (sleep mutex:cxl3 txq17)
>
>      276       193         802         643          10     80     64
> 0      5 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657
> (sleep mutex:cxl3 txq27)
>
>        0         1       98578        1341      482441      0      0
> 0   3767 /usr/src/sys/kern/subr_turnstile.c:552 (spin mutex:turnstile
> chain)
>
>        7        13    11543138      470545    63952832      0      0
> 0 815777 /usr/src/sys/net/route.c:439 (sleep mutex:rtentry)
>
>        6        15     3943582     1545195    63952779      0      0
> 0 3439254 /usr/src/sys/netinet/ip_fastfwd.c:593 (sleep mutex:rtentry)
>
>        7        17     3271389     2258698    63952832      0      0
> 0 6761237 /usr/src/sys/netinet/in_rmx.c:114 (sleep mutex:rtentry)
>
> .. try FLOWTABLE.
>
> The in_rmx.c is the hook to check for temporary routes installed by
> redirect ICMP messages. It's .. not very pretty. Just use FLOWTABLE
> for now and see if it improves things.
>
> (Yes, we likely can do better on the rtentry locking..)
>
>
> -a
>
>
> On 25 July 2014 13:51, Adrian Chadd <adrian@freebsd.org> wrote:
>> Ugh, the forwarding table stupidity. Try enabling FLOWTABLE as an option.
>>
>> I really dislike how the rtentry locking works. But that isn't a
>> rwlock - i'll have to look at your full lock profiling output to see.
>>
>>
>> -a
>>
>>
>> On 25 July 2014 09:20, John Jasen <jjasen@gmail.com> wrote:
>>> Based on advice I received, I've collected lock profile debugging output,
>>> and pmcannotate'd data from the system while it was processing about 3
>>> million packets/second.
>>>
>>> Combined, the files are about 325k in size, so I'll submit highlights here.
>>> I can provide the raw files to interested parties privately.
>>>
>>> pmcannotate summary output:
>>>
>>> grep ^Profile pmcannotate.20140725
>>> Profile trace for function: __rw_rlock() [17.04%]
>>> Profile trace for function: __mtx_unlock_flags() [9.10%]
>>> Profile trace for function: _rw_runlock_cookie() [7.67%]
>>> Profile trace for function: sched_idletd() [5.73%]
>>> Profile trace for function: memcpy() [5.64%]
>>> Profile trace for function: bcopy() [5.04%]
>>> Profile trace for function: bcmp() [5.01%]
>>> Profile trace for function: __mtx_lock_flags() [3.66%]
>>> Profile trace for function: t4_eth_tx() [3.25%]
>>> Profile trace for function: lock_profile_release_lock() [2.73%]
>>> Profile trace for function: ip_fastforward() [2.68%]
>>> Profile trace for function: ether_output() [2.50%]
>>> Profile trace for function: get_scatter_segment() [1.75%]
>>> Profile trace for function: rn_match() [1.74%]
>>> Profile trace for function: _mtx_lock_spin_cookie() [1.53%]
>>> Profile trace for function: lock_profile_obtain_lock_success() [1.49%]
>>> Profile trace for function: cxgbe_transmit() [1.37%]
>>> Profile trace for function: uma_zalloc_arg() [1.31%]
>>> Profile trace for function: bzero() [1.30%]
>>> Profile trace for function: service_iq() [1.26%]
>>> Profile trace for function: ether_nh_input() [1.23%]
>>> Profile trace for function: __mtx_lock_sleep() [1.19%]
>>> Profile trace for function: arpresolve() [1.07%]
>>> Profile trace for function: uma_zfree_arg() [0.95%]
>>> Profile trace for function: reclaim_tx_descs() [0.87%]
>>> Profile trace for function: _mtx_trylock_flags_() [0.80%]
>>> Profile trace for function: bounce_bus_dmamap_load_buffer() [0.72%]
>>> Profile trace for function: ether_demux() [0.64%]
>>> Profile trace for function: mb_ctor_mbuf() [0.63%]
>>> Profile trace for function: rtalloc1_fib() [0.54%]
>>>
>>> sysctl debug.lock.prof.stats summary: (some of the highest hit counts,
>>> especially in cnt_hold:
>>>
>>>        7        17     3271389     2258698    63952832      0      0  0
>>> 6761237 /usr/src/sys/netinet/in_rmx.c:114 (sleep mutex:rtentry)
>>>
>>>        7        13    11543138      470545    63952832      0      0  0
>>> 815777 /usr/src/sys/net/route.c:439 (sleep mutex:rtentry)
>>>
>>>        6        15     3943582     1545195    63952779      0      0  0
>>> 3439254 /usr/src/sys/netinet/ip_fastfwd.c:593 (sleep mutex:rtentry
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Jul 22, 2014 at 11:18 AM, John Jasen <jjasen@gmail.com> wrote:
>>>
>>>> Feedback and/or tips and tricks more than welcome.
>>>>
>>> <snip>
>>> _______________________________________________
>>> freebsd-net@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53D4600A.1010505>