Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Aug 2016 02:05:09 +0700
From:      Eugene Grosbein <eugen@grosbein.net>
To:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject:   Re: 40Gbps http client benchmark
Message-ID:  <57A634E5.9060702@grosbein.net>
In-Reply-To: <57A62668.7020309@grosbein.net>
References:  <57A62668.7020309@grosbein.net>

next in thread | previous in thread | raw e-mail | index | archive | help
07.08.2016 1:03, Eugene Grosbein пишет:
> Hi!
>
> Is there any high performance benchmark acting as http client for outer http server
> capable to receive 40Gbps without overwhelming CPU with insane number of syscalls?
>
> I've tried benchmarks/wrk version 4.0.2 and it works just fine upto 20Gbps
> for my hardware: two 6-core (HT disabled) Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
> with two dual-port ix(4) 82599ES 10-Gigabit SFI/SFP+ Network Connection
> combined to single lagg interface (lagghash l4).
>
> But each worker pthread of wrk generates too many kqueue() system calls
> polling for incoming data and eats 100% of its CPU core and cannot receive more.
> Or, it may be some kqueue() kernel level lock contention, I do not know.
> More worker threads, more overloaded CPU cores, no increase of transfer over about 20Gbps.

Hmm, it seems that's not number of system calls that hurts me but some kernel-level problem.
This is NUMA system running under 10.3-STABLE r303291. It has two NUMA domains
with first physical CPU (cores 0-7) and first dual-port ix adapter belonging to one domain
and second CPU (cores 8-11) and second dual-port adatper belonging to another domain:

http://www.grosbein.net/img/r4.svg

Just prepend of "cpuset -l 0-7" to wrk invocation boots transfer upto nearly 30Gbps
without any other changes. That's strange.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?57A634E5.9060702>