Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 29 Jan 2011 14:43:07 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc:        freebsd-performance@FreeBSD.org, Julian Elischer <julian@FreeBSD.org>, Bruce Evans <brde@optusnet.com.au>, Stefan Lambrev <stefan.lambrev@moneybookers.com>
Subject:   Re: Interrupt performance
Message-ID:  <20110129133859.O967@besplex.bde.org>
In-Reply-To: <20110128215215.GJ18170@zxy.spb.ru>
References:  <20110128143355.GD18170@zxy.spb.ru> <22E77EED-6455-4164-9115-BBD359EC8CA6@moneybookers.com> <20110128161035.GF18170@zxy.spb.ru> <CDBFAB7F-1EBC-4B3A-B2F5-6162DD58A93D@moneybookers.com> <4D42F87C.7020909@freebsd.org> <20110128172516.GG18170@zxy.spb.ru> <20110129070205.Q7034@besplex.bde.org> <20110128215215.GJ18170@zxy.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 29 Jan 2011, Slawa Olhovchenkov wrote:

> On Sat, Jan 29, 2011 at 07:52:11AM +1100, Bruce Evans wrote:
>>
>> To see how much CPU is actually available, run something else and see how
>> fast it runs.  A simple counting loops works well on UP systems.
>
> ===
> #include <stdio.h>
> #include <sys/time.h>
>
> int Dummy;
>
> int
> main(int argc, char *argv[])
> {
> long int count,i,dt;
> struct timeval st,et;
>
> count = atol(argv[1]);
>
> gettimeofday(&st, NULL);
> for(i=count;i;i--) Dummy++;
> gettimeofday(&et, NULL);
> dt = (et.tv_sec-st.tv_sec)*1000000 + et.tv_usec-st.tv_usec;
> printf("Elapsed %d us\n",dt);
> }
> ===
>
> This is ok?

It's better not to compete with the interrupt handler in the kernel by
spinning making syscalls, but that will do for a start.

> ./loop 2000000000
>
> FreeBSD
> 1 process: Elapsed 7554193 us
> 2 process: Elapsed 14493692 us
> netperf + 1 process: Elapsed 21403644 us

This shows about 35% user 65% network.

> Linux
> 1 process: Elapsed 7524843 us
> 2 process: Elapsed 14995866 us
> netperf + 1 process: Elapsed 14107670 us

This shows about 53% user 47% network.

So FreeBSD has about 18% more network overhead (absolute: 65-47), or
about 38% more network overhead (relative: (65-47)/47).  Not too
surprising -- the context switches alone might cost that.

BTW, even -current vs my version of FreeBSD-5.2 has 10-20% more network
overhead (relative) for tx, apparently due to bloat in the network
stack.  This apparently has nothing to do with hardware.  The slowdown
is much the same with bge (heavily modified in my version) and em
(barely modified).  One thing that I modify in both drivers is increase
the tx ifq length by a massive amount (from about 512 to about 20000).
This must be bad for overhead because such a large queue cannot fit
in the L2 cache.  A large amount and perhaps more than half of NIC
overhead consists of waiting for cache misses.  The slowdown in -current
might be caused by minor bloat crossing a threshold and thus causing
just one more cache miss every packet or 2.

>> Normal profiling works poorly (I see you found my old mail about high
>> resolution profiling).  Linux might be misreporting the overhead for
>
> I think next server will be support PMC.
> Report from PMC still poorly?

I should be adequate, but I prefer my version of perfmon which can
count cache misses precisely for every function.  But without patches,
perfmon is even more broken than high resolution kernel profiling.

>> ...
>> generate too many interrupts and don't have much or any way to control
>> this.  Linux will certainly be about to handle 56K int/S better than
>> FreeBSD since it doesn't have heavyweight interrupt threads AFAIK.
>> FreeBSD also has "fast" interrupts, which are much like normal interrupts
>> used to be in FreeBSD.  I don't know if your NIC driver uses these.  I
>
> re0: [FILTER]
>
> I think this is answer ([FILTER]), but I don't understand this answer :).

[FILTER] means "fast".  re used to unconditionally use "fast" interrupts
and a task queue, which IMO is a bad way to program an interrupt
handler, but yongari@ recently overhauled re (again :-) so that it now
doesn't use fast interrupts by default for the MSI/MSIX case .  (BTW,
it still bogusly uses INTR_MPSAFE for the fast interrupt bus_setup_intr().)
The overhaul probably also reduces interrupt overhead if it works on
your hardware, just be reducing the interrupt frequency.  I don't
understand what moderates the interrupt frequency in the MSI case.

>> I don't really know if this is low-end, but guess all RealTeks are :-).
>
> FreeBSD support interrupt moderation on this chip, and chip support
> TOE :)

The support was poor according to yongari@'s long messages about
improving it.  With working interrupt moderation, you just don't get
an interrupt rate of even 14KHz, except transiently.  4KHz would be
all I would be happy with on a 1-core 1.6GHz CPU.  Since re is so
primitive, yongari@ only managed to limit the rate to 20KHz.  (Is this
only for the non-MSI case, with the MSI case better even before?)
Linux with its reduced interrupt latency can handled your observed
56KHz without losing by so much.  My 2-core Athlon (Turion) with its
low-end bge 5705, whose brokenness consists mainly of:
     completely broken interrupt moderation
     can't handle full 1Gbps -- saturates at about 300Mbps
     CPU and/or DMA resources used for 300Mbps are about the same as for
       a bge 5701 at a full 1Mbps
saturates at about 100 KHz bge interrupts.  This many interrupts takes
about all of 1 CPU to handle the hardware part and 20% of another to
generate packets.  The system saturates in much the same way under
WinXP.  I only recently got a version of Linux to boot on this system
and haven't tried network performance tests under Linux on it.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110129133859.O967>