Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Mar 2006 00:40:17 -0500
From:      "Gary Thorpe" <gthorpe@myrealbox.com>
To:        g_jin@lbl.gov
Cc:        freebsd-performance@freebsd.org, oxy@field.hu
Subject:   Re: packet drop with intel gigabit / marwell gigabit
Message-ID:  <1143092417.c7f62afcgthorpe@myrealbox.com>

next in thread | raw e-mail | index | archive | help
[No subject in first one, sorry for repost]
=20
Jin Guojun [VFFS] wrote:

> You are fast away from the real world. This has been explained million
> times, just like
> I teach intern student every summer :-)
>
> First of all, DDR400 and 200 MHz bus mean nothing -- A DDR 266 + 500MHz
> CPU system
> can over perform a DDR 400 + 1.7 GHz CPU system.

Given the same chipset+motherboard, no. DDR400 has more bandwidth and a sma=
ller latency. Given different chipsets/motherboards, this may be true. Ho=
wever, one could also say with accuracy that a 500 Mhz processor can outp=
erform the same family running at 1.7 GHz under some conditons but few pe=
ople will run to buy 500 Mhz over 1.7 GHz for performance alone.

Another example:
> Ixxxx 2 CPU was designed with 3 level caches. Supposedly
> Level 1 to level2 takes 5 cycles
> Level 2 to level 3 takes 11 cycles
> What you expect CPU to memory time (cycles) -- CPU to level-1 is one
> cycle ?
> you would expect 17 cycles to 20 cycles of total. But it actually
> takes 210 cycles
> due to some design issues.
> Now your 1.6 GB/s reduced to 16MB/s or even worse just based on this
> factor.

1.6 Gb/s =3D system bus bandwidth. Cache won't affect this bandwidth. DDR40=
0 has 400 MB/s: only attainable for long sequential accesses of either re=
ad or write but not a mix of both. DMA should be able to get near this li=
mit (long and sequential, read or write only per transfer). A NIC with bu=
s mastering DMA should be able to effectively use the memory bandwidth.

> Number of other factors affect memory bandwidth, such as bus arbitration.
> Have you done any memory benchmark on a system before doing such simple
> calculation?

No, they are just theoretical values telling you the limits of performance.=
 I asume that a decent implementation can get 75% of the theoretical limi=
t at least some of the time under good conditions (like DMA).

>
> Secondly, DMA moves data from NIC to mbuf, then who moves data from mbuf
> to user buffer?
> Not human. It is CPU. When DMA moving data, can CPU moves data
> simultaneously?
> DMA takes both I/O bandwidth and memory bandwidth. If your system has
> only 16 MB/s
> memory bandwidth, your network throughput is less 8 MB/s, typically
> below 6.4 MB/s.
> If you cannot move data fast enough away from NIC, what happens?
> packet loss!

True, but would this type of packet loss even be measured by the OS? Packet=
 loss to the OS means some packets were dropped from the software portion=
 of the network stack right? This means that the NIC has no problems deli=
vering it to the OS and the OS has problems delivering it to the user pro=
cess.

You are arguing that the bandwidth is not sufficient for the processor to d=
o this copy out (or page loan out =3D zero copy, only memory management t=
ricks) and the software has to drop packets from mbufs when more packets =
arrive for UDP. Enough bandwidth is theoretically available for this (muc=
h more than required), it may or may not be true that the actual sustaine=
d bandwidth is insufficient. I don't think that 1/4 of the bandwidth is a=
ctually available for any reasonable (i.e. not junk) system.

>
> That is why his CPU utilization was low because there was no much data
> cross CPU.
> So, that is why I asked him what is the CPU utilization first, then the
> chipset. This is
> the basic steps to diagnose network performance.
> If you know a CPU and chipset for a system, you will know the network
> performance
> ceiling for that system, guaranteed. But it does not guarantee you can
> get that ceiling
> performance, especially over OC-12 (622 Mb/s) high-speed networks. That
> requires
> intensive tuning knowledge for current TCP stack, which is well
> explained on the Internet
> by searching for "TCP tuning".

In this case, bandwidth should not factor in (16 MB/s is low, disks can reg=
ularly double this easily). The 1 Gb/s NIC is not being fully used in thi=
s case (< 40 MB/s) and the processor is mostly idle.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1143092417.c7f62afcgthorpe>