From owner-freebsd-current@FreeBSD.ORG Sat Nov 28 19:36:37 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DDF12106566B; Sat, 28 Nov 2009 19:36:37 +0000 (UTC) (envelope-from efinley.lists@gmail.com) Received: from mail-pz0-f185.google.com (mail-pz0-f185.google.com [209.85.222.185]) by mx1.freebsd.org (Postfix) with ESMTP id AA1DB8FC13; Sat, 28 Nov 2009 19:36:37 +0000 (UTC) Received: by pzk15 with SMTP id 15so1632603pzk.3 for ; Sat, 28 Nov 2009 11:36:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=zU0WR4XPknb7Ay7m/mICHVjQS0JYPsPihPynrjVNVOs=; b=IXrGsJTyxHb3G5p5BaNhb/OsIxfrXz3PozFlJUwFome47IO/007wj3pXJSLJuGEREP mvfQBrRsoXjjhwqKtXPcVsMwllWddZRsRn9bbjos3qGoPY4dyD9Khd0oAOn1YTEETig3 EHpt2PX3r4Eb7Gl57sZlQmI3Cj4U7sCFUNp7Q= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=C3+SFxHqqzxBMslqnGxB25/9v9o6qHmsBx2QqdwxQfoFnfdDyZdJXyovfOSw7lOcd1 hMgfeA637f5KFtZ1XsTUIyZHABf7JAdHP6miIDHefDSVIukLi6VQL5zHvPPWd+leEX3j jhtl2veElhS+GgrqmbR83muXB/l8OHObBDp+s= MIME-Version: 1.0 Received: by 10.142.6.11 with SMTP id 11mr253114wff.260.1259436997333; Sat, 28 Nov 2009 11:36:37 -0800 (PST) In-Reply-To: <54e63c320911190842n352cd860q460684376065cd3a@mail.gmail.com> References: <54e63c320911181807m4ddb770br1281d1163ae3cf5f@mail.gmail.com> <54e63c320911190842n352cd860q460684376065cd3a@mail.gmail.com> Date: Sat, 28 Nov 2009 12:36:37 -0700 Message-ID: <54e63c320911281136v5621496ev8c803119e8274056@mail.gmail.com> From: Elliot Finley To: Robert Watson , freebsd-current@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Re: 8.0-RC3 network performance regression X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Nov 2009 19:36:37 -0000 Robert, Here is more info that may be helpful in tracking this down. I'm now running 8.0-R on all boxes. If I use the following settings on the box that's running netserver: kern.ipc.maxsockbuf=16777216 net.inet.tcp.recvbuf_max=16777216 net.inet.tcp.recvbuf_inc=524288 net.inet.tcp.hostcache.expire=1 and I leave the netperf box at default, then I get 932Mbps. But if I then add the same settings to the box that I'm running netperf from, the speed drops down to around 420Mbps again. What other information is needed to help track this down? TIA Elliot On Thu, Nov 19, 2009 at 9:42 AM, Elliot Finley wrote: > > > On Thu, Nov 19, 2009 at 2:11 AM, Robert Watson wrote: > >> >> On Wed, 18 Nov 2009, Elliot Finley wrote: >> >> I have several boxes running 8.0-RC3 with pretty dismal network >>> performance. I also have some 7.2 boxes with great performance. Using iperf >>> I did some tests: >>> >>> server(8.0) <- client (8.0) == 420Mbps >>> server(7.2) <- client (7.2) == 950Mbps >>> server(7.2) <- client (8.0) == 920Mbps >>> server(8.0) <- client (7.2) == 420Mbps >>> >>> so when the server is 7.2, I have good performance regardless of whether >>> the client is 8.0 or 7.2. when the server is 8.0, I have poor performance >>> regardless of whether the client is 8.0 or 7.2. >>> >>> Has anyone else noticed this? Am I missing something simple? >>> >> >> I've generally not measured regressions along these lines, but TCP >> performance can be quite sensitive to specific driver version and hardware >> configuration. So far, I've generally measured significant TCP scalability >> improvements in 8, and moderate raw TCP performance improvements over real >> interfaces. On the other hand, I've seen decreased TCP performance on the >> loopback due to scheduling interactions with ULE on some systems (but not >> all -- disabling checksum generate/verify has improved loopback on other >> systems). >> >> The first thing to establish is whether other similar benchmarks give the >> same result, which might us to narrow the issue down a bit. Could you try >> using netperf+netserver with the TCP_STREAM test and see if that differs >> using the otherwise identical configuration? >> >> Could you compare the ifconfig link configuration of 7.2 and 8.0 to make >> sure there's not a problem with the driver negotiating, for example, half >> duplex instead of full duplex? Also confirm that the same blend ot >> LRO/TSO/checksum offloading/etc is present. >> >> Could you do "procstat -at | grep ifname" (where ifname is your interface >> name) and send that to me? >> >> Another thing to keep an eye of is interrupt rates and pin sharing, which >> are both sensitive to driver change and ACPI changes. It wouldn't hurt to >> compare vmstat -i rates not just on your network interface, but also on >> other devices, to make sure there's not new aliasing. With a new USB stack >> and plenty of other changes, additional driver code running when your NIC >> interrupt fires would be highly measurable. >> >> Finally, two TCP tweaks to try: >> >> (1) Try disabling in-flight bandwidth estimation by setting >> net.inet.tcp.inflight.enable to 0. This often hurts low-latency, >> high-bandwidth local ethernet links, and is sensitive to many other >> issues >> including time-keeping. It may not be the "cause", but it's a useful >> thing to try. >> >> (2) Try setting net.inet.tcp.read_locking to 0, which disables the >> read-write >> locking strategy on global TCP locks. This setting, when enabled, >> significantly impoves TCP scalability when dealing with multiple NICs >> or >> input queues, but is one of the non-trivial functional changes in TCP. > > > Thanks for the reply. Here is some more info: > > netperf results: > storage-price-3 root:~#>netperf -H 10.20.10.20 > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.20.10.20 > (10.20.10.20) port 0 AF_INET > Recv Send Send > Socket Socket Message Elapsed > Size Size Size Time Throughput > bytes bytes bytes secs. 10^6bits/sec > > 4194304 4194304 4194304 10.04 460.10 > > > The interface on both boxes is em1. Both boxes (8.0RC3) have two 4-port > PCIe NICs in them.Trying the two TCP tweaks didn't change anything. While > running iperf I did the procstat and vmstat: > > SERVER: > storage-price-2 root:~#>ifconfig em1 > em1: flags=8843 metric 0 mtu 1500 > options=19b > ether 00:15:17:b2:31:3d > inet 10.20.10.20 netmask 0xffffff00 broadcast 10.20.10.255 > media: Ethernet autoselect (1000baseT ) > status: active > > storage-price-2 root:~#>procstat -at | grep em1 > 0 100040 kernel em1 taskq 3 16 run - > > storage-price-2 root:~#>vmstat -i > interrupt total rate > irq14: ata0 22979 0 > irq15: ata1 23157 0 > irq16: aac0 uhci0* 1552 0 > irq17: uhci2+ 37 0 > irq18: ehci0 uhci+ 43 0 > cpu0: timer 108455076 2000 > irq257: em1 2039287 37 > cpu2: timer 108446955 1999 > cpu1: timer 108447018 1999 > cpu3: timer 108447039 1999 > cpu7: timer 108447061 1999 > cpu5: timer 108447061 1999 > cpu6: timer 108447054 1999 > cpu4: timer 108447061 1999 > Total 869671380 16037 > > CLIENT: > storage-price-3 root:~#>ifconfig em1 > em1: flags=8843 metric 0 mtu 1500 > options=19b > ether 00:15:17:b2:31:49 > inet 10.20.10.30 netmask 0xffffff00 broadcast 10.20.10.255 > media: Ethernet autoselect (1000baseT ) > status: active > > storage-price-3 root:~#>procstat -at | grep em1 > 0 100040 kernel em1 taskq 3 16 run - > > storage-price-3 root:~#>vmstat -i > interrupt total rate > irq1: atkbd0 2 0 > irq14: ata0 22501 0 > irq15: ata1 22395 0 > irq16: aac0 uhci0* 5091 0 > irq17: uhci2+ 125 0 > irq18: ehci0 uhci+ 43 0 > cpu0: timer 108421132 1999 > irq257: em1 1100465 20 > cpu3: timer 108412973 1999 > cpu1: timer 108412987 1999 > cpu2: timer 108413010 1999 > cpu7: timer 108413048 1999 > cpu6: timer 108413048 1999 > cpu5: timer 108413031 1999 > cpu4: timer 108413045 1999 > Total 868462896 16020 > > 7.2 BOX: > dns1 root:~#>ifconfig em0 > em0: flags=8843 metric 0 mtu 1500 > options=9b > ether 00:13:72:5a:ff:48 > inet X.Y.Z.7 netmask 0xffffffc0 broadcast X.Y.Z.63 > media: Ethernet autoselect (1000baseTX ) > status: active > > The 8.0RC3 boxes are being used for testing right now (production 2nd week > of December). If you want access to them, that wouldn't be a problem. > > TIA > Elliot > >