From owner-freebsd-net@FreeBSD.ORG Fri Jul 23 07:40:52 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8A7A7106567D for ; Fri, 23 Jul 2010 07:40:52 +0000 (UTC) (envelope-from jhay@meraka.csir.co.za) Received: from zibbi.meraka.csir.co.za (unknown [IPv6:2001:4200:7000:2::1]) by mx1.freebsd.org (Postfix) with ESMTP id 7A1D88FC0C for ; Fri, 23 Jul 2010 07:40:50 +0000 (UTC) Received: by zibbi.meraka.csir.co.za (Postfix, from userid 3973) id BFCBD39822; Fri, 23 Jul 2010 09:40:47 +0200 (SAST) Date: Fri, 23 Jul 2010 09:40:47 +0200 From: John Hay To: freebsd-net@freebsd.org, jfvogel@gmail.com Message-ID: <20100723074047.GA47514@zibbi.meraka.csir.co.za> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: Subject: packet loss on ixgbe using vlans and routing X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Jul 2010 07:40:52 -0000 Hi, (Jack any chance that you can look at this please?) It looks like there are 2 problems with the ixgbe driver on FreeBSD-8. I have a Dell T710 with 4 X 10G ethernet interfaces (2 X Dual port Intel 82599 cards). It is running FreeBSD RELENG_8. 1 - When routing (using vlans) there is heavy packet loss that go away when you do "ifconfig ix2 -rxcsum". The packet loss seems to be on the receive side because I do not see them on the receiving interface with tcpdump. This seems to impact both ipv4 and ipv6. My test setup is the Dell T710 with its ix2 connected to a 10G port of a Nortel 4526GTX. On that port I have 2 vlans configured with half of the 1G ports in the one vlan and the other half in the other vlan. If I test with iperf from one of the machines on a 1G port to the T710, I get 920Mbit/s. If I do it simultaneously from a few machines connected to the 1G ports, all of them basically saturate their 1G links. If I now try to route from the one vlan to the other, ie. doing an iperf from a 1G connected machine, through the T710, to another 1G connected machine, I see packet loss, sometimes iperf is only able to do 100kbits/s. (Configuring a tcp relay, like socat, on the T710, and working through it, I again get 900Mbit/s and more.) So it seems that as long as the T710 with the 10G card is the start or end point of the connection, I get no packet loss, but as soon as it has to route, something go wrong. 2 - I see packet loss (0 - 40%) on IPv6 packets in vlans, when the machine is not the originator of the packets. This happen even with the "ifconfig ix2 -rxcsum". Let me try to describe a little more. If a neigbouring machine ping6 it, there will be packet loss. If it act as a router for ipv6, there will be packet loss. This happen even when the network is pretty idle and with different switches (Nortel and Cisco equipment). The packet loss is very fluctuating. Pinging 1000 packets might loose 1% one time and the next time 30%. Looking with tcpdump, I can see the packets arriving and going out, but the packet never arrive at the next machine. (My feeling is that they get lost inside the card.) The error counters on the switch does not increment. I do not see packet loss if the machine originate the packets, for example ping6 from the machine. Also ipv4 packets do not have any packets loss. If I do not use vlans, I don't see packet loss with ipv6 either. The machine also have bce 1G interfaces and I do not see the packet loss on them. Here is some info about the machine / setup. The numbers are pretty low because I rebooted after compiling a kernel with IPFIREWALL, ROUTETABLES, MROUTING and FLOWTABLE removed. I'll add my kernel config file with empty and commented out lines removed. pciconf -lvc ix0@pci0:129:0:0: class=0x020000 card=0x00038086 chip=0x10fb8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks cap 11[70] = MSI-X supports 64 messages in map 0x20 enabled cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x8(x8) ix1@pci0:129:0:1: class=0x020000 card=0x00038086 chip=0x10fb8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks cap 11[70] = MSI-X supports 64 messages in map 0x20 enabled cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x8(x8) ix2@pci0:131:0:0: class=0x020000 card=0x00038086 chip=0x10fb8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks cap 11[70] = MSI-X supports 64 messages in map 0x20 enabled cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x8(x8) ix3@pci0:131:0:1: class=0x020000 card=0x00038086 chip=0x10fb8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks cap 11[70] = MSI-X supports 64 messages in map 0x20 enabled cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x8(x8) output of vmstat -i interrupt total rate irq19: ehci0 28371 0 irq21: uhci2 uhci4+ 48 0 irq23: atapci0 46 0 irq34: mpt0 146954 2 cpu0: timer 112205297 1999 irq256: bce0 52063 0 irq257: bce1 1 0 irq258: bce2 1 0 irq259: bce3 1 0 irq260: ix0:que 0 142258 2 irq261: ix0:que 1 56464 1 irq262: ix0:que 2 56199 1 irq263: ix0:que 3 56198 1 irq264: ix0:que 4 66569 1 irq265: ix0:que 5 56148 1 irq266: ix0:que 6 56217 1 irq267: ix0:que 7 56311 1 irq268: ix0:que 8 56169 1 irq269: ix0:que 9 69485 1 irq270: ix0:que 10 56176 1 irq271: ix0:que 11 56205 1 irq272: ix0:que 12 56281 1 irq273: ix0:que 13 56359 1 irq274: ix0:que 14 56292 1 irq275: ix0:que 15 56197 1 irq276: ix0:link 2 0 irq277: ix1:que 0 107873 1 irq278: ix1:que 1 56094 0 irq279: ix1:que 2 56097 0 irq280: ix1:que 3 56096 0 irq281: ix1:que 4 65439 1 irq282: ix1:que 5 56091 0 irq283: ix1:que 6 56092 0 irq284: ix1:que 7 56098 0 irq285: ix1:que 8 56091 0 irq286: ix1:que 9 56096 0 irq287: ix1:que 10 56093 0 irq288: ix1:que 11 56091 0 irq289: ix1:que 12 56096 0 irq290: ix1:que 13 56095 0 irq291: ix1:que 14 57125 1 irq292: ix1:que 15 56093 0 irq293: ix1:link 1 0 irq294: ix2:que 0 231250 4 irq295: ix2:que 1 57784 1 irq296: ix2:que 2 69956 1 irq297: ix2:que 3 59498 1 irq298: ix2:que 4 58201 1 irq299: ix2:que 5 58599 1 irq300: ix2:que 6 57813 1 irq301: ix2:que 7 60075 1 irq302: ix2:que 8 68639 1 irq303: ix2:que 9 58194 1 irq304: ix2:que 10 60752 1 irq305: ix2:que 11 57628 1 irq306: ix2:que 12 66796 1 irq307: ix2:que 13 63307 1 irq308: ix2:que 14 60788 1 irq309: ix2:que 15 59102 1 irq310: ix2:link 5 0 irq311: ix3:que 0 56090 0 irq312: ix3:que 1 56090 0 irq313: ix3:que 2 56090 0 irq314: ix3:que 3 56090 0 irq315: ix3:que 4 56090 0 irq316: ix3:que 5 56090 0 irq317: ix3:que 6 56090 0 irq318: ix3:que 7 56090 0 irq319: ix3:que 8 56090 0 irq320: ix3:que 9 56090 0 irq321: ix3:que 10 56090 0 irq322: ix3:que 11 56090 0 irq323: ix3:que 12 56090 0 irq324: ix3:que 13 56090 0 irq325: ix3:que 14 56090 0 irq326: ix3:que 15 56090 0 cpu1: timer 112196134 1999 cpu10: timer 112196179 1999 cpu3: timer 112196135 1999 cpu8: timer 112196108 1999 cpu4: timer 112196161 1999 cpu11: timer 112196179 1999 cpu5: timer 112196161 1999 cpu13: timer 112196179 1999 cpu6: timer 112196161 1999 cpu14: timer 112196179 1999 cpu2: timer 112196106 1999 cpu12: timer 112196179 1999 cpu7: timer 112196161 1999 cpu9: timer 112196155 1999 cpu15: timer 112196179 1999 Total 1799390156 32072 netstat -m 133178/4042/137220 mbufs in use (current/cache/total) 133112/2062/135174/262144 mbuf clusters in use (current/cache/total/max) 133112/2056 mbuf+clusters out of packet secondary zone in use (current/cache) 0/20/20/131072 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/65536 9k jumbo clusters in use (current/cache/total/max) 0/0/0/32768 16k jumbo clusters in use (current/cache/total/max) 299518K/5214K/304733K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines kernel config file, basically started with 64 bit and removed the stuff I do not need. cpu HAMMER ident SEEKAT device ipmi makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols options SCHED_ULE # ULE scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking options INET6 # IPv6 communications protocols options SCTP # Stream Control Transmission Protocol options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_DIRHASH # Improve performance on big directories options CD9660 # ISO 9660 Filesystem options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS # Pseudo-filesystem framework options GEOM_PART_GPT # GUID Partition Tables. options GEOM_LABEL # Provides labelization options COMPAT_43TTY # BSD 4.3 TTY compat (sgtty) options COMPAT_IA32 # Compatible with i386 binaries options COMPAT_FREEBSD4 # Compatible with FreeBSD4 options COMPAT_FREEBSD5 # Compatible with FreeBSD5 options COMPAT_FREEBSD6 # Compatible with FreeBSD6 options COMPAT_FREEBSD7 # Compatible with FreeBSD7 options KTRACE # ktrace(1) support options STACK # stack(9) support options SYSVSHM # SYSV-style shared memory options SYSVMSG # SYSV-style message queues options SYSVSEM # SYSV-style semaphores options P1003_1B_SEMAPHORES # POSIX-style semaphores options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options PRINTF_BUFR_SIZE=128 # Prevent printf output being interspersed. options KBD_INSTALL_CDEV # install a CDEV entry in /dev options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4) options INCLUDE_CONFIG_FILE # Include this file in kernel options SMP # Symmetric MultiProcessor Kernel device cpufreq device acpi device pci device ata device atapicd # ATAPI CDROM drives device mpt # LSI-Logic MPT-Fusion device scbus # SCSI bus (required for SCSI) device da # Direct Access (disks) device pass # Passthrough device (direct SCSI access) device atkbdc # AT keyboard controller device atkbd # AT keyboard device psm # PS/2 mouse device kbdmux # keyboard multiplexer device vga # VGA video card driver device splash # Splash screen and screen saver support device sc device agp # support several AGP chipsets device uart # Generic UART driver device loop # Network loopback device random # Entropy device device ether # Ethernet support device pty # BSD-style compatibility pseudo ttys device bpf # Berkeley packet filter device uhci # UHCI PCI->USB interface device ehci # EHCI PCI->USB interface (USB 2.0) device usb # USB Bus (required) device uhid # "Human Interface Devices" device ukbd # Keyboard device umass # Disks/Mass storage - Requires scbus and da device ums # Mouse kldstat Id Refs Address Size Name 1 55 0xffffffff80100000 6ea290 kernel 2 1 0xffffffff807eb000 19e088 zfs.ko 3 2 0xffffffff8098a000 3860 opensolaris.ko 4 2 0xffffffff8098e000 20448 krpc.ko 5 1 0xffffffff809af000 21100 geom_mirror.ko 6 1 0xffffffff809d1000 66c0 if_vlan.ko 7 1 0xffffffff809d8000 506c8 if_bce.ko 8 2 0xffffffff80a29000 3ec20 miibus.ko 9 1 0xffffffff80a68000 243e0 if_ixgbe.ko 10 1 0xffffffff80a8d000 1e08 coretemp.ko ifconfig ix2 (with -rxcsum and global addrs modified) ix2: flags=8843 metric 0 mtu 1500 options=5b8 ether 00:1b:21:57:ef:7c inet6 fe80::21b:21ff:fe57:ef7c%ix2 prefixlen 64 scopeid 0x3 nd6 options=3 media: Ethernet autoselect (10Gbase-SR ) status: active ifconfig ix2.1 ix2.1: flags=8843 metric 0 mtu 1500 ether 00:1b:21:57:ef:7c inet 10.0.28.2 netmask 0xffffff00 broadcast 10.0.28.255 inet6 fe80::21b:21ff:fe57:b420%ix2.1 prefixlen 64 scopeid 0x9 inet6 2001:0:0:3:21b:21ff:fe57:b420 prefixlen 64 inet6 2001:0:0:3:: prefixlen 64 anycast nd6 options=3 media: Ethernet autoselect (10Gbase-SR ) status: active vlan: 1 parent interface: ix2 ifconfig ix2.8 ix2.8: flags=8843 metric 0 mtu 1500 ether 00:1b:21:57:ef:7c inet 10.0.8.50 netmask 0xffffff00 broadcast 10.0.8.255 inet6 fe80::21b:21ff:fe57:b420%ix2.8 prefixlen 64 scopeid 0xa inet6 2001:0:0:1:21b:21ff:fe57:b420 prefixlen 64 inet6 2001:0:0:1:: prefixlen 64 anycast nd6 options=3 media: Ethernet autoselect (10Gbase-SR ) status: active vlan: 8 parent interface: ix2 John -- John Hay -- jhay@meraka.csir.co.za / jhay@FreeBSD.org