Date: Wed, 27 Apr 2016 18:10:26 -0300 From: =?UTF-8?Q?Z=C3=A9_Claudio_Pastore?= <zclaudio@bsd.com.br> To: Ryan Stone <rysto32@gmail.com> Cc: freebsd-net <freebsd-net@freebsd.org> Subject: Re: Regression? VLAN packet drop after upgrading from r281235 Message-ID: <CAEGk6G4SxNfb8Ph=Cq0rRATPvFwFqF9jgg%2BsMvMUhc8z554osw@mail.gmail.com> In-Reply-To: <CAFMmRNyY67RGyb8%2BaS=HCLEpzki3n0JiT5QYXO5xnjz5vyYxMA@mail.gmail.com> References: <CAEGk6G4rq=yE14rDcxhJZZ0drstr=fse%2B9aemVYqdt68Gg=bpQ@mail.gmail.com> <CAFMmRNyY67RGyb8%2BaS=HCLEpzki3n0JiT5QYXO5xnjz5vyYxMA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hello Ryan, 2016-04-27 17:28 GMT-03:00 Ryan Stone <rysto32@gmail.com>: > From a quick look at the vlan code, I can identify a few cases that might > cause that counter to increment: > > 1) Error from the underlying ixgbe device. Does "netstat -dI ix0" show > that the driver has been dropping packets? > No, it does not increase drop counters on ix port, only on the vlan device. > > 2) Link down events on the underlying NIC. I believe that link flaps wil= l > be logged to /var/log/messages and dmesg; do you see anything there that > might correspond to the time of the packet drops? > No, dmesg is clean, only a couple down/up link when I actually did disconnect the port, and no other message on /var/log/messages that grabs my attention. > > 3) If VLAN_HWTAGGING is disabled through ifconfig on the port, then in > theory a low memory event could cause the packet to be dropped. Does > "netstat -m" show that "requests for mbufs denied" increasing? > Here is the ifconfig -v output for the vlan6 on the 10.1-STABLE system vlan6: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 15= 00 options=3D303<RXCSUM,TXCSUM,TSO4,TSO6> ether a0:36:9f:2a:6d:ae inet6 fe80::a236:9fff:fe2a:6dae%vlan6 prefixlen 64 scopeid 0x19 inet6 2804:1054:bad:b1fe::1 prefixlen 64 nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL> media: Ethernet autoselect (10Gbase-SR <full-duplex>) status: active vlan: 3005 parent interface: ix3 groups: vlan And here it is on the 10.3-STABLE system, I dont know why the only difference is no options were printed on the newer system, everything else is the same. vlan6: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 15= 00 ether a0:36:9f:2a:6d:ae inet6 fe80::a236:9fff:fe2a:6dae%vlan6 prefixlen 64 scopeid 0x19 inet6 2804:1054:bad:b1fe::1 prefixlen 64 nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL> media: Ethernet autoselect (10Gbase-SR <full-duplex>) status: active vlan: 3005 parent interface: ix3 groups: vlan This is the netstat -m output when system has packet loss. Denied and delayed counters are zeroed. % netstat -m 12365/21040/33405 mbufs in use (current/cache/total) 12310/14530/26840/505076 mbuf clusters in use (current/cache/total/max) 12310/14508 mbuf+clusters out of packet secondary zone in use (current/cache) 0/225/225/252538 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/74826 9k jumbo clusters in use (current/cache/total/max) 0/0/0/42089 16k jumbo clusters in use (current/cache/total/max) 27711K/35220K/62931K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile > > On Wed, Apr 27, 2016 at 2:41 PM, Z=C3=A9 Claudio Pastore <zclaudio@bsd.co= m.br> > wrote: > >> Hello, >> >> On a BGP border router I help manage, we run FreeBSD 10.1-STABLE, >> version r281235 and it works fine for several years now. >> >> We have around 4Gbit/s and 1.8Mpps routed on peak while per port interfa= ce >> we peak at 300Kpps. >> >> Our quality metrics are measured with: >> >> ping -s 1472 -i 0.1 <our-other-ibgp-router> >> >> As well as iperf bidirecional. >> >> This metric is similar to what Speedy Test and SIMET tests are done and >> our >> customers reference. >> >> Systems working w/o problem: >> - 10.1-STABLE / r281235 >> >> Systems tested with drops: >> - 10.2-STABLE / r292035M >> - 10.3-STABLE / r298705 >> - 11.0-CURRENT / r295683 (downloaded snapshot from ftp.freebsd.org) >> - 11.0-CURRENT Melifaro Routing Branch / r297731M >> >> While testing, when errors happen I can see output errs on the vlan port >> on >> the output from "netstat -w1 -I vlan6" >> >> input vlan6 output >> packets errs idrops bytes packets errs bytes colls >> 1 0 0 66 30557 2 33310968 0 >> 1 0 0 105 31458 3 33912219 0 >> 2 0 0 2954 32001 8 34983986 0 >> 1 0 0 1512 33150 6 35942558 0 >> 1 0 0 1512 33654 4 37311862 0 >> 1 0 0 1512 34825 3 38213793 0 >> 3 0 0 1683 35376 4 39488912 0 >> 5 0 0 7280 32423 3 35551869 0 >> >> Problems may happen under high load (~200Kpps) or low load (~30Kpps) on = a >> vlan port. The observed frame loss never happens on untagged ports, only >> vlan related. The observed loss happens with packets sized 900 bytes and >> above but noticeably loss rate is higher with packets close to 1400 (147= 2 >> is my reference size). >> >> Loss rate on all listed systems different from r281235 is 9-19% with >> ping(1) and iperf, while it's 0% on r281235. >> >> First I believed it to be a Intel driver error on systems newer than 10.= 1. >> My reference card are dual port 82599EB 10-Gigabit SFI/SFP+ Network >> Connection (2x2 on x8 PCIe bus, total 4x10G). But yesterday I replaced >> Intel by Chelsio T5 and the problem is still exactly the same, so it's n= ot >> related to card vendor. >> >> I always test the very same hardware, I have two SSD drives in this >> router, >> one for the 10.1 which just runs fine and the other disk to test the >> various versions of FreeBSD. >> >> Only minor loader and sysctl confs are tweaked: >> >> kern.hz=3D2000 >> net.inet.ip.redirect=3D1 # do not send IP redirects >> net.inet.ip.accept_sourceroute=3D0 # drop source routed packets sin= ce >> they ca >> net.inet.ip.sourceroute=3D0 # if source routed packets are >> accepted th >> net.inet.tcp.drop_synfin=3D1 # SYN/FIN packets get dropped on >> initial c >> net.inet.udp.blackhole=3D1 # drop udp packets destined for >> closed soc >> net.inet.tcp.blackhole=3D2 # drop tcp packets destined for >> closed por >> security.bsd.see_other_uids=3D0 >> >> Can anyone suggest what might be a fix/tuning for this behavior? Was the= re >> any relevant change on vlan code from particular revisions close to the >> one >> I run on 10.1 and later which would lead to such a big difference? >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAEGk6G4SxNfb8Ph=Cq0rRATPvFwFqF9jgg%2BsMvMUhc8z554osw>