From owner-freebsd-net@freebsd.org Sat May 7 00:11:30 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 88ED1B31ADF for ; Sat, 7 May 2016 00:11:30 +0000 (UTC) (envelope-from zclaudio@bsd.com.br) Received: from mail-ig0-x22e.google.com (mail-ig0-x22e.google.com [IPv6:2607:f8b0:4001:c05::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5357B1FC5 for ; Sat, 7 May 2016 00:11:30 +0000 (UTC) (envelope-from zclaudio@bsd.com.br) Received: by mail-ig0-x22e.google.com with SMTP id u10so60237866igr.1 for ; Fri, 06 May 2016 17:11:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsd.com.br; s=capeta; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=Wn4jn0h3C9yo2AxYRniVwQHXErHuuAzkNzS7ftwrBqo=; b=Pod8ndN2WbmKJTY1IIymwQII3BnqWE8yJWfDv3seJ1eT38WhRsTqUmaiZpYMCAT1TQ fPrZM9y6mcPIuI0n1X3UGw6AFziCl7nD+tK6YsJF6p65j+FTe1l147S3puWNTeUp5jxS j2ulgsQoDMTLBBzRjEqFPAXpRvahU5gKhk7IQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=Wn4jn0h3C9yo2AxYRniVwQHXErHuuAzkNzS7ftwrBqo=; b=ONvLyktVzy823WaS0teotuiA1dWmiupr5SsHT75kgz/O3jLqWNA9RmDxK8g9bCpoXS xL7CmqaItn//w1h9Qv6p0+d5rRxyOyJCwG3kid+lGfb+xFnR9fzojctjkkYCgzJPuovx 4pGmKjhPTaMtrlQdT4afZlDEpBzh5NEF9g0iWu1/qqpgTcrBwcTymp72IekOnYY07kYM mtt2VDLzIGyMZ4zAijpS95geQdXYSSM9w777upCl1LeO0EH0kz4Df87mZayqgzRSIVFd AS0Zpe72XzSs3ZSU4AHPLK0EA+/0fgmUyIpVrz6EWDMFiWxejrALPjdKCmaSjZfJrlIu wojA== X-Gm-Message-State: AOPr4FViJMFXIEBlftv8izZNGCSPJV4JmwYey1VIbho+zeKU+DJ5ZiJEUB/YB3endisVTKt50iqvI/QTkBm7GQ== MIME-Version: 1.0 X-Received: by 10.50.123.161 with SMTP id mb1mr353767igb.20.1462579889671; Fri, 06 May 2016 17:11:29 -0700 (PDT) Received: by 10.107.29.16 with HTTP; Fri, 6 May 2016 17:11:29 -0700 (PDT) In-Reply-To: References: Date: Fri, 6 May 2016 21:11:29 -0300 Message-ID: Subject: Re: Regression? VLAN packet drop after upgrading from r281235 From: Ze Claudio Pastore To: Ryan Stone Cc: freebsd-net Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 May 2016 00:11:30 -0000 OK I submitted a Bug Report, if someone else get's a similar problem. https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D209351 2016-04-27 18:10 GMT-03:00 Z=C3=A9 Claudio Pastore : > Hello Ryan, > > 2016-04-27 17:28 GMT-03:00 Ryan Stone : > >> From a quick look at the vlan code, I can identify a few cases that migh= t >> cause that counter to increment: >> >> 1) Error from the underlying ixgbe device. Does "netstat -dI ix0" show >> that the driver has been dropping packets? >> > > No, it does not increase drop counters on ix port, only on the vlan devic= e. > > >> >> 2) Link down events on the underlying NIC. I believe that link flaps >> will be logged to /var/log/messages and dmesg; do you see anything there >> that might correspond to the time of the packet drops? >> > > No, dmesg is clean, only a couple down/up link when I actually did > disconnect the port, and no other message on /var/log/messages that grabs > my attention. > > >> >> 3) If VLAN_HWTAGGING is disabled through ifconfig on the port, then in >> theory a low memory event could cause the packet to be dropped. Does >> "netstat -m" show that "requests for mbufs denied" increasing? >> > > Here is the ifconfig -v output for the vlan6 on the 10.1-STABLE system > > vlan6: flags=3D8843 metric 0 mtu = 1500 > options=3D303 > ether a0:36:9f:2a:6d:ae > inet6 fe80::a236:9fff:fe2a:6dae%vlan6 prefixlen 64 scopeid 0x19 > inet6 2804:1054:bad:b1fe::1 prefixlen 64 > nd6 options=3D21 > media: Ethernet autoselect (10Gbase-SR ) > status: active > vlan: 3005 parent interface: ix3 > groups: vlan > > And here it is on the 10.3-STABLE system, I dont know why the only > difference is no options were printed on the newer system, everything els= e > is the same. > > vlan6: flags=3D8843 metric 0 mtu = 1500 > ether a0:36:9f:2a:6d:ae > inet6 fe80::a236:9fff:fe2a:6dae%vlan6 prefixlen 64 scopeid 0x19 > inet6 2804:1054:bad:b1fe::1 prefixlen 64 > nd6 options=3D21 > media: Ethernet autoselect (10Gbase-SR ) > status: active > vlan: 3005 parent interface: ix3 > groups: vlan > > This is the netstat -m output when system has packet loss. Denied and > delayed counters are zeroed. > > % netstat -m > 12365/21040/33405 mbufs in use (current/cache/total) > 12310/14530/26840/505076 mbuf clusters in use (current/cache/total/max) > 12310/14508 mbuf+clusters out of packet secondary zone in use > (current/cache) > 0/225/225/252538 4k (page size) jumbo clusters in use > (current/cache/total/max) > 0/0/0/74826 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/42089 16k jumbo clusters in use (current/cache/total/max) > 27711K/35220K/62931K bytes allocated to network (current/cache/total) > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > > > >> >> On Wed, Apr 27, 2016 at 2:41 PM, Z=C3=A9 Claudio Pastore >> wrote: >> >>> Hello, >>> >>> On a BGP border router I help manage, we run FreeBSD 10.1-STABLE, >>> version r281235 and it works fine for several years now. >>> >>> We have around 4Gbit/s and 1.8Mpps routed on peak while per port >>> interface >>> we peak at 300Kpps. >>> >>> Our quality metrics are measured with: >>> >>> ping -s 1472 -i 0.1 >>> >>> As well as iperf bidirecional. >>> >>> This metric is similar to what Speedy Test and SIMET tests are done and >>> our >>> customers reference. >>> >>> Systems working w/o problem: >>> - 10.1-STABLE / r281235 >>> >>> Systems tested with drops: >>> - 10.2-STABLE / r292035M >>> - 10.3-STABLE / r298705 >>> - 11.0-CURRENT / r295683 (downloaded snapshot from ftp.freebsd.org) >>> - 11.0-CURRENT Melifaro Routing Branch / r297731M >>> >>> While testing, when errors happen I can see output errs on the vlan por= t >>> on >>> the output from "netstat -w1 -I vlan6" >>> >>> input vlan6 output >>> packets errs idrops bytes packets errs bytes colls >>> 1 0 0 66 30557 2 33310968 0 >>> 1 0 0 105 31458 3 33912219 0 >>> 2 0 0 2954 32001 8 34983986 0 >>> 1 0 0 1512 33150 6 35942558 0 >>> 1 0 0 1512 33654 4 37311862 0 >>> 1 0 0 1512 34825 3 38213793 0 >>> 3 0 0 1683 35376 4 39488912 0 >>> 5 0 0 7280 32423 3 35551869 0 >>> >>> Problems may happen under high load (~200Kpps) or low load (~30Kpps) on= a >>> vlan port. The observed frame loss never happens on untagged ports, onl= y >>> vlan related. The observed loss happens with packets sized 900 bytes an= d >>> above but noticeably loss rate is higher with packets close to 1400 (14= 72 >>> is my reference size). >>> >>> Loss rate on all listed systems different from r281235 is 9-19% with >>> ping(1) and iperf, while it's 0% on r281235. >>> >>> First I believed it to be a Intel driver error on systems newer than >>> 10.1. >>> My reference card are dual port 82599EB 10-Gigabit SFI/SFP+ Network >>> Connection (2x2 on x8 PCIe bus, total 4x10G). But yesterday I replaced >>> Intel by Chelsio T5 and the problem is still exactly the same, so it's >>> not >>> related to card vendor. >>> >>> I always test the very same hardware, I have two SSD drives in this >>> router, >>> one for the 10.1 which just runs fine and the other disk to test the >>> various versions of FreeBSD. >>> >>> Only minor loader and sysctl confs are tweaked: >>> >>> kern.hz=3D2000 >>> net.inet.ip.redirect=3D1 # do not send IP redirects >>> net.inet.ip.accept_sourceroute=3D0 # drop source routed packets si= nce >>> they ca >>> net.inet.ip.sourceroute=3D0 # if source routed packets are >>> accepted th >>> net.inet.tcp.drop_synfin=3D1 # SYN/FIN packets get dropped o= n >>> initial c >>> net.inet.udp.blackhole=3D1 # drop udp packets destined for >>> closed soc >>> net.inet.tcp.blackhole=3D2 # drop tcp packets destined for >>> closed por >>> security.bsd.see_other_uids=3D0 >>> >>> Can anyone suggest what might be a fix/tuning for this behavior? Was >>> there >>> any relevant change on vlan code from particular revisions close to the >>> one >>> I run on 10.1 and later which would lead to such a big difference? >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>> >> >> >