Date: Mon, 12 Oct 2015 12:40:16 +0200 From: =?UTF-8?Q?Tom=c3=a1=c5=a1_Drbohlav?= <drb@karlov.mff.cuni.cz> To: freebsd-net@freebsd.org Subject: Lost packets in IPFilter 5 Message-ID: <561B8E10.2070701@karlov.mff.cuni.cz>
next in thread | raw e-mail | index | archive | help
Hello, we are preparing upgrade (new box with new install) for our gateway serving tens (small hundreds) of clients (mostly office desktops plus some special servers, e.g. Nagios monitoring of thousand items), dozen of networks, routed or NATed among them and NATed to outside world. All of that based on ipfilter. The setup is working for us for years on 8.2-RELEASE. We had prepared same setup on 10.2 p4 and when put in the wild we started to see missing packets possibly somewhere inside the new box (tcpdump see them on source machine, on inner interface of NAT but not on outside interface of NAT). After that we stepped back and prepare test setup (most of the config on 10.2 box is left as it was when seeing problems) and the problem is reproducible. We tried new build with LARGE_NAT setup and nothing changed. We have also tried to limit age of NAT mapping in config, it took a bit longer for the first occurrence. Our NAT setup is quite simple, few of: map intA XX.XX.XX.0/24 -> YY.YY.YY.YY/32 proxy port 21 ftp/tcp map intA XX.XX.XX.0/24 -> YY.YY.YY.YY/32 portmap tcp/udp auto map intA XX.XX.XX.0/24 -> YY.YY.YY.YY/32 We also tried to change network card, same behavior on 10G and 1G one. Test setup: inner machine sends ICMP ping (100/s, in groups by 10; 'ping -i 0.01 -c 10'), tested box NATs them out to ping responder and ICMP responses packets go through NAT again to inner machine. After a while, some of packets did make it out. It takes few minutes, tens of minutes to appear first and is slightly getting worse in time. We have narrowed it down to sys/contrib/netinet/ip_nat.c line 2687, where 'exhausted out' gets incremented (we have tested and ruled out other two places, where 'exhausted out' is used). We also see that when one packet is eaten, rest of them from same group do not make it either. Right now we cannot put new box into production, loss rate (we have not counted that exactly, but it is one packet in hundreds) is too big at least for Nagios. I see two choices for us: try to fix/find out what is wrong with ipfilter or switch to PF (some major config syntax challenges beeing ahead). So: any ideas about ipfilter? I will be happy to provide any information anyone finds important. All thoughts welcome! Bye Tomas Drbohlav
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?561B8E10.2070701>