Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Jun 2016 11:53:47 +0100
From:      Karl Pielorz <kpielorz_lst@tdx.co.uk>
To:        freebsd-net@FreeBSD.org
Subject:   Problem with VLAN config and traffic after 10.1-R -> 10.3-R-p5 Upgrade?
Message-ID:  <2ED5D9FEB55641BF734C14F3@[10.12.30.106]>

next in thread | raw e-mail | index | archive | help

Hi,

We're in the process of updating our boxes from 10.1 to 10.3. This has gone 
OK for the simpler cases - but I seem to have found a couple of issues with 
the way 10.3 handles both configuring VLANs and actual traffic on VLANs.


On our box to be upgraded, our /etc/rc.conf has:

cloned_interfaces="lagg0 lagg1 lagg1.30 lagg1.35"
ifconfig_bge0="up"
ifconfig_bge1="up"
ifconfig_lagg0="laggproto failover laggport bge0 laggport bge1 172.16.50.1 
netmask 255.255.255.0"

ifconfig_em3="mtu 1504 up"
ifconfig_em0="mtu 1504 up"
ifconfig_lagg1="laggproto failover laggport em3 laggport em0 192.168.0.2 
netmask 255.255.255.0 mtu 1504"
ifconfig_lagg1_30="inet 192.168.200.2 netmask 255.255.255.0 mtu 1500"
ifconfig_lagg1_35="inet 192.168.210.2 netmask 255.255.255.0 mtu 1500"


The mtu 'hackery' is needed to avoid MTU issues with VLAN interfaces. The 
above worked fine under 10.1 - but the same config under 10.3:

 - Creates lagg0 correctly, and assigns the 172.16.50.1 IP to it
 - Creates lagg1 - and it's VLAN's
 - Does not assign 192.168.0.2 to lagg1 (it silently fails to - i.e. no 
errors logged / shown)

So when the system has finished booting you end up with:

  lagg0    = 172.16.50.1
  lagg1    = no IP assigned
  lagg1.30 = 192.168.200.2
  lagg1.35 = 192.168.210.2

The other thing I've found is, once the box is up:

#ping 192.168.200.1
PING 192.168.200.1 (192.168.200.1): 56 data bytes
ping: sendto: Host is down
^C
--- 192.168.200.1 ping statistics ---
6 packets transmitted, 0 packets received, 100.0% packet loss

Hmm, not good. 192.168.200.1 is a host on the VLAN 30 network (and is up - 
I'm logged into it on another session). Same happens for the 
192.168.210.0/24 network.


Running tcpdump on 192.168.200.1 I see lots of:

11:31:52.956094 ARP, Request who-has 192.168.200.1 tell 192.168.200.2, 
length 46
11:31:52.956102 ARP, Reply 192.168.200.1 is-at x:x:x:x:x:x, length 28
11:31:53.969140 ARP, Request who-has 192.168.200.1 tell 192.168.200.2, 
length 46
11:31:53.969148 ARP, Reply 192.168.200.1 is-at x:x:x:x:x:x, length 28

Ok, so the other box can see the ARP requests from the 10.3 box - and 
issues a reply, but the 10.3 box can't "ping" it.


This gets increasingly weird if I run tcpdump on the 10.3 box. The act of 
running 'tcpdump -i lagg1.30 -n' actually fixes the problem:


#ping 192.168.200.1
PING 192.168.100.1 (192.168.200.1): 56 data bytes
64 bytes from 192.168.200.1: icmp_seq=0 ttl=64 time=0.257 ms
64 bytes from 192.168.200.1: icmp_seq=1 ttl=64 time=0.168 ms
64 bytes from 192.168.200.1: icmp_seq=2 ttl=64 time=0.320 ms

If I ctrl-c the tcpdump on the 10.3 box at this point - pings stop dead. 
Restart the tcpdump - pings resume.


Restoring 10.1 on the box fixes this - but I'd obviously rather be using 
10.3 now.

Any ideas?

Thanks,

-Karl





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2ED5D9FEB55641BF734C14F3>