Date: Fri, 19 Nov 2010 18:48:51 -0500 From: "Brian A. Seklecki" <bseklecki@collaborativefusion.com> To: freebsd-net <freebsd-questions@freebsd.org> Subject: Restarting network & vlan interface = kernel memory corruption (if_vlan / conf/63700 redux) Message-ID: <1290210531.26157.2412.camel@soundwave>
next in thread | raw e-mail | index | archive | help
--=-Gv8I/D5hyjTvJ8MUW+nz Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable [Originally from freebsd-hackers@ / Feb 2008; freebsd-net Jun 2010] All: =20 pf conf/63700 got the ball rolling on fixing cloned/VLAN=20 interface management with rc.d/netif, but a very specific problem still remains. For example, adding an alias to a VLAN and running: /etc/rc.d/netif restart && /etc/rc.d/routing restart=20 is a failure. --- Take the following rc.conf(4) config: hostname=3D"sexdrugsandunix" cloned_interfaces=3D"vlan14" ifconfig_em0=3D"up media 100baseTX mediaopt full-duplex -tso" ifconfig_vlan14=3D"inet 1.2.3.4 netmask 255.255.255.128 vlan 14 vlandev em0 up" ifconfig_vlan14_alias0=3D"inet 1.2.3.5 netmask 255.255.255.255" Change it to include a second alias without a reboot, instead run 'rc.d/netif restart', as works on a physical interface: hostname=3D"sexdrugsandunix" cloned_interfaces=3D"vlan14" ifconfig_em0=3D"up media 100baseTX mediaopt full-duplex -tso" ifconfig_vlan14=3D"inet 1.2.3.4 netmask 255.255.255.128 vlan 14 vlandev em0 up" ifconfig_vlan14_alias0=3D"inet 1.2.3.5 netmask 255.255.255.255" ifconfig_vlan14_alias1=3D"inet 1.2.3.6 netmask 255.255.255.255" The result will be: % ifconfig vlan14 [bseklecki@sureshot ~]$ ifconfig vlan14 vlan14: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu= =20 inet 1.2.3.6 netmask 0xffffffff broadcast 192.168.158.152 inet 1.2.3.5 netmask 0xffffffff broadcast 192.168.158.255 1) I'm not sure where the .152 broadcast comes from. ?! 2) The new _alias1=3D data is now in the primary IP slot 3) The primary IP is lost, there is no routable IP 4) The original _alias0=3D data is now in the 1st alias slot 5) rc.d/routing fails because the interface lacks a routable IP with a valid netmask/broadcast combination. --------------------------- Problem #1: rc.d/netif::network_stop() The core problem is that rc.d/netif::network_stop() never calls network.subr::clone_down() in the same way that rc.d/netif::network_start() calls network.subr::cloned_up() I'd speculate that this is a design decision not to destroy=20 network interfaces that certain userland daemons (DHCP, RTADVD,=20 BPF) may be strictly bound to; I disagree. Even if you explicitly pass your VLAN interface to rc.d/netif, a stop doesn't call 'ifconfig [VL] destory', and, when 'rc.d/netif start' is called later, SIOCSETVLAN results. jail-host-80:/home/bseklecki% sudo ifconfig vlan666 destroy jail-host-80:/home/bseklecki% sudo ifconfig vlan666=20 create inet 1.2.3.4 netmask 255.255.255.0 vlan 666 vlandev em0 jail-host-80:/home/bseklecki% sudo ifconfig vlan666=20 create inet 1.2.3.4 netmask 255.255.255.0 vlan 666 vlandev em0 ifconfig: create: bad value A simple rc.d/network_stop() patch could fix this problem if=20 we can avoid bikeshedding. ------------------------------------------ Problem #2: VLAN interface kernel data structures maintain configuration= =20 data after being destroyed and re-created %ifconfig vlan666 vlan666: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=3D3<RXCSUM,TXCSUM> ether 00:0c:29:a1:4b:9d inet 192.168.15.54 netmask 0xffffff00 broadcast 192.168.15.255 media: Ethernet 1000baseT <full-duplex> status: active vlan: 666 parent interface: em0 %sudo ifconfig vlan666 destroy %sudo ifconfig vlan666 create %ifconfig vlan666 vlan666: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=3D3<RXCSUM,TXCSUM> ether 00:0c:29:a1:4b:9d !!**>> inet 192.168.15.54 netmask 0xffffff00 broadcast 192.168.15.255 <<**!= ! media: Ethernet 1000baseT <full-duplex> status: active vlan: 666 parent interface: em0 Now, that's something you don't see very day!! ---------------------------------------------------- NOTE: I can't get that persistent IP data problem to happen consistently, but its highly reproducible. I also have no idea on the fixes, I'll check this weekend, but I have a work-around. To avoid destroying your routing table after adding an alias to a VLAN interface in rc.conf(5), simply run: $ sudo /etc/rc.d/netif [VLAN####] start DO NOT RESTART, and you should be okay. ~BAS References: http://lists.freebsd.org/pipermail/freebsd-hackers/2008-February/023440.htm= l http://www.freebsd.org/cgi/query-pr.cgi?pr=3D63700&cat=3D (Circa 2004) http://lists.freebsd.org/pipermail/freebsd-net/2007-September/015447.html http://lists.freebsd.org/pipermail/freebsd-net/2010-June/025514.html --=20 Brian A. Seklecki <bseklecki@collaborativefusion.com> Collaborative Fusion, Inc. --=-Gv8I/D5hyjTvJ8MUW+nz Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iEYEABECAAYFAkznDOMACgkQCne6BNDQ+R9aswCeNvSVqqCSUdm14BqH7sNpA1jV SbcAnifVX0YrFj+pmJZLO2ZkBf0fJI6b =OuGB -----END PGP SIGNATURE----- --=-Gv8I/D5hyjTvJ8MUW+nz--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1290210531.26157.2412.camel>