Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 Nov 2010 18:48:51 -0500
From:      "Brian A. Seklecki" <bseklecki@collaborativefusion.com>
To:        freebsd-net <freebsd-questions@freebsd.org>
Subject:   Restarting network & vlan interface = kernel memory corruption (if_vlan / conf/63700 redux)
Message-ID:  <1290210531.26157.2412.camel@soundwave>

next in thread | raw e-mail | index | archive | help

--=-Gv8I/D5hyjTvJ8MUW+nz
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

[Originally from freebsd-hackers@ / Feb 2008; freebsd-net Jun 2010]

 All:
 =20
 pf conf/63700 got the ball rolling on fixing cloned/VLAN=20
 interface management with rc.d/netif, but a very specific problem
 still remains.

 For example, adding an alias to a VLAN and running:
 /etc/rc.d/netif restart && /etc/rc.d/routing restart=20
 is a failure.

---

Take the following rc.conf(4) config:

hostname=3D"sexdrugsandunix"
cloned_interfaces=3D"vlan14"
ifconfig_em0=3D"up media 100baseTX mediaopt full-duplex -tso"
ifconfig_vlan14=3D"inet 1.2.3.4 netmask 255.255.255.128 vlan 14 vlandev
em0 up"
ifconfig_vlan14_alias0=3D"inet 1.2.3.5 netmask 255.255.255.255"

Change it to include a second alias without a reboot, instead run
'rc.d/netif restart', as works on a physical interface:

hostname=3D"sexdrugsandunix"
cloned_interfaces=3D"vlan14"
ifconfig_em0=3D"up media 100baseTX mediaopt full-duplex -tso"
ifconfig_vlan14=3D"inet 1.2.3.4 netmask 255.255.255.128 vlan 14 vlandev
em0 up"
ifconfig_vlan14_alias0=3D"inet 1.2.3.5 netmask 255.255.255.255"
ifconfig_vlan14_alias1=3D"inet 1.2.3.6 netmask 255.255.255.255"

The result will be:

% ifconfig vlan14
[bseklecki@sureshot ~]$ ifconfig vlan14
vlan14: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu=
=20
inet 1.2.3.6 netmask 0xffffffff broadcast 192.168.158.152
inet 1.2.3.5 netmask 0xffffffff broadcast 192.168.158.255


1) I'm not sure where the .152 broadcast comes from. ?!
2) The new _alias1=3D data is now in the primary IP slot
3) The primary IP is lost, there is no routable IP
4) The original _alias0=3D data is now in the 1st alias slot
5) rc.d/routing fails because the interface lacks a routable
   IP with a valid netmask/broadcast combination.

 ---------------------------

 Problem #1: rc.d/netif::network_stop()

 The core problem is that rc.d/netif::network_stop() never calls
 network.subr::clone_down() in the same way that
 rc.d/netif::network_start() calls network.subr::cloned_up()

 I'd speculate that this is a design decision not to destroy=20
 network interfaces that certain userland daemons (DHCP, RTADVD,=20
 BPF) may be strictly bound to; I disagree.

 Even if you explicitly pass your VLAN interface to rc.d/netif,
 a stop doesn't call 'ifconfig [VL] destory', and, when 'rc.d/netif start'
 is called later, SIOCSETVLAN results.

 jail-host-80:/home/bseklecki% sudo ifconfig vlan666 destroy
 jail-host-80:/home/bseklecki% sudo ifconfig vlan666=20
 create inet 1.2.3.4 netmask 255.255.255.0 vlan 666 vlandev em0
 jail-host-80:/home/bseklecki% sudo ifconfig vlan666=20
 create inet 1.2.3.4 netmask 255.255.255.0 vlan 666 vlandev em0
 ifconfig: create: bad value

 A simple rc.d/network_stop() patch could fix this problem if=20
 we can avoid bikeshedding.

------------------------------------------


 Problem #2: VLAN interface kernel data structures maintain configuration=
=20
             data after being destroyed and re-created

%ifconfig vlan666
vlan666: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu
1500
	options=3D3<RXCSUM,TXCSUM>
	ether 00:0c:29:a1:4b:9d
	inet 192.168.15.54 netmask 0xffffff00 broadcast 192.168.15.255
	media: Ethernet 1000baseT <full-duplex>
	status: active
	vlan: 666 parent interface: em0
%sudo ifconfig vlan666 destroy
%sudo ifconfig vlan666 create
%ifconfig vlan666
vlan666: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu
1500
	options=3D3<RXCSUM,TXCSUM>
	ether 00:0c:29:a1:4b:9d
!!**>>	inet 192.168.15.54 netmask 0xffffff00 broadcast 192.168.15.255 <<**!=
!
	media: Ethernet 1000baseT <full-duplex>
	status: active
	vlan: 666 parent interface: em0

Now, that's something you don't see very day!!
----------------------------------------------------

NOTE: I can't get that persistent IP data problem to happen
consistently, but its highly reproducible.

I also have no idea on the fixes, I'll check this weekend, but I have a
work-around.

To avoid destroying your routing table after adding an alias to a VLAN
interface in rc.conf(5), simply run:

 $ sudo /etc/rc.d/netif [VLAN####] start

 DO NOT RESTART, and you should be okay.

~BAS

References:

http://lists.freebsd.org/pipermail/freebsd-hackers/2008-February/023440.htm=
l
http://www.freebsd.org/cgi/query-pr.cgi?pr=3D63700&cat=3D  (Circa 2004)
http://lists.freebsd.org/pipermail/freebsd-net/2007-September/015447.html
http://lists.freebsd.org/pipermail/freebsd-net/2010-June/025514.html


--=20
Brian A. Seklecki <bseklecki@collaborativefusion.com>
Collaborative Fusion, Inc.



--=-Gv8I/D5hyjTvJ8MUW+nz
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEABECAAYFAkznDOMACgkQCne6BNDQ+R9aswCeNvSVqqCSUdm14BqH7sNpA1jV
SbcAnifVX0YrFj+pmJZLO2ZkBf0fJI6b
=OuGB
-----END PGP SIGNATURE-----

--=-Gv8I/D5hyjTvJ8MUW+nz--





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1290210531.26157.2412.camel>