Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 30 Aug 2017 09:29:35 +0200
From:      "Patrick M. Hausen" <hausen@punkt.de>
To:        freebsd-stable <freebsd-stable@freebsd.org>
Subject:   Bridged networking regression in 11.0?
Message-ID:  <F70DE809-66DD-4A02-92DB-1A21FC0B017F@punkt.de>

next in thread | raw e-mail | index | archive | help

--Apple-Mail=_35D9B998-D020-4761-B435-5E5868EA5F60
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

Hi, everyone,

one of the systems on which we run our jail based "proServer" product =
failed
in a very odd way for the second time with a couple of days between the =
two
incidents.

We run VIMAGE based jails (a lot) and bridge them with the physical =
interface
of the machine.

---------
cloned_interfaces=3D"bridge0 bridge1"

ifconfig_bridge0_name=3D"inet0"
ifconfig_inet0=3D"addm ix0 up"
ifconfig_inet0_alias0=3D"inet 217.29.41.2/24"
ifconfig_inet0_ipv6=3D"inet6 2a00:b580:8000:11:44e8:ab80:816:7869/64 =
auto_linklocal"

ifconfig_bridge1_name=3D"mgmt0"
ifconfig_mgmt0=3D"addm ix1 up"
ifconfig_mgmt0_alias0=3D"inet 10.5.105.7/16"
ifconfig_mgmt0_ipv6=3D"inet6 auto_linklocal"
---------

The rest is managed by iocage wich creates the needed epair(4) =
interfaces,
for some reason renames them to "vnetX" and adds them as members to
the bridge.

E.g.
---------
[ry93@ph002 ~]$ ifconfig inet0
inet0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu =
1500
	ether 02:50:51:fe:cc:00
	inet6 fe80::50:51ff:fefe:cc00%inet0 prefixlen 64 scopeid 0x4
	inet6 2a00:b580:8000:11:44e8:ab80:816:7869 prefixlen 64
	inet 217.29.41.2 netmask 0xffffff00 broadcast 217.29.41.255
	nd6 options=3D21<PERFORMNUD,AUTO_LINKLOCAL>
	groups: bridge
	id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
	maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
	root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
	member: vnet0:69 flags=3D143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
	        ifmaxaddr 0 port 76 priority 128 path cost 2000
[... 50 vnet interfaces following ...]
	member: ix0 flags=3D143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
	        ifmaxaddr 0 port 1 priority 128 path cost 2000
---------

When the system fails

- no jail is reachable from the outside via IP
- no jail is reachable from the host via IP
- the host itself is reachable just fine
- when we `iocage console` into a jail it can reach it's own IP =
addresses but nothing "outside"


I tried

- ifconfig ix0 down; ifconfig ix0 up
- ifconfig inet0 down; ifconfig inet0 up # aka bridge0
- iocage stop <jail>; iocage start <jail>

The latter deletes the epair instance connected to the jail and creates =
a fresh one,
then adds it to the bridge. No change in connectivity ... the start of =
the jail takes
"forever" because various processes hang waiting DNS timeouts (no =
networking ;-)

There's nothing in /var/log/messages or the dmesg buffer that relates to =
networking!
Rebooting the host system "fixes" the situation.


Now I'm well aware that this is too little information to draw some =
definite conclusions.
Hence my first question is: what should I do (commands) when the =
situation arises again
to gather more evidence?

Or maybe we are just lucky and there is a known problem? Yes, I know =
VIMAGE is still
considered experimental. We have been running this in production for =
months and it
looks like it could be related to upgrading host and jails from 10.3 to =
11.0 *or* switching
the old shell based iocage for Brandon's new python based version.
I cannot rule out iocage, yet it's not very probable - this is not a =
Docker like running service
or network component, after all. Once the jails are up, iocage is done =
...

An then there's the chance that it is something with the ix driver and =
the way we use the
interface ... so for completeness:
---------
ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 3.1.13-k> =
port 0x6020-0x603f mem 0xc7c00000-0xc7dfffff,0xc7e04000-0xc7e07fff irq =
26 at device 0.0 numa-domain 0 on pci3
ix0: Using MSIX interrupts with 9 vectors
ix0: Ethernet address: 0c:c4:7a:34:ec:ba
ix0: PCI Express Bus: Speed 5.0GT/s Width x8
ix0: netmap queues/slots: TX 8/2048, RX 8/2048
ix0: promiscuous mode enabled
ix0: link state changed to UP
---------


As usual thanks for any hints,
Patrick

--Apple-Mail=_35D9B998-D020-4761-B435-5E5868EA5F60
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQEcBAEBCgAGBQJZpmlfAAoJEJBvLuLt2olcCcMH/1LZg6EqbDFuYbjbxGeKdi7Y
lBn2573lo0+aet5ae5d+GwlPbbfLiXH78gyGMbSPR/gYYdOv6UgMQDCtBsR8FHc4
JVT+Q8tmYGEvJNn2BwXKB0Vnpx4dXEjr/cMdIjlqgKaMtpZYUdlXkQgEh9Ere79u
6OWgUvEujXRFVOpF6r9SIfR4xx++lokj/9FkjKiiSAwpbT4xVxQT7nL5xwXm6cOR
HOoBPdaMPgJogeJUl0/kHVTvWOA4R/YicgIxfM2NWORoBrQovsX+MGkye6C4/fwk
Pz4iZC3bDeDfD4u/S9drY0vW1xzLlLRCv0+drXNlkkoNutYkAH2/OVCF2fP0xhg=
=PY8Z
-----END PGP SIGNATURE-----

--Apple-Mail=_35D9B998-D020-4761-B435-5E5868EA5F60--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?F70DE809-66DD-4A02-92DB-1A21FC0B017F>