Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 22 Aug 2009 03:45:37 -0700
From:      Brian Somers <brian@FreeBSD.org>
To:        Kip Macy <kmacy@FreeBSD.org>
Cc:        freebsd-hackers@FreeBSD.org, Brian Somers <brian@FreeBSD.org>
Subject:   Re: kernel panics in in_lltable_lookup (with INVARIANTS)
Message-ID:  <20090822034537.76b16271@dev.lan.Awfulhak.org>
In-Reply-To: <20090821232313.21a9a7f9@dev.lan.Awfulhak.org>
References:  <20090821164312.641fe2bd@dev.lan.Awfulhak.org> <3c1674c90908211713j36415b96q58b0ed66cc82713f@mail.gmail.com> <20090821215503.3eec9a15@dev.lan.Awfulhak.org> <20090821224134.11d9a2a1@dev.lan.Awfulhak.org> <20090821232313.21a9a7f9@dev.lan.Awfulhak.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/gLiWnsbK4uMkpc41Y3jyX3S
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Fri, 21 Aug 2009 23:23:13 -0700 Brian Somers <brian@FreeBSD.org> wrote:
> On Fri, 21 Aug 2009 22:41:34 -0700 Brian Somers <brian@Awfulhak.org> wrot=
e:
> > On Fri, 21 Aug 2009 21:55:03 -0700 Brian Somers <brian@FreeBSD.org> wro=
te:
> > > On Fri, 21 Aug 2009 17:13:45 -0700 Kip Macy <kmacy@freebsd.org> wrote:
> > > > Try this:
> > > >=20
> > > > Index: sys/net/flowtable.c
> > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> > > > --- sys/net/flowtable.c (revision 196382)
> > > > +++ sys/net/flowtable.c (working copy)
> > > > @@ -688,6 +688,12 @@
> > > >                 struct rtentry *rt =3D ro->ro_rt;
> > > >                 struct ifnet *ifp =3D rt->rt_ifp;
> > > >=20
> > > > +               if (ifp->if_flags & IFF_POINTOPOINT) {
> > > > +                       RTFREE(rt);
> > > > +                       ro->ro_rt =3D NULL;
> > > > +                       return (ENOENT);
> > > > +               }
> > > > +
> > > >                 if (rt->rt_flags & RTF_GATEWAY)
> > > >                         l3addr =3D rt->rt_gateway;
> > > >                 else
> > > >=20
> > > > You'll need to apply this by hand as gmail munges the formatting.
> > > >=20
> > > > -Kip
> > >=20
> > > Hi,
> > >=20
> > > That certainly stops the panic, however data routed to the tun
> > > interface doesn't come out the back end and data written
> > > to the back end doesn't come out the tun interface.
> > [.....]
> > > Maybe this problem isn't a routing problem.  I'll
> > > look into it further and figure out if the packet is getting to the t=
un
> > > driver and if so, what it thinks it's doing with it.
> >=20
> > I wasn't correct - the data *IS* being read out of the back of
> > the tunnel device.  When I send the ICMP, it goes into the tun
> > device and comes out the back end as an AF_LINK packet.  ppp
> > silently discards this (ironically I have a comment noting
> > that I should really track unidentified packet counts).
> >=20
> > I'll try to figure out what in if_tun.c is corrupting the family next...
>=20
> if_tun.c is fine.  The data passed from if_output() has family
> AF_LINK - hence the original panic from flowtable_lookup().
>=20
> So the question is "why is ip_output() sending AF_LINK traffic
> instead of AF_INET traffic?".
>=20
> Still looking....

=46rom what I can tell, this is what is happening:

ip_output() is called with ro =3D=3D NULL.
ip_output() calls flowtable_lookup() with a zeroed 'ro'.
flowtable_lookup() calls ft->ft_rtalloc() (really rtalloc1_fib()) to
initialise 'ro' and ends up with ro->ro_rt->rt_gateway->sa_family
set to AF_LINK.

Your original patch frees ro->ro_rt and fails before calling
llentry_update() with ro->ro_rt->rt_gateway->sa_family !=3D
AF_INET.

Now, when flowtable_lookup() fails, ro->ro_rt is NULL and
ip_output()s 'dst' gets set up with family AF_INET.  Unfortunately,
right after this, after checking for IP_SENDONES, IP_ROUTETOIF
and IN_MULTICAST, the ip_output() code decides to call
in_rtalloc_ign() (which eventually just calls rtalloc1_fib()) to
initialise ro->ro_rt and then sets dst to be ro->ro_rt->rt_gateway
-- which is *still* an AF_LINK address!

Finally ip_output() calls ifp->if_output() (really tunoutput()) with
dst's family set to AF_LINK, tunoutput() queues it to the tun
character device, ppp reads it and drops it on the floor 'cos it
doesn't know what to do with AF_LINK.

The tun driver is more or less the same as the -stable version,
so it seems that ip_output() is to blame.  The only relevant part
that seems substantially different is rtalloc1_fib(), so right now
I'm guessing that the RTF_CLONING code in -stable always
clones the route with a gw family of AF_INET and expectations
are met after that.

I'll look some more on the weekend...

> > > > On Fri, Aug 21, 2009 at 16:43, Brian Somers<brian@freebsd.org> wrot=
e:
> > > > > Hi,
> > > > >
> > > > > I've been working on a fix to address an issue that came up with
> > > > > our update of openssh-5. =C2=A0The issue is that openssh-5 now us=
es
> > > > > pipe() to create stdin/stdout channels between sshd and the server
> > > > > side program where it used to use socketpair(). =C2=A0Because it =
uses
> > > > > pipe(), stdin is no longer bi-directional and cannot be used for =
both
> > > > > input and output by a child process. =C2=A0This breaks the use of=
 ssh
> > > > > as a tunnel with ppp on either end (set device "!ssh -e none host
> > > > > ppp -direct label")
> > > > >
> > > > > I talked with des@ for a while and then with the openssh folks and
> > > > > have not been able to resolve the issues in openssh that made them
> > > > > choose to enforce the use of pipe() over socketpair(). =C2=A0I no=
w have a
> > > > > patch to ppp that makes ppp detect that it's connected via pipe()=
 and
> > > > > causes it to use stdin for input and stdout for output (usually i=
t expects
> > > > > just one descriptor). =C2=A0Although I'm happy with the patch and=
 planned on
> > > > > requesting permission to commit, I've bumped into a show-stopper
> > > > > that seems unrelated, so I thought I'd ask here if anyone has seen
> > > > > this or has any suggestions as to what the problem might be.
> > > > >
> > > > > The issue....
> > > > >
> > > > > I'm seeing a panic when I send traffic through a ppp link:
> > > > >
> > > > > panic string is: sin_family 18
> > > > > Stack trace starts:
> > > > > =C2=A0 =C2=A0in_lltable_lookup()
> > > > > =C2=A0 =C2=A0llentry_update()
> > > > > =C2=A0 =C2=A0flowtable_lookup()
> > > > > =C2=A0 =C2=A0ip_output()
> > > > > =C2=A0 =C2=A0....
> > > > >
> > > > > The panic is due to a KASSERT in in_lltable_lookup() that expects=
 the
> > > > > sockaddr to be AF_INET. =C2=A0Number 18 is AF_LINK.
> > > > >
> > > > > AFAICT this is happening while setting up a temporary route for t=
he
> > > > > first outbound packet. =C2=A0I haven't been able to do much inves=
tigation
> > > > > yet due to other patches in my tree that seem to have broken all =
my
> > > > > kernel symbols, but once I get a clean rebuild I should be back in
> > > > > business.
> > > > >
> > > > > If anyone has any suggestions, I'm all ears!
> > > > >
> > > > > Cheers.

--=20
Brian Somers                                          <brian@Awfulhak.org>
Don't _EVER_ lose your sense of humour !               <brian@FreeBSD.org>

--Sig_/gLiWnsbK4uMkpc41Y3jyX3S
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.12 (FreeBSD)

iQCVAwUBSo/MXA7tvOdmanQhAQImqQP7Bp+8ggpe247WlLnucfB/T4lsJoaiPhWi
gV3gbGvCEyy5WP1d2lZFQzcMx/JacteL40GivXlhuzdF4NrovYWPTRGVINF4W+cf
lzFC7UsECuXwyDIJrRLTQHHe0zFjpxu9fazpWma44HXE76XJwIiis6jVmai7flAl
rc5kuMOLuQI=
=V+gY
-----END PGP SIGNATURE-----

--Sig_/gLiWnsbK4uMkpc41Y3jyX3S--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090822034537.76b16271>