Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 20 Jun 2010 13:15:44 +0200
From:      Fabian Keil <freebsd-listen@fabiankeil.de>
To:        Lawrence Stewart <lstewart@freebsd.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: [CFT] SIFTR - Statistical Information For TCP Research: Uncle Lawrence needs YOU!
Message-ID:  <20100620131544.495ddecd@r500.local>
In-Reply-To: <4C1DED16.8020209@freebsd.org>
References:  <4C1492D0.6020704@freebsd.org> <4C1C3922.2050102@freebsd.org> <20100619195823.53a7baaa@r500.local> <4C1DED16.8020209@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/lVFt+pxVZr2mG_YBfXZGKXy
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Lawrence Stewart <lstewart@freebsd.org> wrote:

> On 06/20/10 03:58, Fabian Keil wrote:
> > Lawrence Stewart<lstewart@freebsd.org>  wrote:
> >
> >> On 06/13/10 18:12, Lawrence Stewart wrote:
> >
> >>> The time has come to solicit some external testing for my SIFTR tool.
> >>> I'm hoping to commit it within a week or so unless problems are disco=
vered.
> >
> >>> I'm interested in all feedback and reports of success/failure, along
> >>> with details of the architecture tested and number of CPUs if you wou=
ld
> >>> be so kind.
> >
> > I got the following hand-transcribed panic maybe a second after
> > sysctl net.inet.siftr.enabled=3D1
> >
> > Fatal trap 12: page fault while in kernel mode
> > cpuid =3D 1; apic id =3D 01
> > [...]
> > current process =3D 12 (swi4: clock)
> > [ thread pid 12 tid 100006 ]
> > Stopped at	siftr_chkpkt+0xd0:	addq	$0x1,0x8(%r14)
> > db>  where
> > Tracing pid 12 tid 100006 td 0xffffff00034037e0
> > siftr_chkpt() at siftr_chkpkt+0xd0
> > pfil_run_hooks() at pfil_run_hooks+0xb4
> > ip_output() at ip_output+0x382
> > tcp_output() tcp_output+0xa41
> > tcp_timer_rexmt() at tcp_timer_rexmt+0x251
> > softclock() at softclock+0x291
> > intr_event_execute_handlers() at intr_event_execute_handlers+0x66
> > ithread_loop at ithread_loop+0x8e
> > fork_exit() at fork_exit+0x112
> > fork_trampoline() at fork_trampoline+0xe
> > --- trap 0, rip =3D 0, rsp =3D 0xffffff800003ad30, rbp =3D 0 ---
>=20
> So I've tracked down the line of code where the page fault is occurring:
>=20
>          if (dir =3D=3D PFIL_IN)
>                  ss->n_in++;
>          else
>                  ss->n_out++;
>=20
> ss is a DPCPU (dynamic per-cpu) variable used to keep a set of stats=20
> per-cpu and is initialised at the start of the function like so:
>=20
>          ss =3D DPCPU_PTR(ss);
>=20
> So for ss to be NULL, that implies DPCPU_PTR() is returning NULL on your=
=20
> machine. I know very little about the inner workings of the DPCPU_*=20
> macros, but I'm pretty sure the way I use them in SIFTR is correct or at=
=20
> least as intended.

siftr_chkpkt() passes ss to siftr_chkreinject() before dereferencing
it itself. I think if ss was NULL, the panic should already occur in
siftr_chkreinject().

To be sure I added:

diff --git a/sys/netinet/siftr.c b/sys/netinet/siftr.c
index 8bc3498..b9fdfe4 100644
--- a/sys/netinet/siftr.c
+++ b/sys/netinet/siftr.c
@@ -788,6 +788,16 @@ siftr_chkpkt(void *arg, struct mbuf **m, struct ifnet =
*ifp, int dir,
        if (siftr_chkreinject(*m, dir, ss))
                goto ret;
=20
+       if (ss =3D=3D NULL) {
+           printf("ss is NULL");
+           ss =3D DPCPU_PTR(ss);
+           if (ss =3D=3D NULL) {
+              printf("ss is still NULL");
+              goto ret;
+           }
+        }
+
+
        if (dir =3D=3D PFIL_IN)
                ss->n_in++;
        else

which doesn't seem to affect the problem.

> Could you please go ahead and retest using a GENERIC kernel and see if=20
> you can reproduce? There could be something in your custom kernel=20
> causing the offsets or linker set magic used by the DPCPU bits to break=20
> which in turn is triggering this panic in SIFTR.

I'll retry without pf first, and with GENERIC afterwards.

Fabian

--Sig_/lVFt+pxVZr2mG_YBfXZGKXy
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEYEARECAAYFAkwd+GcACgkQBYqIVf93VJ1bHQCgiBOEzBC5piR16rlNosEAo72m
0icAn0gDOsjeQqe1/WQ/SrYlocAyg799
=u8tV
-----END PGP SIGNATURE-----

--Sig_/lVFt+pxVZr2mG_YBfXZGKXy--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100620131544.495ddecd>