From owner-freebsd-current@FreeBSD.ORG Sun Jun 20 11:15:33 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 91A681065672; Sun, 20 Jun 2010 11:15:33 +0000 (UTC) (envelope-from freebsd-listen@fabiankeil.de) Received: from smtprelay03.ispgateway.de (smtprelay03.ispgateway.de [80.67.31.37]) by mx1.freebsd.org (Postfix) with ESMTP id 1DF3C8FC18; Sun, 20 Jun 2010 11:15:32 +0000 (UTC) Received: from [87.79.248.113] (helo=r500.local) by smtprelay03.ispgateway.de with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.68) (envelope-from ) id 1OQIUt-0008Is-BC; Sun, 20 Jun 2010 13:15:31 +0200 Date: Sun, 20 Jun 2010 13:15:44 +0200 From: Fabian Keil To: Lawrence Stewart Message-ID: <20100620131544.495ddecd@r500.local> In-Reply-To: <4C1DED16.8020209@freebsd.org> References: <4C1492D0.6020704@freebsd.org> <4C1C3922.2050102@freebsd.org> <20100619195823.53a7baaa@r500.local> <4C1DED16.8020209@freebsd.org> X-Mailer: Claws Mail 3.7.6 (GTK+ 2.20.1; amd64-portbld-freebsd9.0) X-PGP-KEY-URL: http://www.fabiankeil.de/gpg-keys/freebsd-listen-2008-08-18.asc Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/lVFt+pxVZr2mG_YBfXZGKXy"; protocol="application/pgp-signature" X-Df-Sender: 775067 Cc: freebsd-current@freebsd.org Subject: Re: [CFT] SIFTR - Statistical Information For TCP Research: Uncle Lawrence needs YOU! X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Jun 2010 11:15:33 -0000 --Sig_/lVFt+pxVZr2mG_YBfXZGKXy Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Lawrence Stewart wrote: > On 06/20/10 03:58, Fabian Keil wrote: > > Lawrence Stewart wrote: > > > >> On 06/13/10 18:12, Lawrence Stewart wrote: > > > >>> The time has come to solicit some external testing for my SIFTR tool. > >>> I'm hoping to commit it within a week or so unless problems are disco= vered. > > > >>> I'm interested in all feedback and reports of success/failure, along > >>> with details of the architecture tested and number of CPUs if you wou= ld > >>> be so kind. > > > > I got the following hand-transcribed panic maybe a second after > > sysctl net.inet.siftr.enabled=3D1 > > > > Fatal trap 12: page fault while in kernel mode > > cpuid =3D 1; apic id =3D 01 > > [...] > > current process =3D 12 (swi4: clock) > > [ thread pid 12 tid 100006 ] > > Stopped at siftr_chkpkt+0xd0: addq $0x1,0x8(%r14) > > db> where > > Tracing pid 12 tid 100006 td 0xffffff00034037e0 > > siftr_chkpt() at siftr_chkpkt+0xd0 > > pfil_run_hooks() at pfil_run_hooks+0xb4 > > ip_output() at ip_output+0x382 > > tcp_output() tcp_output+0xa41 > > tcp_timer_rexmt() at tcp_timer_rexmt+0x251 > > softclock() at softclock+0x291 > > intr_event_execute_handlers() at intr_event_execute_handlers+0x66 > > ithread_loop at ithread_loop+0x8e > > fork_exit() at fork_exit+0x112 > > fork_trampoline() at fork_trampoline+0xe > > --- trap 0, rip =3D 0, rsp =3D 0xffffff800003ad30, rbp =3D 0 --- >=20 > So I've tracked down the line of code where the page fault is occurring: >=20 > if (dir =3D=3D PFIL_IN) > ss->n_in++; > else > ss->n_out++; >=20 > ss is a DPCPU (dynamic per-cpu) variable used to keep a set of stats=20 > per-cpu and is initialised at the start of the function like so: >=20 > ss =3D DPCPU_PTR(ss); >=20 > So for ss to be NULL, that implies DPCPU_PTR() is returning NULL on your= =20 > machine. I know very little about the inner workings of the DPCPU_*=20 > macros, but I'm pretty sure the way I use them in SIFTR is correct or at= =20 > least as intended. siftr_chkpkt() passes ss to siftr_chkreinject() before dereferencing it itself. I think if ss was NULL, the panic should already occur in siftr_chkreinject(). To be sure I added: diff --git a/sys/netinet/siftr.c b/sys/netinet/siftr.c index 8bc3498..b9fdfe4 100644 --- a/sys/netinet/siftr.c +++ b/sys/netinet/siftr.c @@ -788,6 +788,16 @@ siftr_chkpkt(void *arg, struct mbuf **m, struct ifnet = *ifp, int dir, if (siftr_chkreinject(*m, dir, ss)) goto ret; =20 + if (ss =3D=3D NULL) { + printf("ss is NULL"); + ss =3D DPCPU_PTR(ss); + if (ss =3D=3D NULL) { + printf("ss is still NULL"); + goto ret; + } + } + + if (dir =3D=3D PFIL_IN) ss->n_in++; else which doesn't seem to affect the problem. > Could you please go ahead and retest using a GENERIC kernel and see if=20 > you can reproduce? There could be something in your custom kernel=20 > causing the offsets or linker set magic used by the DPCPU bits to break=20 > which in turn is triggering this panic in SIFTR. I'll retry without pf first, and with GENERIC afterwards. Fabian --Sig_/lVFt+pxVZr2mG_YBfXZGKXy Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAkwd+GcACgkQBYqIVf93VJ1bHQCgiBOEzBC5piR16rlNosEAo72m 0icAn0gDOsjeQqe1/WQ/SrYlocAyg799 =u8tV -----END PGP SIGNATURE----- --Sig_/lVFt+pxVZr2mG_YBfXZGKXy--