Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Nov 2007 01:29:54 +0100
From:      Max Laier <max@love2party.net>
To:        freebsd-pf@freebsd.org
Subject:   Re: 7-STABLE panic: mtx_lock() of spin mutex %s @ %s:%d
Message-ID:  <200711270130.01165.max@love2party.net>
In-Reply-To: <474B5BD0.6040004@kasimir.com>
References:  <474B5BD0.6040004@kasimir.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart4415305.DFEz4RlB8O
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

On Tuesday 27 November 2007, Florian Smeets wrote:
> Hi
>
> i was able to reproduce a hang on a 7-STABLE (csuped just after Scotts
> critical section MFC) firewall which runs mpd4 from ports and uses pf
> for packet filtering. Sometimes when i restart mpd4 the box just hangs.
> I have a up-script which calls /sbin/pfctl -f /etc/pf.conf.

It's an ALTQ problem, really.  I don't know if there is a down-script that=
=20
runs before the interface is destroyed, but if there is - adding=20
pfctl -Fq there should work around your problem.  I did have some patches=20
to have ALTQ working with dynamic interfaces, but I must have dropped=20
them.  I'll see what I can dig up in the next few days.

In short:

altq on foo0 ...

ifconfig foo0 destroy -> BOOM!

mpd has the unfortunate habit of destroying it's interface on session=20
restart - there might be ways to change that, though.

> After adding witness and invariants instead of the hang i get this LOR:
>
> Nov 26 18:06:39 fw-a kernel: lock order reversal:
> Nov 26 18:06:39 fw-a kernel: 1st 0xc0762a8c pf task mtx (pf task mtx) @
> /usr/src/sys/contrib/pf/net/pf_ioctl.c:1304
> Nov 26 18:06:39 fw-a kernel: 2nd 0xc07c470c ifnet (ifnet) @
> /usr/src/sys/net/if.c:1494
> Nov 26 18:06:39 fw-a kernel: KDB: stack backtrace:
> Nov 26 18:06:39 fw-a kernel:
> db_trace_self_wrapper(c0707544,ccdb7a3c,c054c0ae,c07098ff,c07c470c,...)
> at db_trace_self_wrapper+0x26
> Nov 26 18:06:39 fw-a kernel:
> kdb_backtrace(c07098ff,c07c470c,c0710de4,c0710de4,c0710c3d,...) at
> kdb_backtrace+0x29
> Nov 26 18:06:39 fw-a kernel:
> witness_checkorder(c07c470c,9,c0710c3d,5d6,0,...) at
> witness_checkorder+0x6de
> Nov 26 18:06:39 fw-a kernel:
> _mtx_lock_flags(c07c470c,0,c0710c3d,5d6,c205e520,...) at
> _mtx_lock_flags+0xbc
> Nov 26 18:06:39 fw-a kernel:
> ifunit(c205e520,0,c06f5a82,518,ccdb7ab8,...) at ifunit+0x2f
> Nov 26 18:06:39 fw-a kernel:
> pfioctl(c1d82600,c0104414,c205e520,3,c224da50,...) at pfioctl+0x2341
> Nov 26 18:06:39 fw-a kernel:
> devfs_ioctl_f(c1e55ca8,c0104414,c205e520,c1c97c00,c224da50,...) at
> devfs_ioctl_f+0xc9
> Nov 26 18:06:39 fw-a kernel:
> kern_ioctl(c224da50,3,c0104414,c205e520,1000000,...) at
> kern_ioctl+0x243 Nov 26 18:06:39 fw-a kernel:
> ioctl(c224da50,ccdb7cfc,c,c072bdfe,c0739810,...) at ioctl+0x134
> Nov 26 18:06:39 fw-a kernel: syscall(ccdb7d38) at syscall+0x2b3
> Nov 26 18:06:39 fw-a kernel: Xint0x80_syscall() at
> Xint0x80_syscall+0x20 Nov 26 18:06:39 fw-a kernel: --- syscall (54,
> FreeBSD ELF32, ioctl), eip =3D 0x281a6d43, esp =3D 0xbfbfde4c, ebp =3D
> 0xbfbfde78 ---
>
>
> and shortly after that this page fault:
>
> Fatal trap 12: page fault while in kernel mode
> fault virtual address	=3D 0xdeadc0de
> fault code		=3D supervisor read, page not present
> instruction pointer	=3D 0x20:0xc059e4a8
> stack pointer	        =3D 0x28:0xccd41890
> frame pointer	        =3D 0x28:0xccd41890
> code segment		=3D base 0x0, limit 0xfffff, type 0x1b
> 			=3D DPL 0, pres 1, def32 1, gran 1
> processor eflags	=3D interrupt enabled, resume, IOPL =3D 0
> current process		=3D 1474 (pfctl)
> Physical memory: 245 MB
> Dumping 60 MB: 45 29 13
>
> #0  doadump () at pcpu.h:195
> 195	pcpu.h: No such file or directory.
> 	in pcpu.h
> (kgdb) list *0xc059e4a8
> 0xc059e4a8 is in strlen (/usr/src/sys/libkern/strlen.c:41).
> 36	strlen(str)
> 37		const char *str;
> 38	{
> 39		register const char *s;
> 40
> 41		for (s =3D str; *s; ++s);
> 42		return(s - str);
> 43	}
> 44
> (kgdb) bt
> #0  doadump () at pcpu.h:195
> #1  0xc046dc59 in db_fncall (dummy1=3D-858515768, dummy2=3D0, dummy3=3D12,
> dummy4=3D0xccd41634 "") at /usr/src/sys/ddb/db_command.c:486
> #2  0xc046e1c5 in db_command_loop () at
> /usr/src/sys/ddb/db_command.c:401 #3  0xc046f935 in db_trap (type=3D12,
> code=3D0) at
> /usr/src/sys/ddb/db_main.c:222
> #4  0xc053bbe4 in kdb_trap (type=3D12, code=3D0, tf=3D0xccd41850) at
> /usr/src/sys/kern/subr_kdb.c:502
> #5  0xc06d02df in trap_fatal (frame=3D0xccd41850, eva=3D3735929054) at
> /usr/src/sys/i386/i386/trap.c:863
> #6  0xc06d0513 in trap_pfault (frame=3D0xccd41850, usermode=3D0,
> eva=3D3735929054) at /usr/src/sys/i386/i386/trap.c:785
> #7  0xc06d0eb5 in trap (frame=3D0xccd41850) at
> /usr/src/sys/i386/i386/trap.c:463
> #8  0xc06b970b in calltrap () at /usr/src/sys/i386/i386/exception.s:139
> #9  0xc059e4a8 in strlen (str=3D0xdeadc0de <Address 0xdeadc0de out of
> bounds>) at /usr/src/sys/libkern/strlen.c:41
> #10 0xc053e4e5 in kvprintf (fmt=3D0xc0703a1c " @ %s:%d", func=3D0xc053d9a0
> <snprintf_func>, arg=3D0xccd419d0, radix=3D10, ap=3D0xccd41a0c " Mo??\001=
")
>      at /usr/src/sys/kern/subr_prf.c:750
> #11 0xc053ed2b in vsnprintf (str=3D0xc0779380 "mtx_lock() of spin mutex
> ", size=3D256, format=3D0xc0703a01 "mtx_lock() of spin mutex %s @ %s:%d",
> ap=3D0xccd41a08 "???? Mo??\001") at /usr/src/sys/kern/subr_prf.c:483 #12
> 0xc05152f3 in panic (fmt=3D0xc0703a01 "mtx_lock() of spin mutex %s @
> %s:%d") at /usr/src/sys/kern/kern_shutdown.c:530
> #13 0xc0509eba in _mtx_lock_flags (m=3D0xc1d3ad0c, opts=3D0,
> file=3D0xc06f4d20 "/usr/src/sys/contrib/altq/altq/altq_priq.c", line=3D41=
6)
> at
> /usr/src/sys/kern/kern_mutex.c:180
> #14 0xc0437b83 in priq_class_destroy (cl=3D0xc22138c0) at
> /usr/src/sys/contrib/altq/altq/altq_priq.c:416
> #15 0xc0437c7d in priq_remove_altq (a=3D0xc2048a80) at
> /usr/src/sys/contrib/altq/altq/altq_priq.c:252
> #16 0xc0438635 in altq_remove (a=3D0xc2048a80) at
> /usr/src/sys/contrib/altq/altq/altq_subr.c:650
> #17 0xc04508ee in pf_commit_altq (ticket=3D5) at
> /usr/src/sys/contrib/pf/net/pf_ioctl.c:867
> #18 0xc0456341 in pfioctl (dev=3D0xc1d82600, cmd=3D3222029394,
> addr=3D0xc22d32e0 "\a", flags=3D3, td=3D0xc2014a50) at
> /usr/src/sys/contrib/pf/net/pf_ioctl.c:3170
> #19 0xc04bcfe9 in devfs_ioctl_f (fp=3D0xc20270d8, com=3D3222029394,
> data=3D0xc22d32e0, cred=3D0xc200bd00, td=3D0xc2014a50) at
> /usr/src/sys/fs/devfs/devfs_vnops.c:494
> #20 0xc054d563 in kern_ioctl (td=3D0xc2014a50, fd=3D3, com=3D3222029394,
> data=3D0xc22d32e0 "\a") at file.h:266
> #21 0xc054d6c4 in ioctl (td=3D0xc2014a50, uap=3D0xccd41cfc) at
> /usr/src/sys/kern/sys_generic.c:570
> #22 0xc06d07f3 in syscall (frame=3D0xccd41d38) at
> /usr/src/sys/i386/i386/trap.c:1008
> #23 0xc06b9770 in Xint0x80_syscall () at
> /usr/src/sys/i386/i386/exception.s:196
> #24 0x00000033 in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> (kgdb) frame 13
> #13 0xc0509eba in _mtx_lock_flags (m=3D0xc1d3ad0c, opts=3D0,
> file=3D0xc06f4d20 "/usr/src/sys/contrib/altq/altq/altq_priq.c", line=3D41=
6)
> at
> /usr/src/sys/kern/kern_mutex.c:180
> 180		KASSERT(LOCK_CLASS(&m->lock_object) =3D=3D &lock_class_mtx_sleep,
> (kgdb) list
> 175	{
> 176
> 177		MPASS(curthread !=3D NULL);
> 178		KASSERT(m->mtx_lock !=3D MTX_DESTROYED,
> 179		    ("mtx_lock() of destroyed mutex @ %s:%d", file, line));
> 180		KASSERT(LOCK_CLASS(&m->lock_object) =3D=3D &lock_class_mtx_sleep,
> 181		    ("mtx_lock() of spin mutex %s @ %s:%d",
> m->lock_object.lo_name, 182		    file, line));
> 183		WITNESS_CHECKORDER(&m->lock_object, opts | LOP_NEWORDER |
> LOP_EXCLUSIVE,
> 184		    file, line);
> (kgdb)
>
>
> As the panic/page fault seems to be connected to the altq/priq rules i
> have added the definition of the altq part
>
> altq on $ext_if priq bandwidth 960Kb qlimit 250 queue { std_out,
> smtp_out, ssh_out, tcp_ack_out }
>
> queue smtp_out     priq(default) qlimit 250
> queue std_out    priority 2 qlimit 250
> queue ssh_out     priority 5 qlimit 250
> queue tcp_ack_out priority 6 qlimit 250
>
> And of course there are some pass rules which add packets to the
> queues.
>
> Is there anything else needed from the core dump?


=2D-=20
/"\  Best regards,                      | mlaier@freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier@EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News

--nextPart4415305.DFEz4RlB8O
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQBHS2UJXyyEoT62BG0RAlGPAJ9uC/UGsCpXtyYqA64Lgo+Wm5F9TACfdOeP
HfdPJuYSHoADzXhD1BYxzfw=
=rWc7
-----END PGP SIGNATURE-----

--nextPart4415305.DFEz4RlB8O--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200711270130.01165.max>