From owner-svn-src-all@FreeBSD.ORG Fri Apr 25 16:44:28 2014 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [8.8.178.116]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 25598CAF; Fri, 25 Apr 2014 16:44:28 +0000 (UTC) Received: from butcher-nb.yandex.net (hub.freebsd.org [IPv6:2001:1900:2254:206c::16:88]) by mx2.freebsd.org (Postfix) with ESMTP id 7BD8E2F4; Fri, 25 Apr 2014 16:44:26 +0000 (UTC) Message-ID: <535A9093.6010201@FreeBSD.org> Date: Fri, 25 Apr 2014 20:42:59 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 MIME-Version: 1.0 To: Alan Somers , Adrian Chadd Subject: Re: svn commit: r253687 - head/sys/net References: <201307261941.r6QJfEMO087844@svn.freebsd.org> In-Reply-To: X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Apr 2014 16:44:28 -0000 On 25.04.2014 19:58, Alan Somers wrote: > On Fri, Jul 26, 2013 at 1:41 PM, Adrian Chadd wrote: >> Author: adrian >> Date: Fri Jul 26 19:41:13 2013 >> New Revision: 253687 >> URL: http://svnweb.freebsd.org/changeset/base/253687 >> >> Log: >> Break out the static, global LACP debug options into a per-lagg unit >> sysctl tree. >> >> * Create a net.link.lagg.X.lacp node > > I think this introduced a lock order reversal. > >> * Add a debug node under that for tx_test and rx_test >> * Add lacp_strict_mode, defaulting to 1 >> >> tx_test and rx_test are still a bitmap of unit numbers for now. >> At some point it would be nice to create child nodes of the lagg bundle >> for each sub-interface, and then populate those with various knobs >> and statistics. >> >> Sponsored by: Netflix >> >> Modified: >> head/sys/net/ieee8023ad_lacp.c >> head/sys/net/ieee8023ad_lacp.h >> head/sys/net/if_lagg.c >> head/sys/net/if_lagg.h >> >> Modified: head/sys/net/ieee8023ad_lacp.c >> ============================================================================== >> --- head/sys/net/ieee8023ad_lacp.c Fri Jul 26 19:11:08 2013 (r253686) >> +++ head/sys/net/ieee8023ad_lacp.c Fri Jul 26 19:41:13 2013 (r253687) > > > ; >> @@ -765,10 +791,19 @@ lacp_attach(struct lagg_softc *sc) >> >> lsc->lsc_hashkey = arc4random(); >> lsc->lsc_active_aggregator = NULL; >> + lsc->lsc_strict_mode = 1; >> LACP_LOCK_INIT(lsc); >> TAILQ_INIT(&lsc->lsc_aggregators); >> LIST_INIT(&lsc->lsc_ports); >> >> + /* Create a child of the parent lagg interface */ >> + oid = SYSCTL_ADD_NODE(&sc->ctx, SYSCTL_CHILDREN(sc->sc_oid), >> + OID_AUTO, "lacp", CTLFLAG_RD, NULL, "LACP"); > > This line grabs a sleepable lock, but we already had a nonsleepable > lock further up the stack, acquired in lagg_ioctl(). > >> + >> + /* Attach sysctl nodes */ >> + lacp_attach_sysctl(lsc, oid); >> + lacp_attach_sysctl_debug(lsc, oid); >> + >> callout_init_mtx(&lsc->lsc_transit_callout, &lsc->lsc_mtx, 0); >> callout_init_mtx(&lsc->lsc_callout, &lsc->lsc_mtx, 0); >> > > Here's the warning from Witness.as well as a warning from UMA. Many > more UMA warnings followed. > > lock order reversal: (sleepable after non-sleepable) > 1st 0xfffff8000252ca08 if_lagg rmlock (if_lagg rmlock) @ > /usr/home/alans/freebsd/head/sys/modules/if_lagg/../../net/if_lagg.c:1040 > 2nd 0xffffffff814ef4e0 sysctl lock (sysctl lock) @ > /usr/home/alans/freebsd/head/sys/kern/kern_sysctl.c:474 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00977485b0 > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe0097748660 > witness_checkorder() at witness_checkorder+0xdc2/frame 0xfffffe00977486f0 > _sx_xlock() at _sx_xlock+0x75/frame 0xfffffe0097748730 > sysctl_add_oid() at sysctl_add_oid+0x4a/frame 0xfffffe0097748780 > lacp_attach() at lacp_attach+0xf7/frame 0xfffffe00977487f0 > lagg_lacp_attach() at lagg_lacp_attach+0x88/frame 0xfffffe0097748810 > lagg_ioctl() at lagg_ioctl+0x98a/frame 0xfffffe00977488f0 > in_control() at in_control+0x38e/frame 0xfffffe0097748970 > ifioctl() at ifioctl+0xba2/frame 0xfffffe0097748a30 > kern_ioctl() at kern_ioctl+0x22b/frame 0xfffffe0097748a90 > sys_ioctl() at sys_ioctl+0x13c/frame 0xfffffe0097748ae0 > amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe0097748bf0 > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0097748bf0 > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800fa045a, rsp = > 0x7fffffffe118, rbp = 0x7fffffffe1a0 --- > uma_zalloc_arg: zone "128" with the following non-sleepable locks held: > exclusive rm if_lagg rmlock (if_lagg rmlock) r = 0 > (0xfffff8000252ca08) locked @ > /usr/home/alans/freebsd/head/sys/modules/if_lagg/../../net/if_lagg.c:1040 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0097748500 > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe00977485b0 > witness_warn() at witness_warn+0x4b5/frame 0xfffffe0097748670 > uma_zalloc_arg() at uma_zalloc_arg+0x3b/frame 0xfffffe00977486e0 > malloc() at malloc+0x194/frame 0xfffffe0097748730 > sysctl_add_oid() at sysctl_add_oid+0x11f/frame 0xfffffe0097748780 > lacp_attach() at lacp_attach+0xf7/frame 0xfffffe00977487f0 > lagg_lacp_attach() at lagg_lacp_attach+0x88/frame 0xfffffe0097748810 > lagg_ioctl() at lagg_ioctl+0x98a/frame 0xfffffe00977488f0 > in_control() at in_control+0x38e/frame 0xfffffe0097748970 > ifioctl() at ifioctl+0xba2/frame 0xfffffe0097748a30 > kern_ioctl() at kern_ioctl+0x22b/frame 0xfffffe0097748a90 > sys_ioctl() at sys_ioctl+0x13c/frame 0xfffffe0097748ae0 > amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe0097748bf0 > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0097748bf0 > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800fa045a, rsp = > 0x7fffffffe118, rbp = 0x7fffffffe1a0 --- > > > # uname -a > FreeBSD alans-fbsd-head 11.0-CURRENT FreeBSD 11.0-CURRENT #49 > r264887M: Thu Apr 24 17:21:48 MDT 2014 > alans@ns1.eng.sldomain.com:/vmpool/obj/usr/home/alans/freebsd/head/sys/GENERIC > amd64 > > To reproduce: > ifconfig tap0 create > ifconfig tap1 create > ifconfig tap2 create > ifconfig lagg0 create > ifconfig lagg0 up laggproto lacp laggport tap0 laggport tap1 laggport > tap2 192.0.0.2/24 > > If I create and destroy the lagg in a tight loop, while running > "ifconfig -am" in a tight loop in another terminal, I eventually hit a > general protection fault in __mtx_lock_sleep. I think it might be > related. Do you have a backtrace from this panic? > Can you reproduce this? Do you have any good ideas for a solution? I can reproduce a lot of LOR messages, but no panic. -- WBR, Andrey V. Elsukov