Date: Wed, 13 May 2015 08:45:30 +0300 From: Sergey Kandaurov <pluknet@gmail.com> To: Niclas Zeising <zeising@freebsd.org> Cc: current <current@freebsd.org> Subject: Re: Panic using QLogic NetXtreme II BCM57810 with latest CURRENT snapshot Message-ID: <CAE-mSOK1o--DaYCyuGQLM4AtrQ=Wf=DEvtS2EkaxdmtPcwir-A@mail.gmail.com> In-Reply-To: <55526EDD.4050105@freebsd.org> References: <55526EDD.4050105@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 13 May 2015 at 00:21, Niclas Zeising <zeising@freebsd.org> wrote: > Hi! > I got the following panic with a QLogic NetXtreme II BCM57810 when > trying to assign an IP address using dhclient. The network card uses > the bxe driver. The machine in question is a HP DL380 Gen9. > > Kernel page fault with the following non-sleepable locks held: > shared rw if_addr_lock (if_addr_lock) locked @ /usr/src/sys/net/if.c:1539 > exclusive sleep mutex bxe0_mcast_lock lockeed @ > /usr/src/sys/dev/bxe/bxe.c:12548 > > See screenshots at the links below for details and a stack trace. > I can provoke this panic at will, let me know if you need more details. > Unfortunately I don't have access to a console where I can copy things > out currently, so screenshots have to do. > > Screenshot 1: https://people.freebsd.org/~zeising/panic1.png > Screenshot 2: https://people.freebsd.org/~zeising/panic2.png > I'm not in any way a network/bxe expert, and this is probably unrelated, but I see there at least a missing unlock at the error path. Index: sys/dev/bxe/bxe.c =================================================================== --- sys/dev/bxe/bxe.c (revision 282468) +++ sys/dev/bxe/bxe.c (working copy) @@ -12551,6 +12551,7 @@ rc = ecore_config_mcast(sc, &rparam, ECORE_MCAST_CMD_DEL); if (rc < 0) { BLOGE(sc, "Failed to clear multicast configuration: %d\n", rc); + BXE_MCAST_UNLOCK(sc); return (rc); } BXE_MCAST_LOCK acquires two locks: sc mutex, and if_maddr_rlock(ifp) OTOH, in bxe_init_mcast_macs_list(), down the path, if_maddr_rlock is acquired (and released) one more time: in if_multiaddr_array / if_multiaddr_count functions. Is it recursive? Another one is bcopy under lock. It is probably inlined under bxe_handle_rx_mode_tq() in ddb, so the actual place where it's called is not visible. My guess is bcopy in bxe_init_mcast_macs_list(): bcopy((mta + (i * ETHER_ADDR_LEN)), mc_mac->mac, ETHER_ADDR_LEN); Previously, there was a pointer assignment, see stable/10: mc_mac->mac = (uint8_t *)LLADDR((struct sockaddr_dl *)ifma->ifma_addr); mc_mac itself is malloc(M_ZERO)'ed, so that mc_mac->mac is NULL. Probably bcopy should be restored to assignment (not even compile tested): Index: sys/dev/bxe/bxe.c =================================================================== --- sys/dev/bxe/bxe.c (revision 282468) +++ sys/dev/bxe/bxe.c (working copy) @@ -12506,7 +12506,7 @@ to be different */ for(i=0; i< mcnt; i++) { - bcopy((mta + (i * ETHER_ADDR_LEN)), mc_mac->mac, ETHER_ADDR_LEN); + mc_mac->mac = (uint8_t *)(mta + (i * ETHER_ADDR_LEN)); ECORE_LIST_PUSH_TAIL(&mc_mac->link, &p->mcast_list); BLOGD(sc, DBG_LOAD, -- wbr, pluknet
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAE-mSOK1o--DaYCyuGQLM4AtrQ=Wf=DEvtS2EkaxdmtPcwir-A>