Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 May 2015 08:45:30 +0300
From:      Sergey Kandaurov <pluknet@gmail.com>
To:        Niclas Zeising <zeising@freebsd.org>
Cc:        current <current@freebsd.org>
Subject:   Re: Panic using QLogic NetXtreme II BCM57810 with latest CURRENT snapshot
Message-ID:  <CAE-mSOK1o--DaYCyuGQLM4AtrQ=Wf=DEvtS2EkaxdmtPcwir-A@mail.gmail.com>
In-Reply-To: <55526EDD.4050105@freebsd.org>
References:  <55526EDD.4050105@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 13 May 2015 at 00:21, Niclas Zeising <zeising@freebsd.org> wrote:
> Hi!
> I got the following panic with a QLogic NetXtreme II BCM57810 when
> trying to assign an IP address using dhclient.  The network card uses
> the bxe driver.  The machine in question is a HP DL380 Gen9.
>
> Kernel page fault with the following non-sleepable locks held:
> shared rw if_addr_lock (if_addr_lock) locked @ /usr/src/sys/net/if.c:1539
> exclusive sleep mutex bxe0_mcast_lock lockeed @
> /usr/src/sys/dev/bxe/bxe.c:12548
>
> See screenshots at the links below for details and a stack trace.
> I can provoke this panic at will, let me know if you need more details.
>  Unfortunately I don't have access to a console where I can copy things
> out currently, so screenshots have to do.
>
> Screenshot 1: https://people.freebsd.org/~zeising/panic1.png
> Screenshot 2: https://people.freebsd.org/~zeising/panic2.png
>

I'm not in any way a network/bxe expert, and this is probably unrelated,
but I see there at least a missing unlock at the error path.

Index: sys/dev/bxe/bxe.c
===================================================================
--- sys/dev/bxe/bxe.c   (revision 282468)
+++ sys/dev/bxe/bxe.c   (working copy)
@@ -12551,6 +12551,7 @@
     rc = ecore_config_mcast(sc, &rparam, ECORE_MCAST_CMD_DEL);
     if (rc < 0) {
         BLOGE(sc, "Failed to clear multicast configuration: %d\n", rc);
+        BXE_MCAST_UNLOCK(sc);
         return (rc);
     }

BXE_MCAST_LOCK acquires two locks: sc mutex, and if_maddr_rlock(ifp)

OTOH, in bxe_init_mcast_macs_list(), down the path, if_maddr_rlock is acquired
(and released) one more time: in if_multiaddr_array / if_multiaddr_count
functions. Is it recursive?

Another one is bcopy under lock. It is probably inlined
under bxe_handle_rx_mode_tq() in ddb, so the actual place
where it's called is not visible.
My guess is bcopy in bxe_init_mcast_macs_list():

         bcopy((mta + (i * ETHER_ADDR_LEN)), mc_mac->mac, ETHER_ADDR_LEN);

Previously, there was a pointer assignment, see stable/10:

        mc_mac->mac = (uint8_t *)LLADDR((struct sockaddr_dl *)ifma->ifma_addr);

mc_mac itself is malloc(M_ZERO)'ed, so that mc_mac->mac is NULL.

Probably bcopy should be restored to assignment (not even compile tested):

Index: sys/dev/bxe/bxe.c
===================================================================
--- sys/dev/bxe/bxe.c   (revision 282468)
+++ sys/dev/bxe/bxe.c   (working copy)
@@ -12506,7 +12506,7 @@
                                                       to be  different */
     for(i=0; i< mcnt; i++) {

-        bcopy((mta + (i * ETHER_ADDR_LEN)), mc_mac->mac, ETHER_ADDR_LEN);
+        mc_mac->mac = (uint8_t *)(mta + (i * ETHER_ADDR_LEN));
         ECORE_LIST_PUSH_TAIL(&mc_mac->link, &p->mcast_list);

         BLOGD(sc, DBG_LOAD,

-- 
wbr,
pluknet



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAE-mSOK1o--DaYCyuGQLM4AtrQ=Wf=DEvtS2EkaxdmtPcwir-A>