From owner-freebsd-net@FreeBSD.ORG Wed Jul 10 08:32:29 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B4A66F3E for ; Wed, 10 Jul 2013 08:32:29 +0000 (UTC) (envelope-from melifaro@yandex-team.ru) Received: from forward-corp1f.mail.yandex.net (forward-corp1f.mail.yandex.net [IPv6:2a02:6b8:0:801::10]) by mx1.freebsd.org (Postfix) with ESMTP id 44BA41BCD for ; Wed, 10 Jul 2013 08:32:29 +0000 (UTC) Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net [95.108.252.2]) by forward-corp1f.mail.yandex.net (Yandex) with ESMTP id E20FD24200D5; Wed, 10 Jul 2013 12:32:26 +0400 (MSK) Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id C87492C0248; Wed, 10 Jul 2013 12:32:26 +0400 (MSK) Received: from dhcp170-36-red.yandex.net (dhcp170-36-red.yandex.net [95.108.170.36]) by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTP id iNunliI0zu-WQImEUcu; Wed, 10 Jul 2013 12:32:26 +0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1373445146; bh=SkrVDUTTlO3jh90d+JM8x8Nx0O4IWIpYVNID2c6HGb0=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: Content-Type:Content-Transfer-Encoding; b=U3/6gNdCx4c9B2D9C7glufa/Pqtkbl21tiGyTx2Ufd9QxG8sqaVmQ2jceSRQr/25M DybVXAHUtO7Vs0lKC0EUwrFClVHDlOFWUf1F+zHE5T4MxKrGQtzhQQtRJ9freFjZ4k zAvjy2Y2QtVAEkn3Lkcia49gaqJBlu0LzxhsnmVM= Authentication-Results: smtpcorp4.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Message-ID: <51DD1C09.4060606@yandex-team.ru> Date: Wed, 10 Jul 2013 12:32:09 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130418 Thunderbird/17.0.5 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: netmap receiver crashes driver on exit Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Luigi Rizzo X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 08:32:29 -0000 Hello list! It seems there are still some rough edges with netmap api. I'm currently experimenting with netmap receiver on fresh -current (r252470) and ixgbe. Every time the receiver is killed/coredumped/or ^C'd (stock pkt-gen with -f rx can act as such receiver), after some random pause (10-300 seconds) the crash comes: kgdb) bt #0 doadump (textdump=) at pcpu.h:236 #1 0xffffffff8083ed70 in kern_reboot (howto=260) at /home/melifaro/netmap_10/sys/kern/kern_shutdown.c:447 #2 0xffffffff8083f135 in panic (fmt=) at /home/melifaro/netmap_10/sys/kern/kern_shutdown.c:754 #3 0xffffffff80bbb1b5 in trap_fatal (frame=, eva=) at /home/melifaro/netmap_10/sys/amd64/amd64/trap.c:873 #4 0xffffffff80bbb48b in trap_pfault (frame=0x0, usermode=0) at /home/melifaro/netmap_10/sys/amd64/amd64/trap.c:699 #5 0xffffffff80bbac55 in trap (frame=0xffffff9046d52810) at /home/melifaro/netmap_10/sys/amd64/amd64/trap.c:463 #6 0xffffffff80ba4ff2 in calltrap () at exception.S:232 #7 0xffffffff81a22051 in ixgbe_rxeof (que=0xfffffe0e1f6ce000) at /home/melifaro/netmap_10/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4484 #8 0xffffffff81a22e55 in ixgbe_msix_que (arg=0xfffffe0e1f6ce000) at /home/melifaro/netmap_10/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:1515 #9 0xffffffff80812f28 in intr_event_execute_handlers (p=, ie=0xfffffe012083f300) at /home/melifaro/netmap_10/sys/kern/kern_intr.c:1263 #10 0xffffffff808133f8 in ithread_loop (arg=0xfffffe0120849820) at /home/melifaro/netmap_10/sys/kern/kern_intr.c:1276 #11 0xffffffff80810bea in fork_exit (callout=0xffffffff808132d0 , arg=0xfffffe0120849820, frame=0xffffff9046d52ac0) at /home/melifaro/netmap_10/sys/kern/kern_fork.c:991 #12 0xffffffff80ba552e in fork_trampoline () at exception.S:606 #13 0x0000000000000000 in ?? () (kgdb) up 7 #7 0xffffffff81a22051 in ixgbe_rxeof (que=0xfffffe0e1f6ce000) at /home/melifaro/netmap_10/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4484 4484 mp->m_len = len; (kgdb) p mp $1 = (struct mbuf *) 0x0 (kgdb) p i $2 = 279 (kgdb) p *rxr $3 = {adapter = 0xffffff8001245000, rx_mtx = {lock_object = {lo_name = 0xfffffe0d52a7f0ae "ix0:rx(0)", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 18446741879526785024}, me = 0, rx_base = 0xffffff9046d43000, rxdma = {dma_paddr = 4822675456, dma_vaddr = 0xffffff9046d43000 "", dma_tag = 0xfffffe01301d8e00, dma_map = 0xffffffff812bbf08, dma_seg = {ds_addr = 0, ds_len = 0}, dma_size = 16384, dma_nseg = 0}, lro = {ifp = 0x0, lro_queued = 0, lro_flushed = 0, lro_bad_csum = 0, lro_cnt = 0, lro_active = {slh_first = 0x0}, lro_free = {slh_first = 0x0}}, lro_enabled = false, hw_rsc = false, discard = false, vtag_strip = false, next_to_refresh = 277, next_to_check = 279, num_desc = 1024, mbuf_sz = 2048, process_limit = 65535, mtx_name = "ix0:rx(0)\000\000\000\000\000\000", rx_buffers = 0xffffff81cb9e5000, ptag = 0xfffffe01301d8300, bytes = 174, packets = 0, rx_irq = 0, rx_copies = 522, rx_packets = 3631, rx_bytes = 196720, rx_discarded = 0, rsc_num = 0, flm = 0} (kgdb) p rbuf $4 = (struct ixgbe_rx_buf *) 0xffffff81cb9e7b98 (kgdb) p *rbuf $5 = {buf = 0x0, fmp = 0x0, pmap = 0x0, flags = 0, addr = 6467809280} More specifically: small traffic rate (~180 packets/s) was constantly flowing on ix0 (so each interrupt grabs 1 packet) ix0 was opened by netmap for several seconds. After that, netmap program was killed Panic usually follows after ~30 seconds ix configuration: 4q, ring length: 1024 some more investigation (from other similar dump): define list_ring set $rxr = (struct rx_ring *)$arg0 set $i = 0 while $i < $rxr->num_desc set $rbuf = &$rxr->rx_buffers[$i] if $rbuf->buf == 0 p $i p *$rbuf end set $i = $i + 1 end p $i end (kgdb) p ifindex_table[3]->ife_ifnet->if_xname $553 = "ix0", '\0' (kgdb) p ifindex_table[4]->ife_ifnet->if_xname $554 = "ix1", '\0' kgdb) p &((struct adapter *)ifindex_table[3]->ife_ifnet->if_softc)->rx_rings[0] $529 = (struct rx_ring *) 0xfffffe0120846800 (kgdb) p &((struct adapter *)ifindex_table[3]->ife_ifnet->if_softc)->rx_rings[1] $530 = (struct rx_ring *) 0xfffffe0120846910 (kgdb) p &((struct adapter *)ifindex_table[3]->ife_ifnet->if_softc)->rx_rings[2] $531 = (struct rx_ring *) 0xfffffe0120846a20 (kgdb) p &((struct adapter *)ifindex_table[3]->ife_ifnet->if_softc)->rx_rings[3] $532 = (struct rx_ring *) 0xfffffe0120846b30 (kgdb) list_ring $529 $533 = 1024 (kgdb) list_ring $530 $534 = 591 $535 = {buf = 0x0, fmp = 0x0, pmap = 0x0, flags = 0, addr = 0} $536 = 1024 (kgdb) list_ring $531 $537 = 274 $538 = {buf = 0x0, fmp = 0x0, pmap = 0x0, flags = 0, addr = 0} $539 = 276 $540 = {buf = 0x0, fmp = 0x0, pmap = 0x0, flags = 0, addr = 0} $541 = 1024 (kgdb) list_ring $532 $542 = 592 $543 = {buf = 0x0, fmp = 0x0, pmap = 0x0, flags = 0, addr = 6021709824} $544 = 1024 (kgdb) p &((struct adapter *)ifindex_table[4]->ife_ifnet->if_softc)->rx_rings[0] $545 = (struct rx_ring *) 0xfffffe0120845000 (kgdb) p &((struct adapter *)ifindex_table[4]->ife_ifnet->if_softc)->rx_rings[1] $546 = (struct rx_ring *) 0xfffffe0120845110 (kgdb) p &((struct adapter *)ifindex_table[4]->ife_ifnet->if_softc)->rx_rings[2] $547 = (struct rx_ring *) 0xfffffe0120845220 (kgdb) p &((struct adapter *)ifindex_table[4]->ife_ifnet->if_softc)->rx_rings[3] $548 = (struct rx_ring *) 0xfffffe0120845330 (kgdb) list_ring $545 $549 = 1024 (kgdb) list_ring $546 $550 = 1024 (kgdb) list_ring $547 $551 = 1024 (kgdb) list_ring $548 $552 = 1024 What can I do to further debug/fix this issue?