Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 08 May 2018 19:37:55 +0200
From:      Harry Schmalzbauer <freebsd@omnilan.de>
To:        Sean Bruno <sbruno@freebsd.org>
Cc:        Kevin Bowling <kevin.bowling@kev009.com>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Stephen Hurd <shurd@freebsd.org>
Subject:   Re: iflib-if_em tests with HEAD and lagg panic [Was: Re: svn commit: r333338 - in stable/11/sys: dev/bnxt kern net sys]
Message-ID:  <5AF1E073.5010701@omnilan.de>
In-Reply-To: <65972f0d-2873-42ea-464c-a3db543abafb@freebsd.org>
References:  <201805072142.w47LgN1R041002@repo.freebsd.org> <5AF16B8B.7030703@omnilan.de> <CAK7dMtBkCvLgPVnsf%2BECcrdbKNvOShONeZ=vqvg3dJ5ZeuoP5w@mail.gmail.com> <5AF17134.7020602@omnilan.de> <CAK7dMtB3V1F=2AxtsbUznn5DO81G3Zkh9UYiN3eWkyOfV_CYmg@mail.gmail.com> <5AF1CF0F.4040909@omnilan.de> <65972f0d-2873-42ea-464c-a3db543abafb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Bezüglich Sean Bruno's Nachricht vom 08.05.2018 18:44 (localtime):
> 
> 
> On 05/08/18 10:23, Harry Schmalzbauer wrote:
>> Bezüglich Kevin Bowling's Nachricht vom 08.05.2018 11:52 (localtime):
>> …
>>>> But if the simple iflib/hw-support test with kawela+hartwell helps I'm
>>>> happy to do.
>>>
>>> At this point it would be helpful, we think e1000 is nearing pretty
>>> good shape and I need to become familiar with any outstanding bugs.
>>
>> I started with hartwell:
>> em1: attach_pre capping queues at 2
>>
>> Current cap: 0x460b
>> em1: using 1024 tx descriptors and 1024 rx descriptors
>> em1: msix_init qsets capped at 2
>> em1: pxm cpus: 2 queue msgs: 4 admincnt: 1
>> em1: using 2 rx queues 2 tx queues
>> em1: Using MSIX interrupts with 3 vectors
>> em1: allocated for 2 tx_queues
>> em1: allocated for 2 rx_queues
>> em1: Ethernet address: 00:1b:21:3e:90:52
>> em1: netmap queues/slots: TX 2/1024, RX 2/1024
>> dev.em.1.iflib.driver_version: 7.6.1-k
>> dev.em.1.queue_rx_1.rx_irq: 0
>> dev.em.1.queue_rx_1.rxd_tail: 607
>> dev.em.1.queue_rx_1.rxd_head: 21
>> dev.em.1.queue_rx_0.rx_irq: 0
>> dev.em.1.queue_rx_0.rxd_tail: 410
>> dev.em.1.queue_rx_0.rxd_head: 412
>> dev.em.1.queue_tx_1.tx_irq: 0
>> dev.em.1.queue_tx_1.txd_tail: 8
>> dev.em.1.queue_tx_1.txd_head: 8
>> dev.em.1.queue_tx_0.tx_irq: 0
>> dev.em.1.queue_tx_0.txd_tail: 428
>> dev.em.1.queue_tx_0.txd_head: 428
>>
>> Looks good so far, no problems with simple line speed (NFS4) copies.
>>
>> According to the i217 (Clarkville) Datasheet, it also supports 2 queues:
>> Table 63. Intel® Ethernet Controller I217 Capability PHY Address 01,
>>           Page 776,Register 19
>> But it probably was never supported, at least I haven't ever checked
>> pre-iflib.
>> Here's the clakville:
>> em0: attach_pre capping queues at 1
>> em0: using 1024 tx descriptors and 1024 rx descriptors
>> em0: msix_init qsets capped at
>> em0: PCIY_MSIX capability not found; or rid 0 == 0.
>> em0: Using an MSI interrupt
>> em0: allocated for 1 tx_queues
>> em0: allocated for 1 rx_queues
>> em0: Ethernet address: 54:be:f7:0b:d7:4e
>> em0: netmap queues/slots: TX 1/1024, RX 1/1024
>>
>> Since it's not not effort here, I also tried LACP, which panicked.
>> vmcore available, but what debugger to use these days? kgdb seems to be
>> replaced...
>>
>> -harry
>> _____________
> 
> /usr/libexec/kgdb should be the old kgdb that you are used to.  Most of
> us have switched to using devel/gdb from ports.

Thanks, me stupid – it's in libexec, not in my path...
Unfortunately I have no clue about those essential C tools, so it
doesn't make much sense for me to waste energy installing devel/gdb ;-)
While I'm wondering why/how LLVM/gdb can be mixed... pure lack of
essentials :-(

So back to iflib-if_em panic after setting up a if_lagg(4) interface
(which consists of an addon 82574 and the on-board (PCH)+i217 NIC, which
was assigned a locally administrated ethernet address and used as first
laggport, so the private MAC was (successfully) set on both NICs)
and firing dhclient to get a lease:


Sleeping on "e1000_delay" with the following non-sleepable locks held:
exclusive rm if_lagg rmlock (if_lagg rmlock) r = 0 (0xfffff80014228c08)
locked @ /usr/src/sys/net/if_lagg.c:1433
stack backtrace:
#0 0xffffffff80701113 at witness_debugger+0x73
#1 0xffffffff807024f1 at witness_warn+0x461
#2 0xffffffff806a42cc at _sleep+0x6c
#3 0xffffffff806a4b34 at pause_sbt+0x144
#4 0xffffffff80440e21 at e1000_write_phy_reg_mdic+0xf1
#5 0xffffffff804446bf at e1000_enable_phy_wakeup_reg_access_bm+0x2f
#6 0xffffffff80432e0a at e1000_update_mc_addr_list_pch2lan+0x3a
#7 0xffffffff8041408f at em_if_multi_set+0x1bf
#8 0xffffffff807bc02e at iflib_if_ioctl+0xfe
#9 0xffffffff82111a15 at lagg_ioctl+0x115
#10 0xffffffff807dd348 at inm_release_task+0x218
#11 0xffffffff806dea29 at gtaskqueue_run_locked+0x139
#12 0xffffffff806de7a8 at gtaskqueue_thread_loop+0x88
#13 0xffffffff80659d84 at fork_exit+0x84
#14 0xffffffff809b767e at fork_trampoline+0xe
Sleeping thread (tid 100017, pid 0) owns a non-sleepable lock
KDB: stack backtrace of thread 100017:
sched_switch() at sched_switch+0x945/frame 0xfffffe00750dc5d0
mi_switch() at mi_switch+0x18c/frame 0xfffffe00750dc600
sleepq_switch() at sleepq_switch+0x10d/frame 0xfffffe00750dc640
sleepq_timedwait() at sleepq_timedwait+0x50/frame 0xfffffe00750dc680
_sleep() at _sleep+0x307/frame 0xfffffe00750dc730
pause_sbt() at pause_sbt+0x144/frame 0xfffffe00750dc780
e1000_write_phy_reg_mdic() at e1000_write_phy_reg_mdic+0xf1/frame
0xfffffe00750dc7c0
e1000_enable_phy_wakeup_reg_access_bm() at
e1000_enable_phy_wakeup_reg_access_bm+0x2f/frame 0xfffffe00750dc7e0
e1000_update_mc_addr_list_pch2lan() at
e1000_update_mc_addr_list_pch2lan+0x3a/frame 0xfffffe00750dc820
em_if_multi_set() at em_if_multi_set+0x1bf/frame 0xfffffe00750dc870
iflib_if_ioctl() at iflib_if_ioctl+0xfe/frame 0xfffffe00750dc8e0
lagg_ioctl() at lagg_ioctl+0x115/frame 0xfffffe00750dc990
inm_release_task() at inm_release_task+0x218/frame 0xfffffe00750dc9f0
gtaskqueue_run_locked() at gtaskqueue_run_locked+0x139/frame
0xfffffe00750dca40
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x88/frame
0xfffffe00750dca70
fork_exit() at fork_exit+0x84/frame 0xfffffe00750dcab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00750dcab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
panic: sleeping thread
cpuid = 3
time = 1525794682
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfffffe008fe180e0
vpanic() at vpanic+0x1a3/frame 0xfffffe008fe18140
panic() at panic+0x43/frame 0xfffffe008fe181a0
propagate_priority() at propagate_priority+0x335/frame 0xfffffe008fe181e0
turnstile_wait() at turnstile_wait+0x38d/frame 0xfffffe008fe18230
__mtx_lock_sleep() at __mtx_lock_sleep+0x1e1/frame 0xfffffe008fe182b0
__mtx_lock_flags() at __mtx_lock_flags+0xf9/frame 0xfffffe008fe18300
_rm_rlock() at _rm_rlock+0x280/frame 0xfffffe008fe18330
_rm_rlock_debug() at _rm_rlock_debug+0x14c/frame 0xfffffe008fe18380
lagg_transmit() at lagg_transmit+0x38/frame 0xfffffe008fe183f0
ether_output_frame() at ether_output_frame+0xaa/frame 0xfffffe008fe18420
ether_output() at ether_output+0x68b/frame 0xfffffe008fe184c0
arprequest() at arprequest+0x474/frame 0xfffffe008fe185c0
arp_ifinit() at arp_ifinit+0x58/frame 0xfffffe008fe18600
ether_ioctl() at ether_ioctl+0x1d1/frame 0xfffffe008fe18630
lagg_ioctl() at lagg_ioctl+0x602/frame 0xfffffe008fe186e0
in_control() at in_control+0x8f5/frame 0xfffffe008fe18780
ifioctl() at ifioctl+0x19c6/frame 0xfffffe008fe18850
kern_ioctl() at kern_ioctl+0x2b9/frame 0xfffffe008fe188b0
sys_ioctl() at sys_ioctl+0x168/frame 0xfffffe008fe18980
amd64_syscall() at amd64_syscall+0x2cc/frame 0xfffffe008fe18ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe008fe18ab0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8004820ba, rsp =
0x7fffffffe1c8, rbp = 0x7fffffffe210 ---
KDB: enter: panic


Hope this helps,

-harry



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5AF1E073.5010701>