Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Jan 2013 19:30:01 GMT
From:      George Neville-Neil <gnn@FreeBSD.org>
To:        freebsd-net@FreeBSD.org
Subject:   Re: kern/172113: [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4): m_getjcl: invalid cluster type
Message-ID:  <201301211930.r0LJU1jl057381@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/172113; it has been noted by GNATS.

From: George Neville-Neil <gnn@FreeBSD.org>
To: John Baldwin <jhb@FreeBSD.org>
Cc: bug-followup@FreeBSD.org,
 egrosbein@rdtc.ru,
 jfv@FreeBSD.org
Subject: Re: kern/172113: [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4): m_getjcl: invalid cluster type
Date: Mon, 21 Jan 2013 14:25:00 -0500

 On Jan 19, 2013, at 23:26 , John Baldwin <jhb@FreeBSD.org> wrote:
 
 > I was able to finally reproduce this panic today.  It seems to require
 > a server configured for PXE but that receives no DHCP reply (and
 > possibly with the requisite SuperMicro X8 board).  I was able to
 > prevent the panic with a subset of the referenced patch by only adding
 > the 'if_drv_flags & IFF_DRV_RUNNING' check to the start of
 > igb_msix_que().  The rest of the patch was unnecessary.  I also added
 > some debugging to print out the ICR, EICR, IMS, and EIMS registers in
 > this case.  It does look like the hardware is sending an interrupt =
 that
 > is not enabled in the interrupt mask (specifically LSC).  In fact, the
 > 82576 datasheet specifically mentions masking LSC until initialization
 > is complete to avoid spurious interrupts during boot and AFAICT igb(4)
 > does this since e1000_reset_hw() clears the interrupt mask via writes
 > to IMC and doesn't re-enable interrupts until igb_init_locked() is
 > invoked via 'ifconfig up'.  Here is my debug output:
 >=20
 > SMP: AP CPU #6 Launched!
 > SMP: AP CPU #4 Launched!
 > stray irq0
 > igb0: interrupt on que 0: icr 0x1000004 eicr 0
 >     ims 0 eims 0x80000000
 >=20
 > Hmmm.   Nothing clears EIMS.  After some more debugging, I determined
 > that e1000_reset_hw() always turns this bit in EIMS on, even if it is
 > off before e1000_reset_hw() is called(!).  I added explicit calls to
 > igb_disable_intr() to clear EIMS after each call to e1000_reset_hw().
 > This removes the 'stray irq0', but I still get a spurious interrupt
 > during boot (albeit with eims 0).  I can use the IFF_DRV_RUNNING hack
 > for now, but I think the real fix is something else.
 >=20
 
 I think Jack will have to chime in on this one.  Do you think it's all =
 SM X8 boards
 or just the one we happen to have?  I wonder if Jack or Jeffrey (the =
 testing guy he works
 with) have access to the right board.
 
 Best,
 George
 
 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201301211930.r0LJU1jl057381>