Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Jan 2013 20:30:03 GMT
From:      Jack Vogel <jfvogel@gmail.com>
To:        freebsd-net@FreeBSD.org
Subject:   Re: kern/172113: [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4): m_getjcl: invalid cluster type
Message-ID:  <201301212030.r0LKU3wi068337@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/172113; it has been noted by GNATS.

From: Jack Vogel <jfvogel@gmail.com>
To: George Neville-Neil <gnn@freebsd.org>
Cc: John Baldwin <jhb@freebsd.org>, bug-followup@freebsd.org, egrosbein@rdtc.ru, 
	jfv@freebsd.org
Subject: Re: kern/172113: [panic] [e1000] [patch] 9.1-RC1/amd64 panices in
 igb(4): m_getjcl: invalid cluster type
Date: Mon, 21 Jan 2013 12:28:40 -0800

 --f46d04339ce484676004d3d24e43
 Content-Type: text/plain; charset=ISO-8859-1
 
 Well, do you have a more complete designation of the motherboard? We can
 look into it, although if the one check stops the problem it may be a low
 priority.
 
 Jack
 
 
 On Mon, Jan 21, 2013 at 11:25 AM, George Neville-Neil <gnn@freebsd.org>wrote:
 
 >
 > On Jan 19, 2013, at 23:26 , John Baldwin <jhb@FreeBSD.org> wrote:
 >
 > > I was able to finally reproduce this panic today.  It seems to require
 > > a server configured for PXE but that receives no DHCP reply (and
 > > possibly with the requisite SuperMicro X8 board).  I was able to
 > > prevent the panic with a subset of the referenced patch by only adding
 > > the 'if_drv_flags & IFF_DRV_RUNNING' check to the start of
 > > igb_msix_que().  The rest of the patch was unnecessary.  I also added
 > > some debugging to print out the ICR, EICR, IMS, and EIMS registers in
 > > this case.  It does look like the hardware is sending an interrupt that
 > > is not enabled in the interrupt mask (specifically LSC).  In fact, the
 > > 82576 datasheet specifically mentions masking LSC until initialization
 > > is complete to avoid spurious interrupts during boot and AFAICT igb(4)
 > > does this since e1000_reset_hw() clears the interrupt mask via writes
 > > to IMC and doesn't re-enable interrupts until igb_init_locked() is
 > > invoked via 'ifconfig up'.  Here is my debug output:
 > >
 > > SMP: AP CPU #6 Launched!
 > > SMP: AP CPU #4 Launched!
 > > stray irq0
 > > igb0: interrupt on que 0: icr 0x1000004 eicr 0
 > >     ims 0 eims 0x80000000
 > >
 > > Hmmm.   Nothing clears EIMS.  After some more debugging, I determined
 > > that e1000_reset_hw() always turns this bit in EIMS on, even if it is
 > > off before e1000_reset_hw() is called(!).  I added explicit calls to
 > > igb_disable_intr() to clear EIMS after each call to e1000_reset_hw().
 > > This removes the 'stray irq0', but I still get a spurious interrupt
 > > during boot (albeit with eims 0).  I can use the IFF_DRV_RUNNING hack
 > > for now, but I think the real fix is something else.
 > >
 >
 > I think Jack will have to chime in on this one.  Do you think it's all SM
 > X8 boards
 > or just the one we happen to have?  I wonder if Jack or Jeffrey (the
 > testing guy he works
 > with) have access to the right board.
 >
 > Best,
 > George
 >
 >
 >
 
 --f46d04339ce484676004d3d24e43
 Content-Type: text/html; charset=ISO-8859-1
 Content-Transfer-Encoding: quoted-printable
 
 Well, do you have a more complete designation of the motherboard? We can<br=
 >look into it, although if the one check stops the problem it may be a low =
 priority.<br><br>Jack<br><br><br><div class=3D"gmail_quote">On Mon, Jan 21,=
  2013 at 11:25 AM, George Neville-Neil <span dir=3D"ltr">&lt;<a href=3D"mai=
 lto:gnn@freebsd.org" target=3D"_blank">gnn@freebsd.org</a>&gt;</span> wrote=
 :<br>
 <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
 x #ccc solid;padding-left:1ex"><br>
 On Jan 19, 2013, at 23:26 , John Baldwin &lt;jhb@FreeBSD.org&gt; wrote:<br>
 <br>
 &gt; I was able to finally reproduce this panic today. =A0It seems to requi=
 re<br>
 &gt; a server configured for PXE but that receives no DHCP reply (and<br>
 &gt; possibly with the requisite SuperMicro X8 board). =A0I was able to<br>
 &gt; prevent the panic with a subset of the referenced patch by only adding=
 <br>
 &gt; the &#39;if_drv_flags &amp; IFF_DRV_RUNNING&#39; check to the start of=
 <br>
 &gt; igb_msix_que(). =A0The rest of the patch was unnecessary. =A0I also ad=
 ded<br>
 &gt; some debugging to print out the ICR, EICR, IMS, and EIMS registers in<=
 br>
 &gt; this case. =A0It does look like the hardware is sending an interrupt t=
 hat<br>
 &gt; is not enabled in the interrupt mask (specifically LSC). =A0In fact, t=
 he<br>
 &gt; 82576 datasheet specifically mentions masking LSC until initialization=
 <br>
 &gt; is complete to avoid spurious interrupts during boot and AFAICT igb(4)=
 <br>
 &gt; does this since e1000_reset_hw() clears the interrupt mask via writes<=
 br>
 &gt; to IMC and doesn&#39;t re-enable interrupts until igb_init_locked() is=
 <br>
 &gt; invoked via &#39;ifconfig up&#39;. =A0Here is my debug output:<br>
 &gt;<br>
 &gt; SMP: AP CPU #6 Launched!<br>
 &gt; SMP: AP CPU #4 Launched!<br>
 &gt; stray irq0<br>
 &gt; igb0: interrupt on que 0: icr 0x1000004 eicr 0<br>
 &gt; =A0 =A0 ims 0 eims 0x80000000<br>
 &gt;<br>
 &gt; Hmmm. =A0 Nothing clears EIMS. =A0After some more debugging, I determi=
 ned<br>
 &gt; that e1000_reset_hw() always turns this bit in EIMS on, even if it is<=
 br>
 &gt; off before e1000_reset_hw() is called(!). =A0I added explicit calls to=
 <br>
 &gt; igb_disable_intr() to clear EIMS after each call to e1000_reset_hw().<=
 br>
 &gt; This removes the &#39;stray irq0&#39;, but I still get a spurious inte=
 rrupt<br>
 &gt; during boot (albeit with eims 0). =A0I can use the IFF_DRV_RUNNING hac=
 k<br>
 &gt; for now, but I think the real fix is something else.<br>
 &gt;<br>
 <br>
 I think Jack will have to chime in on this one. =A0Do you think it&#39;s al=
 l SM X8 boards<br>
 or just the one we happen to have? =A0I wonder if Jack or Jeffrey (the test=
 ing guy he works<br>
 with) have access to the right board.<br>
 <br>
 Best,<br>
 George<br>
 <br>
 <br>
 </blockquote></div><br>
 
 --f46d04339ce484676004d3d24e43--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201301212030.r0LKU3wi068337>