Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 4 Jun 2007 10:01:02 +0900
From:      Pyun YongHyeon <pyunyh@gmail.com>
To:        Arne H Juul <arnej@yahoo-inc.com>
Cc:        current@freebsd.org
Subject:   Re: panic in tulip_rx_intr after recent changes
Message-ID:  <20070604010102.GA6456@cdnetworks.co.kr>
In-Reply-To: <Pine.BSO.4.64.0706031537040.19155@murphy.trondheim.corp.yahoo.com>
References:  <Pine.BSO.4.64.0706031537040.19155@murphy.trondheim.corp.yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--Qxx1br4bt0+wmkIi
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Sun, Jun 03, 2007 at 03:39:56PM +0200, Arne H Juul wrote:
 > (this mail didn't make it to the list from my private
 > address, so I'm resending it from work instead; my
 > apologies if it suddenly appears multiple times)
 > 
 > 
 > I'm getting a kernel panic during network startup with the
 > "de" driver.  Here's the messages from the crash dump:
 > 
 > <118>Mounting local file systems:
 > <118>.
 > <118>Setting hostname: bluebox.trondheim.corp.yahoo.com.
 > <118>net.inet6.ip6.auto_linklocal:
 > <118>1
 > <118> ->
 > <118>0
 > <118>
 > de0: unable to load rx map, error = 27
 > panic: tulip_rx_intr
 > cpuid = 0
 > KDB: enter: panic
 > Uptime: 13s
 > 
 > I think this must have been introduced during the last week
 > or so on -CURRENT; my old kernel works OK:
 > 
 > arnej@bluebox:~ $ uname -a
 > FreeBSD bluebox 7.0-CURRENT FreeBSD 7.0-CURRENT #13: Tue May 29 08:02:41 
 > CEST 2007 root@bluebox:/usr/obj/home/src.cur/sys/GENERIC amd64
 > 
 > as you can see this is on amd64 platform.
 > 
 > it crashes here (in if_de.c):
 > 
 > 3557                error = bus_dmamap_load_mbuf(ri->ri_data_tag,
 > *nextout->di_map, ms,
 > 3558                    tulip_dma_map_rxbuf, nextout->di_desc,
 > BUS_DMA_NOWAIT);
 > 3559                if (error) {
 > 3560                    device_printf(sc->tulip_dev,
 > 3561                        "unable to load rx map, error = %d\n",
 > error);
 > 3562                    panic("tulip_rx_intr");         /* XXX */
 > 3563                }
 > 
 > errno 27 is EFBIG, and indeed the mbuf is MCLBYTES:
 > 
 > (kgdb) print ms[0].M_dat.MH.MH_pkthdr.len
 > $22 = 2048
 > 
 > while the tag has a lower limit:
 > 
 > (kgdb) print ri->ri_data_tag[0].maxsegsz
 > $21 = 2032
 > 
 > it looks like this is the triggering change:
 > 
 > RCS file: /usr/cvs/src/sys/amd64/amd64/busdma_machdep.c,v
 > ----------------------------
 > revision 1.81
 > date: 2007/05/29 06:30:25;  author: yongari;  state: Exp;  lines: +2 -0
 > Honor maxsegsz of less than a page size in a DMA tag. Previously it
 > used to return PAGE_SIZE without respect to restrictions of a DMA tag.
 > This affected all of the busdma load functions that use
 > _bus_dmamap_loader_buffer() as their back-end.
 > 
 > so the questions are...
 > 
 > Is the above change wrong?
 > or is the "de" driver buggy?
 > or should bus_dmamap_load_mbuf handle this somehow?
 > and does it cause problems other places too?
 > 

I'm not familiar with de(4) but it seems that it needs big cleanup.
All busdma load functions can fail so it's job of the driver to
recover from busdma load failure. I think explicitly invoking panic(9)
is really bad idea.

The de(4) set maximum segment size for a dma segment to
TULIP_DATA_PER_DESC in tulip_busdma_allocring(). I don't know why
the author limit the segment size to TULIP_DATA_PER_DESC but I guess
it comes from the limit of DMA engine of the hardware.(e.g. the
hardware can dma upto TULIP_DATA_PER_DESC bytes in size for SG
operations.)
In Rx path it allocates a mbuf with m_getcl(9) so the length of
the mbuf is MCLBYTES which is greater than a segment size supported by
the hardware.

I guess we have two possible way to fix de(4).

1. Nuke TULIP_DATA_PER_DESC and use MCLBYTES instead. Of course, it
   assumes the hardware can support upto the segment size in dma
   operation.
2. Set the mbuf length to TULIP_DATA_PER_DESC in Rx path after
   allocating a mbuf with m_getcl(9). See attached patch(I don't have
   de(4) hardware so it's just guess work but you may know the point).

However it still lacks a code that should recover from busdma load
failure. :-(

-- 
Regards,
Pyun YongHyeon

--Qxx1br4bt0+wmkIi
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="if_de.patch"

Index: if_de.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/de/if_de.c,v
retrieving revision 1.182
diff -u -r1.182 if_de.c
--- if_de.c	23 Feb 2007 12:18:37 -0000	1.182
+++ if_de.c	4 Jun 2007 00:47:16 -0000
@@ -3553,7 +3553,7 @@
 	    M_ASSERTPKTHDR(ms);
 	    KASSERT(ms->m_data == ms->m_ext.ext_buf,
 		("rx mbuf data doesn't point to cluster"));	    
-	    ms->m_len = ms->m_pkthdr.len = MCLBYTES;
+	    ms->m_len = ms->m_pkthdr.len = TULIP_RX_BUFLEN;
 	    error = bus_dmamap_load_mbuf(ri->ri_data_tag, *nextout->di_map, ms,
 		tulip_dma_map_rxbuf, nextout->di_desc, BUS_DMA_NOWAIT);
 	    if (error) {

--Qxx1br4bt0+wmkIi--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070604010102.GA6456>