Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 18 Feb 2010 16:19:13 -0800
From:      Pyun YongHyeon <pyunyh@gmail.com>
To:        Slawa Olhovchenkov <slw@zxy.spb.ru>
Cc:        Nick Rogers <ncrogers@gmail.com>, stable@freebsd.org
Subject:   Re: trap 12: page fault while in kernel mode on 8.0-RELEASE (possibly bge(4) related)
Message-ID:  <20100219001913.GE11675@michelle.cdnetworks.com>
In-Reply-To: <20100218215039.GK55307@zxy.spb.ru>
References:  <147432021002141004o6c1412b7gd548b87709532ef9@mail.gmail.com> <20100216175719.GB1394@michelle.cdnetworks.com> <20100218143822.GA8380@zxy.spb.ru> <20100218193612.GB11675@michelle.cdnetworks.com> <20100218212428.GJ55307@zxy.spb.ru> <20100218213213.GD11675@michelle.cdnetworks.com> <20100218215039.GK55307@zxy.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help

--U+BazGySraz5kW0T
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Fri, Feb 19, 2010 at 12:50:39AM +0300, Slawa Olhovchenkov wrote:
> On Thu, Feb 18, 2010 at 01:32:13PM -0800, Pyun YongHyeon wrote:
> 
> > > > dmesg output(only bge(4) related one).
> > > 
> > > dmesg from boot:
> > > 
> > > bge0: <HP NC7782 Gigabit Server Adapter, ASIC rev. 0x002100> mem 0xfdf70000-0xfdf7ffff irq 25 at device 2.0 on pci2
> > > miibus0: <MII bus> on bge0
> > > brgphy0: <BCM5704 10/100/1000baseTX PHY> PHY 1 on miibus0
> > > brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
> > > bge0: Ethernet address: 00:14:c2:3d:e5:52
> > > bge0: [ITHREAD]
> > > bge1: <HP NC7782 Gigabit Server Adapter, ASIC rev. 0x002100> mem 0xfdf60000-0xfdf6ffff irq 26 at device 2.1 on pci2
> > > miibus1: <MII bus> on bge1
> > > brgphy1: <BCM5704 10/100/1000baseTX PHY> PHY 1 on miibus1
> > > brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
> > > bge1: Ethernet address: 00:14:c2:3d:e5:51
> > > bge1: [ITHREAD]
> > > bge1: link state changed to UP
> > > bge0: link state changed to UP
> > > 
> > > Nothing in dmesg before trap.
> > > 
> > 
> > Is this PCI-X controller? It would be even better if you can post
> 
> This integrated controller (HP DL360-G4)
> 
> > bge(4) related dmesg output of verbosed boot and the output of
> 
> Preloaded elf kernel "/boot/kernel/kernel" at 0xffffffff8088e000.
> Preloaded elf obj module "/boot/kernel/if_bge.ko" at 0xffffffff8088e1d0.
> Preloaded elf obj module "/boot/kernel/miibus.ko" at 0xffffffff8088e7f8.
> pci0:2:2:0: bad VPD cksum, remain 19
> bge0: <HP NC7782 Gigabit Server Adapter, ASIC rev. 0x002100> mem 0xfdf70000-0xfdf7ffff irq 25 at device 2.0 on pci2
> bge0: Reserved 0x10000 bytes for rid 0x10 type 3 at 0xfdf70000
> bge0: CHIP ID 0x00002100; ASIC REV 0x02; CHIP REV 0x21; PCI-X
> miibus0: <MII bus> on bge0
> brgphy0: <BCM5704 10/100/1000baseTX PHY> PHY 1 on miibus0
> brgphy0: OUI 0x000818, model 0x0019, rev. 0
> brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
> bge0: bpf attached
> bge0: Ethernet address: 00:14:c2:3d:e5:52
> ioapic1: routing intpin 1 (PCI IRQ 25) to lapic 0 vector 50
> bge0: [MPSAFE]
> bge0: [ITHREAD]
> pci0:2:2:1: bad VPD cksum, remain 19
> bge1: <HP NC7782 Gigabit Server Adapter, ASIC rev. 0x002100> mem 0xfdf60000-0xfdf6ffff irq 26 at device 2.1 on pci2
> bge1: Reserved 0x10000 bytes for rid 0x10 type 3 at 0xfdf60000
> bge1: CHIP ID 0x00002100; ASIC REV 0x02; CHIP REV 0x21; PCI-X
> miibus1: <MII bus> on bge1
> brgphy1: <BCM5704 10/100/1000baseTX PHY> PHY 1 on miibus1
> brgphy1: OUI 0x000818, model 0x0019, rev. 0
> brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
> bge1: bpf attached
> bge1: Ethernet address: 00:14:c2:3d:e5:51
> ioapic1: routing intpin 2 (PCI IRQ 26) to lapic 0 vector 51
> bge1: [MPSAFE]
> bge1: [ITHREAD]
> bge1: link UP
> bge1: link state changed to UP
> 
> 
> > "pciconf -lcv".
> 

[...]

> bge0@pci0:2:2:0:        class=0x020000 card=0x00d00e11 chip=0x164814e4 rev=0x10 hdr=0x00
>     vendor     = 'Broadcom Corporation'
>     device     = 'NetXtreme Dual Gigabit Adapter (BCM5704)'
>     class      = network
>     subclass   = ethernet
>     cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 1 split transaction
>     cap 01[48] = powerspec 2  supports D0 D3  current D0
>     cap 03[50] = VPD
>     cap 05[58] = MSI supports 8 messages, 64 bit 
> bge1@pci0:2:2:1:        class=0x020000 card=0x00d00e11 chip=0x164814e4 rev=0x10 hdr=0x00
>     vendor     = 'Broadcom Corporation'
>     device     = 'NetXtreme Dual Gigabit Adapter (BCM5704)'
>     class      = network
>     subclass   = ethernet
>     cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 1 split transaction
>     cap 01[48] = powerspec 2  supports D0 D3  current D0
>     cap 03[50] = VPD
>     cap 05[58] = MSI supports 8 messages, 64 bit 

I'm still not sure whether the panic is related with bge(4) but
there are a couple of missing workaround for PCIX BCM5704 silicon
bug in bge(4). Did you also see the panic before updating to
stable/8?
Anyway, try attached patch and let me know how it works.

--U+BazGySraz5kW0T
Content-Type: text/x-diff; charset=us-ascii
Content-Disposition: attachment; filename="bge.5704.diff"

Index: sys/dev/bge/if_bge.c
===================================================================
--- sys/dev/bge/if_bge.c	(revision 204011)
+++ sys/dev/bge/if_bge.c	(working copy)
@@ -1342,6 +1342,7 @@
 bge_chipinit(struct bge_softc *sc)
 {
 	uint32_t dma_rw_ctl;
+	uint16_t val;
 	int i;
 
 	/* Set endianness before we access any non-PCI registers. */
@@ -1362,6 +1363,17 @@
 	    i < BGE_STATUS_BLOCK_END + 1; i += sizeof(uint32_t))
 		BGE_MEMWIN_WRITE(sc, i, 0);
 
+	if (sc->bge_chiprev == BGE_CHIPREV_5704_BX) {
+		/*
+		 *  Fix data corruption casued by non-qword write with WB.
+		 *  Fix master abort in PCI mode.
+		 *  Fix PCI latency timer.
+		 */
+		val = pci_read_config(sc->bge_dev, BGE_PCI_MSI_DATA + 2, 2);
+		val |= (1 << 10) | (1 << 12) | (1 << 13);
+		pci_write_config(sc->bge_dev, BGE_PCI_MSI_DATA + 2, val, 2);
+	}
+
 	/*
 	 * Set up the PCI DMA control register.
 	 */
@@ -3157,6 +3169,26 @@
 	pci_write_config(dev, BGE_PCI_CMD, command, 4);
 	write_op(sc, BGE_MISC_CFG, BGE_32BITTIME_66MHZ);
 
+	/*
+	 * Disable PCIX relaxed ordering to ensure status block update
+	 * comes first than packet buffer DMA. Otherwise driver may
+	 * read stale status block.
+	 */
+	if (sc->bge_flags & BGE_FLAG_PCIX) {
+		devctl = pci_read_config(dev,
+		    sc->bge_pcixcap + PCIXR_COMMAND, 2);
+		devctl &= ~PCIXM_COMMAND_ERO;
+		if (sc->bge_asicrev == BGE_ASICREV_BCM5703) {
+			devctl &= ~PCIXM_COMMAND_MAX_READ;
+			devctl |= PCIXM_COMMAND_MAX_READ_2048;
+		} else if (sc->bge_asicrev == BGE_ASICREV_BCM5704) {
+			devctl &= ~(PCIXM_COMMAND_MAX_SPLITS |
+			    PCIXM_COMMAND_MAX_READ);
+			devctl |= PCIXM_COMMAND_MAX_READ_2048;
+		}
+		pci_write_config(dev, sc->bge_pcixcap + PCIXR_COMMAND,
+		    devctl, 2);
+	}
 	/* Re-enable MSI, if neccesary, and enable the memory arbiter. */
 	if (BGE_IS_5714_FAMILY(sc)) {
 		/* This chip disables MSI on reset. */

--U+BazGySraz5kW0T--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100219001913.GE11675>