Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Jun 2004 23:32:33 GMT
From:      Vadim Mikhailov <freebsd-bugs@REMOVEMEmikhailov.org>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   i386/68347: bge0 watchdog timeout on 5.2.1 and later, 5.1 is ok
Message-ID:  <200406252332.i5PNWXt2062765@www.freebsd.org>
Resent-Message-ID: <200406252340.i5PNeOOQ038322@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         68347
>Category:       i386
>Synopsis:       bge0 watchdog timeout on 5.2.1 and later, 5.1 is ok
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-i386
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Jun 25 23:40:24 GMT 2004
>Closed-Date:
>Last-Modified:
>Originator:     Vadim Mikhailov
>Release:        FreeBSD 5.2.1-RELEASE-p8 i386
>Organization:
>Environment:
FreeBSD xxx 5.2.1-RELEASE-p8 FreeBSD 5.2.1-RELEASE-p8 #0: Thu Jun 25 11:57:42 PST 2004     xxx  i386
>Description:
      I have a Dell PowerEdge 1750 server with 2 Xeon 3GHZ CPUs and 2 onboard gigabit ethernet ports:
bge0: <Broadcom BCM5704C Dual Gigabit Ethernet, ASIC rev. 0x2002> mem 0xfcd20000-0xfcd2ffff,0xfcd30000-0xfcd3ffff irq 17 at device 0.0 on pci2
bge1: <Broadcom BCM5704C Dual Gigabit Ethernet, ASIC rev. 0x2002> mem 0xfcd00000-0xfcd0ffff,0xfcd10000-0xfcd1ffff irq 18 at device 0.1 on pci2

I also use jumbo frames (my gigabit switch supports them):
bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 9000
        options=1b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING>
        inet 172.xx.xx.xx netmask 0xfffff800 broadcast 172.xx.xx.255
        ether 00:06:5b:ef:63:e6
        media: Ethernet autoselect (1000baseTX <full-duplex>)
        status: active

This box also has these SCSI adapters:

mpt0: <LSILogic 1030 Ultra4 Adapter> port 0xbc00-0xbcff mem 0xfcb20000-0xfcb2ffff,0xfcb30000-0xfcb3ffff irq 18 at device 5.0 on pci4

ahc0: <Adaptec 3960D Ultra160 SCSI adapter> port 0xdc00-0xdcff mem 0xfcf01000-0xfcf01fff irq 19 at device 4.0 on pci1

Each adapter has disks attached to them. Firmware on motherboard and all peripherial devices is upgraded to the very latest versions from Dell.

  This setup works more or less fine under FreeBSD 5.1-RELEASE-p8 (GENERIC kernel with SMP enabled), but once a month or two machine reboots under load for unknown reasons. So I decided to upgrade it to 5.2.1-RELEASE.
   But when I boot 5.2.1-RELEASE or later kernel (-current) on this box, network adapter locks up and I see these messages on console and in the logs:

Jun 25 15:25:22 vortex kernel: bge0: watchdog timeout -- resetting

The only way to get network connection back is to bring it down and up:
ifconfig bge0 down up

After this network is available for few seconds and then box is not pingable again :(. I ran systat -v at that time and have noticed that ping stops working exactly when I see any interrupt coming to mpt or ahc (i.e. on any disk activity).

One visible difference between 5.1 (where it works) and 5.2.1/current (where it doesn't) is that interrupts to PCI devices are getting assigned differently. This is IRQ map for 5.1:
mpt0 13, mpt1 16, bge0 17, bge0 18, ahc0 19, ahc1 20
and this is for 5.2.1:
mpt0 18, mpt1 19, bge0 16, bge1 17, ahc0 20, ahc1 21.

I have tried to change IRQ order in the BIOS, but it didn't change anything from FreeBSD point of view.
I have also tried to boot 5.2.1 with ACPI disabled - result is the same.

>How-To-Repeat:
      Install FreeBSD 5.2.1-RELEASE (or -current) on Dell PowerEdge 1750, connect bge0 to gigabit switch and you will see watchdog timeouts.
5.1-RELEASE works fine on the same hardware.

>Fix:
      Not known.
>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200406252332.i5PNWXt2062765>