Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 30 Jun 2004 18:57:39 -0700
From:      Vadim Mikhailov <freebsd-bugs@mikhailov.org>
To:        John Baldwin <jhb@FreeBSD.org>
Cc:        freebsd-bugs@mikhailov.org
Subject:   Re: kern/68351: bge0 watchdog timeout on 5.2.1 and -current, 5.1 is ok
Message-ID:  <40E36F93.3020108@mikhailov.org>
In-Reply-To: <200406291158.19613.jhb@FreeBSD.org>
References:  <678213ABF77E5D4F9E6CF1DA61A4E2D518413E@usmilm005.palm1.palmone.com> <200406291158.19613.jhb@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday, 6/29/2004 8:58 AM, John Baldwin wrote:
> On Monday 28 June 2004 01:32 pm, Vadim Mikhailov wrote:
>>  I have a Dell PowerEdge 1750 server with 2 Xeon 3.0 GHZ CPUs,
 >> 4 GB RAM and 2 onboard gigabit ethernet ports:
>>
>> bge0: <Broadcom BCM5704C Dual Gigabit Ethernet, ASIC rev.
>> This setup works more or less ok under FreeBSD 5.1-RELEASE-p8
 >> (GENERIC kernel with SMP enabled), but once a month or two
 >> machine reboots under load, so I want to upgrade it to
 >> 5.2.1-RELEASE.
>> But when I boot 5.2.1-RELEASE or later kernel (-current) on
 >> this box, network adapter locks up.
>> I see these messages on console and in the logs:
>>
>>Jun 25 15:25:22 vortex kernel: bge0: watchdog timeout -- resetting
>>
>>  If I do "ifconfig bge0 down up", network becomes available
 >> for few seconds and then machine is not pingable again.
 >>  I ran "systat -v" and have noticed that ping stops working
 >> exactly when I see any interrupt coming to mpt or ahc
 >> (i.e. on any disk activity).
>>
>>  One visible difference between 5.1 (where it works) and
 >> 5.2.1/current (where it doesn't) is that interrupts to PCI
 >> devices are getting assigned differently:
>>
>>IRQ map under 5.1: mpt0 13, mpt1 16, bge0 17, bge0 18, ahc0 19, ahc1 20,
>>  and under 5.2.1: mpt0 18, mpt1 19, bge0 16, bge1 17, ahc0 20, ahc1 21.
> 
> The numbers mean different things under 5.1 and 5.2.1.  Can you try booting a 
> kernel from a recent snapshot of current to see if current works better?  
> There have been various APIC and ACPI fixes since 5.2.1.

   I tried this with 10 days old -current kernel that time - didn't help.
In any case, running -current on this production box is out of question,
but I will try to boot fresh -current on it as soon as I have a chance
to bring it down painlessly just to confirm if it works.
   Today I picked up Intel Pro/1000 MT Desktop 32bit card at local store 
($45) and installed it into available PCI slot. While FreeBSD 5.1 cannot
recognize it, FreeBSD 5.2.1 sees it as em0 and everything works just
fine, jumbo frames ok! So using this cheap Intel card is workaround for
this problem, but I'd would very much prefer to have a real fix for
onboard bge0. Because, when Broadcom works (under 5.1),
sustained network transfer speed is 104 MB/sec - jumbo frames rock!
With Intel card it is only 76 MB/sec because it is only 32bit.
So I have 2 options to solve this problem in full - either to have bge0
driver fixed, or purchase Intel PRO/1000 XT Server adapter (64bit) -
that's $130 extra and the only free PCI slot occupied for nothing :-(
   BTW, when it is booted to 5.2.1 and em0 works fine, attempt to
ifconfig bge0 or bge1 gives watchdog timeout on it (no surprise here).
Bring bge0/bge1 down, ifconfig em0 - and everything is good again
without rebooting...

Thanks,

Vadim Mikhailov



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?40E36F93.3020108>