Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Sep 2010 01:08:08 +0300
From:      Vlad Galu <dudu@dudu.ro>
To:        pyunyh@gmail.com
Cc:        freebsd-net@freebsd.org, Igor Sysoev <is@rambler-co.ru>
Subject:   Re: bge hangs on recent 7.3-STABLE
Message-ID:  <AANLkTikiUm4UQGcFYoYCuybddVNVzhcJQxOscAgJ0=3Q@mail.gmail.com>
In-Reply-To: <20100913180447.GA1229@michelle.cdnetworks.com>
References:  <20100909102826.GB53812@rambler-co.ru> <20100909201050.GG7203@michelle.cdnetworks.com> <20100909211808.GJ7203@michelle.cdnetworks.com> <20100913142707.GL10050@rambler-co.ru> <20100913180447.GA1229@michelle.cdnetworks.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Sep 13, 2010 at 9:04 PM, Pyun YongHyeon <pyunyh@gmail.com> wrote:
> On Mon, Sep 13, 2010 at 06:27:08PM +0400, Igor Sysoev wrote:
>> On Thu, Sep 09, 2010 at 02:18:08PM -0700, Pyun YongHyeon wrote:
>>
>> > On Thu, Sep 09, 2010 at 01:10:50PM -0700, Pyun YongHyeon wrote:
>> > > On Thu, Sep 09, 2010 at 02:28:26PM +0400, Igor Sysoev wrote:
>> > > > Hi,
>> > > >
>> > > > I have several hosts running FreeBSD/amd64 7.2-STABLE updated on 1=
1.01.2010
>> > > > and 25.02.2010. Hosts process about 10K input and 10K output packe=
ts/s
>> > > > without issues. One of them, however, is loaded more than others, =
so it
>> > > > processes 20K/20K packets/s.
>> > > >
>> > > > Recently, I have upgraded one host to 7.3-STABLE, 24.08.2010.
>> > > > Then bge on this host hung two times. I was able to restart it fro=
m
>> > > > console using:
>> > > > =A0 /etc/rc.d/netif restart bge0
>> > > >
>> > > > Then I have upgraded the most loaded (20K/20K) host to 7.3-STABLE,=
 07.09.2010.
>> > > > After reboot bge hung every several seconds. I was able to restart=
 it,
>> > > > but bge hung again after several seconds.
>> > > >
>> > > > Then I have downgraded this host to 7.3-STABLE, 14.08.2010, since =
there
>> > > > were several if_bge.c commits on 15.08.2010. The same hangs.
>> > > > Then I have downgraded this host to 7.3-STABLE, 17.03.2010, before
>> > > > the first if_bge.c commit after 25.02.2010. Now it runs without ha=
ngs.
>> > > >
>> > > > The hosts are amd64 dual core SMP with 4G machines. bge informatio=
n:
>> > > >
>> > > > bge0@pci0:4:0:0: =A0 =A0 =A0 =A0class=3D0x020000 card=3D0x165914e4=
 chip=3D0x165914e4 rev=3D0x11 hdr=3D0x00
>> > > > =A0 =A0 vendor =A0 =A0 =3D 'Broadcom Corporation'
>> > > > =A0 =A0 device =A0 =A0 =3D 'NetXtreme Gigabit Ethernet PCI Express=
 (BCM5721)'
>> > > >
>> > > > bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0=
x004101> mem 0xfe5f0000-0xfe5fffff irq 19 at device 0.0 on pci4
>> > > > miibus1: <MII bus> on bge0
>> > > > brgphy0: <BCM5750 10/100/1000baseTX PHY> PHY 1 on miibus1
>> > > > brgphy0: =A010baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000ba=
seT, 1000baseT-FDX, auto
>> > > > bge0: Ethernet address: 00:e0:81:5f:6e:8a
>> > > >
>> > >
>> > > Could you show me verbose boot message(bge part only)?
>> > > Also show me the output of "pciconf -lcbv".
>> > >
>> >
>> > Forgot to send a patch. Let me know whether attached patch fixes
>> > the issue or not.
>>
>> > Index: sys/dev/bge/if_bge.c
>> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> > --- sys/dev/bge/if_bge.c =A0 =A0(revision 212341)
>> > +++ sys/dev/bge/if_bge.c =A0 =A0(working copy)
>> > @@ -3386,9 +3386,11 @@
>> > =A0 =A0 sc->bge_rx_saved_considx =3D rx_cons;
>> > =A0 =A0 bge_writembx(sc, BGE_MBX_RX_CONS0_LO, sc->bge_rx_saved_considx=
);
>> > =A0 =A0 if (stdcnt)
>> > - =A0 =A0 =A0 =A0 =A0 bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, sc->bge=
_std);
>> > + =A0 =A0 =A0 =A0 =A0 bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, (sc->bg=
e_std +
>> > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 BGE_STD_RX_RING_CNT - 1) % BGE_STD_RX_RI=
NG_CNT);
>> > =A0 =A0 if (jumbocnt)
>> > - =A0 =A0 =A0 =A0 =A0 bge_writembx(sc, BGE_MBX_RX_JUMBO_PROD_LO, sc->b=
ge_jumbo);
>> > + =A0 =A0 =A0 =A0 =A0 bge_writembx(sc, BGE_MBX_RX_JUMBO_PROD_LO, (sc->=
bge_jumbo +
>> > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 BGE_JUMBO_RX_RING_CNT - 1) % BGE_JUMBO_R=
X_RING_CNT);
>> > =A0#ifdef notyet
>> > =A0 =A0 /*
>> > =A0 =A0 =A0* This register wraps very quickly under heavy packet drops=
.
>>
>> Thank you, it seems the patch has fixed the bug.
>> BTW, I noticed the same hungs on FreeBSD 8.1, date=3D2010.09.06.23.59.59
>> I will apply the patch on all my updated hosts.
>>
>
> Thanks for testing. I'm afraid bge(4) in HEAD, stable/8 and
> stable/7(including 8.1-RELEASE and 7.3-RELEASE) may suffer from
> this issue. Let me know what other hosts work with the patch.

Hi Pyun,

Thanks for the patch. It seems to have fixed the symptom in my case,
on a card identical to Igor's, but on board of an IBM eServer 306m.

Regards,
Vlad

--=20
Good, fast & cheap. Pick any two.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTikiUm4UQGcFYoYCuybddVNVzhcJQxOscAgJ0=3Q>