Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Jan 2011 01:21:42 -0800
From:      Jan Koum <jan@whatsapp.com>
To:        Mike Tancsa <mike@sentex.net>, Jack Vogel <jfvogel@gmail.com>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Sean Bruno <seanbru@yahoo-inc.com>, Ivan Voras <ivoras@freebsd.org>, "freebsd-hardware@freebsd.org" <freebsd-hardware@freebsd.org>
Subject:   Re: em driver, 82574L chip, and possibly ASPM
Message-ID:  <AANLkTimFzYZOkwdExm5JPRB7BaN8Am8pPcgrMT0wVZqy@mail.gmail.com>
In-Reply-To: <4D2C636B.5040003@sentex.net>
References:  <icgd44$89l$1@dough.gmane.org> <1290533941.3173.50.camel@home-yahoo> <4CEC0548.1080801@sentex.net> <AANLkTim82pWyf_X%2Bu72uj8RkWeRUb_4KSQ8B_HpNYsP9@mail.gmail.com> <AANLkTinO1yfN--_K63-yD1LY3wusOF7wB2wwG8DUd5Z4@mail.gmail.com> <4D2C636B.5040003@sentex.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jan 11, 2011 at 6:04 AM, Mike Tancsa <mike@sentex.net> wrote:

> On 12/24/2010 5:44 PM, Jan Koum wrote:
> > hi Ivan and Mike,
> >
> > wanted to follow up and see if you found a solid long-term solution to
> this
> > bug. we are still seeing this problem in our 8.2 environment with ASPM
> > already disabled.  here is what we have:
> >
> > 1. motherboard is SuperMicro X8SIE-LN4F Intel Xeon:
> >
>
> Hi Jack,
>        Looks like this problem is not completely gone :(



Dear Mike and Jack,

sadly the problem is not gone for us either.  here is what we know so far:

- we are running latest e1000 drivers from 8.2
- we have ASPM disabled and we have the following bios settings:
http://camel.ethereal.net/~jkb/bios/
- we have updated our BIOS and IPMI to the latest firmware available from
our ISP
- we added hw.em.rxd=4096 and hw.em.txd=4096 to loader.conf (thanks Sean for
this tip -- these settings made problem happen much less frequently but
didn't go away completely)

the ONLY thing sort of saving our butts right now is this cron job that runs
every minute:

if /sbin/ping -q -c 10 www.google.com> /dev/null; then
 echo "OK" >> /tmp/em1.out
else
 date >> /tmp/em1.out
 echo "broken" >> /tmp/em1.out
 /sbin/ifconfig em1 down
 sleep 3
 /sbin/ifconfig em1 up
fi

this is obviously no way to run a high performance web servers in a long
term.

as mentioned in my previous emails, let us know if you need any further
information or if you have any debug code you want us to run in production.
 we are willing to do pretty much anything to get this issue fixed

thanks



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTimFzYZOkwdExm5JPRB7BaN8Am8pPcgrMT0wVZqy>