Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 12 Nov 2009 14:18:00 -0900
From:      Royce Williams <royce.williams@gmail.com>
To:        Jeremy Chadwick <freebsd@jdc.parodius.com>, Jack Vogel <jfvogel@gmail.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: 82573 xfers pause, no watchdog timeouts, DCGDIS ineffective  (7.2-R)
Message-ID:  <9dd082310911121518l24adaa23jdb41ff567374d11c@mail.gmail.com>
In-Reply-To: <20091112204736.GA29095@icarus.home.lan>
References:  <4AFC63B0.5020707@alaska.net> <20091112204736.GA29095@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Nov 12, 2009 at 11:47 AM, Jeremy Chadwick
<freebsd@jdc.parodius.com> wrote:
> Please define "low-throughput" and "high-volume" if you could; it might
> help folks determine where the threshold is for problems.

My definitions are pretty subjective/operational, but for what it's worth:

- "low" is interactive SSH, DNS lookups, and pings;
- "high" is a single unthrottled rsync session.

>> rand# sysctl dev.em
>> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 6.9.6

>> dev.em.0.%pnpinfo: vendor=3D0x8086 device=3D0x108c subvendor=3D0x15d9 su=
bdevice=3D0x108c class=3D0x020000

>> kenv:
>>
>> rand# kenv | grep smbios | egrep -v 'socket|serial|uuid|tag|0123456789'
>> smbios.bios.reldate=3D"03/05/2008"

> For what it's worth as a comparison base:
>
> We use the following Supermicro SuperServers, and can confirm that no
> such issues occur for us using RELENG_6 nor RELENG_7 on the following
> hardware:

[good cross-check list snipped]

The problem system is a 5015M-MF.  We are running 5015M-MT+ and
5015T-PR on RELENG_6 and 7, both without the symptom.

> Relevant server configuration and network setup details:
>
> - All machines use pf(4).
> - All emX devices are configured for autoneg.
> - All emX devices use RXCSUM, TXCSUM, and TSO4.
> - We do not use polling.
> - All machines use both NICs simultaneously at all times.
> - All machines connected to an HP ProCurve 2626 switch (100mbit,
> =A0full-duplex ports, all autoneg).
> - We do not use Jumbo frames.
> - No add-in cards (PCI, PCI-X, nor PCIe) are used in the systems.
> - All of the systems had DCGDIS.EXE run on them; no EEPROM settings
> =A0were changed, indicating the from-the-Intel-factory MANC register
> =A0in question was set properly.

No firewall is active on the problem system, and none of this back
have been DCGDIS-ified, but otherwise, our setup is identical.

> I've compared your sysctl dev.em output to that of our 5015M-T+B systems
> (which use the PDSMi+, not the PDSMi, but whatever), and ours is 100%
> identical.
>
> All of our 5015M-T+B systems are using BIOS 1.3, and the 5015B-MTB
> system is using BIOS 1.30.

The repurposed system is at 1.3 (03/05/2008) - flashed prior to
install. The production 6.3 systems are using 1.1 (or 1.1A, would have
to reboot to check, but the date is 10/27/2005).

> If you'd like, I can provide the exact BIOS settings we use on the
> machines in question; they do deviate from the factory defaults a slight
> bit, but none of the adjustments are "tweaks" for performance or
> otherwise (just disabling things which we don't use, etc.).

We're running similarly as well.

I might be able to retire another system of this batch and install
7.2, but leave the BIOS update off, to see if it makes a difference.

Royce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9dd082310911121518l24adaa23jdb41ff567374d11c>