From owner-freebsd-hackers  Wed Jun 14  9: 3:35 2000
Delivered-To: freebsd-hackers@freebsd.org
Received: from ns3.safety.net (ns3.safety.net [216.200.162.38])
	by hub.freebsd.org (Postfix) with ESMTP id 3686637C161
	for <freebsd-hackers@freebsd.org>; Wed, 14 Jun 2000 09:03:31 -0700 (PDT)
	(envelope-from les@ns3.safety.net)
Received: (from les@localhost)
	by ns3.safety.net (8.9.3/8.9.3) id JAA80849
	for freebsd-hackers@freebsd.org; Wed, 14 Jun 2000 09:03:29 -0700 (MST)
	(envelope-from les)
From: Les Biffle <les@ns3.safety.net>
Message-Id: <200006141603.JAA80849@ns3.safety.net>
Subject: Conflict between Intel 82558/9 and VIA MVP4?
To: freebsd-hackers@freebsd.org
Date: Wed, 14 Jun 2000 09:03:24 -0700 (MST)
Reply-To: les@safety.net
X-Mailer: ELM [version 2.4ME+ PL54 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

We're having problems with the Intel EtherExpress 10/100 NICs in our
product platform.  We suspect unfavorable interaction between the 82558
and 82559 Intel parts and our motherboard chipset.  Here are some
specifics:

We're using 3.4-STABLE, with the "latest" fxp driver code:

 $FreeBSD: src/sys/pci/if_fxp.c,v 1.59.2.7 2000/04/01 19:04:21 dg Exp $
 $FreeBSD: src/sys/pci/if_fxpreg.h,v 1.13.2.3 1999/12/06 20:11:53 peter Exp $
 $FreeBSD: src/sys/pci/if_fxpvar.h,v 1.6.2.2 2000/04/01 19:04:22 dg Exp $

The platform is a small PC designed for the point of sale folks, and uses
the VIA Apollo MVP4 chipset.  From dmesg:

  chip0: <Host to PCI bridge (vendor=1106 device=0501)> rev 0x02 on pci0.0.0
  chip1: <PCI to PCI bridge (vendor=1106 device=8501)> rev 0x00 on pci0.1.0
  chip2: <PCI to ISA bridge (vendor=1106 device=0686)> rev 0x14 on pci0.7.0
  chip3: <Host to PCI bridge (vendor=1106 device=3057)> rev 0x10 on pci0.7.4

We use an AMD K6-2 at 350 or 450 Mhz, 32MB of RAM and boot from Compact Flash.

The two PCI slots are on a riser card.  On the riser card is a RealTEK
8139 10/100 interface which works quite well:

  rl0: <RealTek 8139 10/100BaseTX> rev 0x10 int a irq 12 on pci0.13.0

We can install other RealTEK-based NICs in either or both riser card PCI
slots, and they work well, as do WAN cards.  The problem comes when we
install a NIC based on the Intel 82558 or 82559 parts.

When the NIC is in the "top" slot on the riser (pci0.1.19), the kernel
panics in if_fxp.c at fxp_add_rfabuf + 0xc4.  The backtrace says
fxp_add_rfabuf was called from fxp_intr.

With the NIC in the "bottom" slot (pci0.1.17), there is no panic, but the
card gets choked up and seems not to listen reliably.  For example, it
will hear an ARP reply if it sent the ARP request, but will ignore an
ARP request inbound.  My sniffer shows the packets on the link, but there
is no indication in a "netstat -i" that the NIC saw them.

Further watching of a "netstat -i -w 1" display shows something very
puzzling and troubling.  When the card _is_ working, the transmitted and
received byte counts get updated in the display, but the associated
packet counts don't go up for one or two seconds.  When the card is NOT
working right (doesn't hear), the bytes-received counts will increment
and the packets-received counts WON'T.

Here's the display for a "working" NIC on a quiet subnet that has a
single machine sending broadcasts every 3 seconds and a quick 100-packet
flood ping of that machine.  Note the two second delay before the packet
counts catch up:

            input         (fxp0)           output
   packets  errs      bytes    packets  errs      bytes colls
         1     0          0          0     0          0     0
         0     0         71          0     0          0     0
         0     0          0          0     0          0     0
         1     0       9800          0     0       9800     0
         0     0         71          0     0          0     0
       100     0          0        100     0          0     0
         1     0          0          0     0          0     0
         0     0         71          0     0          0     0
         0     0          0          0     0          0     0
         1     0          0          0     0          0     0

Our mbuf levels are hitting really high peaks, and I suspect that
whatever is hanging onto the packets is responsible for that.  Other
NICs in the same situation (including the much maligned RealTEK) don't
exhibit this symptom, and don't run up our peak mbufs.

In addition to causing massive peaks, the Intel NICs do something else
ugly.  It appears that they get choked up when they can't get rid of
queued outputs as quickly as they would like.  A 10Mbps shared-media
segment will have many many collisions when transfering a file or doing
a flood ping between two fast FreeBSD boxes, and a bunch of the queued
output mbufs wind up in limbo.  Changing to a full-duplex 100Mbps
connection between the boxes eliminates the buffer-loss problem, but
does not stop the NIC from having its receive or panic problems.  We
see the mbuf peak symptoms on other motherboards as well, but not the
ignored received packets.

The NICs we have tried are Intel EtherExpress Pro 100B, Pro 100+, and
the new EtherExpress Pro 100+ Management Adaptors.  The Management
Adaptors have another side effect in our platform.  The Management
Adaptors have the Wake on Lan function integrated, as well as having a
net boot ROM installed and enabled by default.  Intel has a utility
called "brow" that will modify the settings of these new features, and
we routinely turn off both the WOL and net boot facilities.  We have to
do this in a PC that is not one of our shipping product platforms,
because our product won't get through the BIOS PCI scan with these
"features" enabled.

Can somebody help us here?  We're in a bit of a panic.

Best regards,

-Les

-- 
Les Biffle            Community Service...  Just Say NO!
(480) 778-0177    les@safety.net  http://www.les.safety.net/
Network Safety, 7802 E Gray Rd Ste 500,  Scottsdale, AZ 85260


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message