From owner-freebsd-hackers Wed Jun 14 9: 3:35 2000 Delivered-To: freebsd-hackers@freebsd.org Received: from ns3.safety.net (ns3.safety.net [216.200.162.38]) by hub.freebsd.org (Postfix) with ESMTP id 3686637C161 for ; Wed, 14 Jun 2000 09:03:31 -0700 (PDT) (envelope-from les@ns3.safety.net) Received: (from les@localhost) by ns3.safety.net (8.9.3/8.9.3) id JAA80849 for freebsd-hackers@freebsd.org; Wed, 14 Jun 2000 09:03:29 -0700 (MST) (envelope-from les) From: Les Biffle Message-Id: <200006141603.JAA80849@ns3.safety.net> Subject: Conflict between Intel 82558/9 and VIA MVP4? To: freebsd-hackers@freebsd.org Date: Wed, 14 Jun 2000 09:03:24 -0700 (MST) Reply-To: les@safety.net X-Mailer: ELM [version 2.4ME+ PL54 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG We're having problems with the Intel EtherExpress 10/100 NICs in our product platform. We suspect unfavorable interaction between the 82558 and 82559 Intel parts and our motherboard chipset. Here are some specifics: We're using 3.4-STABLE, with the "latest" fxp driver code: $FreeBSD: src/sys/pci/if_fxp.c,v 1.59.2.7 2000/04/01 19:04:21 dg Exp $ $FreeBSD: src/sys/pci/if_fxpreg.h,v 1.13.2.3 1999/12/06 20:11:53 peter Exp $ $FreeBSD: src/sys/pci/if_fxpvar.h,v 1.6.2.2 2000/04/01 19:04:22 dg Exp $ The platform is a small PC designed for the point of sale folks, and uses the VIA Apollo MVP4 chipset. From dmesg: chip0: rev 0x02 on pci0.0.0 chip1: rev 0x00 on pci0.1.0 chip2: rev 0x14 on pci0.7.0 chip3: rev 0x10 on pci0.7.4 We use an AMD K6-2 at 350 or 450 Mhz, 32MB of RAM and boot from Compact Flash. The two PCI slots are on a riser card. On the riser card is a RealTEK 8139 10/100 interface which works quite well: rl0: rev 0x10 int a irq 12 on pci0.13.0 We can install other RealTEK-based NICs in either or both riser card PCI slots, and they work well, as do WAN cards. The problem comes when we install a NIC based on the Intel 82558 or 82559 parts. When the NIC is in the "top" slot on the riser (pci0.1.19), the kernel panics in if_fxp.c at fxp_add_rfabuf + 0xc4. The backtrace says fxp_add_rfabuf was called from fxp_intr. With the NIC in the "bottom" slot (pci0.1.17), there is no panic, but the card gets choked up and seems not to listen reliably. For example, it will hear an ARP reply if it sent the ARP request, but will ignore an ARP request inbound. My sniffer shows the packets on the link, but there is no indication in a "netstat -i" that the NIC saw them. Further watching of a "netstat -i -w 1" display shows something very puzzling and troubling. When the card _is_ working, the transmitted and received byte counts get updated in the display, but the associated packet counts don't go up for one or two seconds. When the card is NOT working right (doesn't hear), the bytes-received counts will increment and the packets-received counts WON'T. Here's the display for a "working" NIC on a quiet subnet that has a single machine sending broadcasts every 3 seconds and a quick 100-packet flood ping of that machine. Note the two second delay before the packet counts catch up: input (fxp0) output packets errs bytes packets errs bytes colls 1 0 0 0 0 0 0 0 0 71 0 0 0 0 0 0 0 0 0 0 0 1 0 9800 0 0 9800 0 0 0 71 0 0 0 0 100 0 0 100 0 0 0 1 0 0 0 0 0 0 0 0 71 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 Our mbuf levels are hitting really high peaks, and I suspect that whatever is hanging onto the packets is responsible for that. Other NICs in the same situation (including the much maligned RealTEK) don't exhibit this symptom, and don't run up our peak mbufs. In addition to causing massive peaks, the Intel NICs do something else ugly. It appears that they get choked up when they can't get rid of queued outputs as quickly as they would like. A 10Mbps shared-media segment will have many many collisions when transfering a file or doing a flood ping between two fast FreeBSD boxes, and a bunch of the queued output mbufs wind up in limbo. Changing to a full-duplex 100Mbps connection between the boxes eliminates the buffer-loss problem, but does not stop the NIC from having its receive or panic problems. We see the mbuf peak symptoms on other motherboards as well, but not the ignored received packets. The NICs we have tried are Intel EtherExpress Pro 100B, Pro 100+, and the new EtherExpress Pro 100+ Management Adaptors. The Management Adaptors have another side effect in our platform. The Management Adaptors have the Wake on Lan function integrated, as well as having a net boot ROM installed and enabled by default. Intel has a utility called "brow" that will modify the settings of these new features, and we routinely turn off both the WOL and net boot facilities. We have to do this in a PC that is not one of our shipping product platforms, because our product won't get through the BIOS PCI scan with these "features" enabled. Can somebody help us here? We're in a bit of a panic. Best regards, -Les -- Les Biffle Community Service... Just Say NO! (480) 778-0177 les@safety.net http://www.les.safety.net/ Network Safety, 7802 E Gray Rd Ste 500, Scottsdale, AZ 85260 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message