Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Mar 1999 09:33:38 -0500 (EST)
From:      wpaul@ctr.columbia.edu (Bill Paul)
To:        thyerm@camtech.com.au
Cc:        current@freebsd.org, shocking@prth.pgs.com
Subject:   Re: RealTek driver woes
Message-ID:  <199903301433.JAA00854@sirius.ctr.columbia.edu>
In-Reply-To: <Pine.BSF.4.10.9903302142340.480-100000@localhost> from "Matthew Thyer" at Mar 30, 99 09:53:36 pm

next in thread | previous in thread | raw e-mail | index | archive | help
Of all the gin joints in all the towns in all the world, Matthew Thyer
had to walk into mine and say:
 
> There are certain RealTek chipsets that perform very badly in both Windows
> and FreeBSD in my experience.  This is due to poor hardware design as far
> as the FreeBSD driver author could see.

It happens though that I only have one RealTek 8139 board for testing,
which worked fine in all the machines I tried it in. However, I do not
have any AMD 486dx/100 machines, so I can't easily duplicate the operating
environment in order to reproduce the problem. And if I can't reproduce
the problem, I can't fix it. I don't think people realize this: telling
me that the board locks up under condition foo doesn't do me any good
unless I can reproduce the hang myself under condition foo.

I've certainly never had the whole machine wedge up while using the
RealTek driver. From experience, having the whole machine freeze
usually indicates one of two things: the kernel is stuck in an infinite
loop somewhere with interrupts disabled (like in an interrupt handler),
or there's some sort of weird hardware interaction that's actually
hanging the CPU or bus. There shouldn't be any loop conditions in the
driver if the hardware functions correctly -- there might be if it
functions incorrectly, like if it asserts its interrupt and the
interrupt handler is unable to clear it. Usually you can detect this
by putting in a loop counter and bailing out if it reaches a certain
ridiculously large value.

If the machine does wedge up, then don't just sit there like a bump
on a log looking at it. If you have DDB compiled into the kernel, try
to break into the debugger on the console. If the system is caught in
a tight loop somewhere then you probably won't be able to do it, but
if you can, you can type 'trace' and get some idea of where the loop
is occurring by examining the stack trace.

If the interface just stops communicating with the other side, check
for an obvious reason. Look at netstat -in and check for input or output
errors. Do an 'ifconfig rl0' and see if the OACTIVE flag is set. This
usually indicates a stuck transmitter (although in this case you'll
probably also see a watchdog timeout error on the console).

Lastly, don't leave out important information about the operating
environment and conditions at the time the problem occurs. People
sometimes forget to mention important things because they think
they're obvious. This is wrong: nothing is obvious when I can't actually
see the machine. This includes simple things like: were you running some
program that places the interface in promiscuous mode, or were you
running X11, or is the machine configured as a router, or were you
using NAT of firewalling. Once you start noting down these things
for your problem report, take it upon yourself to investigate what
happens when you eliminate these conditions.

> Replace your network card with a $30 PCI 10/100 card that is not a RealTek
> such as the VIA Technologies VT3043 `Rhine I' and VT86C100A `Rhine II'
> chips and you'll get much better performance (FreeBSD 'vr' driver).

But not much better. :) At 10Mbps they're okay, but the VIA Rhine is
still a little brain damaged.
 
> Search the cvs-all mailing list archives for mail re: the 'rl' driver.
> 
> In my experience:
> 
> A no-one could get files from a firend of mine's Windows 95 box but he
> copy them to other peoples machines.
> 
> If you did manage to get a small file it was corrupt.
> 
> Under FreeBSD we had the same lockups you are having (using UTP).
> If we used coax (BNC) it worked fine.  This was all at 10 Mbps.
> We didn't test Windows on coax.

I tested my sample board at both 10Mbps and 100Mbps and never had any
lockups. This is what's frustrating. Now, if I actually had access to
the machines that were experiencing the trouble I might actually be able
to debug the problem.
 
> Now that we have put a Rhine based card in his machine both Windows and
> FreeBSD are working fine at ~1 MB/s through put (at 10 Mbps) and NFS is
> working fine.
> 
> On Thu, 25 Mar 1999, Stephen Hocking-Senior Programmer PGS Tensor Perth wrote:
> 
> > I'm running a RealTek ethernet card in a 486dx4-100 machine and am having some 
> > problems. Firstly, doing an ls on a nfs mounted directory exported from the 
> > RealTek machine hangs. According to tcpdump it is receiving the readdir 
> > packets. Secondly, it will hange solidly when acting as the receiver (haven't 
> > tried it as the sender) running the netpipe tests (NPtcp -s -r receiving, the 
> > sender runs NP -t -h host_rl -s) - no DDB, just a solid hang. An ISA SMC card 
> > in the same machine is fine. I've tried it with RL_USEIOSPACE defined and 
> > undefined. This is running a very current system, with the id string
> > 
> > $Id: if_rl.c,v 1.12 1999/02/23 15:38:25 wpaul Exp$
> > 
> > Here's the dmesg output.
> > 
> > Copyright (c) 1992-1999 The FreeBSD Project.
> > Copyright (c) 1982, 1986, 1989, 1991, 1993
> >         The Regents of the University of California. All rights reserved.
> > FreeBSD 4.0-CURRENT #1: Thu Mar 25 21:37:03 WST 1999
> >     toor@bloop.craftncomp.com:/data/src/sys/compile/bleep
> > Timecounter "i8254"  frequency 1193182 Hz
> > CPU: AMD Enhanced Am486DX4 Write-Through (486-class CPU)
> >   Origin = "AuthenticAMD"  Id = 0x484  Stepping=4
> >   Features=0x1<FPU>
> > real memory  = 16777216 (16384K bytes)
> > avail memory = 13750272 (13428K bytes)
> > Preloaded elf kernel "kernel" at 0xc02c3000.
> > Preloaded elf module "linux.ko" at 0xc02c309c.
> > Probing for devices on PCI bus 0:
> > chip0: <Host to PCI bridge (vendor=10b9 device=1445)> rev 0x00 on pci0.0.0
> > rl0: <RealTek 8139 10/100BaseTX> rev 0x10 int a irq 9 on pci0.4.0
> > rl0: Ethernet address: 00:00:e8:53:a2:3e
> > rl0: autoneg complete, link status good (half-duplex, 10Mbps)

I'm going to go _way_ out on a limb here and suggest that you try and
coerce your system BIOS to assign the RealTek interface an IRQ other
than 9. I say 'try' because sometimes you aren't given the option to
configure this. Usually there's some configuration menu that lets
you 'reserve IRQs for legacy/ISA devices.' You should put IRQ 9 on the
reserved list so that the BIOS will pick another. Hopefully it will
be an IRQ that isn't shared with another device. If there aren't any
free ones left. you can try disabling one of the serial ports in order
to free up an IRQ (e.g. turn off COM2).

-Bill

-- 
=============================================================================
-Bill Paul            (212) 854-6020 | System Manager, Master of Unix-Fu
Work:         wpaul@ctr.columbia.edu | Center for Telecommunications Research
Home:  wpaul@skynet.ctr.columbia.edu | Columbia University, New York City
=============================================================================
"Mulder, toads just fell from the sky!" "I guess their parachutes didn't open."
=============================================================================


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199903301433.JAA00854>