Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 20 Jul 2003 12:21:59 -0700 (PDT)
From:      wpaul@FreeBSD.ORG (Bill Paul)
To:        shawn@buzzardnews.com (Shawn Ramsey)
Cc:        freebsd-net@FreeBSD.ORG
Subject:   Re: Lots of input errors...
Message-ID:  <20030720192159.8778037B405@hub.freebsd.org>
In-Reply-To: <009201c34e94$7272fa50$d3db75d8@shawn> from Shawn Ramsey at "Jul 20, 2003 00:56:36 am"

next in thread | previous in thread | raw e-mail | index | archive | help

> > I would say the physical wire is probably bad.  Seeing unidirectional
> errors
> > in this case wouldn't be uncommon; one of the pair of the receive wires
> may
> > have issues.  Have you swapped the cable?  Most of the time you won't see
> > framing errors related to duplex mismatching.
> 
> It turned out to be a NIC issue. It was a 3COM 3c980C-TXM or something like
> that... I switched it to the braindead rl0 onboard adapter and the errors,
> and problems went away... pretty sad that "The worst ethernet adapter ever
> made" (according to the person who wrote the driver) beats out a pricey 3com
> adapter, although its probably just a driver issue.

It doesn't really 'beat out' the 3Com. It just happens not to be
flagging any RX errors in this one particular case. Sadly, it looks
like we'll never know the real reason for the RX errors with the 3Com
now that you've swapped it out.

There are a couple of reasons why the xl driver might be reporting
input errors:

1) Unable to allocate a new mbuf in xl_rxeof()
2) XL_RXSTAT_UP_ERROR bit was set in an RX DMA descriptor in xl_rxeof()
3) RX overrun errors detected when reading the internal stats counter
   registers on the NIC in xl_stats_update().

I don't think you ran out of mbufs (you would have noticed) so that
rules out case #1. Checking cases #2 and #3 requires adding a little
instrumentation to the driver. If the XL_RXSTAT_UP_ERROR bit is being
detected in xl_rxeof(), you can print out the status word and see
if any of the following bits are also set:

#define XL_RXSTAT_UP_OVERRUN    0x00010000
#define XL_RXSTAT_RUNT          0x00020000
#define XL_RXSTAT_ALIGN         0x00040000
#define XL_RXSTAT_CRC           0x00080000
#define XL_RXSTAT_OVERSIZE      0x00100000
#define XL_RXSTAT_DRIBBLE       0x00800000
#define XL_RXSTAT_UP_OFLOW      0x01000000

You can also add some instrumentation to the xl_update_stats() routine.
Something tells me the problem is RX overruns, which means some of
the DMA parameters may need to be adjusted a little. However, after
going back and trying to dig up the previous e-mails from this thread
in the archives, I was unable to locate the following important info:

- How fast is this machine? (What CPU, speed, etc...)
- You say there's a gigE NIC in this machine too. What kind is it?
  (Driver, chipset, etc...)

I never did see any dmesg output from this box, which would have
answered these questions. Also, if you really want to provide some
idea about interrupt load, you should run systat -vmstat 1 while
the system is busy and note the interrupts per second handled by
each device.

What I did see was a lot of people holding forth about duplex mismatches
which, while they can be annoying, are not the only source of RX errors.
A duplex mismatch typically yeilds very low overall throughput and very
bursty traffic patterns.

-Bill

--
=============================================================================
-Bill Paul            (510) 749-2329 | Senior Engineer, Master of Unix-Fu
                 wpaul@windriver.com | Wind River Systems
=============================================================================
      "If stupidity were a handicap, you'd have the best parking spot."
=============================================================================



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030720192159.8778037B405>