Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 12 Nov 1998 09:50:12 -0500 (EST)
From:      Bill Paul <wpaul@skynet.ctr.columbia.edu>
To:        svd@kbtelecom.nalnet.ru (Sergey V.Dorokhov)
Cc:        freebsd-bugs@FreeBSD.ORG
Subject:   Re: xl ethernet driver bug
Message-ID:  <199811121450.JAA28758@skynet.ctr.columbia.edu>
In-Reply-To: <199811121131.OAA24109@kbtelecom.nalnet.ru> from "Sergey V.Dorokhov" at Nov 12, 98 02:31:24 pm

next in thread | previous in thread | raw e-mail | index | archive | help
Of all the gin joints in all the towns in all the world, Sergey 
V.Dorokhov had to walk into mine and say: > 

> Hello !
> 
> I have "3Com 3c905 Fast Etherlink XL 10/100BaseTX" ethernet card.
> Under 3.0-980520-SNAP  this card worked with 'vx0' driver. Now
> it work with a new driver xl0 (which uses bus master DMA).
> But really it work unstable. From time to time xl driver output
> error message and computer lose link to other computers.
> Commands 'ifconfig xl0 down; ifconfig xl0 up' fix trouble for short time
> but then kernel output message 'panic....'

No, it didn't output a message 'panic....' The panic message contained
much more than this, but since you chose not to share this information
with us, there's no way to tell what caused the panic. Don't assume
that we all know what each and every panic message looks like and that
we can tell instantly what's wrong without ever seeing it. If you see
a panic, WRITE IT DOWN.

 and computer reboot.
> I send you file '/var/log/messages'.
> Can you fix this trouble?

Not yet, no, because I don't know what the trouble is yet. I can tell
you what the messages mean:

/*
 * TX status codes
 */
#define XL_TXSTATUS_RECLAIM     0x02 /* 3c905B only */
#define XL_TXSTATUS_OVERFLOW    0x04
#define XL_TXSTATUS_MAXCOLS     0x08
#define XL_TXSTATUS_UNDERRUN    0x10
#define XL_TXSTATUS_JABBER      0x20
#define XL_TXSTATUS_INTREQ      0x40
#define XL_TXSTATUS_COMPLETE    0x80

The 0x90 message means 'DMA transfer was complete' and 'there was a
transmit underrun error' (0x80|0x10). The 0xd0 message means that,
plus there driver requested an interrupt to be triggered on transmission
(0x80|0x10|0x40). The main problem seems to be transmit underruns.

What I can't figure out is why there would be transmit underruns. I
never got any on my test equipment. (I really need to grab my 3c905
card back from the guy I loaned it to.)

There's something I find very confusing. I've gotten a couple of bug
reports that all seem to have the same things in common:

- The people sending the reports are all in .ru.
- The reports all concern 3c905 cards (as opposed to the 3c905B).
- I'm never able to reproduce _any_ of the reported problems locally.

I'm starting to wonder about a few things. I hope you won't be offended
if this sounds patronizing, but I have to ask:

- I notice you have a bunch of disk drives. Are these internal? If so,
  are you sure the power supply in your system is able to provide enough
  power to run all these devices?

- What sort of cable are you using to connect this machine to its link
  partner? (Hub, switch, whatever.) Did you buy it pre-made or did you
  make it yourself? Are you sure it's terminated correctly for ethernet
  use? I ask this because recently I've enountered people who have been
  operating under the mistaken assumption that it's okay to use twisted
  pair cabling that's wired 'straight through' (i.e. all four paris 
  wired immediately adjacent to each other: first pair on pins 1 and 2,
  second pair on pins 3 and 4, third pair on 5 and 6 and fourth pair on
  7 and 8). In fact, I have also seen pre-made patch cables wired
  this way. This is actually wrong: ethernet cabling has to be wired in
  a special way: 'straight through' will not work. It may appear to work
  at low speeds and a continuity check will show the cable to be okay,
  but it will cause all kinds of problems if you try it at 100Mbps. (It
  may not even work right a 10Mbps, depending on the exact circumstances.)

  Another question is: are you certain the cables are category 5? Cat 3
  cable won't cut it at 100Mbps.

Assuming the problem isn't with cabling, then there's only one thing
I can suggest that you try. In /sys/pci/if_xlreg.h, you'll see a couple
of #defines like this:

#define XL_RX_LIST_CNT          16
#define XL_TX_LIST_CNT          16

Change this to:

#define XL_RX_LIST_CNT          64
#define XL_TX_LIST_CNT          64

This will increase the size of the transmit and receive rings. I don't
know for sure that this will help, but it couldn't hurt.

Also, TELL US EXACTLY WHAT THE PANIC MESSAGE SAID! Without knowing
the exact message, it's hard to tell if the panic was actually caused
by the driver.

-Bill

-- 
=============================================================================
-Bill Paul            (212) 854-6020 | System Manager, Master of Unix-Fu
Work:         wpaul@ctr.columbia.edu | Center for Telecommunications Research
Home:  wpaul@skynet.ctr.columbia.edu | Columbia University, New York City
=============================================================================
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=============================================================================

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199811121450.JAA28758>