From owner-freebsd-current@FreeBSD.ORG Mon Nov 28 09:17:58 2005 Return-Path: X-Original-To: freebsd-current@FreeBSD.org Delivered-To: freebsd-current@FreeBSD.org Received: by hub.freebsd.org (Postfix, from userid 618) id BD80316A424; Mon, 28 Nov 2005 09:17:58 +0000 (GMT) In-Reply-To: <20051128052238.GH6610@cs.rmit.edu.au> from Emil Mikulic at "Nov 28, 2005 04:22:39 pm" To: emil@cs.rmit.edu.au (Emil Mikulic) Date: Mon, 28 Nov 2005 09:17:58 +0000 (GMT) X-Mailer: ELM [version 2.4ME+ PL54 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-Id: <20051128091758.BD80316A424@hub.freebsd.org> From: wpaul@FreeBSD.ORG (Bill Paul) Cc: freebsd-current@FreeBSD.org, glebius@FreeBSD.org Subject: Re: bge driver autoneg failure and system-wide stalls X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2005 09:17:58 -0000 > On Fri, Nov 25, 2005 at 04:22:28PM +0300, Gleb Smirnoff wrote: > > On Fri, Nov 25, 2005 at 01:20:41PM +1100, Emil Mikulic wrote: > > E> The other problem is that bge will never negotiate a working link speed. > > E> ifconfig will always return "status: no carrier" > > E> > > E> If I force the media to 10baseT/UTP or 100baseTX (either mediaopt > > E> full-duplex or not), it will issue a couple more MII_TICKs then stop, > > E> ifconfig will return "status: active", there will be no more stalls, > > E> and, most importantly, the network connection will actually work. > > > > Please try out the attached patch. > > No effect. In your original e-mail, you write: > I have a network port with bad wiring in the walls - a cable tester > shows only wires 1,2,3 and 6 are actually connected. Actually, this is not 'bad' wiring. It's correct for 10/100 ethernet as long as a) the cabling is actually cat5, and not moldy old cat3 or something, and b) the four wires are actually connected in the right sequence. Pins 1 and 2 form one pair, and pins 3 and 6 form the second pair. A typical installation may have the orange/orange+white pair on pins 1 and 2, and the blue/blue+white pair on 3 and 6. And both sides must match. If it's not done this way, then while you may have a DC path between all 4 pins on each side, you won't be getting the proper noise cancellation effect of twisted pair cabling. This can cause signal distortion, dropped packets, and possibly botched autoneg. You didn't say if you checked for this though, so we can't speculate if this is really the problem. If the pairs are wrong, then that could be why autoneg is failing. It's also the least of your worries, since even if you could convince the software to establish a link, you might end up with rotten performance. A couple things you neglected to mention (and which Gleb failed to ask you about): - Exactly what kind of switch is on the other end of this wiring? - Is the port that corresponds to this wall jack a gigabit ethernet port, or just 10/100? If it is a gigE port, then you're being silly. 4 pairs are required for gigE. Period. The NWAY autonegotiation exchange can take place over just 2 pairs, but the gigE signalling scheme requires all 4 pairs to be present in order to establish a link. If there's just two pairs connected, both sides will can announce that they support gigabit speeds, and both sides will try configuring themselves for gigE operation, but no link will ever be established. If you manually override the autonegotiation in this case, you should do "ifconfig bge0 media 100baseTX" only. Do not specify full duplex. This won't work. When you manually select the mode, autoneg will be turned off, and the other side will rely on parallel detection to select the appropriate link speed, but it won't be able to sense if the link partner is in full or half duplex mode, so it will default to half. If you manually specify full, this will create a duplex mismatch, and you'll get rotten throughput. If the switch port is 10/100 and not gigE, then autoneg should be working properly, and I don't know why it isn't. As an aside, I really don't understand the purpose of the brgphy_loop() function. (I didn't write it.) It looks like it tries to put the PHY into loopback mode, and then waits for the PHY to report that there's a good link. I'm not really sure of the point here. I mean, you can do that, but I don't understand why. Also, the DELAY(10) here can probably be replaced with a tsleep() or something, which will allow the CPU to do other work while waiting for the PHY instead of hard busywaiting and blocking up the whole system (allowing a reschedule here should not hurt). > It still can't autoselect a working media, and I still get loops in > miibus. [...] > Anything else I can try? > > --Emil If the switch port really is 10/100, then maybe, just maybe, you can try increasing the autoneg timeout. In brgphy_service(), you'll see this: /* * Only retry autonegotiation every 5 seconds. */ if (++sc->mii_ticks <= 5) break; Change the 5 to a 10 and see if that helps. But if you really are trying to autoneg a link with a gigE switch port, this won't make any difference. If the switch is managed, and you have the password to it, you can try programming it to only announce 10/100 support on that port until such time as you can recable the place for gigE. Alternatively, you can attempt to steal two pairs from a neighboring cable that leads to the same jack. -Bill -- ============================================================================= -Bill Paul (510) 749-2329 | Senior Engineer, Master of Unix-Fu wpaul@windriver.com | Wind River Systems ============================================================================= you're just BEGGING to face the moose =============================================================================