From owner-freebsd-current@FreeBSD.ORG Mon Nov 24 15:26:16 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7BCE916A4CE for ; Mon, 24 Nov 2003 15:26:16 -0800 (PST) Received: from kai.xtaz.net (82-32-25-111.cable.ubr04.azte.blueyonder.co.uk [82.32.25.111]) by mx1.FreeBSD.org (Postfix) with ESMTP id 529DC43FBD for ; Mon, 24 Nov 2003 15:26:14 -0800 (PST) (envelope-from matt@xtaz.net) Received: from xtaz.net (xai.xtaz.net [10.0.0.2]) by kai.xtaz.net (Postfix) with ESMTP id A00188FD54; Mon, 24 Nov 2003 23:26:12 +0000 (GMT) Message-ID: <3FC29394.8010400@xtaz.net> Date: Mon, 24 Nov 2003 23:26:12 +0000 From: Matt Smith User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.5) Gecko/20031101 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Matt Smith References: <1069446169.15019.46.camel@janitor> <20031121202949.GA5181@xor.obsecurity.org> <1069454437.15019.75.camel@janitor> <3FBEA355.8080800@xtaz.co.uk> <3FBF6DB1.2070209@xtaz.co.uk> In-Reply-To: <3FBF6DB1.2070209@xtaz.co.uk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-current@freebsd.org cc: Jimmy Selgen Subject: Re: xl0: watchdog timeout X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Nov 2003 23:26:16 -0000 Matt Smith wrote: > Matt Smith wrote: > >> Jimmy Selgen wrote: >> >>> On Fri, 2003-11-21 at 21:29, Kris Kennaway wrote: >>> >>>> On Fri, Nov 21, 2003 at 09:22:49PM +0100, Jimmy Selgen wrote: >>>> I saw this with some of sam's locking changes that (temporarily) broke >>>> DUMMYNET. I see you're using ipfilter - it's possible that this >>>> configuration has not been well-tested. Are you passing much traffic >>>> through ipfilter on this box? >>> >>> >>> >>> The box in question is my workstation, so I guess i'm not passing that >>> much traffic through ipfilter. Also, when I said that the NIC still >>> worked, I might have mislead you a bit. I had about 5-10 timeouts while >>> scp'ing the dmesg output to my other workstation. >>> Data seems to move from userland to the kernel, then get stuck in >>> buffers there for 10-15 seconds, "generating" timeouts, before they're >>> shipped off. I assume this is expected behaviour when a NIC isnt >>> behaving correctly. >>> >>> >>>> It would be helpful if you can do a binary search to narrow down when >>>> the problem started. >>> >>> >>> >>> What would you have me search ? I'm a faily seasoned C programmer (12 >>> years experience, some of them doing RTOS kernel work), but dont know >>> much about FreeBSD kernel development, or the process of checking out >>> different kernel revisions. >>> >>> >>> I've tried a build without IPFILTER, and the problem still exists. >>> I've also tried booting with ACPI disabled, and the problem is still >>> there. >>> >>> I have attached a copy of my kernel config file, in case i'm doing >>> something wrong. >>> >> >> >> >> I have just noticed that my xl0 card is misbehaving as well. I have a >> 3c905c in my desktop and noticed that an ftp of a file from another >> machine on the lan (100 meg switched) was only going at around >> 70KB/sec. Normally I get around 9MB/sec. >> >> A netstat -bi xl0 shows lots of errors: >> >> Name Mtu Network Address Ipkts Ierrs Ibytes >> Opkts Oerrs Obytes Coll >> xl0 1500 00:04:76:8d:c5:fd 3081878 217616 3778632119 >> 2451968 6 368229701 0 >> >> I also have this in my messages file: >> >> xl0: transmission error: 90 >> xl0: tx underrun, increasing tx start threshold to 180 bytes >> xl0: transmission error: 90 >> xl0: tx underrun, increasing tx start threshold to 240 bytes >> xl0: transmission error: 90 >> xl0: tx underrun, increasing tx start threshold to 300 bytes >> xl0: transmission error: 90 >> xl0: tx underrun, increasing tx start threshold to 360 bytes >> xl0: transmission error: 90 >> xl0: tx underrun, increasing tx start threshold to 420 bytes >> >> I do not currently have any debugging options compiled into this kernel. >> >> FreeBSD fraggle.xtaz.co.uk 5.1-CURRENT FreeBSD 5.1-CURRENT #0: Tue Nov >> 18 20:05:52 GMT 2003 >> root@fraggle.xtaz.co.uk:/usr/obj/usr/src/sys/FRAGGLE i386 >> >> I am actually in the process of building a new world/kernel to update >> it again as I thought it might be something that's fixed. I >> unfortunatly can not boot the old kernel to see if it works fine in >> that because of the statfs changes so it *could* possibly be the NIC >> has gone funny. >> >> I also have a 3c905a and a 3c905b in my router machine and this is >> showing no issues at all with the same dated kernel. >>http://xtaz.net/ >> Matt. >> > > I am now running a 5.2-BETA kernel from today and still have the problem > with my xl0 card here. I can only get a max throughput of around > 110KB/sec through it. And I am getting huge amounts of errors in the > interface stats (5 minutes after booting): > > Name Mtu Network Address Ipkts Ierrs Ibytes > Opkts Oerrs Obytes Coll > xl0 1500 00:04:76:8d:c5:fd 217042 1290 57669634 > 309460 0 208178476 0 > > So the question is, is this my network card has died and I need to throw > it out or is it related to Jimmy Selgen's email about the watchdog > timeouts? > > It's a shame I can't boot an old kernel to test really. > > Matt. > I have done some testing on this. I've changed the network cable, switch port etc. No affect. I've found though that if I ftp this box and GET a file it goes at around 6MB/sec. But if I PUT a file it goes at 100KB/sec. Previously this has worked at around 9-10MB/sec both ways. I can't place a date on it though because I've not tried to do large file transfers for a long time and only just noticed it this week. So it looks like it is driver related I guess. The "buffer" scenario Jimmy reported looks likely. Matt.