From owner-freebsd-stable@FreeBSD.ORG Sat Dec 30 20:51:57 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 32E7D16A403 for ; Sat, 30 Dec 2006 20:51:57 +0000 (UTC) (envelope-from sam@errno.com) Received: from ebb.errno.com (ebb.errno.com [69.12.149.25]) by mx1.freebsd.org (Postfix) with ESMTP id DB2AE13C428 for ; Sat, 30 Dec 2006 20:51:56 +0000 (UTC) (envelope-from sam@errno.com) Received: from [10.0.0.248] (trouble.errno.com [10.0.0.248]) (authenticated bits=0) by ebb.errno.com (8.13.6/8.12.6) with ESMTP id kBUKKhi1020821 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 30 Dec 2006 12:20:43 -0800 (PST) (envelope-from sam@errno.com) Message-ID: <4596CA1A.9040906@errno.com> Date: Sat, 30 Dec 2006 12:20:42 -0800 From: Sam Leffler User-Agent: Thunderbird 1.5.0.8 (X11/20061115) MIME-Version: 1.0 To: JoaoBR References: <200612282002.11562.joao@matik.com.br> In-Reply-To: <200612282002.11562.joao@matik.com.br> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: ath0 timeout problem - again X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 30 Dec 2006 20:51:57 -0000 JoaoBR wrote: > I need some help here, this is not a single case, I get this on a several > machines, this is releng_6 , recent, but old problem getting ugly > > > first I get this kind of events in messages, independent if it is client mode > or hostap or adhoc > > Dec 28 16:50:53 ap1-cds kernel: ath0: discard oversize frame (ether type 5e4 > flags 3 len 1522 > max 1514) > Dec 28 16:51:01 ap1-cds kernel: ath0: discard oversize frame (ether type 5e4 > flags 3 len 1522 > max 1514) > Dec 28 16:58:16 ap1-cds kernel: ath0: device timeout > ... timeout event repeats > > I really do not know what this event means (ether type 5e4), for my > understandings it is vague in the source, so I am lost here Seems pretty clear: it's the type field extracted from the ethernet header of the oversized packet. A quick check of sys/net/ethernet.h shows no such ETHERTYPE defined. So something in your network is transmitting packets that either being rx'd incorrectly or, more likely, corrupted in transit. > > { > I get continously: > > kernel: ath0: link state changed to DOWN > kernel: ath0: link state changed to UP > > when WL client but it recovers when the AP comes back to normal > so wl-cli mode is not the issue > } Sorry this is hard to understand. You are saying that when you see packets discarded on the ap the client stations lose their association to the ap? You've said nothing about your environment but I'd guess you've got some heavy interference like a microwave oven operating. > > > but when the machine is running hostap the link state up/down events do not > come up but transmission is interrupted, or better, goes slow and stops > then - and stops forever until cold reboot, no chance to get this card back, > not even unload ath and reload the driver (that was a try but I use it > compiled into the kernel) > this is not related to any WEP settings or any rate, this problem is coming up > with either rate-sample or rate_onoe > > > this is not related to the "tx stopped" problem (OACTIVE) and it is not > related to any [TX|RX]BUF value (whatever it is set to) > > this problem is not a single case and not hardware related, here I mean MB, > CPU, memory but is related in a certain way to the ath drv - same machine, > but wi0 (prism card) and it does NOT happen this way > > > I am with this problem since 6.0 and would be glad if somebody could convince > Mr. Sam L. to attend this since it is a serious issue - any FreeBSD releng_6 > has this problem but releng_5 does not Well "Mr. Sam L" has other things to do that are more important to him. If you want help I can try to provide it but this is not exactly a problem one can diagnose from afar. I suggest you sniff traffic from a separate station and try to identify what is going on in the network when you this event occur. It would also help to do the obvious things like swap ath cards. You've also said nothing about your environment such as the mac+phy revs for the card and the computer this is operating in. > > depending on the amount of traffic I get this any hour ( when 2-3Mbit/s or > more) or several times a day (when 1-2Mbit/s) > > it get worse when I have more then one ath card installed Sounds like you've got radio/antenna issues that manifest themselves as noise that drives the radio's into silence. Diagnosing something like that may requires tools like a spectrum analyzer. > > > ath stats: > > 70777 data frames received > 71551 data frames transmit > 420 tx frames with an alternate rate > 10821 long on-chip tx retries > 260 tx failed 'cuz too many retries > 11M current transmit rate > 10489 tx management frames > 1 tx frames discarded prior to association > 786 tx frames with no ack marked > 80516 tx frames with short preamble > 54395 rx failed 'cuz of bad CRC > 146438 rx failed 'cuz of PHY err > 145013 CCK timing > 1425 CCK restart > 5295 beacons transmitted > 19 periodic calibrations > 42 rssi of last ack > 31 avg recv rssi > -98 rx noise floor > 572 cabq frames transmitted > 11 cabq xmit overflowed beacon interval This should not happen. You have stations in power save mode in your bss and the transmission of queued multicast frames overflowed the interval following the beacon frame. This should be handled (I explicitly tested it) but you might want to observe if this occurs when you have problems. > 1525 switched default/rx antenna > Antenna profile: > [1] tx 41285 rx 4 This makes no sense; you rx'd 4 frames total? That's inconsistent with the "data frames received" counter and makes me question whether these numbers are meaningful. > > > ifconfig > > ath0: flags=8943 mtu 1500 > ether 00:13:46:8b:f1:86 > media: IEEE 802.11 Wireless Ethernet DS/11Mbps mode 11b > status: associated > ssid omegasul channel 1 (2412) bssid 00:13:46:8b:f1:86 > authmode OPEN privacy ON deftxkey 1 > wepkey 1:40-bit > wepkey 2:40-bit > wepkey 3:40-bit > wepkey 4:40-bit powersavemode OFF powersavesleep 100 txpowmax 36 > txpower 63 rtsthreshold 2346 mcastrate 1 fragthreshold 2346 bmiss 7 > -pureg protmode CTS -wme burst ssid HIDE -apbridge dtimperiod 1 > bintval 100 Unfortunately you've not provide critical info like the mac+phy of the card and the platform (E.g. is this a soekris box). As I said I can try to _HELP_ you but I cannot fix your problem. You need to diagnose what is happening. Sam