From owner-freebsd-net@FreeBSD.ORG Sun Mar 27 02:18:26 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D80B51065670; Sun, 27 Mar 2011 02:18:26 +0000 (UTC) (envelope-from david.somayajulu@qlogic.com) Received: from VA3EHSOBE001.bigfish.com (va3ehsobe001.messaging.microsoft.com [216.32.180.11]) by mx1.freebsd.org (Postfix) with ESMTP id 80AA98FC17; Sun, 27 Mar 2011 02:18:26 +0000 (UTC) Received: from mail84-va3-R.bigfish.com (10.7.14.236) by VA3EHSOBE001.bigfish.com (10.7.40.21) with Microsoft SMTP Server id 14.1.225.22; Sun, 27 Mar 2011 02:03:23 +0000 Received: from mail84-va3 (localhost.localdomain [127.0.0.1]) by mail84-va3-R.bigfish.com (Postfix) with ESMTP id 8F456B5830C; Sun, 27 Mar 2011 02:03:23 +0000 (UTC) X-SpamScore: -7 X-BigFish: VPS-7(zz14ffOzz1202hzz8275bh8275dhz2fh2a8h668h61h) X-Spam-TCS-SCL: 0:0 X-Forefront-Antispam-Report: KIP:(null); UIP:(null); IPVD:NLI; H:avexcashub1.qlogic.com; RD:avexcashub1.qlogic.com; EFVD:NLI Received: from mail84-va3 (localhost.localdomain [127.0.0.1]) by mail84-va3 (MessageSwitch) id 1301191402811608_14295; Sun, 27 Mar 2011 02:03:22 +0000 (UTC) Received: from VA3EHSMHS028.bigfish.com (unknown [10.7.14.250]) by mail84-va3.bigfish.com (Postfix) with ESMTP id A4846186004F; Sun, 27 Mar 2011 02:03:22 +0000 (UTC) Received: from avexcashub1.qlogic.com (198.70.193.61) by VA3EHSMHS028.bigfish.com (10.7.99.38) with Microsoft SMTP Server (TLS) id 14.1.225.22; Sun, 27 Mar 2011 02:03:21 +0000 Received: from avexmb1.qlogic.org ([fe80::9545:3a4f:c131:467d]) by avexcashub1.qlogic.org ([::1]) with mapi; Sat, 26 Mar 2011 19:03:20 -0700 From: David Somayajulu To: "freebsd-net@freebsd.org" , "freebsd-current@freebsd.org" Date: Sat, 26 Mar 2011 19:03:18 -0700 Thread-Topic: Questions on LRO and Delayed ACK Thread-Index: AcvsIyO9944te0QiTC65oaG4IotW/g== Message-ID: <75E1A2A7D185F841A975979B0906BBA6774E0CE8CA@AVEXMB1.qlogic.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US MIME-Version: 1.0 X-OriginatorOrg: qlogic.com Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Questions on LRO and Delayed ACK X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Mar 2011 02:18:26 -0000 Hi All, 1. If there is hardware support for LRO, (where the hardware delivers= coalesces a bunch of consecutive TCP segments into one large TCP Segment),= is it enough for the driver to simply post the segment to the host stack v= ia ifp->if_input() ? I mean is there a need to run thru tcp_lro_rx() follow= ed by tcp_lro_flush(). 2. What kind performance improvement does one get using soft lro via = tcp_lro_init(); tcp_lro_rx();tcp_lro_flush(); 3. In the absence of LRO, is there any way that one can increase the = number of inbound frames for which an ACK is transmitted to a value greater= than 2? Thanks david S. ________________________________ This message and any attached documents contain information from QLogic Cor= poration or its wholly-owned subsidiaries that may be confidential. If you = are not the intended recipient, you may not read, copy, distribute, or use = this information. If you have received this transmission in error, please n= otify the sender immediately by reply e-mail and then delete this message. From owner-freebsd-net@FreeBSD.ORG Sun Mar 27 14:34:06 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1275A1065674 for ; Sun, 27 Mar 2011 14:34:06 +0000 (UTC) (envelope-from newsmanager@claflin.edu) Received: from msx.claflin.edu (msx.claflin.edu [65.83.118.39]) by mx1.freebsd.org (Postfix) with ESMTP id B3DE18FC19 for ; Sun, 27 Mar 2011 14:34:05 +0000 (UTC) Received: from claflin-web (10.5.1.5) by cu-exchange.claflin.edu (10.10.20.31) with Microsoft SMTP Server id 8.1.436.0; Sun, 27 Mar 2011 10:23:53 -0400 MIME-Version: 1.0 From: "WealthMiners@gmail.com.com" To: freebsd-net@freebsd.org Date: Sun, 27 Mar 2011 10:23:53 -0400 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Message-ID: <86e3d267-3e8a-42c0-907a-5b4822fe9982@CU-EXCHANGE.claflin.edu> Subject: Article sent from Claflin News Manager X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Mar 2011 14:34:06 -0000 IA0KSGVsbG8gRnJpZW5kLA0KDQpZb3VyIFdlYWx0aCBNaW5lcnMgU291cmNlIENhcGl0YWwh IFlvdSBDYW4gRWFybiBXaGlsZSBZb3UgU2xlZXAhISENCg0KSG93IENvdWxkIFlvdSBJbWFn aW5lIFRvIFNlbmQgWW91ciBBZHMgVG8gTW9yZSBUaGFuIDkwMCBNaWxsaW9uIEV2ZXJ5ZGF5 IEp1c3QgYSBGZXcgDQpDbGljayBvZiBZb3VyIE1vdXNlIEh1cnJ5IFRoaXMgTGltaXRlZCBI b3QgQnVzaW5lc3MgaW4gMjAxMSBGaXJzdCBDb21lIEZpcnN0IFNlcnZlLi4uDQoNCkNvbmdy YXR1bGF0aW9uczogR2V0IFlvdXIgRWFzeSAkODAwIEhvdCBDb21taXNzaW9ucyBOb3chISEN Cg0KSXQncyBTbyBTaW1wbGUgVGhhdCBFdmVuIEEgVGVuIFllYXIgT2xkIENvdWxkIExlYXJu IFRoaXMgSW4gVW5kZXIgMSBIb3VyISIiSXQgRG9lc24ndA0KTWF0dGVyIFdoZXJlIEluIFRo ZSBXb3JsZCBZb3UgQXJlIElmIFlvdSBIYXZlIEFuIEludGVybmV0IENvbm5lY3Rpb24gJiBB IFBDIFlvdSANCkNhbiBFYXJuIFRoZSAkODAwIEZvciBKdXN0IEEgRmV3IE1pbnV0ZXMgT2Yg Q2xpY2tpbmcgQSBNb3VzZSAiDQoNCkltYWdpbmUgd2FraW5nIHVwIGF0IDEwYW0gaW4gdGhl IG1vcm5pbmcsIGhhdmluZyBhIHF1aWNrIGxvb2sgYXQgeW91ciBQQyBhbmQgICAgICANCmZp bmRpbmcgIHRoZSBleGFjdCBpbmZvcm1hdGlvbiB5b3UgbmVlZCB0byBjb2xsZWN0IGEgcXVp Y2sgJDMwMCBieSBsdW5jaHRpbWUuIA0KWW91IGNvdWxkIHRha2UgIHRoZSBhZnRlcm5vb24g b2ZmLCBwbGF5IHNvbWUgZ29sZiwgZ28gc2hvcHBpbmcgb3Igc3BlbmQgc29tZSAgDQpxdWFs aXR5IHRpbWUgd2l0aCB0aGUgZmFtaWx5LiBUaGVuIGRvIHRoZSBzYW1lIHRoaW5nIGFnYWlu IGluIHRoZSBldmVuaW5nLCANCldoYXQgYSB3b25kZXJmdWwgY29uY2VwdCBhbmQgeW91IGNv dWxkIGJlIGRvaW5nIGl0IHRvZGF5LiANCg0KRXZlbiBkdXJpbmcgdGhlIGJvb20geWVhcnMg aW4gYW55IGVjb25vbXkgaXQncyBub3QgcG9zc2libGUgdG8gZmluZCBhIGZvcm0gDQpvZiBm cmVlIGluY29tZSwgQlVUIFRISVMgSVMgRVhBQ1RMWSBUSEFUISBhbmQgaXQgd2lsbCBtYWtl IHlvdSBmcmVlIG1vbmV5IGZvciAgIA0KdGhlIHJlc3Qgb2YgeW91ciBsaWZlLlRoaXMgSXMg Tm90IEdhbWJsaW5nLlRoaXMgSXMgTm90IFRyYWRpbmcuVGhpcyBJcyBOb3QgRXZlbiANCklu dmVzdGluZy5CZWNhdXNlIEl0J3MgSW1wb3NzaWJsZSBUaGF0IFlvdSBDYW4gTWFrZSBBIExv c3MsSXQgSXMgQSBNYXRoZW1hdGljYWwgDQpDZXJ0YWludHkgIFRoYXQgWW91IENhbiBVc2Ug VG8gTWFrZSBGcmVlIE1vbmV5IEZvciBUaGUgUmVzdCBPZiBZb3VyIExpZmUuDQogDQpJdCdz IEVhc3kgVG8gTWFrZSBNb25leSBFdmVyeWRheSBFdmVuIElmIFlvdSdyZSBTdGFydGluZyBG cm9tIFNjcmF0Y2ggV2l0aA0KWmVybyBLbm93bGVkZ2UsIEV4cGVyaWVuY2UgT3IgQnVkZ2V0 IUknbGwgU2hvdyBZb3UgRXhhY3RseSBIb3cuDQoNCldlJ3ZlIFN0YXJ0IHB1dHRpbmcgTmV3 IDMyIE1lbWJlcnMgaW4gWU9VUiBURUFNIGZvciB0aGUgTWFyY2ggMjQgdGgtMzEvIDIwMTEN CndlZWtseSBjb21taXNzaW9uIGN5Y2xlLi4uYW5kIEdST1dJTkcgZXZlcnlkYXkgZWFybiBi eSAkMTAwIHVwIHRvICQyMDAgb3IgbW9yZS4NCg0KSU1QT1JUQU5UOkFkdmFuY2UgRG9uJ3Qg ZGVsYXkgb24gTWFyY2ggMzEvIDIwMTEsaXMgdGhlIEN1dC1PZmYgZGF5IHRvIGxvY2sgIA0K aW4geW91ciBwb3NpdGlvbiB0aGVuIGZhc3RlciB5b3UgYWN0IHRoZSBoaWdoZXIgY29tbWlz c2lvbiB5b3UgIHdpbGwgZWFybiEhIQ0KDQpHbyBIZXJlIFRvIFNlY3VyZSBub3QgbGVzcyB0 aGFuICQ4MDAgY29tbWlzc2lvbiBOb3cgYW5kIGl0IHN0aWxsIGdyb3dpbmcgYXMgIA0KbWFu eSBwZW9wbGUgam9pbmluZyB1bmRlciB5b3UuIGlmIHlvdSBzZWN1cmUgeW91ciBwb3NpdGlv biByaWdodCBhd2F5OlRoZSAkODAwDQpDb21taXNzaW9uIHdpbGwgQXJyaXZlIFRocm91Z2gg eW91ciBQYXlwYWwgb3IgQ3JlZGl0IENhcmQgb24gQXByaWwvMjAvMjAxMS4uLkh1cnJ5ICAN CnRoaXMgbGltaXRlZCB0aW1lLCBvbmx5IDMgUG9zaXRpb25zIGFyZSBhdmFpbGFibGUgTm93 Lg0KDQpPbmNlIHlvdXIgTWVtYmVyc2hpcCBhcmUgc2V0IHVwLCB5b3Ugd2lsbCBiZSBhYmxl IHRvIGVhcm4gJDgwMCBpbiBsZXNzIHRoYW4gDQoyIGhvdXJzIGEgZGF5Lkkgd2lsbCBzaG93 IHlvdSBob3cgd2UgZG8gdGhhdCBhbmQgdGhlbiBJIHdpbGwgaGVscCB5b3UgdGhyb3VnaCAN CnRoZSBwcm9jZXNzIHNvIHRoYXQgWU9VIFNVQ0NFRUQhIEFuZCBFbmpveSEhISANCg0KWW91 IHdpbGwgYWNjZXNzIHlvdXIgJDgwMCBpbiBhbnkgQVRNIHdoZW4geW91IEpvaW4gZWFybHkg b3VyIHdlZWtseSBjeWNsZS4NCg0KQ2xpY2sgQmVsb3chISFBbmQgSm9pbiBSaWdodCBOb3cu Lg0KDQpodHRwczovL3d3dy5wbGltdXMuY29tL2pzcC9yZWRpcmVjdC5qc3A/Y29udHJhY3RJ ZD0yNzU3MDY2JnJlZmVycmVyPWtpbWF5DQoNCj4+Pj4+IFRoaXMgU25hcHNob3RzIGlzIFBy b3ZlbiBFYXJuIFdoZW4gWW91IEpvaW4gRWFybHkgPDw8PDw8PDwNCg0KVFlQRSAgIERBVEUg JiBUSU1FIC0tLS0tLS0gTkVXIE1FTUJFUlMgLS0tLS0tLS0tLS0tLS0gQ09VTlRSWQ0KDQpQ IC0tLSBNQVJDSC4gMjYgQCAyOjM4ICBBTS0tIFNhcnJhaCBHcmFudC0tLS0tLS0tLS0tLSBV bml0ZWQgU3RhdGVzDQpQIC0tLSBNQVJDSC4gMjYgQCAyOjUzICBBTS0tIEFuZHkgV2lsbGlh bSAtLS0tLS0tLS0tLSBVbml0ZWQgS2luZ2RvbQ0KUCAtLS0gTUFSQ0guIDI2IEAgMjo1NiAg QU0tLSBKZWZmcmV5IEphY29icy0tLS0tLS0tLS0gR2VybWFueQ0KTSAtLS0gTUFSQ0guIDI2 IEAgNDoxOSAgQU0tLSBNYXlldGggVGhvbXBzb24tLS0tLS0tLS0gU2luZ2Fwb3JlDQpQIC0t LSBNQVJDSC4gMjYgQCA0OjI4ICBBTS0tIENoYW5kcmVuYSBXaGl0ZS0tLS0tLS0tLSBJdGFs eQ0KUCAtLS0gTUFSQ0guIDI2IEAgMjozOCAgQU0tLSBKaW5reSBCdWZmZXItLS0tLS0tLS0t LS0gVW5pdGVkIFN0YXRlcw0KUCAtLS0gTUFSQ0guIDI2IEAgMjo1MyAgQU0tLSBBaWxhaW5l IFNtaXRoIC0tLS0tLS0tLS0gVW5pdGVkIEtpbmdkb20NClAgLS0tIE1BUkNILiAyNSBAIDI6 NTYgIEFNLS0gTWFuZGVuZSBKb25oc29uLS0tLS0tLS0tIEdlcm1hbnkNCk0gLS0tIE1BUkNI LiAyNSBAIDQ6MTkgIEFNLS0gQ3Jpc3RpYW4gR2F0bWFpdGFuLS0tLS0tIFNpbmdhcG9yZQ0K UCAtLS0gTUFSQ0guIDI1IEAgNDoyOCAgQU0tLSBKaG9uIENhcm1hbG9uLS0tLS0tLS0tLS0g SXRhbHkNCk0gLS0tIE1BUkNILiAyNSBAIDY6MDEgIEFNLS0gbGFsYWluZSBBbmRlcnNvbi0t LS0tLS0tIEF1c3RyYWxpYQ0KUCAtLS0gTUFSQ0guIDI1IEAgNzoxMSAgQU0tLSBSZWJlY2Nh IFVuZGVyd29vZC0tLS0tLS0gSHVuZ2FyeQ0KUCAtLS0gTUFSQ0guIDI1IEAgNzozOSAgQU0t LSBKZXJpY2hvIEphY2tzb24tLS0tLS0tLS0gQ2FuYWRhDQpQIC0tLSBNQVJDSC4gMjUgQCA5 OjQyICBBTS0tIFRob21hcyBTaWx2YSAtLS0tLS0tLS0tLSBTcmkgTGFua2ENCk0gLS0tIE1B UkNILiAyNSBAIDk6NTggIFBNLS0gR3JhY2UgVGF5bG9yLS0tLS0tLS0tLS0tIFVuaXRlZCBT dGF0ZXMNClAgLS0tIE1BUkNILiAyNSBAIDEwOjIxIFBNLS0gR2luYSBIZW5yeS0tLS0tLS0t LS0tLS0tIE5ldyBaZWFsYW5kDQpQIC0tLSBNQVJDSC4gMjQgQCAxMToyNCBQTS0tIE1vaGFt bWVkIEFobWVuIC0tLS0tLS0tLSBSb21hbmlhDQpNIC0tLSBNQVJDSC4gMjQgQCAxMTozMyBQ TS0tIFRyYWNleSBEdW5jYW4tLS0tLS0tLS0tLSBQdWVydG8gUmljbw0KUCAtLS0gTUFSQ0gu IDI0IEAgMTE6NDEgUE0tLSBKYW5lIFN0YXdydC0tLS0tLS0tLS0tLS0gVW5pdGVkIFN0YXRl cw0KUCAtLS0gTUFSQ0guIDI0IEAgMTE6NDcgUE0tLSBKYW5pY2UgWW91bmdzdG93bi0tLS0t LS0gVGFpd2FuDQpQIC0tLSBNQVJDSC4gMjQgQCAxMTo1MyBQTS0tIFNoaXJsZXkgT25nLS0t LS0tLS0tLS0tLSBDaGluYQ0KUCAtLS0gTUFSQ0guIDI0IEAgMTo0NSAgQU0tLSBSeWFubiBM YW1iZXJ0IC0tLS0tLS0tLS0gRXVyb3BlDQpNIC0tLSBNQVJDSC4gMjQgQCAxMjozNCBBTS0t IE5pY2sgR2F1Y2kgLS0tLS0tLS0tLS0tLSBDYWxlZm9ybmlhDQpNIC0tLSBNQVJDSC4gMjQg QCAxMDoyNCBBTS0tIERvbiBSaWxleSAtLS0tLS0tLS0tLS0tLSBOZXRoZXJsYW5kDQpQIC0t LSBNQVJDSC4gMjQgQCAxMDozMCBBTS0tIExvcm5lIFdoaXR0YWtlciAtLS0tLS0tLSBTd2V0 emVybGFuZA0KUCAtLS0gTUFSQ0guIDI0IEAgMDI6MTQgQU0tLSBBc2h3YW5pIFZvaHJhIC0t LS0tLS0tLS0gQnJhemlsDQpNIC0tLSBNQVJDSC4gMjQgQCAyOjM0ICBBTS0tIEtldmluIEh1 bnQgLS0tLS0tLS0tLS0tLSBVbml0ZWQgU3RhdGVzDQpQIC0tLSBNQVJDSC4gMjQgQCAxOjU0 ICBBTS0tIENoYXJsZXMgQnJvd24tLS0tLS0tLS0tLSBVbml0ZWQgU3RhdGVzDQoNClRoZXJl Zm9yZSwgeW91IGhhdmUgYSBHVUFSQU5URUVEICQ4MDAgQ29tbWlzc2lvblMgIGV2ZXJ5IG1v bnRoIGZyb20gbm93IG9uIQ0KDQpFYXJuICQyNVBlciBQcm9jZXNzIUVhY2ggJDI1IHggMzIg PSAkODAwIENvbW1pc3Npb24gd2lsbCBiZSB5b3Vycy4uLg0KDQpCZSBTdXJlIHRvIENvcHkg dGhlIGxpbmsgYmVsb3cgJiBQYXN0ZSBpbnRvIHlvdXIgYnJvd3NlciBhbmQgcHJlc3MgZW50 ZXI6DQpUbyBTZWN1cmUgeW91ciAkODAwIGNvbW1pc3Npb24hDQoNCllvdSB3aWxsIGFjY2Vz cyB5b3VyICQ4MDAgaW4gYW55IEFUTSB3aGVuIHlvdSBKb2luIGVhcmx5IG91ciB3ZWVrbHkg Y3ljbGUuDQogDQpDbGljayBCZWxvdyEhIUFuZCBKb2luIFJpZ2h0IE5vdy4uDQoNCmh0dHBz Oi8vd3d3LnBsaW11cy5jb20vanNwL3JlZGlyZWN0LmpzcD9jb250cmFjdElkPTI3NTcwNjYm cmVmZXJyZXI9a2ltYXkNCg0KSnVzdCBvbmUgc2ltcGxlIHBheW1lbnQgb2YgJDI1IGFuZCB5 b3UgY291bGQgaGF2ZSBlYXJuICQ4MDAgUmVtZW1iZXIgbm8gb25lIGVsc2UgICAgDQpDYW4g Z2l2ZSB0aGlzIGtpbmQgb2YgbW9uZXkgZXZlcnkgMjB0aCBvZiB0aGUgbW9udGguIFRvZGF5 IGl0cyAkODAwIEZvciB0aGUgICANCnN0YXJ0IG9mIHRoZSBtb250aCBvZiBNYXJjaCAyMDEx IGlmIGdvZXMgdXAgZGFpbHkgdW50aWwgdGhlIGVuZCBvZiB0aGUgbW9udGguDQoNCllvdSBt dXN0IFVQR1JBREUgcmlnaHQgYXdheSBvciBiZWZvcmUgb3RoZXJzIGRvLi4uLg0KDQpCdXNp bmVzcyBNYW5hZ2VyIFN1Y2Nlc3MsIEthcmVuIEphY2tzb24NCldlYWx0aE1pbmVyc0BnbWFp bC5jb20uY29tIA0KTUFJTiBPRkZJQ0UsIFVTQSwgVUssIEF1c3RyYWxpYSwgQXNpYSwgRXVy b3BlDQoNCg0KDQoNCg0KaHR0cDovL3d3dy5jbGFmbGluLmVkdS9OZXdzLz9hPTk1NiZ6PTEN Cg0KX19fX19fX19fX18NClZpc2l0IHVzIGF0IDoNCmh0dHA6Ly93d3cuY2xhZmxpbi5lZHU= From owner-freebsd-net@FreeBSD.ORG Sun Mar 27 19:30:12 2011 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 66099106566B for ; Sun, 27 Mar 2011 19:30:12 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 3A3308FC14 for ; Sun, 27 Mar 2011 19:30:12 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2RJUC8m039194 for ; Sun, 27 Mar 2011 19:30:12 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2RJUCKH039189; Sun, 27 Mar 2011 19:30:12 GMT (envelope-from gnats) Date: Sun, 27 Mar 2011 19:30:12 GMT Message-Id: <201103271930.p2RJUCKH039189@freefall.freebsd.org> To: freebsd-net@FreeBSD.org From: Kevin Cc: Subject: Re: kern/155714: [zyd] [panic] zyd_bulk_write_callback panic in 8.2-RELEASE [regression] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Kevin List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Mar 2011 19:30:12 -0000 The following reply was made to PR kern/155714; it has been noted by GNATS. From: Kevin To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/155714: [zyd] [panic] zyd_bulk_write_callback panic in 8.2-RELEASE [regression] Date: Sun, 27 Mar 2011 14:26:46 -0500 This panic appears to be fixed with the following patch: Author: kevlo Date: Fri Mar 25 05:01:13 2011 New Revision: 219982 URL: http://svn.freebsd.org/changeset/base/219982 Log: Fix panic while associating access point. While here, add the SMC SMCWUSB-G Modified: head/sys/dev/usb/wlan/if_zyd.c From owner-freebsd-net@FreeBSD.ORG Mon Mar 28 01:26:15 2011 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B99A41065672; Mon, 28 Mar 2011 01:26:15 +0000 (UTC) (envelope-from kevlo@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8C3E88FC16; Mon, 28 Mar 2011 01:26:15 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2S1QF0r064555; Mon, 28 Mar 2011 01:26:15 GMT (envelope-from kevlo@freefall.freebsd.org) Received: (from kevlo@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2S1QFs8064551; Mon, 28 Mar 2011 01:26:15 GMT (envelope-from kevlo) Date: Mon, 28 Mar 2011 01:26:15 GMT Message-Id: <201103280126.p2S1QFs8064551@freefall.freebsd.org> To: kevin@your.org, kevlo@FreeBSD.org, freebsd-net@FreeBSD.org From: kevlo@FreeBSD.org Cc: Subject: Re: kern/155714: [zyd] [panic] zyd_bulk_write_callback panic in 8.2-RELEASE [regression] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2011 01:26:15 -0000 Synopsis: [zyd] [panic] zyd_bulk_write_callback panic in 8.2-RELEASE [regression] State-Changed-From-To: open->closed State-Changed-By: kevlo State-Changed-When: Mon Mar 28 01:25:26 UTC 2011 State-Changed-Why: Fxied in r219982. Thanks for tesing! http://www.freebsd.org/cgi/query-pr.cgi?pr=155714 From owner-freebsd-net@FreeBSD.ORG Mon Mar 28 01:39:28 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 11D781065672 for ; Mon, 28 Mar 2011 01:39:28 +0000 (UTC) (envelope-from jhellenthal@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id BC8AA8FC0C for ; Mon, 28 Mar 2011 01:39:27 +0000 (UTC) Received: by iyj12 with SMTP id 12so3900108iyj.13 for ; Sun, 27 Mar 2011 18:39:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:sender:date:from:to:cc:subject:in-reply-to :message-id:references:user-agent:x-openpgp-key-id :x-openpgp-key-fingerprint:mime-version:content-type; bh=BvKaeZcyaShabrGSSQqKZziW0VGvdUrFRmnN3NDQmBE=; b=DSUnoMuSqt2qe43Jfp1sB3WVIb5o0FGktrlR0T7e6B+jwvKPraa/W2dveQVWwCUPe6 rZvZ8ML1qFCSrLjSqikVdSudo34mFQfsYNV01k/yPKmVohxybkidGJPUqJfGtr3xmNRs X3KB5ul2kCunNOxRuCROVBWOZfImaSY9RY7cA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:x-openpgp-key-id:x-openpgp-key-fingerprint:mime-version :content-type; b=L9jB+hRwY4hK6duDLdJuXRwxLXldqYCXUzIo2KddNNvuzOlZIg3x34zuY286uW4shS HRQTtutMl+j87C1/yvyhtAWlsDw0rr1dFCXTsZAHj5FmWr5VMGLuWzWLDoLbYWAaF6PL u0EE1h+svV0INPNxnNjTPbW1FSzfGM957nCzg= Received: by 10.231.52.209 with SMTP id j17mr3401887ibg.163.1301276367039; Sun, 27 Mar 2011 18:39:27 -0700 (PDT) Received: from disbatch.dataix.local (adsl-99-181-153-110.dsl.klmzmi.sbcglobal.net [99.181.153.110]) by mx.google.com with ESMTPS id u9sm2584749ibe.2.2011.03.27.18.39.22 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 27 Mar 2011 18:39:23 -0700 (PDT) Sender: "J. Hellenthal" Date: Sun, 27 Mar 2011 21:38:57 -0400 From: "J. Hellenthal" To: Stefan `Sec` Zehl In-Reply-To: <20110326224340.GB23803@ice.42.org> Message-ID: References: <4D8B99B4.4070404@FreeBSD.org> <201103250825.10674.jhb@freebsd.org> <20110325194109.GB25392@ice.42.org> <201103251640.16147.jhb@freebsd.org> <20110326140212.GB45402@ice.42.org> <20110326224340.GB23803@ice.42.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-OpenPGP-Key-Id: 0x89D8547E X-OpenPGP-Key-Fingerprint: 85EF E26B 07BB 3777 76BE B12A 9057 8789 89D8 547E MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="80310268-1666522583-1301276363=:9813" Cc: freebsd-net@freebsd.org Subject: Re: The tale of a TCP bug X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2011 01:39:28 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --80310268-1666522583-1301276363=:9813 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sat, 26 Mar 2011 18:43, sec@ wrote: > Hi, > >> On Fri, Mar 25, 2011 at 16:40 -0400, John Baldwin wrote: >>> And the problem is that the code that uses 'adv' to determine if it >>> sound send a window update to the remote end is falsely succeeding due >>> to the overflow causing tcp_output() to 'goto send' but that it then >>> fails to send any data because it thinks the remote window is full? > > On a whim I wanted to find out, how often that overflow is triggered in > normal operation, and whipped up a quick counter-sysctl. > > --- sys/netinet/tcp_output.c.org 2011-01-04 19:27:00.000000000 +0100 > +++ sys/netinet/tcp_output.c 2011-03-26 18:49:30.000000000 +0100 > @@ -87,6 +87,11 @@ > extern struct mbuf *m_copypack(); > #endif > > +VNET_DEFINE(int, adv_neg) = 0; > +SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, adv_neg, CTLFLAG_RD, > + &VNET_NAME(adv_neg), 1, > + "How many times adv got negative"); > + > VNET_DEFINE(int, path_mtu_discovery) = 1; > SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, path_mtu_discovery, CTLFLAG_RW, > &VNET_NAME(path_mtu_discovery), 1, > @@ -573,6 +578,10 @@ > long adv = min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) - > (tp->rcv_adv - tp->rcv_nxt); > > + if(min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) < > + (tp->rcv_adv - tp->rcv_nxt)) > + adv_neg++; > + > if (adv >= (long) (2 * tp->t_maxseg)) > goto send; > if (2 * adv >= (long) so->so_rcv.sb_hiwat) > > I booted my main (web/shell) box with (only) this patch: > > 11:36PM up 3:50, 1 user, load averages: 2.29, 1.51, 0.73 > net.inet.tcp.adv_neg: 2466 > > That's approximately once every 5 seconds. That's way more often than I > suspected. > > CU, > Sec > With this patch applied with John's on a 32-bit box I can repeatedly bump this sysctl with an SSL connection to another destination. Doesn't seem to matter what the destination is. curl -q https://www.changeip.com/ip.asp It also bumps in SSL connections to other protocols too. This behavior does not seem to be happening with non-SSL connections. Attached is a script that I am using to monitor the sysctl here just for reference. L = Last value C = Current value D = Difference I = Log interval S = Seconds since last change * = marked changed line /bin/sh ./adv_neg_mon.sh 7 |tee -a adv_neg.log [...] L:41 C:41 D:0 I:7 S:7.000000e+01 L:41 C:41 D:0 I:7 S:7.700000e+01 L:41 C:43 D:2 I:7 S:8.400000e+01 * L:43 C:88 D:45 I:7 S:7.000000e+00 * - -- Regards, J. Hellenthal (0x89D8547E) JJH48-ARIN -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (FreeBSD) Comment: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x89D8547E iQEcBAEBAgAGBQJNj+a7AAoJEJBXh4mJ2FR+VssIAI7QSUUb6jvZdMWxxVGPpr6o vPGDqPfWxNcih4D5SZxJJtsslnunpAcOjSWK8YGvOCINt8XhexVOSklyHuyvjIWd 4ijywngx5H2RT22c6wTdNPOfsZzoBkvLZZ2mj2cUF1ISxrvgy5syMp/TnANE3kul Mqf29HA8t3qYQCfb6zuFoWGdYI5Ahfsks4rljZJy/5bRQfNceJwBjUGnSlL0651m Bl4GpcNWA0fbuJeUgEzIK6mOpNdoI+PrZv6GEG7LErLaVtr+43gET/YITuGv1jY3 dlQ1WkHZSnaG/S7vpWbb2W/cuJ8ak6esbM74x8KakiOnLeJgy0MYK8oqYJyN3aI= =l+iW -----END PGP SIGNATURE----- --80310268-1666522583-1301276363=:9813 Content-Type: TEXT/PLAIN; charset=US-ASCII; name=adv_neg_mon.sh Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: adv_neg monitor Content-Disposition: attachment; filename=adv_neg_mon.sh IyEvYmluL3NoDQoNCnRyYXAgJ2V4aXQgMScgMg0KDQpVUERBVEU9JDEgOzog JHtVUERBVEU6PTV9DQoNCndoaWxlIHRydWU7IGRvDQoJTlZBTD0kKHN5c2N0 bCAtbiBuZXQuaW5ldC50Y3AuYWR2X25lZykNCglpZiBbIC16ICIkTFZBTCIg XTsgdGhlbg0KCQlMVkFMPSR7TlZBTH0NCglmaQ0KCWlmIFsgIiROVkFMIiAt Z3QgIiRMVkFMIiBdOyB0aGVuDQoJCWVjaG8gIkw6JExWQUwgQzokTlZBTCBE OiQoKCR7TlZBTH0tJHtMVkFMfSkpIEk6JHtVUERBVEV9IFM6JChwcmludGYg JWUgJHtVU0VDU30pICoiDQoJCVVTRUNTPSR7VVBEQVRFfQ0KCWVsc2UNCgkJ ZWNobyAiTDokTFZBTCBDOiROVkFMIEQ6JCgoJHtOVkFMfS0ke0xWQUx9KSkg SToke1VQREFURX0gUzokKHByaW50ZiAlZSAke1VTRUNTfSkiDQoJCVVTRUNT PSQoKCR7VVNFQ1N9KyR7VVBEQVRFfSkpDQoJZmkNCglMVkFMPSR7TlZBTH0N CglzbGVlcCAkVVBEQVRFDQpkb25lDQo= --80310268-1666522583-1301276363=:9813-- From owner-freebsd-net@FreeBSD.ORG Mon Mar 28 11:07:01 2011 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 314C01065678 for ; Mon, 28 Mar 2011 11:07:01 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 1CF5F8FC0A for ; Mon, 28 Mar 2011 11:07:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2SB71j0026707 for ; Mon, 28 Mar 2011 11:07:01 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2SB70Fa026705 for freebsd-net@FreeBSD.org; Mon, 28 Mar 2011 11:07:00 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 28 Mar 2011 11:07:00 GMT Message-Id: <201103281107.p2SB70Fa026705@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-net@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-net@FreeBSD.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2011 11:07:01 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/155772 net ifconfig(8): ioctl (SIOCAIFADDR): File exists on direc o kern/155680 net [multicast] problems with multicast s kern/155642 net [request] Add driver for Realtek RTL8191SE/RTL8192SE W o kern/155636 net [msk] msk driver locks marvel yukon 88E8057 NIC o kern/155604 net [flowtable] Flowtable excessively caches dest MAC addr o kern/155597 net [panic] Kernel panics with "sbdrop" message o kern/155585 net [tcp] [panic] tcp_output tcp_mtudisc loop until kernel o kern/155498 net [ral] ral(4) needs to be resynced with OpenBSD's to ga o kern/155420 net [vlan] adding vlan break existent vlan o bin/155365 net [routed] [patch] if.c in routed fails to compile if ti o kern/155177 net [route] [panic] Panic when inject routes in kernel o kern/155030 net [igb] igb(4) DEVICE_POLLING does not work with carp(4) o kern/155010 net [msk] ntfs-3g via iscsi using msk driver cause kernel o kern/155004 net [bce] [panic] kernel panic in bce0 driver o kern/154943 net [gif] ifconfig gifX create on existing gifX clears IP s kern/154851 net [request]: Port brcm80211 driver from Linux to FreeBSD o kern/154850 net [netgraph] [patch] ng_ether fails to name nodes when t o kern/154831 net [arp] [patch] arp sysctl setting log_arp_permanent_mod o kern/154679 net [em] Fatal trap 12: "em1 taskq" only at startup (8.1-R o kern/154676 net [netgraph] [panic] HEAD, 8.1-RELEASE panic after some o kern/154600 net [tcp] [panic] Random kernel panics on tcp_output o kern/154567 net [ath] ath(4) lot of bad series(0) o kern/154557 net [tcp] Freeze tcp-session of the clients, if in the gat o kern/154443 net [if_bridge] Kernel module bridgestp.ko missing after u o kern/154286 net [netgraph] [panic] 8.2-PRERELEASE panic in netgraph o kern/154284 net [ath] Modern ath wifi cards (such as AR9285) have miss o kern/154255 net [nfs] NFS not responding o kern/154214 net [stf] [panic] Panic when creating stf interface o kern/154185 net race condition in mb_dupcl o kern/154169 net [multicast] [ip6] Node Information Query multicast add o kern/154134 net [ip6] stuck kernel state in LISTEN on ipv6 daemon whic o kern/154091 net [netgraph] [panic] netgraph, unaligned mbuf? o conf/154062 net [vlan] [patch] change to way of auto-generatation of v o kern/154006 net [tcp] [patch] tcp "window probe" bug on 64bit o kern/153937 net [ral] ralink panics the system (amd64 freeBSDD 8.X) wh o kern/153936 net [ixgbe] [patch] MPRC workaround incorrectly applied to o kern/153816 net [ixgbe] ixgbe doesn't work properly with the Intel 10g o kern/153772 net [ixgbe] [patch] sysctls reference wrong XON/XOFF varia o kern/153671 net [em] [panic] 8.2-PRERELEASE repeatable kernel in if_em o kern/153497 net [netgraph] netgraph panic due to race conditions o kern/153454 net [patch] [wlan] [urtw] Support ad-hoc and hostap modes o kern/153308 net [em] em interface use 100% cpu o kern/153255 net [panic] 8.2-PRERELEASE repeatable kernel panic under h o kern/153244 net [em] em(4) fails to send UDP to port 0xffff o kern/152893 net [netgraph] [panic] 8.2-PRERELEASE panic in netgraph o kern/152853 net [em] tftpd (and likely other udp traffic) fails over e o kern/152828 net [em] poor performance on 8.1, 8.2-PRE o kern/152569 net [net]: Multiple ppp connections and routing table prob o kern/152360 net [dummynet] [panic] Crash related to dummynet. o kern/152235 net [arp] Permanent local ARP entries are not properly upd o kern/152141 net [vlan] [patch] encapsulate vlan in ng_ether before out o kern/151690 net [ep] network connectivity won't work until dhclient is o kern/151681 net [nfs] NFS mount via IPv6 leads to hang on client with o kern/151593 net [igb] [panic] Kernel panic when bringing up igb networ o kern/150920 net [ixgbe][igb] Panic when packets are dropped with heade o bin/150642 net netstat(1) doesn't print anything for SCTP sockets o kern/150557 net [igb] igb0: Watchdog timeout -- resetting o kern/150251 net [patch] [ixgbe] Late cable insertion broken o kern/150249 net [ixgbe] Media type detection broken o bin/150224 net ppp(8) does not reassign static IP after kill -KILL co f kern/149969 net [wlan] [ral] ralink rt2661 fails to maintain connectio o kern/149937 net [ipfilter] [patch] kernel panic in ipfilter IP fragmen o kern/149786 net [bwn] bwn on Dell Inspiron 1150: connections stall o kern/149643 net [rum] device not sending proper beacon frames in ap mo o kern/149609 net [panic] reboot after adding second default route o kern/149539 net [ath] atheros ar9287 is not supported by ath_hal o kern/149516 net [ath] ath(4) hostap with fake MAC/BSSID results in sta o kern/149373 net [realtek/atheros]: None of my network card working o kern/149307 net [ath] Doesn't work Atheros 9285 o kern/149306 net [alc] Doesn't work Atheros AR8131 PCIe Gigabit Etherne o kern/149117 net [inet] [patch] in_pcbbind: redundant test o kern/149086 net [multicast] Generic multicast join failure in 8.1 o kern/148322 net [ath] Triggering atheros wifi beacon misses in hostap o kern/148317 net [ath] FreeBSD 7.x hostap memory leak in net80211 or At o kern/148078 net [ath] wireless networking stops functioning o kern/148018 net [flowtable] flowtable crashes on ia64 o kern/147894 net [ipsec] IPv6-in-IPv4 does not work inside an ESP-only o kern/147155 net [ip6] setfb not work with ipv6 o kern/146845 net [libc] close(2) returns error 54 (connection reset by f kern/146792 net [flowtable] flowcleaner 100% cpu's core load o kern/146719 net [pf] [panic] PF or dumynet kernel panic o kern/146534 net [icmp6] wrong source address in echo reply o kern/146427 net [mwl] Additional virtual access points don't work on m o kern/146426 net [mwl] 802.11n rates not possible on mwl o kern/146425 net [mwl] mwl dropping all packets during and after high u f kern/146394 net [vlan] IP source address for outgoing connections o bin/146377 net [ppp] [tun] Interface doesn't clear addresses when PPP o kern/146358 net [vlan] wrong destination MAC address o kern/146165 net [wlan] [panic] Setting bssid in adhoc mode causes pani o kern/146082 net [ng_l2tp] a false invaliant check was performed in ng_ o kern/146037 net [panic] mpd + CoA = kernel panic o bin/145934 net [patch] add count option to netstat(1) o kern/145826 net [ath] Unable to configure adhoc mode on ath0/wlan0 o kern/145825 net [panic] panic: soabort: so_count o kern/145728 net [lagg] Stops working lagg between two servers. o kern/144987 net [wpi] [panic] injecting packets with wlaninject using f kern/144917 net [flowtable] [panic] flowtable crashes system [regressi o kern/144882 net MacBookPro =>4.1 does not connect to BSD in hostap wit o kern/144874 net [if_bridge] [patch] if_bridge frees mbuf after pfil ho o conf/144700 net [rc.d] async dhclient breaks stuff for too many people o kern/144642 net [rum] [panic] Enabling rum interface causes panic o kern/144616 net [nat] [panic] ip_nat panic FreeBSD 7.2 o kern/144572 net [carp] CARP preemption mode traffic partially goes to f kern/144315 net [ipfw] [panic] freebsd 8-stable reboot after add ipfw o kern/144231 net bind/connect/sendto too strict about sockaddr length o kern/143939 net [ipfw] [em] ipfw nat and em interface rxcsum problem o kern/143874 net [wpi] Wireless 3945ABG error. wpi0 could not allocate o kern/143868 net [ath] [patch] [request] allow Atheros watchdog timeout o kern/143846 net [gif] bringing gif3 tunnel down causes gif0 tunnel to s kern/143673 net [stf] [request] there should be a way to support multi s kern/143666 net [ip6] [request] PMTU black hole detection not implemen o kern/143622 net [pfil] [patch] unlock pfil lock while calling firewall o kern/143593 net [ipsec] When using IPSec, tcpdump doesn't show outgoin o kern/143591 net [ral] RT2561C-based DLink card (DWL-510) fails to work o kern/143208 net [ipsec] [gif] IPSec over gif interface not working o conf/143079 net hostapd(8) startup missing multi wlan functionality o kern/143034 net [panic] system reboots itself in tcp code [regression] o kern/142877 net [hang] network-related repeatable 8.0-STABLE hard hang o kern/142774 net Problem with outgoing connections on interface with mu o kern/142772 net [libc] lla_lookup: new lle malloc failed o kern/142018 net [iwi] [patch] Possibly wrong interpretation of beacon- o kern/141861 net [wi] data garbled with WEP and wi(4) with Prism 2.5 f kern/141741 net Etherlink III NIC won't work after upgrade to FBSD 8, o kern/141023 net [carp] CARP arp replays with wrong src mac o kern/140796 net [ath] [panic] privileged instruction fault o kern/140742 net rum(4) Two asus-WL167G adapters cannot talk to each ot o kern/140682 net [netgraph] [panic] random panic in netgraph o kern/140634 net [vlan] destroying if_lagg interface with if_vlan membe o kern/140619 net [ifnet] [patch] refine obsolete if_var.h comments desc o kern/140346 net [wlan] High bandwidth use causes loss of wlan connecti o kern/140245 net [ath] [panic] Kernel panic during network activity on o kern/140142 net [ip6] [panic] FreeBSD 7.2-amd64 panic w/IPv6 o kern/140066 net [bwi] install report for 8.0 RC 2 (multiple problems) o kern/139565 net [ipfilter] ipfilter ioctl SIOCDELST broken o kern/139387 net [ipsec] Wrong lenth of PF_KEY messages in promiscuous o bin/139346 net [patch] arp(8) add option to remove static entries lis o kern/139268 net [if_bridge] [patch] allow if_bridge to forward just VL p kern/139204 net [arp] DHCP server replies rejected, ARP entry lost bef o kern/139117 net [lagg] + wlan boot timing (EBUSY) o kern/139058 net [ipfilter] mbuf cluster leak on FreeBSD 7.2 o kern/138850 net [dummynet] dummynet doesn't work correctly on a bridge o kern/138782 net [panic] sbflush_internal: cc 0 || mb 0xffffff004127b00 o kern/138688 net [rum] possibly broken on 8 Beta 4 amd64: able to wpa a o kern/138678 net [lo] FreeBSD does not assign linklocal address to loop o kern/138620 net [lagg] [patch] lagg port bpf-writes blocked o kern/138407 net [gre] gre(4) interface does not come up after reboot o kern/138332 net [tun] [lor] ifconfig tun0 destroy causes LOR if_adata/ o kern/138266 net [panic] kernel panic when udp benchmark test used as r o kern/138177 net [ipfilter] FreeBSD crashing repeatedly in ip_nat.c:257 o kern/137881 net [netgraph] [panic] ng_pppoe fatal trap 12 o bin/137841 net [patch] wpa_supplicant(8) cannot verify SHA256 signed p kern/137776 net [rum] panic in rum(4) driver on 8.0-BETA2 o bin/137641 net ifconfig(8): various problems with "vlan_device.vlan_i o kern/137592 net [ath] panic - 7-STABLE (Aug 7, 2009 UTC) crashes on ne o bin/137484 net [patch] Integer overflow in wpa_supplicant(8) base64 e o kern/137392 net [ip] [panic] crash in ip_nat.c line 2577 o kern/137372 net [ral] FreeBSD doesn't support wireless interface from o kern/137089 net [lagg] lagg falsely triggers IPv6 duplicate address de o bin/136994 net [patch] ifconfig(8) print carp mac address o kern/136943 net [wpi] [lor] wpi0_com_lock / wpi0 o kern/136911 net [netgraph] [panic] system panic on kldload ng_bpf.ko t o kern/136836 net [ath] atheros card stops functioning after about 12 ho o bin/136661 net [patch] ndp(8) ignores -f option o kern/136618 net [pf][stf] panic on cloning interface without unit numb o kern/136426 net [panic] spawning several dhclients in parallel panics o kern/135502 net [periodic] Warning message raised by rtfree function i o kern/134931 net [route] Route messages sent to all socket listeners re o kern/134583 net [hang] Machine with jail freezes after random amount o o kern/134531 net [route] [panic] kernel crash related to routes/zebra o kern/134168 net [ral] ral driver problem on RT2525 2.4GHz transceiver o kern/134157 net [dummynet] dummynet loads cpu for 100% and make a syst o kern/133969 net [dummynet] [panic] Fatal trap 12: page fault while in o kern/133968 net [dummynet] [panic] dummynet kernel panic o kern/133736 net [udp] ip_id not protected ... o kern/133595 net [panic] Kernel Panic at pcpu.h:195 o kern/133572 net [ppp] [hang] incoming PPTP connection hangs the system o kern/133490 net [bpf] [panic] 'kmem_map too small' panic on Dell r900 o kern/133235 net [netinet] [patch] Process SIOCDLIFADDR command incorre o kern/133218 net [carp] [hang] use of carp(4) causes system to freeze f kern/133213 net arp and sshd errors on 7.1-PRERELEASE o kern/133060 net [ipsec] [pfsync] [panic] Kernel panic with ipsec + pfs o kern/132889 net [ndis] [panic] NDIS kernel crash on load BCM4321 AGN d o conf/132851 net [patch] rc.conf(5): allow to setfib(1) for service run o kern/132734 net [ifmib] [panic] panic in net/if_mib.c o kern/132722 net [ath] Wifi ath0 associates fine with AP, but DHCP or I o kern/132705 net [libwrap] [patch] libwrap - infinite loop if hosts.all o kern/132672 net [ndis] [panic] ndis with rt2860.sys causes kernel pani o kern/132554 net [ipl] There is no ippool start script/ipfilter magic t o kern/132354 net [nat] Getting some packages to ipnat(8) causes crash o kern/132285 net [carp] alias gives incorrect hash in dmesg o kern/132277 net [crypto] [ipsec] poor performance using cryptodevice f o kern/132107 net [carp] carp(4) advskew setting ignored when carp IP us o kern/131781 net [ndis] ndis keeps dropping the link o kern/131776 net [wi] driver fails to init o kern/131753 net [altq] [panic] kernel panic in hfsc_dequeue o bin/131567 net [socket] [patch] Update for regression/sockets/unix_cm o kern/131549 net ifconfig(8) can't clear 'monitor' mode on the wireless o bin/131365 net route(8): route add changes interpretation of network f kern/130820 net [ndis] wpa_supplicant(8) returns 'no space on device' o kern/130628 net [nfs] NFS / rpc.lockd deadlock on 7.1-R o conf/130555 net [rc.d] [patch] No good way to set ipfilter variables a o kern/130525 net [ndis] [panic] 64 bit ar5008 ndisgen-erated driver cau o kern/130311 net [wlan_xauth] [panic] hostapd restart causing kernel pa o kern/130109 net [ipfw] Can not set fib for packets originated from loc f kern/130059 net [panic] Leaking 50k mbufs/hour f kern/129750 net [ath] Atheros AR5006 exits on "cannot map register spa f kern/129719 net [nfs] [panic] Panic during shutdown, tcp_ctloutput: in o kern/129517 net [ipsec] [panic] double fault / stack overflow o kern/129508 net [carp] [panic] Kernel panic with EtherIP (may be relat o kern/129219 net [ppp] Kernel panic when using kernel mode ppp o kern/129197 net [panic] 7.0 IP stack related panic o bin/128954 net ifconfig(8) deletes valid routes o bin/128602 net [an] wpa_supplicant(8) crashes with an(4) o kern/128448 net [nfs] 6.4-RC1 Boot Fails if NFS Hostname cannot be res o conf/128334 net [request] use wpa_cli in the "WPA DHCP" situation o bin/128295 net [patch] ifconfig(8) does not print TOE4 or TOE6 capabi o bin/128001 net wpa_supplicant(8), wlan(4), and wi(4) issues o kern/127826 net [iwi] iwi0 driver has reduced performance and connecti o kern/127815 net [gif] [patch] if_gif does not set vlan attributes from o kern/127724 net [rtalloc] rtfree: 0xc5a8f870 has 1 refs f bin/127719 net [arp] arp: Segmentation fault (core dumped) f kern/127528 net [icmp]: icmp socket receives icmp replies not owned by o bin/127192 net routed(8) removes the secondary alias IP of interface f kern/127145 net [wi]: prism (wi) driver crash at bigger traffic o kern/127057 net [udp] Unable to send UDP packet via IPv6 socket to IPv o kern/127050 net [carp] ipv6 does not work on carp interfaces [regressi o kern/126945 net [carp] CARP interface destruction with ifconfig destro o kern/126895 net [patch] [ral] Add antenna selection (marked as TBD) o kern/126874 net [vlan]: Zebra problem if ifconfig vlanX destroy o kern/126714 net [carp] CARP interface renaming makes system no longer o kern/126695 net rtfree messages and network disruption upon use of if_ o kern/126475 net [ath] [panic] ath pcmcia card inevitably panics under o kern/126339 net [ipw] ipw driver drops the connection o kern/126214 net [ath] txpower problem with Atheros wifi card o kern/126075 net [inet] [patch] internet control accesses beyond end of o bin/125922 net [patch] Deadlock in arp(8) o kern/125920 net [arp] Kernel Routing Table loses Ethernet Link status o kern/125845 net [netinet] [patch] tcp_lro_rx() should make use of hard o kern/125816 net [carp] [if_bridge] carp stuck in init when using bridg o kern/125721 net [ath] Terrible throughput/high ping latency with Ubiqu o kern/125617 net [ath] [panic] ath(4) related panic o kern/125501 net [ath] atheros cardbus driver hangs f kern/125442 net [carp] [lagg] CARP combined with LAGG causes system pa f kern/125332 net [ath] [panic] crash under any non-tiny networking unde o kern/125258 net [socket] socket's SO_REUSEADDR option does not work o kern/125239 net [gre] kernel crash when using gre o kern/124767 net [iwi] Wireless connection using iwi0 driver (Intel 220 o kern/124341 net [ral] promiscuous mode for wireless device ral0 looses o kern/124225 net [ndis] [patch] ndis network driver sometimes loses net o kern/124160 net [libc] connect(2) function loops indefinitely o kern/124021 net [ip6] [panic] page fault in nd6_output() o kern/123968 net [rum] [panic] rum driver causes kernel panic with WPA. o kern/123892 net [tap] [patch] No buffer space available o kern/123890 net [ppp] [panic] crash & reboot on work with PPP low-spee o kern/123858 net [stf] [patch] stf not usable behind a NAT o kern/123796 net [ipf] FreeBSD 6.1+VPN+ipnat+ipf: port mapping does not o kern/123758 net [panic] panic while restarting net/freenet6 o bin/123633 net ifconfig(8) doesn't set inet and ether address in one o kern/123559 net [iwi] iwi periodically disassociates/associates [regre o bin/123465 net [ip6] route(8): route add -inet6 -interfac o kern/123463 net [ipsec] [panic] repeatable crash related to ipsec-tool o conf/123330 net [nsswitch.conf] Enabling samba wins in nsswitch.conf c o kern/123160 net [ip] Panic and reboot at sysctl kern.polling.enable=0 f kern/123045 net [ng_mppc] ng_mppc_decompress - disabling node o kern/122989 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/122954 net [lagg] IPv6 EUI64 incorrectly chosen for lagg devices f kern/122780 net [lagg] tcpdump on lagg interface during high pps wedge o kern/122697 net [ath] Atheros card is not well supported o kern/122685 net It is not visible passing packets in tcpdump(1) o kern/122319 net [wi] imposible to enable ad-hoc demo mode with Orinoco o kern/122290 net [netgraph] [panic] Netgraph related "kmem_map too smal o kern/122033 net [ral] [lor] Lock order reversal in ral0 at bootup ieee o bin/121895 net [patch] rtsol(8)/rtsold(8) doesn't handle managed netw s kern/121774 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/121555 net [panic] Fatal trap 12: current process = 12 (swi1: net o kern/121443 net [gif] [lor] icmp6_input/nd6_lookup o kern/121437 net [vlan] Routing to layer-2 address does not work on VLA o bin/121359 net [patch] [security] ppp(8): fix local stack overflow in o kern/121257 net [tcp] TSO + natd -> slow outgoing tcp traffic o kern/121181 net [panic] Fatal trap 3: breakpoint instruction fault whi o kern/120966 net [rum] kernel panic with if_rum and WPA encryption p docs/120945 net [patch] ip6(4) man page lacks documentation for TCLASS o kern/120566 net [request]: ifconfig(8) make order of arguments more fr o kern/120304 net [netgraph] [patch] netgraph source assumes 32-bit time o kern/120266 net [udp] [panic] gnugk causes kernel panic when closing U o kern/120130 net [carp] [panic] carp causes kernel panics in any conste o bin/120060 net routed(8) deletes link-level routes in the presence of o kern/119945 net [rum] [panic] rum device in hostap mode, cause kernel o kern/119791 net [nfs] UDP NFS mount of aliased IP addresses from a Sol o kern/119617 net [nfs] nfs error on wpa network when reseting/shutdown f kern/119516 net [ip6] [panic] _mtx_lock_sleep: recursed on non-recursi o kern/119432 net [arp] route add -host -iface causes arp e o kern/119225 net [wi] 7.0-RC1 no carrier with Prism 2.5 wifi card [regr o kern/118727 net [netgraph] [patch] [request] add new ng_pf module s kern/117717 net [panic] Kernel panic with Bittorrent client. o kern/117448 net [carp] 6.2 kernel crash [regression] o kern/117423 net [vlan] Duplicate IP on different interfaces o bin/117339 net [patch] route(8): loading routing management commands o kern/117271 net [tap] OpenVPN TAP uses 99% CPU on releng_6 when if_tap o kern/116747 net [ndis] FreeBSD 7.0-CURRENT crash with Dell TrueMobile o bin/116643 net [patch] [request] fstat(1): add INET/INET6 socket deta o kern/116185 net [iwi] if_iwi driver leads system to reboot o kern/115239 net [ipnat] panic with 'kmem_map too small' using ipnat o kern/115019 net [netgraph] ng_ether upper hook packet flow stops on ad o kern/115002 net [wi] if_wi timeout. failed allocation (busy bit). ifco o kern/114915 net [patch] [pcn] pcn (sys/pci/if_pcn.c) ethernet driver f o kern/113432 net [ucom] WARNING: attempt to net_add_domain(netgraph) af o kern/112722 net [ipsec] [udp] IP v4 udp fragmented packet reject o kern/112686 net [patm] patm driver freezes System (FreeBSD 6.2-p4) i38 o bin/112557 net [patch] ppp(8) lock file should not use symlink name o kern/112528 net [nfs] NFS over TCP under load hangs with "impossible p o kern/111457 net [ral] ral(4) freeze o kern/109470 net [wi] Orinoco Classic Gold PC Card Can't Channel Hop o kern/109308 net [pppd] [panic] Multiple panics kernel ppp suspected [r o bin/108895 net pppd(8): PPPoE dead connections on 6.2 [regression] o kern/107944 net [wi] [patch] Forget to unlock mutex-locks f kern/107279 net [ath] [panic] ath_start: attempted use of a free mbuf! o conf/107035 net [patch] bridge(8): bridge interface given in rc.conf n o kern/106444 net [netgraph] [panic] Kernel Panic on Binding to an ip to o kern/106438 net [ipf] ipfilter: keep state does not seem to allow repl o kern/106316 net [dummynet] dummynet with multipass ipfw drops packets o kern/105945 net Address can disappear from network interface s kern/105943 net Network stack may modify read-only mbuf chain copies o bin/105925 net problems with ifconfig(8) and vlan(4) [regression] f kern/105348 net [ath] ath device stopps TX o kern/104851 net [inet6] [patch] On link routes not configured when usi o kern/104751 net [netgraph] kernel panic, when getting info about my tr o kern/103191 net Unpredictable reboot o kern/103135 net [ipsec] ipsec with ipfw divert (not NAT) encodes a pac o kern/102540 net [netgraph] [patch] supporting vlan(4) by ng_fec(4) o conf/102502 net [netgraph] [patch] ifconfig name does't rename netgrap o kern/102035 net [plip] plip networking disables parallel port printing o kern/101948 net [ipf] [panic] Kernel Panic Trap No 12 Page Fault - cau o kern/100709 net [libc] getaddrinfo(3) should return TTL info o kern/100519 net [netisr] suggestion to fix suboptimal network polling o kern/98978 net [ipf] [patch] ipfilter drops OOW packets under 6.1-Rel o kern/98597 net [inet6] Bug in FreeBSD 6.1 IPv6 link-local DAD procedu o bin/98218 net wpa_supplicant(8) blacklist not working o kern/97306 net [netgraph] NG_L2TP locks after connection with failed o conf/97014 net [gif] gifconfig_gif? in rc.conf does not recognize IPv f kern/96268 net [socket] TCP socket performance drops by 3000% if pack o kern/95519 net [ral] ral0 could not map mbuf o kern/95288 net [pppd] [tty] [panic] if_ppp panic in sys/kern/tty_subr o kern/95277 net [netinet] [patch] IP Encapsulation mask_match() return o kern/95267 net packet drops periodically appear f kern/93886 net [ath] Atheros/D-Link DWL-G650 long delay to associate f kern/93378 net [tcp] Slow data transfer in Postfix and Cyrus IMAP (wo o kern/93019 net [ppp] ppp and tunX problems: no traffic after restarti o kern/92880 net [libc] [patch] almost rewritten inet_network(3) functi s kern/92279 net [dc] Core faults everytime I reboot, possible NIC issu o kern/91859 net [ndis] if_ndis does not work with Asus WL-138 s kern/91777 net [ipf] [patch] wrong behaviour with skip rule inside an o kern/91364 net [ral] [wep] WF-511 RT2500 Card PCI and WEP o kern/91311 net [aue] aue interface hanging s kern/90086 net [hang] 5.4p8 on supermicro P8SCT hangs during boot if o kern/87521 net [ipf] [panic] using ipfilter "auth" keyword leads to k o kern/87421 net [netgraph] [panic]: ng_ether + ng_eiface + if_bridge s kern/86920 net [ndis] ifconfig: SIOCS80211: Invalid argument [regress o kern/86871 net [tcp] [patch] allocation logic for PCBs in TIME_WAIT s o kern/86427 net [lor] Deadlock with FASTIPSEC and nat o kern/86103 net [ipf] Illegal NAT Traversal in IPFilter o kern/85780 net 'panic: bogus refcnt 0' in routing/ipv6 o bin/85445 net ifconfig(8): deprecated keyword to ifconfig inoperativ p kern/85320 net [gre] [patch] possible depletion of kernel stack in ip o bin/82975 net route change does not parse classfull network as given o kern/82881 net [netgraph] [panic] ng_fec(4) causes kernel panic after o bin/82185 net [patch] ndp(8) can delete the incorrect entry o kern/81095 net IPsec connection stops working if associated network i o kern/79895 net [ipf] 5.4-RC2 breaks ipfilter NAT when using netgraph o kern/78968 net FreeBSD freezes on mbufs exhaustion (network interface o kern/78090 net [ipf] ipf filtering on bridged packets doesn't work if o kern/77341 net [ip6] problems with IPV6 implementation o kern/77273 net [ipf] ipfilter breaks ipv6 statefull filtering on 5.3 s kern/77195 net [ipf] [patch] ipfilter ioctl SIOCGNATL does not match o kern/75873 net Usability problem with non-RFC-compliant IP spoof prot s kern/75407 net [an] an(4): no carrier after short time a kern/71474 net [route] route lookup does not skip interfaces marked d o kern/71469 net default route to internet magically disappears with mu o kern/70904 net [ipf] ipfilter ipnat problem with h323 proxy support o kern/66225 net [netgraph] [patch] extend ng_eiface(4) control message o kern/65616 net IPSEC can't detunnel GRE packets after real ESP encryp s kern/60293 net [patch] FreeBSD arp poison patch a kern/56233 net IPsec tunnel (ESP) over IPv6: MTU computation is wrong o kern/54383 net [nfs] [patch] NFS root configurations without dynamic s bin/41647 net ifconfig(8) doesn't accept lladdr along with inet addr s kern/39937 net ipstealth issue a kern/38554 net [patch] changing interface ipaddress doesn't seem to w o kern/34665 net [ipf] [hang] ipfilter rcmd proxy "hangs". o kern/31647 net [libc] socket calls can return undocumented EINVAL o kern/30186 net [libc] getaddrinfo(3) does not handle incorrect servna o kern/27474 net [ipf] [ppp] Interactive use of user PPP and ipfilter c o conf/23063 net [arp] [patch] for static ARP tables in rc.network 392 problems total. From owner-freebsd-net@FreeBSD.ORG Mon Mar 28 14:21:04 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5DBC8106564A; Mon, 28 Mar 2011 14:21:04 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 168AF8FC17; Mon, 28 Mar 2011 14:21:04 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id A5ED346B49; Mon, 28 Mar 2011 10:21:03 -0400 (EDT) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 327AD8A02A; Mon, 28 Mar 2011 10:21:03 -0400 (EDT) From: John Baldwin To: "Stefan `Sec` Zehl" Date: Mon, 28 Mar 2011 10:21:00 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110311; KDE/4.5.5; amd64; ; ) References: <4D8B99B4.4070404@FreeBSD.org> <201103251640.16147.jhb@freebsd.org> <20110326140212.GB45402@ice.42.org> In-Reply-To: <20110326140212.GB45402@ice.42.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201103281021.00673.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Mon, 28 Mar 2011 10:21:03 -0400 (EDT) Cc: freebsd-net@freebsd.org, Doug Barton Subject: Re: The tale of a TCP bug X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2011 14:21:04 -0000 On Saturday, March 26, 2011 10:02:12 am Stefan `Sec` Zehl wrote: > Hi again, > > On Fri, Mar 25, 2011 at 16:40 -0400, John Baldwin wrote: > > Reading some more. I'm trying to understand the breakage in your case. > > > > You are saying that FreeBSD is the sender, who has data to send, yet is not > > sending any window probes because it never starts the persist timer when the > > initial window is zero? Is that correct? > > Yes. The receiver never sends a window update on its own, but when > probed will "admit" to a bigger window. > > > And the problem is that the code that uses 'adv' to determine if it > > sound send a window update to the remote end is falsely succeeding due > > to the overflow causing tcp_output() to 'goto send' but that it then > > fails to send any data because it thinks the remote window is full? > > Yes, as far as I remember (I did that part of debugging 2 Months ago, > when I submitted the PR %-) that's what happens. > > > So one thing I don't quite follow is how you are having rcv_nxt > > > rcv_adv. I saw this when the other side would send a window probe, > > and then the receiving side would take the -1 remaining window and > > explode it into the maximum window size when it ACKd. > > No, it's not rcv_nxt > rcv_adv. It's > > (rcv_adv - rcv_nxt) > min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) > > My sample case has (rcv_adv - rcv_nxt) = 65536, but > (TCP_MAXWIN << tp->rcv_scale) = 65535 (as there is no window scaling in > effect) Ahhhh. > > Are you seeing the other end of the connection send a window probe, but > > FreeBSD is not setting the persist timer so that it will send its own window > > probes? > > No, the dump looks like this: > > | 10.42.0.25.44852 > 10.42.0.2.1516: Flags [S], > | seq 3339144437, win 65535, options [...], length 0 > > FreeBSD sending the first SYN. > [rcv_adv=0, rcv_nxt=0] > > | 10.42.0.2.1516 > 10.42.0.25.44852: Flags [S.], > | seq 42, ack 3339144438, win 0, length 0 > > The other end SYN|ACKing with a window size of 0. > > | 10.42.0.25.44852 > 10.42.0.2.1516: Flags [.], > | seq 1, ack 1, win 65535, length 0 > > FreeBSD ACKing, and (correctly) sending no data. > [rcv_adv=67779, rcv_nxt=43], thus resulting in adv=-1/0xffffffff Ahh, and this is the real bug. And this goes back to the calculation of 'rcv_wnd' in tcp_input(). How about this: Index: tcp_input.c =================================================================== --- tcp_input.c (revision 220098) +++ tcp_input.c (working copy) @@ -1694,6 +1694,8 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, win = sbspace(&so->so_rcv); if (win < 0) win = 0; + if (win > TCP_MAXWIN << tp->rcv_scale) + win = TCP_MAXWIN << tp->rcv_scale; tp->rcv_wnd = imax(win, (int)(tp->rcv_adv - tp->rcv_nxt)); /* Reset receive buffer auto scaling when not in bulk receive mode. */ This is basically the same as your patch except that it ensures that 'rcv_wnd' is accurate for any other uses. It looks like the syncache code is already correct as it uses a similar test to initialize 'sc_wnd': /* * Initial receive window: clip sbspace to [0 .. TCP_MAXWIN]. * win was derived from socket earlier in the function. */ win = imax(win, 0); win = imin(win, TCP_MAXWIN); sc->sc_wnd = win; -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Mon Mar 28 18:23:56 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E5BA21065674; Mon, 28 Mar 2011 18:23:56 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id A33598FC1E; Mon, 28 Mar 2011 18:23:56 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 2615346B2C; Mon, 28 Mar 2011 14:23:53 -0400 (EDT) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id A47D08A01B; Mon, 28 Mar 2011 14:23:52 -0400 (EDT) From: John Baldwin To: freebsd-net@freebsd.org Date: Mon, 28 Mar 2011 14:23:51 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110311; KDE/4.5.5; amd64; ; ) References: <4D8B99B4.4070404@FreeBSD.org> <20110326140212.GB45402@ice.42.org> <201103281021.00673.jhb@freebsd.org> In-Reply-To: <201103281021.00673.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201103281423.52202.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Mon, 28 Mar 2011 14:23:52 -0400 (EDT) Cc: Stefan `Sec` Zehl , Doug Barton Subject: Re: The tale of a TCP bug X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2011 18:23:57 -0000 On Monday, March 28, 2011 10:21:00 am John Baldwin wrote: > On Saturday, March 26, 2011 10:02:12 am Stefan `Sec` Zehl wrote: > > Hi again, > > > > On Fri, Mar 25, 2011 at 16:40 -0400, John Baldwin wrote: > > > Reading some more. I'm trying to understand the breakage in your case. > > > > > > You are saying that FreeBSD is the sender, who has data to send, yet is not > > > sending any window probes because it never starts the persist timer when the > > > initial window is zero? Is that correct? > > > > Yes. The receiver never sends a window update on its own, but when > > probed will "admit" to a bigger window. > > > > > And the problem is that the code that uses 'adv' to determine if it > > > sound send a window update to the remote end is falsely succeeding due > > > to the overflow causing tcp_output() to 'goto send' but that it then > > > fails to send any data because it thinks the remote window is full? > > > > Yes, as far as I remember (I did that part of debugging 2 Months ago, > > when I submitted the PR %-) that's what happens. > > > > > So one thing I don't quite follow is how you are having rcv_nxt > > > > rcv_adv. I saw this when the other side would send a window probe, > > > and then the receiving side would take the -1 remaining window and > > > explode it into the maximum window size when it ACKd. > > > > No, it's not rcv_nxt > rcv_adv. It's > > > > (rcv_adv - rcv_nxt) > min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) > > > > My sample case has (rcv_adv - rcv_nxt) = 65536, but > > (TCP_MAXWIN << tp->rcv_scale) = 65535 (as there is no window scaling in > > effect) > > Ahhhh. > > > > Are you seeing the other end of the connection send a window probe, but > > > FreeBSD is not setting the persist timer so that it will send its own window > > > probes? > > > > No, the dump looks like this: > > > > | 10.42.0.25.44852 > 10.42.0.2.1516: Flags [S], > > | seq 3339144437, win 65535, options [...], length 0 > > > > FreeBSD sending the first SYN. > > [rcv_adv=0, rcv_nxt=0] > > > > | 10.42.0.2.1516 > 10.42.0.25.44852: Flags [S.], > > | seq 42, ack 3339144438, win 0, length 0 > > > > The other end SYN|ACKing with a window size of 0. > > > > | 10.42.0.25.44852 > 10.42.0.2.1516: Flags [.], > > | seq 1, ack 1, win 65535, length 0 > > > > FreeBSD ACKing, and (correctly) sending no data. > > [rcv_adv=67779, rcv_nxt=43], thus resulting in adv=-1/0xffffffff > > Ahh, and this is the real bug. And this goes back to the calculation of > 'rcv_wnd' in tcp_input(). > > How about this: > > Index: tcp_input.c > =================================================================== > --- tcp_input.c (revision 220098) > +++ tcp_input.c (working copy) > @@ -1694,6 +1694,8 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, > win = sbspace(&so->so_rcv); > if (win < 0) > win = 0; > + if (win > TCP_MAXWIN << tp->rcv_scale) > + win = TCP_MAXWIN << tp->rcv_scale; > tp->rcv_wnd = imax(win, (int)(tp->rcv_adv - tp->rcv_nxt)); > > /* Reset receive buffer auto scaling when not in bulk receive mode. */ > > This is basically the same as your patch except that it ensures that > 'rcv_wnd' is accurate for any other uses. No, this is not really right. Your patch from your blog is the best fix actually. The reason we want to let 'win' be larger than TCP_MAXWIN is that if the remote end sends more data than we've advertised but we have room in the socket buffer, we want to go ahead and accept the data as valid and ACK it rather than dropping the data that is beyond rcv_adv. My change above to rcv_wnd would break this. Also, for the TCPS_SYN_SENT case we don't know what 'rcv_scale' is until just before we update 'rcv_adv'. This should be the same as your patch: Index: tcp_input.c =================================================================== --- tcp_input.c (revision 220098) +++ tcp_input.c (working copy) @@ -1756,7 +1756,8 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, (TF_RCVD_SCALE|TF_REQ_SCALE)) { tp->rcv_scale = tp->request_r_scale; } - tp->rcv_adv += tp->rcv_wnd; + tp->rcv_adv += imin(tp->rcv_wnd, + TCP_MAXWIN << tp->rcv_scale); tp->snd_una++; /* SYN is acked */ /* * If there's data, delay ACK; if there's also a FIN -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Mon Mar 28 18:38:12 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 91FF71065670; Mon, 28 Mar 2011 18:38:12 +0000 (UTC) (envelope-from sec@42.org) Received: from ice.42.org (v6.42.org [IPv6:2001:608:9::1]) by mx1.freebsd.org (Postfix) with ESMTP id 4500F8FC08; Mon, 28 Mar 2011 18:38:12 +0000 (UTC) Received: by ice.42.org (Postfix, from userid 1000) id 6FE1928419; Mon, 28 Mar 2011 20:38:10 +0200 (CEST) Date: Mon, 28 Mar 2011 20:38:10 +0200 From: Stefan `Sec` Zehl To: John Baldwin Message-ID: <20110328183810.GF23803@ice.42.org> Mail-Followup-To: John Baldwin , freebsd-net@freebsd.org, Doug Barton X-Current-Backlog: 3790 messages References: <4D8B99B4.4070404@FreeBSD.org> <20110326140212.GB45402@ice.42.org> <201103281021.00673.jhb@freebsd.org> <201103281423.52202.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201103281423.52202.jhb@freebsd.org> User-Agent: Mutt/1.4.2.3i I-love-doing-this: really X-Modeline: vim:set ts=8 sw=4 smarttab tw=72 si noic notitle: Accept-Languages: de, en X-URL: http://sec.42.org/ Cc: freebsd-net@freebsd.org, Doug Barton Subject: Re: The tale of a TCP bug X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2011 18:38:12 -0000 Hi, On Mon, Mar 28, 2011 at 14:23 -0400, John Baldwin wrote: > > No, this is not really right. Your patch from your blog is the best > fix actually. The reason we want to let 'win' be larger than > TCP_MAXWIN is that if the remote end sends more data than we've > advertised but we have room in the socket buffer, we want to go ahead > and accept the data as valid and ACK it rather than dropping the data > that is beyond rcv_adv. My change above to rcv_wnd would break this. > Also, for the TCPS_SYN_SENT case we don't know what 'rcv_scale' is > until just before we update 'rcv_adv'. This should be the same > as your patch: > > Index: tcp_input.c > =================================================================== > --- tcp_input.c (revision 220098) > +++ tcp_input.c (working copy) > @@ -1756,7 +1756,8 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, > (TF_RCVD_SCALE|TF_REQ_SCALE)) { > tp->rcv_scale = tp->request_r_scale; > } > - tp->rcv_adv += tp->rcv_wnd; > + tp->rcv_adv += imin(tp->rcv_wnd, > + TCP_MAXWIN << tp->rcv_scale); > tp->snd_una++; /* SYN is acked */ > /* > * If there's data, delay ACK; if there's also a FIN > I've applied this to my test-VM, and as expected it now passes my two testcases. As far as I'm concerned this fixes it for me. I'm interested to see if my adv_neg counting hack together with this patch still registers any hits. -- If nobody beats me to it, I'll try it out on my webserver tomorrow. CU, Sec -- We may very soon have computers weighing no more than 1.5 tons. From owner-freebsd-net@FreeBSD.ORG Mon Mar 28 20:08:42 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D885F106567C; Mon, 28 Mar 2011 20:08:42 +0000 (UTC) (envelope-from Albert.Shih@obspm.fr) Received: from spock-ext.obspm.fr (spock-ext.obspm.fr [145.238.186.3]) by mx1.freebsd.org (Postfix) with ESMTP id 6DF4A8FC26; Mon, 28 Mar 2011 20:08:42 +0000 (UTC) Received: from obspm.fr (pcjas.obspm.fr [145.238.184.233]) by spock-ext.obspm.fr (8.14.3/8.14.3/DIO Observatoire de Paris - 15/04/10) with ESMTP id p2SK8d1D005688 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 28 Mar 2011 22:08:40 +0200 Date: Mon, 28 Mar 2011 22:08:39 +0200 From: Albert Shih To: Julian Elischer Message-ID: <20110328200839.GA16611@obspm.fr> References: <20110322131435.GA5792@obspm.fr> <4D890905.9010000@freebsd.org> <20110323100504.GA8779@obspm.fr> <4D8AE4BC.4080900@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4D8AE4BC.4080900@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.4 (spock-ext.obspm.fr [145.238.186.20]); Mon, 28 Mar 2011 22:08:40 +0200 (CEST) X-Virus-Scanned: clamav-milter 0.97 at spock-ext.obspm.fr X-Virus-Status: Clean Cc: freebsd-net@freebsd.org, freebsd-jail@freebsd.org Subject: Re: setfib mount X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2011 20:08:43 -0000 Le 23/03/2011 à 23:29:16-0700, Julian Elischer a écrit > >> > > No. > > > > The mount is on the host. > > so then I too am not sure why the mount itself would use the second FIB. > is it possible that some of the mounting is being done automatically > by rc scripts > using /etc/fstab in the jail? No totally «impossible»...;-) But maybe I find something. I need to do some tests (the server is in production) but seem to me it's more complicate then I though. Configuration : Two physical interfaces (bce0 and bce1) The «host» is on the bce0 --> setfib 0 The jail is on the bce1. --> setfib 1 If I try in the /etc/fstab to put the mount he seem to me the connection start from bce1. If I don't put the mount in the /etc/fstab but in something like (or manually) /etc/rc.local with #/usr/sbin/setfib 0 mount -t nfs -o rw,tcp etc... it's working...until the jail try to access to this partition at this moment the connection start from bce1. So to solve my problem I put the mount in the /etc/fstab and make the NFS-server accept connection from bce1. But...I think it's a bug.... If you want me to do some other tests tell me (and give me some time). Regards. JAS -- Albert SHIH DIO batiment 15 Observatoire de Paris Meudon 5 Place Jules Janssen 92195 Meudon Cedex Téléphone : 01 45 07 76 26/06 86 69 95 71 Heure local/Local time: lun 28 mar 2011 22:02:00 CEST From owner-freebsd-net@FreeBSD.ORG Tue Mar 29 15:41:27 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F375B1065674 for ; Tue, 29 Mar 2011 15:41:27 +0000 (UTC) (envelope-from fbsd@opal.com) Received: from mho-01-ewr.mailhop.org (mho-01-ewr.mailhop.org [204.13.248.71]) by mx1.freebsd.org (Postfix) with ESMTP id BEA818FC19 for ; Tue, 29 Mar 2011 15:41:27 +0000 (UTC) Received: from pool-141-154-217-103.bos.east.verizon.net ([141.154.217.103] helo=homobox.opal.com) by mho-01-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1Q4ajy-000F5g-B5 for freebsd-net@freebsd.org; Tue, 29 Mar 2011 15:21:54 +0000 Received: from opal.com (localhost [IPv6:::1]) (authenticated bits=0) by homobox.opal.com (8.14.4/8.14.4) with ESMTP id p2TFfOiN059029 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 29 Mar 2011 11:41:25 -0400 (EDT) (envelope-from fbsd@opal.com) Received: from shibato.opal.com ([173.52.157.153] helo=shibato.opal.com) with IPv4:587 by opal.com; 29 Mar 2011 11:41:24 -0400 X-Mail-Handler: MailHop Outbound by DynDNS X-Originating-IP: 141.154.217.103 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/mailhop/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX19d0h+noqCQxxGMp99yjmuj Date: Tue, 29 Mar 2011 12:01:22 -0400 From: "J.R. Oldroyd" To: freebsd-net@freebsd.org Message-ID: <20110329120122.4f7bd980@shibato.opal.com> In-Reply-To: <20110324140752.071ed024@shibato.opal.com> References: <20110317134514.5f9d52de@shibato.opal.com> <20110324140752.071ed024@shibato.opal.com> X-Mailer: Claws Mail 3.7.6 (GTK+ 2.20.1; amd64-portbld-freebsd8.1) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: CFT: IPv6 DNS autoconfiguration (RFC6106 RDNSS and DNSSL) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Mar 2011 15:41:28 -0000 I have updated the rtadvd patch to allow greater flexibility in configuring DNS servers and search domains in rtadvd.conf. The new patch allows comma-separated values in the "rdnss=" and "dnssl=" entries and now sends separate RA RDNSS and DNSSL options for each such entry. This allows separate lifetimes to be given for each RA option. Previously, :rdnss="2001:db8:ffff::1":rdnss0="2001:db8:ffff::2":\ :rdnssltime#1200: sent one RA RDNSS option containing two server IPs with the one lifetime. Instead, you can now say: :rdnss="2001:db8:ffff::1,2001:db8:ffff::2":rdnssltime#1200:\ :rdnss0="2001:db8:ffff::3,2001:db8:ffff::4":rdnssltime0#900: which will send two RA RDNSS options, each with two server IPs and each with the corresponding lifetime. Same goes for "dnssl=". I now also send RA RDNSS and DNSSL options with zero lifetimes when the server is shut down using a TERM signal. If you've tried this out and wish to grab this latest version, all you need is the rtadvd-rdnss.diff update from the web site. The other three diffs do not change. http://opal.com/jr/freebsd/rdnss/ -jr From owner-freebsd-net@FreeBSD.ORG Tue Mar 29 20:21:00 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A15221065704 for ; Tue, 29 Mar 2011 20:21:00 +0000 (UTC) (envelope-from andrey@zonov.org) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 2557C8FC18 for ; Tue, 29 Mar 2011 20:20:59 +0000 (UTC) Received: by fxm11 with SMTP id 11so681272fxm.13 for ; Tue, 29 Mar 2011 13:20:59 -0700 (PDT) Received: by 10.223.14.137 with SMTP id g9mr290647faa.1.1301428540603; Tue, 29 Mar 2011 12:55:40 -0700 (PDT) Received: from [10.254.254.77] (ppp95-165-132-192.pppoe.spdop.ru [95.165.132.192]) by mx.google.com with ESMTPS id j12sm2123492fax.9.2011.03.29.12.55.38 (version=SSLv3 cipher=OTHER); Tue, 29 Mar 2011 12:55:39 -0700 (PDT) Message-ID: <4D923931.2070606@zonov.org> Date: Tue, 29 Mar 2011 23:55:29 +0400 From: Andrey Zonov User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.1.24) Gecko/20100228 Thunderbird/2.0.0.24 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Mar 2011 20:21:00 -0000 Hi, New igb driver (and I think em too) is required too much 9k mbufs when it's been configured with mtu = 9000. On machine with 8 CPUs, driver is required 8192 9k mbufs, but by default there is only 6400 and network won't start. In previous versions for big mtu it was used 4k mbufs, by default there is 12800 and all worked fine. Maybe it's time to think about increasing default kern.maxusers/kern.ipc.nmbclusters? or use mp_ncpus for calculation these values? or just increase amount of mbuf_cluster/mbuf_jumbo_page/mbuf_jumbo_9k from that driver... I just want igb to work out-of-the-box. -- Andrey Zonov From owner-freebsd-net@FreeBSD.ORG Tue Mar 29 21:55:18 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 32F89106564A for ; Tue, 29 Mar 2011 21:55:18 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id D19928FC08 for ; Tue, 29 Mar 2011 21:55:17 +0000 (UTC) Received: by vxc34 with SMTP id 34so691178vxc.13 for ; Tue, 29 Mar 2011 14:55:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=Y5YcENVCW8gAUdhOlaVGeyOot2ZAOrpO/m+H9w0AQFE=; b=ISViasphV0gZax7IklT0dcn4Vqp+HrZzkqc5pn5JzAYhSJfplHMtaGF5OSZnbDNkqg UG2dbDVX8YyJHJO1O8yPHFAqYx6+FvDFpRHwoRTNuXQhTY5Gt48mCOSGLuHchx8XGtOe hHEqtiORPyoSs9F4ZTGY+OUgIgQOACVgxB7CI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=d2J4AkeUfQDUyv3VxhQUH+sQ51TfIlPi6nTv1rJDhCNYIncFi8iE1LICOtPYjw3xfj cU6d04+7ZLJcMVbJQwDPbjLrr+cajtN8j2fq3heap5tD/Y2gQsMLqmw/V2du0257PbM0 VrV/im4FuxYKO7EiO0NdboYm+J74pN+wBPtME= MIME-Version: 1.0 Received: by 10.52.95.135 with SMTP id dk7mr533684vdb.93.1301435717065; Tue, 29 Mar 2011 14:55:17 -0700 (PDT) Received: by 10.52.167.6 with HTTP; Tue, 29 Mar 2011 14:55:17 -0700 (PDT) In-Reply-To: <4D923931.2070606@zonov.org> References: <4D923931.2070606@zonov.org> Date: Tue, 29 Mar 2011 14:55:17 -0700 Message-ID: From: Jack Vogel To: Andrey Zonov Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Mar 2011 21:55:18 -0000 Our validation group has a default postinstall process, every installed system gets those changes, and these mbuf pool sizes are in that set of changes. While I'm not opposed to system default settings changing its usually necessary to have local sys changes anyway, after all you don't get 9K jumbos without manually specifying them as well :) Regards, Jack On Tue, Mar 29, 2011 at 12:55 PM, Andrey Zonov wrote: > Hi, > > New igb driver (and I think em too) is required too much 9k mbufs when it's > been configured with mtu = 9000. On machine with 8 CPUs, driver is required > 8192 9k mbufs, but by default there is only 6400 and network won't start. In > previous versions for big mtu it was used 4k mbufs, by default there is > 12800 and all worked fine. > > Maybe it's time to think about increasing default > kern.maxusers/kern.ipc.nmbclusters? or use mp_ncpus for calculation these > values? or just increase amount of > mbuf_cluster/mbuf_jumbo_page/mbuf_jumbo_9k from that driver... > > I just want igb to work out-of-the-box. > > -- > Andrey Zonov > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Tue Mar 29 23:34:06 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:4f8:fff6::35]) by hub.freebsd.org (Postfix) with ESMTP id 4D8D1106564A for ; Tue, 29 Mar 2011 23:34:06 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from 65-241-43-5.globalsuite.net (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx2.freebsd.org (Postfix) with ESMTP id D83E4150259; Tue, 29 Mar 2011 23:34:05 +0000 (UTC) Message-ID: <4D926C6D.7040308@FreeBSD.org> Date: Tue, 29 Mar 2011 16:34:05 -0700 From: Doug Barton Organization: http://SupersetSolutions.com/ User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.15) Gecko/20110319 Thunderbird/3.1.9 MIME-Version: 1.0 To: Jack Vogel References: <4D923931.2070606@zonov.org> In-Reply-To: X-Enigmail-Version: 1.1.2 OpenPGP: id=1A1ABC84 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Andrey Zonov Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Mar 2011 23:34:06 -0000 It would probably be useful to document those tunables in the man page. It already has good sections for other tunables, so adding them should be easy. Doug On 03/29/2011 14:55, Jack Vogel wrote: > Our validation group has a default postinstall process, every installed > system gets those changes, > and these mbuf pool sizes are in that set of changes. While I'm not opposed > to system default settings > changing its usually necessary to have local sys changes anyway, after all > you don't get 9K jumbos > without manually specifying them as well :) > > Regards, > > Jack > > > On Tue, Mar 29, 2011 at 12:55 PM, Andrey Zonov wrote: > >> Hi, >> >> New igb driver (and I think em too) is required too much 9k mbufs when it's >> been configured with mtu = 9000. On machine with 8 CPUs, driver is required >> 8192 9k mbufs, but by default there is only 6400 and network won't start. In >> previous versions for big mtu it was used 4k mbufs, by default there is >> 12800 and all worked fine. >> >> Maybe it's time to think about increasing default >> kern.maxusers/kern.ipc.nmbclusters? or use mp_ncpus for calculation these >> values? or just increase amount of >> mbuf_cluster/mbuf_jumbo_page/mbuf_jumbo_9k from that driver... >> >> I just want igb to work out-of-the-box. >> >> -- >> Andrey Zonov -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 05:07:27 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0E1B3106566B for ; Wed, 30 Mar 2011 05:07:27 +0000 (UTC) (envelope-from lacombar@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id C963D8FC1B for ; Wed, 30 Mar 2011 05:07:26 +0000 (UTC) Received: by iyj12 with SMTP id 12so1125459iyj.13 for ; Tue, 29 Mar 2011 22:07:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=ID60uSpyXtlDa9DzEMjI8t1DfunYNvqKwxpcJWRh4Vo=; b=WKaT36M66abOF1gbc478i3DiqQg+m95dE3fscrRp2DTGsIDP8G6rqhIUYv1ntAuDkv rr872LE+C0vf44ptPD+z3UsO6zVtJ7/UkfiytvN0N2Cq46anTY/NGFqyaoJpOeZqN9o3 GoulfOShmnzetTHmn4aFz8rFeSad8yL2I//4A= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=tP7CbfRECi8l7WBXxQXV4GKzBiKg4hUqoV4RUHbBRqleTw9GsFvkCfWYdbiP8wvco5 4JYdDEvN0LNCr9avslBhF3xVCquhy9x6REKyASTKx/BcYJq0GbjkzV6VPVfjcrprJpjE T0P4i0L6KQpJpheYIE5TcV9ekoSnCO2A4xJiI= MIME-Version: 1.0 Received: by 10.42.1.70 with SMTP id 6mr623753icf.483.1301461646078; Tue, 29 Mar 2011 22:07:26 -0700 (PDT) Received: by 10.42.146.72 with HTTP; Tue, 29 Mar 2011 22:07:26 -0700 (PDT) In-Reply-To: <4D923931.2070606@zonov.org> References: <4D923931.2070606@zonov.org> Date: Wed, 30 Mar 2011 01:07:26 -0400 Message-ID: From: Arnaud Lacombe To: Andrey Zonov Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 05:07:27 -0000 Hi, On Tue, Mar 29, 2011 at 3:55 PM, Andrey Zonov wrote: > Hi, > > New igb driver (and I think em too) is required too much 9k mbufs when it's > been configured with mtu = 9000. On machine with 8 CPUs, driver is required > 8192 9k mbufs, but by default there is only 6400 and network won't start. In > previous versions for big mtu it was used 4k mbufs, by default there is > 12800 and all worked fine. > > Maybe it's time to think about increasing default > kern.maxusers/kern.ipc.nmbclusters? or use mp_ncpus for calculation these > values? or just increase amount of > mbuf_cluster/mbuf_jumbo_page/mbuf_jumbo_9k from that driver... > ... or maintain internal changes to the driver to make it not that memory hungry/behave well under memory pressure, no matter what Jack say, especially on system where memory _is_ a constraint. I guess it will be the only solution to use em(4) in the Real World (ie. not some cozy Intel test lab). - Arnaud [0]: OOTH, I subscribed a few months ago :) > I just want igb to work out-of-the-box. > > -- > Andrey Zonov > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 05:11:30 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:4f8:fff6::35]) by hub.freebsd.org (Postfix) with ESMTP id 9E393106567C for ; Wed, 30 Mar 2011 05:11:30 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from 65-241-43-5.globalsuite.net (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx2.freebsd.org (Postfix) with ESMTP id D27AD16089C; Wed, 30 Mar 2011 05:11:13 +0000 (UTC) Message-ID: <4D92BB71.5000900@FreeBSD.org> Date: Tue, 29 Mar 2011 22:11:13 -0700 From: Doug Barton Organization: http://SupersetSolutions.com/ User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.15) Gecko/20110319 Thunderbird/3.1.9 MIME-Version: 1.0 To: Arnaud Lacombe References: <4D923931.2070606@zonov.org> In-Reply-To: X-Enigmail-Version: 1.1.2 OpenPGP: id=1A1ABC84 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Andrey Zonov Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 05:11:30 -0000 On 03/29/2011 22:07, Arnaud Lacombe wrote: > ... or maintain internal changes to the driver to make it not that memory hungry/behave well under memory pressure, especially on system where memory_is_ a constraint. If you come up with patches, I'm sure everyone would like to see them. Meanwhile, there are times where memory IS a constraint, and there are some things you can't do without more of it. Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 05:24:43 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C0824106564A for ; Wed, 30 Mar 2011 05:24:43 +0000 (UTC) (envelope-from mlmichael70@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 53FA88FC21 for ; Wed, 30 Mar 2011 05:24:42 +0000 (UTC) Received: by wwc33 with SMTP id 33so1028397wwc.31 for ; Tue, 29 Mar 2011 22:24:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:message-id:date:from:user-agent:mime-version:to :subject:content-type:content-transfer-encoding; bh=a1W0Onzx5X4AfYe7x5BLPL/bwbXIP66I8DW8RqGkzPk=; b=l7ReCW1i3bK87ZPZhPrQsSU7t0O+rAizuk0Rie2hi/muF57V2dwigHzFu27kR92IVc AClcXCH4MJ2UcyvhlssYh6cpnniuBmmICKx5Cv3NjGKktrMa2CtdWG6CHRUv7+evVwE7 q2lVHvk+2GR0eIfWwqmzLa0Whr+BvgVyZqH70= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; b=chrw6RyLltm9qF9ziKBZUFXxVNUcH+0scorlA6qu8f3w6yLimS4eNSd0LaN9EW1rAJ 7WKGr5CsZ35gvZCqsl+vEbN0cJ1YKUyDXd0GTJ02jgwEudLXaLYI9SQ1QtrTdRMQ9t13 xadjBsf3p+Tx1QJHMX/Z79cCLssjHVoxuASHE= Received: by 10.227.176.135 with SMTP id be7mr787318wbb.0.1301462682272; Tue, 29 Mar 2011 22:24:42 -0700 (PDT) Received: from prime.nonspace (adsl-178-78-102-245.karoo.kcom.com [178.78.102.245]) by mx.google.com with ESMTPS id p5sm2800648wbg.45.2011.03.29.22.24.41 (version=SSLv3 cipher=OTHER); Tue, 29 Mar 2011 22:24:41 -0700 (PDT) Message-ID: <4D92BE98.40407@gmail.com> Date: Wed, 30 Mar 2011 06:24:40 +0100 From: Michael User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.15) Gecko/20110317 Thunderbird/3.1.9 MIME-Version: 1.0 To: freebsd-net@freebsd.org Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: mac_acl - how to get a list of allowed stations X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 05:24:43 -0000 Hi, To get MAC ACL I'm using wlan_acl and I'm adding stations with "ifconfig mac:add" command. It works but how can I get a list of currently allowed/denied stations? Michael From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 05:56:27 2011 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8546E106566B; Wed, 30 Mar 2011 05:56:27 +0000 (UTC) (envelope-from remko@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 598958FC0A; Wed, 30 Mar 2011 05:56:27 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2U5uRkE002764; Wed, 30 Mar 2011 05:56:27 GMT (envelope-from remko@freefall.freebsd.org) Received: (from remko@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2U5uRMo002760; Wed, 30 Mar 2011 05:56:27 GMT (envelope-from remko) Date: Wed, 30 Mar 2011 05:56:27 GMT Message-Id: <201103300556.p2U5uRMo002760@freefall.freebsd.org> To: remko@FreeBSD.org, freebsd-i386@FreeBSD.org, freebsd-net@FreeBSD.org From: remko@FreeBSD.org Cc: Subject: Re: kern/147912: [wifi]: [boot] FreeBSD 8 Beta won't boot on Thinkpad i1300 1171-5XU X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 05:56:27 -0000 Old Synopsis: [boot] FreeBSD 8 Beta won't boot on Thinkpad i1300 1171-5XU New Synopsis: [wifi]: [boot] FreeBSD 8 Beta won't boot on Thinkpad i1300 1171-5XU Responsible-Changed-From-To: freebsd-i386->freebsd-net Responsible-Changed-By: remko Responsible-Changed-When: Wed Mar 30 05:55:58 UTC 2011 Responsible-Changed-Why: reassign to networking team, this might have something to do with the network card that is in the system. http://www.freebsd.org/cgi/query-pr.cgi?pr=147912 From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 05:58:16 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BA0791065675 for ; Wed, 30 Mar 2011 05:58:16 +0000 (UTC) (envelope-from andrey@zonov.org) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 20E8A8FC18 for ; Wed, 30 Mar 2011 05:58:15 +0000 (UTC) Received: by fxm11 with SMTP id 11so1018512fxm.13 for ; Tue, 29 Mar 2011 22:58:15 -0700 (PDT) Received: by 10.223.17.76 with SMTP id r12mr711616faa.142.1301464694753; Tue, 29 Mar 2011 22:58:14 -0700 (PDT) Received: from [10.254.254.77] (ppp95-165-142-247.pppoe.spdop.ru [95.165.142.247]) by mx.google.com with ESMTPS id n1sm2222369fam.40.2011.03.29.22.58.13 (version=SSLv3 cipher=OTHER); Tue, 29 Mar 2011 22:58:13 -0700 (PDT) Message-ID: <4D92C673.2080107@zonov.org> Date: Wed, 30 Mar 2011 09:58:11 +0400 From: Andrey Zonov User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.1.24) Gecko/20100228 Thunderbird/2.0.0.24 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Jack Vogel References: <4D923931.2070606@zonov.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 05:58:16 -0000 Hi, 9k jumbos is ubiquitous now. I believe we use 9k jumbos for last four years or more. You've got unworkable system after upgrade from 8.1 to 8.2 and documentation doesn't help here. It isn't a good way is made driver (or system) which doesn't work with jumbo by default. My point is if you're using machine with 8 CPUs than maxusers/clusters/9k mbufs should have been increased by system, because on this machine minimum 2Gb memory is available. -- Andrey Zonov 30.03.2011 1:55, Jack Vogel ?????: > Our validation group has a default postinstall process, every > installed system gets those changes, > and these mbuf pool sizes are in that set of changes. While I'm not > opposed to system default settings > changing its usually necessary to have local sys changes anyway, after > all you don't get 9K jumbos > without manually specifying them as well :) > > Regards, > > Jack > > > On Tue, Mar 29, 2011 at 12:55 PM, Andrey Zonov > wrote: > > Hi, > > New igb driver (and I think em too) is required too much 9k mbufs > when it's been configured with mtu = 9000. On machine with 8 CPUs, > driver is required 8192 9k mbufs, but by default there is only > 6400 and network won't start. In previous versions for big mtu it > was used 4k mbufs, by default there is 12800 and all worked fine. > > Maybe it's time to think about increasing default > kern.maxusers/kern.ipc.nmbclusters? or use mp_ncpus for > calculation these values? or just increase amount of > mbuf_cluster/mbuf_jumbo_page/mbuf_jumbo_9k from that driver... > > I just want igb to work out-of-the-box. > > -- > Andrey Zonov > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org > " > > From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 06:25:58 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3D5B31065675 for ; Wed, 30 Mar 2011 06:25:58 +0000 (UTC) (envelope-from bschmidt@techwires.net) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id C79268FC15 for ; Wed, 30 Mar 2011 06:25:57 +0000 (UTC) Received: by fxm11 with SMTP id 11so1031442fxm.13 for ; Tue, 29 Mar 2011 23:25:56 -0700 (PDT) Received: by 10.223.126.140 with SMTP id c12mr797244fas.31.1301466356600; Tue, 29 Mar 2011 23:25:56 -0700 (PDT) Received: from jessie.localnet (p5B2ECD0C.dip0.t-ipconnect.de [91.46.205.12]) by mx.google.com with ESMTPS id f15sm2229159fax.34.2011.03.29.23.25.54 (version=SSLv3 cipher=OTHER); Tue, 29 Mar 2011 23:25:55 -0700 (PDT) Sender: Bernhard Schmidt From: Bernhard Schmidt To: Michael Date: Wed, 30 Mar 2011 08:25:17 +0200 User-Agent: KMail/1.13.5 (Linux/2.6.32-30-generic; KDE/4.4.5; i686; ; ) References: <4D92BE98.40407@gmail.com> In-Reply-To: <4D92BE98.40407@gmail.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <201103300825.17860.bschmidt@freebsd.org> Cc: freebsd-net@freebsd.org Subject: Re: mac_acl - how to get a list of allowed stations X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: bschmidt@freebsd.org List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 06:25:58 -0000 On Wednesday, March 30, 2011 07:24:40 Michael wrote: > Hi, > > To get MAC ACL I'm using wlan_acl and I'm adding stations with "ifconfig > mac:add" command. It works but how can I get a list of currently > allowed/denied stations? Without actually trying this, I'd say the "list" command is what you are looking for. # ifconfig wlan0 list mac -- Bernhard From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 06:37:00 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BFC28106566B; Wed, 30 Mar 2011 06:37:00 +0000 (UTC) (envelope-from mlmichael70@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 1FB938FC15; Wed, 30 Mar 2011 06:36:59 +0000 (UTC) Received: by wyf23 with SMTP id 23so973400wyf.13 for ; Tue, 29 Mar 2011 23:36:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type :content-transfer-encoding; bh=N1Wq9UlMdGYQNgAUiEf3IGs7PWgfede6Qxe3LBRowfY=; b=Pe4H1cUwgvS47ixl020eEsg1+wBb7Iv/3LfOMmZ85A7xFskxZ57y5PxoGLIwTY0dsn rIvnHwLCrSHVYqK6mi5CWZKs+tocVKKAFvv3th6NEKzYuGRQXkLjeP2nHDFr2OZUQvVW xYG4rmkuCUXya6mmoQYNmEcRPSc7G9AgHUt2s= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=UocClS/G/jw3EaLK3bDAQbkuZbuRGUQtv/ceazkgzVhLPSDoOyeFvNi1Pj+KpLTrl3 e+e67V01/utBRsmfKcklXTmUQGE1Yti/SE7yrnKjslLu1yPWtP8YBgmlPA58sqj9Sk44 j4W0DowURwHBKFKbRtoaK3FPzNoGjg4BTnKqg= Received: by 10.227.150.151 with SMTP id y23mr811773wbv.135.1301467018840; Tue, 29 Mar 2011 23:36:58 -0700 (PDT) Received: from prime.nonspace (adsl-178-78-102-245.karoo.kcom.com [178.78.102.245]) by mx.google.com with ESMTPS id g7sm2826097wby.48.2011.03.29.23.36.57 (version=SSLv3 cipher=OTHER); Tue, 29 Mar 2011 23:36:58 -0700 (PDT) Message-ID: <4D92CF89.5040405@gmail.com> Date: Wed, 30 Mar 2011 07:36:57 +0100 From: Michael User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.15) Gecko/20110317 Thunderbird/3.1.9 MIME-Version: 1.0 To: bschmidt@freebsd.org References: <4D92BE98.40407@gmail.com> <201103300825.17860.bschmidt@freebsd.org> In-Reply-To: <201103300825.17860.bschmidt@freebsd.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org Subject: Re: mac_acl - how to get a list of allowed stations X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 06:37:00 -0000 On 30/03/2011 07:25, Bernhard Schmidt wrote: > On Wednesday, March 30, 2011 07:24:40 Michael wrote: >> Hi, >> >> To get MAC ACL I'm using wlan_acl and I'm adding stations with "ifconfig >> mac:add" command. It works but how can I get a list of currently >> allowed/denied stations? > > Without actually trying this, I'd say the "list" command is what you > are looking for. > > # ifconfig wlan0 list mac > Silly me, I was looking at "mac:" commands. Thank you very much. Michael From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 12:38:25 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9CBD1106566B; Wed, 30 Mar 2011 12:38:25 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 750778FC15; Wed, 30 Mar 2011 12:38:25 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 2769F46B55; Wed, 30 Mar 2011 08:38:25 -0400 (EDT) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id B89BF8A027; Wed, 30 Mar 2011 08:38:24 -0400 (EDT) From: John Baldwin To: "Stefan `Sec` Zehl" Date: Wed, 30 Mar 2011 08:38:09 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110325; KDE/4.5.5; amd64; ; ) References: <4D8B99B4.4070404@FreeBSD.org> <201103281423.52202.jhb@freebsd.org> <20110328183810.GF23803@ice.42.org> In-Reply-To: <20110328183810.GF23803@ice.42.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201103300838.09608.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Wed, 30 Mar 2011 08:38:24 -0400 (EDT) Cc: freebsd-net@freebsd.org, Doug Barton Subject: Re: The tale of a TCP bug X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 12:38:25 -0000 On Monday, March 28, 2011 2:38:10 pm Stefan `Sec` Zehl wrote: > Hi, > > On Mon, Mar 28, 2011 at 14:23 -0400, John Baldwin wrote: > > > > No, this is not really right. Your patch from your blog is the best > > fix actually. The reason we want to let 'win' be larger than > > TCP_MAXWIN is that if the remote end sends more data than we've > > advertised but we have room in the socket buffer, we want to go ahead > > and accept the data as valid and ACK it rather than dropping the data > > that is beyond rcv_adv. My change above to rcv_wnd would break this. > > Also, for the TCPS_SYN_SENT case we don't know what 'rcv_scale' is > > until just before we update 'rcv_adv'. This should be the same > > as your patch: > > > > Index: tcp_input.c > > =================================================================== > > --- tcp_input.c (revision 220098) > > +++ tcp_input.c (working copy) > > @@ -1756,7 +1756,8 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, > > (TF_RCVD_SCALE|TF_REQ_SCALE)) { > > tp->rcv_scale = tp->request_r_scale; > > } > > - tp->rcv_adv += tp->rcv_wnd; > > + tp->rcv_adv += imin(tp->rcv_wnd, > > + TCP_MAXWIN << tp->rcv_scale); > > tp->snd_una++; /* SYN is acked */ > > /* > > * If there's data, delay ACK; if there's also a FIN > > > > I've applied this to my test-VM, and as expected it now passes my two > testcases. As far as I'm concerned this fixes it for me. > > I'm interested to see if my adv_neg counting hack together with this > patch still registers any hits. -- If nobody beats me to it, I'll try it > out on my webserver tomorrow. There is at least one case I know of related to a bug I reported earlier where a window probe from a remote connection can cause rcv_nxt to advance past rcv_adv by one. However, I think we want to know about those cases, and we should probably be treating rcv_adv - rcv_nxt as if it is zero in that case, not -1 (my patch in my original e-mail does just that in a different place in tcp_output() when we calculate the window "for real"). -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 12:39:08 2011 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3BDFE106567D; Wed, 30 Mar 2011 12:39:08 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 140CD8FC24; Wed, 30 Mar 2011 12:39:08 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p2UCd7fE003250; Wed, 30 Mar 2011 12:39:07 GMT (envelope-from jhb@freefall.freebsd.org) Received: (from jhb@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p2UCd76L003246; Wed, 30 Mar 2011 12:39:07 GMT (envelope-from jhb) Date: Wed, 30 Mar 2011 12:39:07 GMT Message-Id: <201103301239.p2UCd76L003246@freefall.freebsd.org> To: sec@42.org, jhb@FreeBSD.org, freebsd-net@FreeBSD.org, jhb@FreeBSD.org From: jhb@FreeBSD.org Cc: Subject: Re: kern/154006: [tcp] [patch] tcp "window probe" bug on 64bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 12:39:08 -0000 Synopsis: [tcp] [patch] tcp "window probe" bug on 64bit State-Changed-From-To: open->patched State-Changed-By: jhb State-Changed-When: Wed Mar 30 12:38:33 UTC 2011 State-Changed-Why: Fix committed to HEAD. Responsible-Changed-From-To: freebsd-net->jhb Responsible-Changed-By: jhb Responsible-Changed-When: Wed Mar 30 12:38:33 UTC 2011 Responsible-Changed-Why: Fix committed to HEAD. http://www.freebsd.org/cgi/query-pr.cgi?pr=154006 From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 14:20:50 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C584D106566B for ; Wed, 30 Mar 2011 14:20:50 +0000 (UTC) (envelope-from lacombar@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 8C18C8FC12 for ; Wed, 30 Mar 2011 14:20:50 +0000 (UTC) Received: by iwn33 with SMTP id 33so1634941iwn.13 for ; Wed, 30 Mar 2011 07:20:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=fIxPhEtAEjiup/ZHMVK4wjxr2CjCtkTEYARmT980bQQ=; b=SdrJflvq18qV1irAAIjNSLE7g/qilu+pOHIZmf+RddpBKMXO3NUjwwTUipmoZ1Kars 8zHkS1wxxhW+V/4qeKMi5/NB/W2Dv7d/YgI1t/9dFr8SOB/u72UwIvhL4YvmgOiXzISo iM0mnqTZWt3ZqMBufkzeUW6TNX0Tjlg8IU/PI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=J34IRXxMNdqYYYQg/5Fbfb+neCjdJ7T9vGGtcxthV1oRZHT7iLRGQGZttWm3phYHoO 2k0iIUHyC/LWtrZ9j1Y8HF9GFpXIBAWRx/mlR8Dcm3qDtxeJArS1tqnRrbBLSMHz7pqp MlxXSRVZSYdTUR5ObMVZqE52ebnmjkf0L4ulw= MIME-Version: 1.0 Received: by 10.231.180.94 with SMTP id bt30mr1360548ibb.23.1301494748020; Wed, 30 Mar 2011 07:19:08 -0700 (PDT) Received: by 10.42.146.72 with HTTP; Wed, 30 Mar 2011 07:19:07 -0700 (PDT) In-Reply-To: <4D92BB71.5000900@FreeBSD.org> References: <4D923931.2070606@zonov.org> <4D92BB71.5000900@FreeBSD.org> Date: Wed, 30 Mar 2011 10:19:07 -0400 Message-ID: From: Arnaud Lacombe To: Doug Barton Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 14:20:50 -0000 Hi, On Wed, Mar 30, 2011 at 1:11 AM, Doug Barton wrote: > On 03/29/2011 22:07, Arnaud Lacombe wrote: >> >> ... or maintain internal changes to the driver to make it not that memor= y >> hungry/behave well under memory pressure, especially on system where >> memory_is_ =A0a constraint. > > If you come up with patches, I'm sure everyone would like to see them. > No, I came with a patch, Jack sent it explicitly to /dev/null, telling me that what I was checking was not available in the mode the driver was in. Then I took the chip documentation, quoted all the chapters which lead me to believe that what I was checking _was_ available in the mode the driver was. I never got an answer. Unfortunately, all these discussion are not publicly available because Jack like doing things off the list. The only things I've been able to get from Jack is "We, at Intel, test em(4) at 256k nmbclusters. We do not have problem. If you have problem, raise nmbcluster.". 256k nmbcluster in my environment is not acceptable. > Meanwhile, there are times where memory IS a constraint, and there are so= me > things you can't do without more of it. > yes, but the driver should not need a manual reset between the time resource are (heavily) scarce and the time it became available again. - Arnaud From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 14:22:42 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 554F9106564A; Wed, 30 Mar 2011 14:22:42 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail.yamagi.overkill.yamagi.org (unknown [IPv6:2a01:4f8:121:2102:1::7]) by mx1.freebsd.org (Postfix) with ESMTP id 8F7988FC0A; Wed, 30 Mar 2011 14:22:41 +0000 (UTC) Received: from [2001:5c0:110d:6600:21b:21ff:fe07:b562] (unknown [IPv6:2001:5c0:110d:6600:21b:21ff:fe07:b562]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.yamagi.overkill.yamagi.org (Postfix) with ESMTPSA id 7C52616663D1; Wed, 30 Mar 2011 16:22:36 +0200 (CEST) Date: Wed, 30 Mar 2011 16:22:23 +0200 (CEST) From: Yamagi Burmeister X-X-Sender: yamagi@saya.home.yamagi.org To: freebsd-net@freebsd.org Message-ID: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII Cc: yongari@FreeBSD.org Subject: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 14:22:42 -0000 Hi, I recently got four about two years old Asus M3A-H/HDMI mainboards with an integrated Attansic L2 ethernet controller. This NIC is supported by age(4) and recognized by freebsd: ---- age0: mem 0xfeac0000-0xfeafffff irq 18 at device 0.0 on pci2 age0: 1280 Tx FIFO, 2364 Rx FIFO age0: Using 1 MSI messages. age0: 4GB boundary crossed, switching to 32bit DMA addressing mode. miibus0: on age0 atphy0: PHY 0 on miibus0 atphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, 1000baseT-FDX-master, auto age0: Ethernet address: 00:23:54:31:a0:12 age0: [FILTER] ---- age0: flags=8843 metric 0 mtu 1500 options=c319b ether 00:23:54:31:a0:12 inet6 fe80::223:54ff:fe31:a012%age0 prefixlen 64 scopeid 0x1 nd6 options=3 media: Ethernet autoselect (none) status: no carrier ---- All for boxes are unstable if the Attansic NIC is in use, no one of them survived more than 60 minutes of ~20mb/s network traffic. I managed to get some coredumps and extracted the backtraces. Since everytime one of the boxes paniced I got different panic message and a different backtrace with a different subsystem involved I suspected broken hardware. I plugged a em(4) NIC into the PCI slot and wasn't able to reproduce the problem, in fact the boxes run rock solid for several days. Next I set up a Windows 7, installed the Attansic vendor driver and did another run. All went smooth, no crash for nearly 24 hours. My guess is kernel memory corruption by age(4), which would explain all the different backtraces and the different panic messages. This problem is reproducible in at least FreeBSD 7.4 and 8.2 and with TSO4 enabled and disabled. I'm willing to debug this, but I really don't know how. So any help or a pointer into the right direction would be appreciated. ---- Three backtraces, all of them occurred while receiving and sending data via NFS over the age(4) NIC: panic: initiate_write_filepage: dir inum 50001080 != new 0 cpuid = 2 #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:251 #1 0xffffffff8018604c in db_fncall (dummy1=Variable "dummy1" is not available. ) at /usr/src/sys/ddb/db_command.c:548 #2 0xffffffff80186381 in db_command (last_cmdp=0xffffffff806178c0, cmd_table=Variable "cmd_table" is not available. ) at /usr/src/sys/ddb/db_command.c:445 #3 0xffffffff801865d0 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 #4 0xffffffff80188619 in db_trap (type=Variable "type" is not available. ) at /usr/src/sys/ddb/db_main.c:229 #5 0xffffffff8024d7fe in kdb_trap (type=3, code=0, tf=0xffffff8243513720) at /usr/src/sys/kern/subr_kdb.c:546 #6 0xffffffff80424366 in trap (frame=0xffffff8243513720) at /usr/src/sys/amd64/amd64/trap.c:566 #7 0xffffffff8040c234 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #8 0xffffffff8024d99d in kdb_enter (why=0xffffffff80479419 "panic", msg=0xa
) at cpufunc.h:63 #9 0xffffffff8021c4f0 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:575 #10 0xffffffff80c5925e in softdep_fsync_mountdev () from /boot/kernel/ufs.ko #11 0xffffff00067a0460 in ?? () #12 0x0000000000000000 in ?? () #13 0xffffff0167d49988 in ?? () #14 0xffffff000694000e in ?? () #15 0xffffff0006b32800 in ?? () #16 0xffffff81ef201bd0 in ?? () #17 0xffffff81ef201bd0 in ?? () #18 0xffffff0006b613b0 in ?? () #19 0xffffff0006b614c8 in ?? () #20 0xffffff0156024878 in ?? () #21 0xffffff8243513980 in ?? () #22 0xffffffff80c5c174 in ffs_flushfiles () from /boot/kernel/ufs.ko #23 0xffffff81ef201bd0 in ?? () #24 0xffffff013c210a80 in ?? () #25 0x0000000000000004 in ?? () #26 0x0000000000000000 in ?? () #27 0xffffff82435139b0 in ?? () #28 0xffffffff80c3ea25 in ufs_do_nfs4_acl_inheritance () from /boot/kernel/ufs.ko #29 0xffffff82435139b0 in ?? () #30 0xffffffff80459fb5 in VOP_STRATEGY_APV (vop=0xffffff00067a0460, a=0xffffff0167d49980) at vnode_if.c:2169 Previous frame inner to this frame (corrupt stack?) ---- Fatal trap 9: general protection fault while in kernel mode cpuid = 2; apic id = 02 instruction pointer = 0x20:0xffffffff8020ca0e stack pointer = 0x28:0xffffff82435139e0 frame pointer = 0x28:0xffffff8243513a00 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 21 (syncer) #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:251 #1 0xffffffff8018604c in db_fncall (dummy1=Variable "dummy1" is not available. ) at /usr/src/sys/ddb/db_command.c:548 #2 0xffffffff80186381 in db_command (last_cmdp=0xffffffff806178c0, cmd_table=Variable "cmd_table" is not available. ) at /usr/src/sys/ddb/db_command.c:445 #3 0xffffffff801865d0 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 #4 0xffffffff80188619 in db_trap (type=Variable "type" is not available. ) at /usr/src/sys/ddb/db_main.c:229 #5 0xffffffff8024d7fe in kdb_trap (type=9, code=0, tf=0xffffff8243513930) at /usr/src/sys/kern/subr_kdb.c:546 #6 0xffffffff80423d1d in trap_fatal (frame=0xffffff8243513930, eva=Variable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:778 #7 0xffffffff804242f9 in trap (frame=0xffffff8243513930) at /usr/src/sys/amd64/amd64/trap.c:592 #8 0xffffffff8040c234 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #9 0xffffffff8020ca0e in _mtx_lock_sleep (m=0xffffff0106d57820, tid=18446742974306583648, opts=Variable "opts" is not available. ) at /usr/src/sys/kern/kern_mutex.c:369 #10 0xffffffff802b16d7 in vfs_msync (mp=0xffffff00069ad8d0, flags=2) at /usr/src/sys/kern/vfs_subr.c:3219 #11 0xffffffff802b190a in sync_fsync (ap=Variable "ap" is not available. ) at /usr/src/sys/kern/vfs_subr.c:3473 #12 0xffffffff802afabe in sync_vnode (slp=0xffffff00067688b8, bo=0xffffff8243513ba0, td=0xffffff00067a0460) at vnode_if.h:549 #13 0xffffffff802afdb1 in sched_sync () at /usr/src/sys/kern/vfs_subr.c:1836 #14 0xffffffff801f1a48 in fork_exit (callout=0xffffffff802afbe0 , arg=0x0, frame=0xffffff8243513c40) at /usr/src/sys/kern/kern_fork.c:845 #15 0xffffffff8040c6fe in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:565 #16 0x0000000000000000 in ?? () #17 0x0000000000000000 in ?? () #18 0x0000000000000001 in ?? () #19 0x0000000000000000 in ?? () #20 0x0000000000000000 in ?? () #21 0x0000000000000000 in ?? () #22 0x0000000000000000 in ?? () #23 0x0000000000000000 in ?? () #24 0x0000000000000000 in ?? () #25 0x0000000000000000 in ?? () #26 0x0000000000000000 in ?? () #27 0x0000000000000000 in ?? () #28 0x0000000000000000 in ?? () #29 0x0000000000000000 in ?? () #30 0x0000000000000000 in ?? () #31 0x0000000000000000 in ?? () #32 0x0000000000000000 in ?? () #33 0x0000000000000000 in ?? () #34 0x0000000000000000 in ?? () #35 0x0000000000000000 in ?? () #36 0x0000000000000000 in ?? () #37 0x0000000000000000 in ?? () #38 0x0000000000000000 in ?? () #39 0x0000000000000000 in ?? () #40 0xffffffff80637f00 in tdq_cpu () #41 0x0000000000000002 in ?? () #42 0x0000000000000000 in ?? () #43 0xffffff00067a0460 in ?? () #44 0xffffff8243513830 in ?? () #45 0xffffff82435137d8 in ?? () #46 0xffffff000347b460 in ?? () #47 0xffffffff80241b79 in sched_switch (td=0xffffffff802afbe0, newtd=0x0, flags=Variable "flags" is not available. ) at /usr/src/sys/kern/sched_ule.c:1852 ---- Fatal trap 9: general protection fault while in kernel mode cpuid = 1; apic id = 01 instruction pointer = 0x20:0xffffffff803e3b0b stack pointer = 0x28:0xffffff8245984890 frame pointer = 0x28:0xffffff82459848a0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 3250 (nfsiod 15) #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:251 #1 0xffffffff8018604c in db_fncall (dummy1=Variable "dummy1" is not available. ) at /usr/src/sys/ddb/db_command.c:548 #2 0xffffffff80186381 in db_command (last_cmdp=0xffffffff806178c0, cmd_table=Variable "cmd_table" is not available. ) at /usr/src/sys/ddb/db_command.c:445 #3 0xffffffff801865d0 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 #4 0xffffffff80188619 in db_trap (type=Variable "type" is not available. ) at /usr/src/sys/ddb/db_main.c:229 #5 0xffffffff8024d7fe in kdb_trap (type=9, code=0, tf=0xffffff82459847e0) at /usr/src/sys/kern/subr_kdb.c:546 #6 0xffffffff80423d1d in trap_fatal (frame=0xffffff82459847e0, eva=Variable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:778 #7 0xffffffff804242f9 in trap (frame=0xffffff82459847e0) at /usr/src/sys/amd64/amd64/trap.c:592 #8 0xffffffff8040c234 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #9 0xffffffff803e3b0b in slab_alloc_item (zone=Variable "zone" is not available. ) at /usr/src/sys/vm/uma_core.c:2321 #10 0xffffffff803e86a9 in uma_zalloc_arg (zone=0xffffff022ffe6c80, udata=0xffffff005f78a000, flags=3) at /usr/src/sys/vm/uma_core.c:2406 #11 0xffffffff81c5d4ec in nfsm_uiotombuf () from /boot/kernel/nfsclient.ko #12 0xffffff8245984a80 in ?? () #13 0x0000800000000001 in ?? () #14 0xffffff8245984a60 in ?? () #15 0x0000800006b86000 in ?? () #16 0xffffff8223ad1fcc in ?? () #17 0xffffff0006622200 in ?? () #18 0x0000000000008000 in ?? () #19 0x0000000100000000 in ?? () #20 0xffffff012ec84588 in ?? () #21 0xffffff005f84c000 in ?? () #22 0xffffff8245984b2c in ?? () #23 0xffffff8245984ae0 in ?? () #24 0xffffff005f84c000 in ?? () #25 0xffffff005f9c0b00 in ?? () #26 0x0000000000008000 in ?? () #27 0xffffff8245984ac0 in ?? () #28 0xffffffff81c64e97 in nfs_writerpc () from /boot/kernel/nfsclient.ko Previous frame inner to this frame (corrupt stack?) ---- Thanks, Yamagi From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 14:33:05 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ED9E41065673 for ; Wed, 30 Mar 2011 14:33:05 +0000 (UTC) (envelope-from lacombar@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id B5D2D8FC08 for ; Wed, 30 Mar 2011 14:33:05 +0000 (UTC) Received: by iwn33 with SMTP id 33so1647410iwn.13 for ; Wed, 30 Mar 2011 07:33:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=48Akc4g8AU2erehAy2/qkTBDqBRZjhVg9qn6IWljfz8=; b=bjMhQ7A44sbP6RqV0YJLQyaC1OrjxAlIJDcWNu1xUkyDc4ocZVlH9x5/0BiHbs99tN 9JWIxo0f8hJHtyCKXdMMBmu+yqLbOzpJYnnAxmaclNkC5NKNwWKT++Ummevdvvp1ar2c Ml4YfR4N2Z3tuFHfaL9lf5jpORM1DRXUy8dvQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=p7+50f/PGH6A5W7I046PHt/ZNl2nTSk4ELjNFy6wt/zMnq27LZpU7/UMKkCXMbDfEQ u1cUmBPM/HJrgju0yQXOn2q0xrx+X6eJUBVzXnxRoPulVJdnvUXnq5mOqvML1Htmf8Ju ommeSt/8lz6scS7n7QbdI6zypgc8sKUXfKA5U= MIME-Version: 1.0 Received: by 10.231.197.27 with SMTP id ei27mr1210172ibb.198.1301495585062; Wed, 30 Mar 2011 07:33:05 -0700 (PDT) Received: by 10.42.146.72 with HTTP; Wed, 30 Mar 2011 07:33:05 -0700 (PDT) In-Reply-To: <4D92C673.2080107@zonov.org> References: <4D923931.2070606@zonov.org> <4D92C673.2080107@zonov.org> Date: Wed, 30 Mar 2011 10:33:05 -0400 Message-ID: From: Arnaud Lacombe To: Andrey Zonov Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org, Jack Vogel Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 14:33:06 -0000 Hi, On Wed, Mar 30, 2011 at 1:58 AM, Andrey Zonov wrote: > My point is if you're using machine with 8 CPUs than maxusers/clusters/9k > mbufs should have been increased by system, because on this machine minimum > 2Gb memory is available. > I am doubtful that the number of CPU[0] or number of users (yes, I know `maxusers' is currently used to compute the default `nmbcluster'...) can be linked to any network load pattern at all. You can have a 24 CPU machine made for 4096 users with a single NIC, not requiring much memory, while a 1 CPU machine with only 1 users can have +8 NIC and require a huge quantity of memory. Available KVM space should also be taken into account, as it is rather limited on i386. - Arnaud [0]: even more today where you can have a huge number of virtual CPU. From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 15:56:28 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2CD15106564A for ; Wed, 30 Mar 2011 15:56:28 +0000 (UTC) (envelope-from dudu@dudu.ro) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id EE64D8FC0C for ; Wed, 30 Mar 2011 15:56:27 +0000 (UTC) Received: by iwn33 with SMTP id 33so1737980iwn.13 for ; Wed, 30 Mar 2011 08:56:27 -0700 (PDT) Received: by 10.231.34.139 with SMTP id l11mr1533030ibd.31.1301500587167; Wed, 30 Mar 2011 08:56:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.42.3.13 with HTTP; Wed, 30 Mar 2011 08:55:47 -0700 (PDT) In-Reply-To: <20110313011632.GA1621@michelle.cdnetworks.com> References: <20110313011632.GA1621@michelle.cdnetworks.com> From: Vlad Galu Date: Wed, 30 Mar 2011 17:55:47 +0200 Message-ID: To: pyunyh@gmail.com Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org, Arnaud Lacombe Subject: Re: bge(4) on RELENG_8 mbuf cluster starvation X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 15:56:28 -0000 On Sun, Mar 13, 2011 at 2:16 AM, YongHyeon PYUN wrote: > On Sat, Mar 12, 2011 at 09:17:28PM +0100, Vlad Galu wrote: > > On Sat, Mar 12, 2011 at 8:53 PM, Arnaud Lacombe > wrote: > > > > > Hi, > > > > > > On Sat, Mar 12, 2011 at 4:03 AM, Vlad Galu wrote: > > > > Hi folks, > > > > > > > > On a fairly busy recent (r219010) RELENG_8 machine I keep getting > > > > -- cut here -- > > > > 1096/1454/2550 mbufs in use (current/cache/total) > > > > 1035/731/1766/262144 mbuf clusters in use (current/cache/total/max) > > > > 1035/202 mbuf+clusters out of packet secondary zone in use > > > (current/cache) > > > > 0/117/117/12800 4k (page size) jumbo clusters in use > > > > (current/cache/total/max) > > > > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) > > > > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) > > > > 2344K/2293K/4637K bytes allocated to network (current/cache/total) > > > > 0/70128196/37726935 requests for mbufs denied > > > (mbufs/clusters/mbuf+clusters) > > > > ^^^^^^^^^^^^^^^^^^^^^ > > > > -- and here -- > > > > > > > > kern.ipc.nmbclusters is set to 131072. Other settings: > > > no, netstat(8) says 262144. > > > > > > > > Heh, you're right, I forgot I'd doubled it a while ago. Wrote that from > the > > top of my head. > > > > > > > Maybe can you include $(sysctl dev.bge) ? Might be useful. > > > > > > - Arnaud > > > > > > > Sure: > > [...] > > > dev.bge.1.%desc: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC > rev. > > 0x004101 > > dev.bge.1.%driver: bge > > dev.bge.1.%location: slot=0 function=0 > > dev.bge.1.%pnpinfo: vendor=0x14e4 device=0x1659 subvendor=0x1014 > > subdevice=0x02c6 class=0x020000 > > dev.bge.1.%parent: pci5 > > dev.bge.1.forced_collapse: 2 > > dev.bge.1.forced_udpcsum: 0 > > dev.bge.1.stats.FramesDroppedDueToFilters: 0 > > dev.bge.1.stats.DmaWriteQueueFull: 0 > > dev.bge.1.stats.DmaWriteHighPriQueueFull: 0 > > dev.bge.1.stats.NoMoreRxBDs: 680050 > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > This indicates bge(4) encountered RX buffer shortage. Perhaps > bge(4) couldn't fill new RX buffers for incoming frames due to > other system activities. > > > dev.bge.1.stats.InputDiscards: 228755931 > > This counter indicates number of frames discarded due to RX buffer > shortage. bge(4) discards received frame if it failed to allocate > new RX buffer such that InputDiscards is normally higher than > NoMoreRxBDs. > > > dev.bge.1.stats.InputErrors: 49080818 > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > Something is wrong here. Too many frames were classified as error > frames. You may see poor RX performance. > > > dev.bge.1.stats.RecvThresholdHit: 0 > > dev.bge.1.stats.rx.ifHCInOctets: 2095148839247 > > dev.bge.1.stats.rx.Fragments: 47887706 > > dev.bge.1.stats.rx.UnicastPkts: 32672557601 > > dev.bge.1.stats.rx.MulticastPkts: 1218 > > dev.bge.1.stats.rx.BroadcastPkts: 2 > > dev.bge.1.stats.rx.FCSErrors: 2822217 > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > FCS errors are too high. Please check cabling again(I'm assuming > the controller is not broken here). I think you can use vendor's > diagnostic tools to verify this. > > > dev.bge.1.stats.rx.AlignmentErrors: 0 > > dev.bge.1.stats.rx.xonPauseFramesReceived: 0 > > dev.bge.1.stats.rx.xoffPauseFramesReceived: 0 > > dev.bge.1.stats.rx.ControlFramesReceived: 0 > > dev.bge.1.stats.rx.xoffStateEntered: 0 > > dev.bge.1.stats.rx.FramesTooLong: 0 > > dev.bge.1.stats.rx.Jabbers: 0 > > dev.bge.1.stats.rx.UndersizePkts: 0 > > dev.bge.1.stats.tx.ifHCOutOctets: 48751515826 > > dev.bge.1.stats.tx.Collisions: 0 > > dev.bge.1.stats.tx.XonSent: 0 > > dev.bge.1.stats.tx.XoffSent: 0 > > dev.bge.1.stats.tx.InternalMacTransmitErrors: 0 > > dev.bge.1.stats.tx.SingleCollisionFrames: 0 > > dev.bge.1.stats.tx.MultipleCollisionFrames: 0 > > dev.bge.1.stats.tx.DeferredTransmissions: 0 > > dev.bge.1.stats.tx.ExcessiveCollisions: 0 > > dev.bge.1.stats.tx.LateCollisions: 0 > > dev.bge.1.stats.tx.UnicastPkts: 281039183 > > dev.bge.1.stats.tx.MulticastPkts: 0 > > dev.bge.1.stats.tx.BroadcastPkts: 1153 > > -- and here -- > > > > And now, that I remembered about this as well: > > -- cut here -- > > Name Mtu Network Address Ipkts Ierrs Idrop Opkts > > Oerrs Coll > > bge1 1500 00:11:25:22:0d:ed 32321767025 278517070 > 37726837 > > 281068216 0 0 > > -- and here -- > > The colo provider changed my cable a couple of times so I'd not blame it > on > > that. Unfortunately, I don't have access to the port statistics on the > > switch. Running netstat with -w1 yields between 0 and 4 errors/second. > > > > Hardware MAC counters still show high number of FCS errors. The > service provider should have to check possible cabling issues on > the port of the switch. > After swapping cables and moving the NIC into another switch, there are some improvements. However: -- cut here -- dev.bge.1.%desc: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x004101 dev.bge.1.%driver: bge dev.bge.1.%location: slot=0 function=0 dev.bge.1.%pnpinfo: vendor=0x14e4 device=0x1659 subvendor=0x1014 subdevice=0x02c6 class=0x020000 dev.bge.1.%parent: pci5 dev.bge.1.forced_collapse: 0 dev.bge.1.forced_udpcsum: 0 dev.bge.1.stats.FramesDroppedDueToFilters: 0 dev.bge.1.stats.DmaWriteQueueFull: 0 dev.bge.1.stats.DmaWriteHighPriQueueFull: 0 dev.bge.1.stats.NoMoreRxBDs: 243248 <- this dev.bge.1.stats.InputDiscards: 9945500 dev.bge.1.stats.InputErrors: 0 dev.bge.1.stats.RecvThresholdHit: 0 dev.bge.1.stats.rx.ifHCInOctets: 36697296701 dev.bge.1.stats.rx.Fragments: 0 dev.bge.1.stats.rx.UnicastPkts: 549334370 dev.bge.1.stats.rx.MulticastPkts: 113638 dev.bge.1.stats.rx.BroadcastPkts: 0 dev.bge.1.stats.rx.FCSErrors: 0 dev.bge.1.stats.rx.AlignmentErrors: 0 dev.bge.1.stats.rx.xonPauseFramesReceived: 0 dev.bge.1.stats.rx.xoffPauseFramesReceived: 0 dev.bge.1.stats.rx.ControlFramesReceived: 0 dev.bge.1.stats.rx.xoffStateEntered: 0 dev.bge.1.stats.rx.FramesTooLong: 0 dev.bge.1.stats.rx.Jabbers: 0 dev.bge.1.stats.rx.UndersizePkts: 0 dev.bge.1.stats.tx.ifHCOutOctets: 10578000636 dev.bge.1.stats.tx.Collisions: 0 dev.bge.1.stats.tx.XonSent: 0 dev.bge.1.stats.tx.XoffSent: 0 dev.bge.1.stats.tx.InternalMacTransmitErrors: 0 dev.bge.1.stats.tx.SingleCollisionFrames: 0 dev.bge.1.stats.tx.MultipleCollisionFrames: 0 dev.bge.1.stats.tx.DeferredTransmissions: 0 dev.bge.1.stats.tx.ExcessiveCollisions: 0 dev.bge.1.stats.tx.LateCollisions: 0 dev.bge.1.stats.tx.UnicastPkts: 64545266 dev.bge.1.stats.tx.MulticastPkts: 0 dev.bge.1.stats.tx.BroadcastPkts: 313 and 0/1710531/2006005 requests for mbufs denied (mbufs/clusters/mbuf+clusters) -- and here -- I'll start gathering some stats/charts on this host to see if I can correlate the starvation with other system events. > However this does not explain why you have large number of mbuf > cluster allocation failure. The only wild guess I have at this > moment is some process or kernel subsystems are too slow to release > allocated mbuf clusters. Did you check various system activities > while seeing the issue? > -- Good, fast & cheap. Pick any two. From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 16:44:37 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:4f8:fff6::35]) by hub.freebsd.org (Postfix) with ESMTP id 7CCC8106564A for ; Wed, 30 Mar 2011 16:44:37 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from [127.0.0.1] (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx2.freebsd.org (Postfix) with ESMTP id 39941158A95; Wed, 30 Mar 2011 16:44:37 +0000 (UTC) Message-ID: <4D935DF6.90906@FreeBSD.org> Date: Wed, 30 Mar 2011 09:44:38 -0700 From: Doug Barton Organization: http://www.FreeBSD.org/ User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9 MIME-Version: 1.0 To: Arnaud Lacombe References: <4D923931.2070606@zonov.org> <4D92BB71.5000900@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 16:44:37 -0000 On 3/30/2011 7:19 AM, Arnaud Lacombe wrote: > Hi, > > On Wed, Mar 30, 2011 at 1:11 AM, Doug Barton wrote: > The only things I've been able to get from Jack is "We, at Intel, test > em(4) at 256k nmbclusters. We do not have problem. If you have > problem, raise nmbcluster.". 256k nmbcluster in my environment is not > acceptable. > >> Meanwhile, there are times where memory IS a constraint, and there are some >> things you can't do without more of it. >> > yes, but the driver should not need a manual reset between the time > resource are (heavily) scarce and the time it became available again. If you're facing that situation then obviously your system is constrained by hardware. It sounds like you have 3 choices: 1. Add more RAM 2. Use a different NIC 3. Set MTU lower I'm sorry to say that just because the software is free doesn't mean that we can guarantee that it will work on all hardware. Sometimes the physical limits of the hardware are what need to be changed. Good luck, Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 17:06:57 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EEEC11065672; Wed, 30 Mar 2011 17:06:57 +0000 (UTC) (envelope-from lacombar@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 98FB38FC16; Wed, 30 Mar 2011 17:06:57 +0000 (UTC) Received: by iwn33 with SMTP id 33so1810252iwn.13 for ; Wed, 30 Mar 2011 10:06:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=VQ/D77275z+OmPcS/oezEH2bQNo+mKvWeUluwYK0sDA=; b=HyfPSY1JQ2eKIBvecrmAulN1W/CSlhFkTmwZhyHc5FQOWvR9R3JBoNHtRvekcPePlD 29RhWOLfoZwGxm5Vj7NbzmvK3vz2mwT63KuXqUT+1VqVUBXExHgrz5rPUbr+y7s5CSUI qhw1/5h+p4tMOgmsY6Jw4Jom03eRIoYdCRk90= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=gMA8Se5RxiRri7/arZSDqqMReJDKkJLXb3hQHUSuO1WxVZjk2pEmDKHWg6mklAoi3a cFH1whA/JXiOKYaU2rzkZj2379/7V0eYuN/+jKEOMuIi1zJFnKCShU6sB8uPJXtrPLE4 jgx5qpBcVRVGDl3pzE4PJn6xRX5ySW8ts5pi4= MIME-Version: 1.0 Received: by 10.43.60.71 with SMTP id wr7mr1367279icb.148.1301504817002; Wed, 30 Mar 2011 10:06:57 -0700 (PDT) Received: by 10.42.146.72 with HTTP; Wed, 30 Mar 2011 10:06:56 -0700 (PDT) In-Reply-To: <4D935DF6.90906@FreeBSD.org> References: <4D923931.2070606@zonov.org> <4D92BB71.5000900@FreeBSD.org> <4D935DF6.90906@FreeBSD.org> Date: Wed, 30 Mar 2011 13:06:56 -0400 Message-ID: From: Arnaud Lacombe To: Doug Barton Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 17:06:58 -0000 Hi, On Wed, Mar 30, 2011 at 12:44 PM, Doug Barton wrote: > On 3/30/2011 7:19 AM, Arnaud Lacombe wrote: >> >> Hi, >> >> On Wed, Mar 30, 2011 at 1:11 AM, Doug Barton =A0wrote= : > >> The only things I've been able to get from Jack is "We, at Intel, test >> em(4) at 256k nmbclusters. We do not have problem. If you have >> problem, raise nmbcluster.". 256k nmbcluster in my environment is not >> acceptable. >> >>> Meanwhile, there are times where memory IS a constraint, and there are >>> some >>> things you can't do without more of it. >>> >> yes, but the driver should not need a manual reset between the time >> resource are (heavily) scarce and the time it became available again. > > If you're facing that situation then obviously your system is constrained= by > hardware. No. We are taking about exceptional recoverable situation not handled by the software, it should not bring the complete system down. If you're swapping code has defect, you do not tell one to buy more RAM not to trigger the defective code, you fix the code. The situation is similar here. - Arnaud From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 17:10:30 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:4f8:fff6::35]) by hub.freebsd.org (Postfix) with ESMTP id 1436A106566C for ; Wed, 30 Mar 2011 17:10:30 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from [127.0.0.1] (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx2.freebsd.org (Postfix) with ESMTP id E148E14EB07; Wed, 30 Mar 2011 17:10:29 +0000 (UTC) Message-ID: <4D936407.2030900@FreeBSD.org> Date: Wed, 30 Mar 2011 10:10:31 -0700 From: Doug Barton Organization: http://www.FreeBSD.org/ User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9 MIME-Version: 1.0 To: Arnaud Lacombe References: <4D923931.2070606@zonov.org> <4D92BB71.5000900@FreeBSD.org> <4D935DF6.90906@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 17:10:30 -0000 On 3/30/2011 10:06 AM, Arnaud Lacombe wrote: > No. We are taking about exceptional recoverable situation not handled > by the software, it should not bring the complete system down. If > you're swapping code has defect, you do not tell one to buy more RAM > not to trigger the defective code, you fix the code. The situation is > similar here. I understand that you believe the situations to be similar, however you may make your point more clearly if you share the patches you've developed with the list so that others can review/comment/etc. There is no need to cc me on further related messages. Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 17:11:39 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8CB5D106564A for ; Wed, 30 Mar 2011 17:11:39 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id 56FCD8FC0C for ; Wed, 30 Mar 2011 17:11:39 +0000 (UTC) Received: by pzk27 with SMTP id 27so283789pzk.13 for ; Wed, 30 Mar 2011 10:11:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:date:to:cc:subject:message-id:reply-to :references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=Pc5PqTEBJqGtYS3ECTMoN+rzf8G8otcVMLSDmzq98pQ=; b=RuJlJXeoANVFb3rMxNuIc187176FbLUX5aotf1mLADeUjnKlDwwweHmyVknYCJaSwj 6aDO9ahnp/giZVd2eNIeZnzQnbkAq3x6PE+apD97PbVIDFuxRzDomul3Pju64WeSr6Br NFrirvknmS2id+fyLtWDhVNF6JfilWnUTdaEU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=bKoL8if5JHvFeVeZQiLUBR+aSmqGxWXmNiXAQ1TXMBHUpNKa5f0uNDphDteF7F4D/F HSS9vS9+Q0l7Iu1uh7ZuBW4RtcaK3bq3ys64LTbnMgddqAtSy8Pde4nUqYkPf1M3hN45 7W9IKbCTCq717KOI4+4LQPLmc7+rv4D2sePf8= Received: by 10.142.133.17 with SMTP id g17mr1080399wfd.62.1301505098706; Wed, 30 Mar 2011 10:11:38 -0700 (PDT) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id x11sm323011wfd.13.2011.03.30.10.11.34 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 30 Mar 2011 10:11:36 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Wed, 30 Mar 2011 10:10:23 -0700 From: YongHyeon PYUN Date: Wed, 30 Mar 2011 10:10:23 -0700 To: Vlad Galu Message-ID: <20110330171023.GA8601@michelle.cdnetworks.com> References: <20110313011632.GA1621@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org, Arnaud Lacombe Subject: Re: bge(4) on RELENG_8 mbuf cluster starvation X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 17:11:39 -0000 On Wed, Mar 30, 2011 at 05:55:47PM +0200, Vlad Galu wrote: > On Sun, Mar 13, 2011 at 2:16 AM, YongHyeon PYUN wrote: > > > On Sat, Mar 12, 2011 at 09:17:28PM +0100, Vlad Galu wrote: > > > On Sat, Mar 12, 2011 at 8:53 PM, Arnaud Lacombe > > wrote: > > > > > > > Hi, > > > > > > > > On Sat, Mar 12, 2011 at 4:03 AM, Vlad Galu wrote: > > > > > Hi folks, > > > > > > > > > > On a fairly busy recent (r219010) RELENG_8 machine I keep getting > > > > > -- cut here -- > > > > > 1096/1454/2550 mbufs in use (current/cache/total) > > > > > 1035/731/1766/262144 mbuf clusters in use (current/cache/total/max) > > > > > 1035/202 mbuf+clusters out of packet secondary zone in use > > > > (current/cache) > > > > > 0/117/117/12800 4k (page size) jumbo clusters in use > > > > > (current/cache/total/max) > > > > > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) > > > > > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) > > > > > 2344K/2293K/4637K bytes allocated to network (current/cache/total) > > > > > 0/70128196/37726935 requests for mbufs denied > > > > (mbufs/clusters/mbuf+clusters) > > > > > ^^^^^^^^^^^^^^^^^^^^^ > > > > > -- and here -- > > > > > > > > > > kern.ipc.nmbclusters is set to 131072. Other settings: > > > > no, netstat(8) says 262144. > > > > > > > > > > > Heh, you're right, I forgot I'd doubled it a while ago. Wrote that from > > the > > > top of my head. > > > > > > > > > > Maybe can you include $(sysctl dev.bge) ? Might be useful. > > > > > > > > - Arnaud > > > > > > > > > > Sure: > > > > [...] > > > > > dev.bge.1.%desc: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC > > rev. > > > 0x004101 > > > dev.bge.1.%driver: bge > > > dev.bge.1.%location: slot=0 function=0 > > > dev.bge.1.%pnpinfo: vendor=0x14e4 device=0x1659 subvendor=0x1014 > > > subdevice=0x02c6 class=0x020000 > > > dev.bge.1.%parent: pci5 > > > dev.bge.1.forced_collapse: 2 > > > dev.bge.1.forced_udpcsum: 0 > > > dev.bge.1.stats.FramesDroppedDueToFilters: 0 > > > dev.bge.1.stats.DmaWriteQueueFull: 0 > > > dev.bge.1.stats.DmaWriteHighPriQueueFull: 0 > > > dev.bge.1.stats.NoMoreRxBDs: 680050 > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > This indicates bge(4) encountered RX buffer shortage. Perhaps > > bge(4) couldn't fill new RX buffers for incoming frames due to > > other system activities. > > > > > dev.bge.1.stats.InputDiscards: 228755931 > > > > This counter indicates number of frames discarded due to RX buffer > > shortage. bge(4) discards received frame if it failed to allocate > > new RX buffer such that InputDiscards is normally higher than > > NoMoreRxBDs. > > > > > dev.bge.1.stats.InputErrors: 49080818 > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Something is wrong here. Too many frames were classified as error > > frames. You may see poor RX performance. > > > > > dev.bge.1.stats.RecvThresholdHit: 0 > > > dev.bge.1.stats.rx.ifHCInOctets: 2095148839247 > > > dev.bge.1.stats.rx.Fragments: 47887706 > > > dev.bge.1.stats.rx.UnicastPkts: 32672557601 > > > dev.bge.1.stats.rx.MulticastPkts: 1218 > > > dev.bge.1.stats.rx.BroadcastPkts: 2 > > > dev.bge.1.stats.rx.FCSErrors: 2822217 > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > FCS errors are too high. Please check cabling again(I'm assuming > > the controller is not broken here). I think you can use vendor's > > diagnostic tools to verify this. > > > > > dev.bge.1.stats.rx.AlignmentErrors: 0 > > > dev.bge.1.stats.rx.xonPauseFramesReceived: 0 > > > dev.bge.1.stats.rx.xoffPauseFramesReceived: 0 > > > dev.bge.1.stats.rx.ControlFramesReceived: 0 > > > dev.bge.1.stats.rx.xoffStateEntered: 0 > > > dev.bge.1.stats.rx.FramesTooLong: 0 > > > dev.bge.1.stats.rx.Jabbers: 0 > > > dev.bge.1.stats.rx.UndersizePkts: 0 > > > dev.bge.1.stats.tx.ifHCOutOctets: 48751515826 > > > dev.bge.1.stats.tx.Collisions: 0 > > > dev.bge.1.stats.tx.XonSent: 0 > > > dev.bge.1.stats.tx.XoffSent: 0 > > > dev.bge.1.stats.tx.InternalMacTransmitErrors: 0 > > > dev.bge.1.stats.tx.SingleCollisionFrames: 0 > > > dev.bge.1.stats.tx.MultipleCollisionFrames: 0 > > > dev.bge.1.stats.tx.DeferredTransmissions: 0 > > > dev.bge.1.stats.tx.ExcessiveCollisions: 0 > > > dev.bge.1.stats.tx.LateCollisions: 0 > > > dev.bge.1.stats.tx.UnicastPkts: 281039183 > > > dev.bge.1.stats.tx.MulticastPkts: 0 > > > dev.bge.1.stats.tx.BroadcastPkts: 1153 > > > -- and here -- > > > > > > And now, that I remembered about this as well: > > > -- cut here -- > > > Name Mtu Network Address Ipkts Ierrs Idrop Opkts > > > Oerrs Coll > > > bge1 1500 00:11:25:22:0d:ed 32321767025 278517070 > > 37726837 > > > 281068216 0 0 > > > -- and here -- > > > The colo provider changed my cable a couple of times so I'd not blame it > > on > > > that. Unfortunately, I don't have access to the port statistics on the > > > switch. Running netstat with -w1 yields between 0 and 4 errors/second. > > > > > > > Hardware MAC counters still show high number of FCS errors. The > > service provider should have to check possible cabling issues on > > the port of the switch. > > > > After swapping cables and moving the NIC into another switch, there are some > improvements. However: > -- cut here -- > dev.bge.1.%desc: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. > 0x004101 > dev.bge.1.%driver: bge > dev.bge.1.%location: slot=0 function=0 > dev.bge.1.%pnpinfo: vendor=0x14e4 device=0x1659 subvendor=0x1014 > subdevice=0x02c6 class=0x020000 > dev.bge.1.%parent: pci5 > dev.bge.1.forced_collapse: 0 > dev.bge.1.forced_udpcsum: 0 > dev.bge.1.stats.FramesDroppedDueToFilters: 0 > dev.bge.1.stats.DmaWriteQueueFull: 0 > dev.bge.1.stats.DmaWriteHighPriQueueFull: 0 > dev.bge.1.stats.NoMoreRxBDs: 243248 <- this > dev.bge.1.stats.InputDiscards: 9945500 > dev.bge.1.stats.InputErrors: 0 There are still discarded frames but I believe it's not related with any cabling issues since you don't have FCS or alignment errors. > dev.bge.1.stats.RecvThresholdHit: 0 > dev.bge.1.stats.rx.ifHCInOctets: 36697296701 > dev.bge.1.stats.rx.Fragments: 0 > dev.bge.1.stats.rx.UnicastPkts: 549334370 > dev.bge.1.stats.rx.MulticastPkts: 113638 > dev.bge.1.stats.rx.BroadcastPkts: 0 > dev.bge.1.stats.rx.FCSErrors: 0 > dev.bge.1.stats.rx.AlignmentErrors: 0 > dev.bge.1.stats.rx.xonPauseFramesReceived: 0 > dev.bge.1.stats.rx.xoffPauseFramesReceived: 0 > dev.bge.1.stats.rx.ControlFramesReceived: 0 > dev.bge.1.stats.rx.xoffStateEntered: 0 > dev.bge.1.stats.rx.FramesTooLong: 0 > dev.bge.1.stats.rx.Jabbers: 0 > dev.bge.1.stats.rx.UndersizePkts: 0 > dev.bge.1.stats.tx.ifHCOutOctets: 10578000636 > dev.bge.1.stats.tx.Collisions: 0 > dev.bge.1.stats.tx.XonSent: 0 > dev.bge.1.stats.tx.XoffSent: 0 > dev.bge.1.stats.tx.InternalMacTransmitErrors: 0 > dev.bge.1.stats.tx.SingleCollisionFrames: 0 > dev.bge.1.stats.tx.MultipleCollisionFrames: 0 > dev.bge.1.stats.tx.DeferredTransmissions: 0 > dev.bge.1.stats.tx.ExcessiveCollisions: 0 > dev.bge.1.stats.tx.LateCollisions: 0 > dev.bge.1.stats.tx.UnicastPkts: 64545266 > dev.bge.1.stats.tx.MulticastPkts: 0 > dev.bge.1.stats.tx.BroadcastPkts: 313 > > and > 0/1710531/2006005 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > -- and here -- > > I'll start gathering some stats/charts on this host to see if I can > correlate the starvation with other system events. > Now MAC statistics counter show no abnormal things which in turn indicates the mbuf starvation came from other issues. The next thing is to identify which process or kernel subsystem consumes a lot of mbuf clusters. > > > > However this does not explain why you have large number of mbuf > > cluster allocation failure. The only wild guess I have at this > > moment is some process or kernel subsystems are too slow to release > > allocated mbuf clusters. Did you check various system activities > > while seeing the issue? > > From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 17:12:50 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A499F10656E3 for ; Wed, 30 Mar 2011 17:12:50 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 4FF6C8FC29 for ; Wed, 30 Mar 2011 17:12:49 +0000 (UTC) Received: by vxc34 with SMTP id 34so1451110vxc.13 for ; Wed, 30 Mar 2011 10:12:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=XpUEaOUJCDYXjwVFruPBv7V7PE0aZrrsFbmdGcd/oBM=; b=cKhOsGpUD2T552kFW9wT5HkJ5X18m8wOuki2VAU4dLLyUmLaKb3nkHgdAIPnB5S6/n tCA1fj+5O8Ai/w9OpM+Wn/gCA0Rbepzh9E0wG3cXxJEUgQqFC6Z0e90eGeC/uDaFZOE4 XBvFrtihD+S/hRPkhW0MWyn5t2a3fQYwYBMso= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=JJjUoDks32pG8yvM0mTQJnkMkhcbm2Dj2UQxE3IwgyDhytOeAJ16mS1w4JMFr1Cbay qRExA7CwpjRAhJ2GLSKMInQegeYUT8wdSvZ54Zj/UCwecEJl+ZgUpBcmmBDAQ+iL6bNh 4PSXxibnAquxlTxglApQGLKY456mAuv3bG3QI= MIME-Version: 1.0 Received: by 10.52.93.177 with SMTP id cv17mr2018084vdb.133.1301505169429; Wed, 30 Mar 2011 10:12:49 -0700 (PDT) Received: by 10.52.167.6 with HTTP; Wed, 30 Mar 2011 10:12:49 -0700 (PDT) In-Reply-To: References: <4D923931.2070606@zonov.org> <4D92BB71.5000900@FreeBSD.org> <4D935DF6.90906@FreeBSD.org> Date: Wed, 30 Mar 2011 10:12:49 -0700 Message-ID: From: Jack Vogel To: Arnaud Lacombe Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org, Doug Barton Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 17:12:50 -0000 On Wed, Mar 30, 2011 at 10:06 AM, Arnaud Lacombe wrote: > Hi, > > On Wed, Mar 30, 2011 at 12:44 PM, Doug Barton wrote: > > On 3/30/2011 7:19 AM, Arnaud Lacombe wrote: > >> > >> Hi, > >> > >> On Wed, Mar 30, 2011 at 1:11 AM, Doug Barton wrote: > > > >> The only things I've been able to get from Jack is "We, at Intel, test > >> em(4) at 256k nmbclusters. We do not have problem. If you have > >> problem, raise nmbcluster.". 256k nmbcluster in my environment is not > >> acceptable. > >> > >>> Meanwhile, there are times where memory IS a constraint, and there are > >>> some > >>> things you can't do without more of it. > >>> > >> yes, but the driver should not need a manual reset between the time > >> resource are (heavily) scarce and the time it became available again. > > > > If you're facing that situation then obviously your system is constrained > by > > hardware. > No. We are taking about exceptional recoverable situation not handled > by the software, it should not bring the complete system down. If > you're swapping code has defect, you do not tell one to buy more RAM > not to trigger the defective code, you fix the code. The situation is > similar here. > > The code that got put in the driver has a response to this "unrecoverable situation", you've flamed me and the code, but you've not demonstrated it does not work. Both Beezar and myself have tried to have a civil discussion over the matter and you just have gotten rude. As demonstrated in this email thread. I don't know about you, but I have feelings, and you've been insensitive to them. So quote chapters and verses all you like, I'm DONE with this. Jack From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 17:23:03 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 798B51065670 for ; Wed, 30 Mar 2011 17:23:03 +0000 (UTC) (envelope-from dudu@dudu.ro) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 4457B8FC13 for ; Wed, 30 Mar 2011 17:23:03 +0000 (UTC) Received: by iwn33 with SMTP id 33so1827508iwn.13 for ; Wed, 30 Mar 2011 10:23:02 -0700 (PDT) Received: by 10.43.59.13 with SMTP id wm13mr1394679icb.416.1301505782549; Wed, 30 Mar 2011 10:23:02 -0700 (PDT) MIME-Version: 1.0 Received: by 10.42.3.13 with HTTP; Wed, 30 Mar 2011 10:17:21 -0700 (PDT) In-Reply-To: <20110330171023.GA8601@michelle.cdnetworks.com> References: <20110313011632.GA1621@michelle.cdnetworks.com> <20110330171023.GA8601@michelle.cdnetworks.com> From: Vlad Galu Date: Wed, 30 Mar 2011 19:17:21 +0200 Message-ID: To: pyunyh@gmail.com Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org, Arnaud Lacombe Subject: Re: bge(4) on RELENG_8 mbuf cluster starvation X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 17:23:03 -0000 On Wed, Mar 30, 2011 at 7:10 PM, YongHyeon PYUN wrote: > On Wed, Mar 30, 2011 at 05:55:47PM +0200, Vlad Galu wrote: > > On Sun, Mar 13, 2011 at 2:16 AM, YongHyeon PYUN > wrote: > > > > > On Sat, Mar 12, 2011 at 09:17:28PM +0100, Vlad Galu wrote: > > > > On Sat, Mar 12, 2011 at 8:53 PM, Arnaud Lacombe > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > On Sat, Mar 12, 2011 at 4:03 AM, Vlad Galu wrote: > > > > > > Hi folks, > > > > > > > > > > > > On a fairly busy recent (r219010) RELENG_8 machine I keep getting > > > > > > -- cut here -- > > > > > > 1096/1454/2550 mbufs in use (current/cache/total) > > > > > > 1035/731/1766/262144 mbuf clusters in use > (current/cache/total/max) > > > > > > 1035/202 mbuf+clusters out of packet secondary zone in use > > > > > (current/cache) > > > > > > 0/117/117/12800 4k (page size) jumbo clusters in use > > > > > > (current/cache/total/max) > > > > > > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) > > > > > > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) > > > > > > 2344K/2293K/4637K bytes allocated to network > (current/cache/total) > > > > > > 0/70128196/37726935 requests for mbufs denied > > > > > (mbufs/clusters/mbuf+clusters) > > > > > > ^^^^^^^^^^^^^^^^^^^^^ > > > > > > -- and here -- > > > > > > > > > > > > kern.ipc.nmbclusters is set to 131072. Other settings: > > > > > no, netstat(8) says 262144. > > > > > > > > > > > > > > Heh, you're right, I forgot I'd doubled it a while ago. Wrote that > from > > > the > > > > top of my head. > > > > > > > > > > > > > Maybe can you include $(sysctl dev.bge) ? Might be useful. > > > > > > > > > > - Arnaud > > > > > > > > > > > > > Sure: > > > > > > [...] > > > > > > > dev.bge.1.%desc: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC > > > rev. > > > > 0x004101 > > > > dev.bge.1.%driver: bge > > > > dev.bge.1.%location: slot=0 function=0 > > > > dev.bge.1.%pnpinfo: vendor=0x14e4 device=0x1659 subvendor=0x1014 > > > > subdevice=0x02c6 class=0x020000 > > > > dev.bge.1.%parent: pci5 > > > > dev.bge.1.forced_collapse: 2 > > > > dev.bge.1.forced_udpcsum: 0 > > > > dev.bge.1.stats.FramesDroppedDueToFilters: 0 > > > > dev.bge.1.stats.DmaWriteQueueFull: 0 > > > > dev.bge.1.stats.DmaWriteHighPriQueueFull: 0 > > > > dev.bge.1.stats.NoMoreRxBDs: 680050 > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > This indicates bge(4) encountered RX buffer shortage. Perhaps > > > bge(4) couldn't fill new RX buffers for incoming frames due to > > > other system activities. > > > > > > > dev.bge.1.stats.InputDiscards: 228755931 > > > > > > This counter indicates number of frames discarded due to RX buffer > > > shortage. bge(4) discards received frame if it failed to allocate > > > new RX buffer such that InputDiscards is normally higher than > > > NoMoreRxBDs. > > > > > > > dev.bge.1.stats.InputErrors: 49080818 > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > Something is wrong here. Too many frames were classified as error > > > frames. You may see poor RX performance. > > > > > > > dev.bge.1.stats.RecvThresholdHit: 0 > > > > dev.bge.1.stats.rx.ifHCInOctets: 2095148839247 > > > > dev.bge.1.stats.rx.Fragments: 47887706 > > > > dev.bge.1.stats.rx.UnicastPkts: 32672557601 > > > > dev.bge.1.stats.rx.MulticastPkts: 1218 > > > > dev.bge.1.stats.rx.BroadcastPkts: 2 > > > > dev.bge.1.stats.rx.FCSErrors: 2822217 > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > FCS errors are too high. Please check cabling again(I'm assuming > > > the controller is not broken here). I think you can use vendor's > > > diagnostic tools to verify this. > > > > > > > dev.bge.1.stats.rx.AlignmentErrors: 0 > > > > dev.bge.1.stats.rx.xonPauseFramesReceived: 0 > > > > dev.bge.1.stats.rx.xoffPauseFramesReceived: 0 > > > > dev.bge.1.stats.rx.ControlFramesReceived: 0 > > > > dev.bge.1.stats.rx.xoffStateEntered: 0 > > > > dev.bge.1.stats.rx.FramesTooLong: 0 > > > > dev.bge.1.stats.rx.Jabbers: 0 > > > > dev.bge.1.stats.rx.UndersizePkts: 0 > > > > dev.bge.1.stats.tx.ifHCOutOctets: 48751515826 > > > > dev.bge.1.stats.tx.Collisions: 0 > > > > dev.bge.1.stats.tx.XonSent: 0 > > > > dev.bge.1.stats.tx.XoffSent: 0 > > > > dev.bge.1.stats.tx.InternalMacTransmitErrors: 0 > > > > dev.bge.1.stats.tx.SingleCollisionFrames: 0 > > > > dev.bge.1.stats.tx.MultipleCollisionFrames: 0 > > > > dev.bge.1.stats.tx.DeferredTransmissions: 0 > > > > dev.bge.1.stats.tx.ExcessiveCollisions: 0 > > > > dev.bge.1.stats.tx.LateCollisions: 0 > > > > dev.bge.1.stats.tx.UnicastPkts: 281039183 > > > > dev.bge.1.stats.tx.MulticastPkts: 0 > > > > dev.bge.1.stats.tx.BroadcastPkts: 1153 > > > > -- and here -- > > > > > > > > And now, that I remembered about this as well: > > > > -- cut here -- > > > > Name Mtu Network Address Ipkts Ierrs Idrop > Opkts > > > > Oerrs Coll > > > > bge1 1500 00:11:25:22:0d:ed 32321767025 278517070 > > > 37726837 > > > > 281068216 0 0 > > > > -- and here -- > > > > The colo provider changed my cable a couple of times so I'd not blame > it > > > on > > > > that. Unfortunately, I don't have access to the port statistics on > the > > > > switch. Running netstat with -w1 yields between 0 and 4 > errors/second. > > > > > > > > > > Hardware MAC counters still show high number of FCS errors. The > > > service provider should have to check possible cabling issues on > > > the port of the switch. > > > > > > > After swapping cables and moving the NIC into another switch, there are > some > > improvements. However: > > -- cut here -- > > dev.bge.1.%desc: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC > rev. > > 0x004101 > > dev.bge.1.%driver: bge > > dev.bge.1.%location: slot=0 function=0 > > dev.bge.1.%pnpinfo: vendor=0x14e4 device=0x1659 subvendor=0x1014 > > subdevice=0x02c6 class=0x020000 > > dev.bge.1.%parent: pci5 > > dev.bge.1.forced_collapse: 0 > > dev.bge.1.forced_udpcsum: 0 > > dev.bge.1.stats.FramesDroppedDueToFilters: 0 > > dev.bge.1.stats.DmaWriteQueueFull: 0 > > dev.bge.1.stats.DmaWriteHighPriQueueFull: 0 > > dev.bge.1.stats.NoMoreRxBDs: 243248 <- this > > dev.bge.1.stats.InputDiscards: 9945500 > > dev.bge.1.stats.InputErrors: 0 > > There are still discarded frames but I believe it's not related > with any cabling issues since you don't have FCS or alignment > errors. > > > dev.bge.1.stats.RecvThresholdHit: 0 > > dev.bge.1.stats.rx.ifHCInOctets: 36697296701 > > dev.bge.1.stats.rx.Fragments: 0 > > dev.bge.1.stats.rx.UnicastPkts: 549334370 > > dev.bge.1.stats.rx.MulticastPkts: 113638 > > dev.bge.1.stats.rx.BroadcastPkts: 0 > > dev.bge.1.stats.rx.FCSErrors: 0 > > dev.bge.1.stats.rx.AlignmentErrors: 0 > > dev.bge.1.stats.rx.xonPauseFramesReceived: 0 > > dev.bge.1.stats.rx.xoffPauseFramesReceived: 0 > > dev.bge.1.stats.rx.ControlFramesReceived: 0 > > dev.bge.1.stats.rx.xoffStateEntered: 0 > > dev.bge.1.stats.rx.FramesTooLong: 0 > > dev.bge.1.stats.rx.Jabbers: 0 > > dev.bge.1.stats.rx.UndersizePkts: 0 > > dev.bge.1.stats.tx.ifHCOutOctets: 10578000636 > > dev.bge.1.stats.tx.Collisions: 0 > > dev.bge.1.stats.tx.XonSent: 0 > > dev.bge.1.stats.tx.XoffSent: 0 > > dev.bge.1.stats.tx.InternalMacTransmitErrors: 0 > > dev.bge.1.stats.tx.SingleCollisionFrames: 0 > > dev.bge.1.stats.tx.MultipleCollisionFrames: 0 > > dev.bge.1.stats.tx.DeferredTransmissions: 0 > > dev.bge.1.stats.tx.ExcessiveCollisions: 0 > > dev.bge.1.stats.tx.LateCollisions: 0 > > dev.bge.1.stats.tx.UnicastPkts: 64545266 > > dev.bge.1.stats.tx.MulticastPkts: 0 > > dev.bge.1.stats.tx.BroadcastPkts: 313 > > > > and > > 0/1710531/2006005 requests for mbufs denied > (mbufs/clusters/mbuf+clusters) > > -- and here -- > > > > I'll start gathering some stats/charts on this host to see if I can > > correlate the starvation with other system events. > > > > Now MAC statistics counter show no abnormal things which in turn > indicates the mbuf starvation came from other issues. The next > thing is to identify which process or kernel subsystem consumes a > lot of mbuf clusters. > > Thanks for the feedback. Oh, there is a BPF consumer listening on bge1. After noticing http://www.mail-archive.com/freebsd-net@freebsd.org/msg25685.html, I decided to shut it down for a while. It's pretty weird, my BPF buffer size is set to 4MB and traffic on that interface is nowhere near that high. I'll get back as soon as I have new data. > > > > > > > However this does not explain why you have large number of mbuf > > > cluster allocation failure. The only wild guess I have at this > > > moment is some process or kernel subsystems are too slow to release > > > allocated mbuf clusters. Did you check various system activities > > > while seeing the issue? > > > > -- Good, fast & cheap. Pick any two. From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 17:33:01 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4FB9B106566C; Wed, 30 Mar 2011 17:33:01 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id F3CCC8FC15; Wed, 30 Mar 2011 17:33:00 +0000 (UTC) Received: by iyj12 with SMTP id 12so1876848iyj.13 for ; Wed, 30 Mar 2011 10:33:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:date:to:cc:subject:message-id:reply-to :references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=sM9Njcqgcu3I00sSdD0t8V4+c+hxhVTViaGo+Y5feQ0=; b=n5O655Q6MYMMElLHZSdG1dwqAVlQq5k7pYlYp3h/SmVHRIhgsA3lUX9bFCMGVfh97K ZP/V34y27U1tuQUgIetNtfCjZ6W+aE1DZvYjYTuZ26hDAU1DjuH5pHsnE7UjaKgeoPX/ LHaXszbcQg5feIAmY5w/myYqYx9VqXuaGob64= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=krLnc1AvSiN6C7ehqpzU6zzunsPHXsCU1enwGsE+Eax7lCRz/rkN1oHFk/1WMLj4Ww GKFKcu54RwLwLnCQWq5qHrfepFOhOYyAUao/JBSN1jw6prTKnedE3rgwkwNCFGchenjl 5xKemG3B02B4rFmfovXJpevNHESGlLCVy+TWw= Received: by 10.43.64.9 with SMTP id xg9mr1445206icb.102.1301506380429; Wed, 30 Mar 2011 10:33:00 -0700 (PDT) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id gy41sm161351ibb.22.2011.03.30.10.32.57 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 30 Mar 2011 10:32:59 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Wed, 30 Mar 2011 10:31:45 -0700 From: YongHyeon PYUN Date: Wed, 30 Mar 2011 10:31:45 -0700 To: Yamagi Burmeister Message-ID: <20110330173145.GB8601@michelle.cdnetworks.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org, yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 17:33:01 -0000 On Wed, Mar 30, 2011 at 04:22:23PM +0200, Yamagi Burmeister wrote: > Hi, > I recently got four about two years old Asus M3A-H/HDMI mainboards with > an integrated Attansic L2 ethernet controller. This NIC is supported by > age(4) and recognized by freebsd: > > ---- > > age0: > mem 0xfeac0000-0xfeafffff irq 18 at device 0.0 on pci2 > age0: 1280 Tx FIFO, 2364 Rx FIFO > age0: Using 1 MSI messages. > age0: 4GB boundary crossed, switching to 32bit DMA addressing mode. > miibus0: on age0 > atphy0: PHY 0 on miibus0 > atphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, > 1000baseT-FDX-master, auto > age0: Ethernet address: 00:23:54:31:a0:12 > age0: [FILTER] > > ---- > > age0: flags=8843 metric 0 mtu 1500 > options=c319b WOL_MCAST,WOL_MAGIC,VLAN_HWTSO,LINKSTATE> > ether 00:23:54:31:a0:12 > inet6 fe80::223:54ff:fe31:a012%age0 prefixlen 64 scopeid 0x1 > nd6 options=3 > media: Ethernet autoselect (none) > status: no carrier > > ---- > > All for boxes are unstable if the Attansic NIC is in use, no one of them > survived more than 60 minutes of ~20mb/s network traffic. I managed to > get some coredumps and extracted the backtraces. Since everytime one of > the boxes paniced I got different panic message and a different backtrace > with a different subsystem involved I suspected broken hardware. I > plugged a em(4) NIC into the PCI slot and wasn't able to reproduce the > problem, in fact the boxes run rock solid for several days. Next I set > up a Windows 7, installed the Attansic vendor driver and did another > run. All went smooth, no crash for nearly 24 hours. > > My guess is kernel memory corruption by age(4), which would explain all > the different backtraces and the different panic messages. This problem > is reproducible in at least FreeBSD 7.4 and 8.2 and with TSO4 enabled > and disabled. I'm willing to debug this, but I really don't know how. So > any help or a pointer into the right direction would be appreciated. > AFAIK this is the first report for possible memory corruption triggered by age(4). I'm still not sure whether it's caused by age(4) but you can disable RX checksum offloading and see whether that makes any difference. Since I have no longer access to the hardware it would be even better if you can tell me which traffic pattern triggered the issue. From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 17:44:18 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4E4DC106566C; Wed, 30 Mar 2011 17:44:18 +0000 (UTC) (envelope-from lacombar@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 047EC8FC14; Wed, 30 Mar 2011 17:44:17 +0000 (UTC) Received: by iyj12 with SMTP id 12so1889201iyj.13 for ; Wed, 30 Mar 2011 10:44:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=LF1Kst5MT8mW5EGJDYS/9A1mX+POyOnUlkjxNnMXed4=; b=fNG31NTZZ6u4/Z7WTvXA1MtYiERBjBZS6hTlOdl1rshG8ODXMIZWr+Wu7tH9Keg+FZ gQtm+8WVzLpm9MBwDU7mqfgn0jfDoubx9pbxQSMSLyt4TpvBSsmMD1pmHY1PAevmkLaF /lAxljzHi9HsaTtb0xC33757kZV/nHPzdqJ4c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=gRo6jxdOg2zf9FRt5SUIKEZnf06ATxsQ7CgP1x445TlBc1cPnchjoVOsaNa98gyFsx jFGQWBfy5ZKc67mh/lTSYpF6ZARoTS5R2Mdnh2wMA2eDhJyWx584VhUYFmjFpAX4WGzK Pv338HRezyxexcoPNcuPhTymSZGTf3sJrdVMI= MIME-Version: 1.0 Received: by 10.42.159.197 with SMTP id m5mr1468846icx.81.1301507057461; Wed, 30 Mar 2011 10:44:17 -0700 (PDT) Received: by 10.42.146.72 with HTTP; Wed, 30 Mar 2011 10:44:17 -0700 (PDT) In-Reply-To: <4D936407.2030900@FreeBSD.org> References: <4D923931.2070606@zonov.org> <4D92BB71.5000900@FreeBSD.org> <4D935DF6.90906@FreeBSD.org> <4D936407.2030900@FreeBSD.org> Date: Wed, 30 Mar 2011 13:44:17 -0400 Message-ID: From: Arnaud Lacombe To: Doug Barton Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 17:44:18 -0000 On Wed, Mar 30, 2011 at 1:10 PM, Doug Barton wrote: > On 3/30/2011 10:06 AM, Arnaud Lacombe wrote: >> >> No. We are taking about exceptional recoverable situation not handled >> by the software, it should not bring the complete system down. If >> you're swapping code has defect, you do not tell one to buy more RAM >> not to trigger the defective code, you fix the code. The situation is >> similar here. > > I understand that you believe the situations to be similar, however you may > make your point more clearly if you share the patches you've developed with > the list so that others can review/comment/etc. > The patch has been posted on the list, please search in the archives. - Arnaud From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 17:56:00 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4082D106566B; Wed, 30 Mar 2011 17:56:00 +0000 (UTC) (envelope-from lacombar@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id E3AD38FC1A; Wed, 30 Mar 2011 17:55:59 +0000 (UTC) Received: by iwn33 with SMTP id 33so1864862iwn.13 for ; Wed, 30 Mar 2011 10:55:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=XFBJyyHeHIRBlwRBBzwxnzeyc29dsVXy1XqtljVtP3E=; b=VO7SPa7KqdE+vcQFEO0OGIeLk0pskwVOG+bzv2fk9PyFXzVVC7SYAuYkmcMfZT/ynB 9GMYu5E1T3tntPcyxZ2nS4j3oSXkfCyLdZP7agykpQLYtAMMrm18BKRjxvEmdGp5kkp+ 1CgdfhZ1F20bpx3UdMdvmvPyo4ny7W2kE5ZH8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=abxB0JUrN8/iHkMFck+n59GaYjZm+fWC+Kvd+kDMEhMaXbEWrfqMjGUrIjTuZcv8bd KpzFYUfRqFSLfyyF2zeq7iUvBsq4wtkUeExipNKNsTxiElcMVnOgkaXXVzLN6YzsFM8V tVf/J8pZakEjkrBLb3eL2eO1yH2saIkUuGvRo= MIME-Version: 1.0 Received: by 10.42.159.197 with SMTP id m5mr1487149icx.81.1301507759159; Wed, 30 Mar 2011 10:55:59 -0700 (PDT) Received: by 10.42.146.72 with HTTP; Wed, 30 Mar 2011 10:55:59 -0700 (PDT) In-Reply-To: References: <4D923931.2070606@zonov.org> <4D92BB71.5000900@FreeBSD.org> <4D935DF6.90906@FreeBSD.org> Date: Wed, 30 Mar 2011 13:55:59 -0400 Message-ID: From: Arnaud Lacombe To: Jack Vogel Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org, Doug Barton Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 17:56:00 -0000 Hi, On Wed, Mar 30, 2011 at 1:12 PM, Jack Vogel wrote: > The code that got put in the driver has a response to this "unrecoverable > situation", you've flamed me and the code, but you've not demonstrated it > does not work. > I did, in "Message-ID: ". If you want to talk code, please tell me where I was wrong. To sum up, the current code relies on em_rxeof() to refresh mbufs. This path is triggered on RX interrupt which never happen if the RX ring is empty. Now the question I ask you is technical, no criticize at all of any kind: how do you refresh the mbufs' ring if no RX interrupt is ever triggered because the card has no descriptor left at all in its ring ? Regards, - Arnaud From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 17:59:58 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 84FBE106566B for ; Wed, 30 Mar 2011 17:59:58 +0000 (UTC) (envelope-from andrey@zonov.org) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 1CC0A8FC0C for ; Wed, 30 Mar 2011 17:59:57 +0000 (UTC) Received: by fxm11 with SMTP id 11so1610673fxm.13 for ; Wed, 30 Mar 2011 10:59:57 -0700 (PDT) Received: by 10.223.101.87 with SMTP id b23mr1668324fao.97.1301507996917; Wed, 30 Mar 2011 10:59:56 -0700 (PDT) Received: from [10.254.254.77] (ppp95-165-144-57.pppoe.spdop.ru [95.165.144.57]) by mx.google.com with ESMTPS id 17sm129402far.43.2011.03.30.10.59.54 (version=SSLv3 cipher=OTHER); Wed, 30 Mar 2011 10:59:55 -0700 (PDT) Message-ID: <4D936F99.3060508@zonov.org> Date: Wed, 30 Mar 2011 21:59:53 +0400 From: Andrey Zonov User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.1.24) Gecko/20100228 Thunderbird/2.0.0.24 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Arnaud Lacombe References: <4D923931.2070606@zonov.org> <4D92C673.2080107@zonov.org> In-Reply-To: Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 8bit Cc: freebsd-net@freebsd.org, Jack Vogel Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 17:59:58 -0000 Hi, Maybe you're right. OK, let's return default hw.igb.rxd to 256. It seems to be enough for stable work and driver is used less memory. BTW, on the man page igb(4) still written that hw.igb.rxd equals to 256 by default. -- Andrey Zonov 30.03.2011 18:33, Arnaud Lacombe ÐÉÛÅÔ: > Hi, > > On Wed, Mar 30, 2011 at 1:58 AM, Andrey Zonov wrote: >> My point is if you're using machine with 8 CPUs than maxusers/clusters/9k >> mbufs should have been increased by system, because on this machine minimum >> 2Gb memory is available. >> > I am doubtful that the number of CPU[0] or number of users (yes, I > know `maxusers' is currently used to compute the default > `nmbcluster'...) can be linked to any network load pattern at all. You > can have a 24 CPU machine made for 4096 users with a single NIC, not > requiring much memory, while a 1 CPU machine with only 1 users can > have +8 NIC and require a huge quantity of memory. Available KVM space > should also be taken into account, as it is rather limited on i386. > > - Arnaud > > [0]: even more today where you can have a huge number of virtual CPU. From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 18:22:42 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 92E7B106564A; Wed, 30 Mar 2011 18:22:42 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 33A698FC18; Wed, 30 Mar 2011 18:22:41 +0000 (UTC) Received: by vws18 with SMTP id 18so1519530vws.13 for ; Wed, 30 Mar 2011 11:22:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=c1rrRLbZVNDUKtpCNiG373zPicv82/19yY40fHosz0o=; b=bgGdhJ4K5t2em5xM8HmJGO06xJqWuBiD7cBNdjh3FQBe7F9L7Me8kaarhM2+tFL78+ vT7ZUPN7S5sREC7U1BGQyDvPymeXDIQ4rrEcSthkYDKWSK+LxqoL6LarNoJ+ezoHV7VE D4i98XLvwYEp4H7oBVbXAn6can95EqxvuJyUU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=BQkVKPqVgkfLMEBeQfBS4XPXxxbm0Hc75Fk8z6kzAVxjqqLx8b6MGn9/NrbUXNXRU3 FvUPHdEyL0SnSYQWVgHYFwTYD6ZbwYU0zLVOS3nZC3D24g3QL9yFZYYHdY1qJ+tJlkZj 54dw6+YY8d7S4eMipBEzMxOXfDP6p6p8d1uAI= MIME-Version: 1.0 Received: by 10.52.92.161 with SMTP id cn1mr2083723vdb.253.1301509361566; Wed, 30 Mar 2011 11:22:41 -0700 (PDT) Received: by 10.52.167.6 with HTTP; Wed, 30 Mar 2011 11:22:41 -0700 (PDT) In-Reply-To: References: <4D923931.2070606@zonov.org> <4D92BB71.5000900@FreeBSD.org> <4D935DF6.90906@FreeBSD.org> Date: Wed, 30 Mar 2011 11:22:41 -0700 Message-ID: From: Jack Vogel To: Arnaud Lacombe Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org, Doug Barton Subject: Re: igb(4) won't start with "igb0: Could not setup receive structures" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 18:22:42 -0000 Read the code in HEAD, em_local_timer() has a test of ALL the rx queues and will schedule a task that refreshes mbufs if they are empty. This has exactly the same effect as checking for some interrupt cause, a cause that is not available when using MSIX on 82574, but this approach works for everything. Jack On Wed, Mar 30, 2011 at 10:55 AM, Arnaud Lacombe wrote: > Hi, > > On Wed, Mar 30, 2011 at 1:12 PM, Jack Vogel wrote: > > The code that got put in the driver has a response to this "unrecoverable > > situation", you've flamed me and the code, but you've not demonstrated it > > does not work. > > > I did, in "Message-ID: > ". If > you want to talk code, please tell me where I was wrong. > > To sum up, the current code relies on em_rxeof() to refresh mbufs. > This path is triggered on RX interrupt which never happen if the RX > ring is empty. Now the question I ask you is technical, no criticize > at all of any kind: how do you refresh the mbufs' ring if no RX > interrupt is ever triggered because the card has no descriptor left at > all in its ring ? > > Regards, > - Arnaud > From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 19:43:54 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 217341065670 for ; Wed, 30 Mar 2011 19:43:54 +0000 (UTC) (envelope-from ulsanrub@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id AADD38FC12 for ; Wed, 30 Mar 2011 19:43:53 +0000 (UTC) Received: by wyf23 with SMTP id 23so1690266wyf.13 for ; Wed, 30 Mar 2011 12:43:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=jOwe+10ED86pByiSw0oxGgIOpIxzk/WRXiVlxCAp/5w=; b=vMx7ilbKAE8LetCo4SqJh9o2lG9WlUyRJC80bp7GhKPVbKGcrbNbJEkkezagljbJEQ C86wIrG26UdJ8Fo7USqQh0VqdATdCFzWodatdd3kX8/z3JuNtDS9TE3+lavyIjmlnE4w vdz+SMGoCOQurnCiQtnnBuDXqkBW+MVXX5o+g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=B/quNJpuJNu3Gtv1+qXnHQtTQ7eFXfrdnacRfFQcAq5PNZySHtKr1kJqzGnfIliqfX hikgyyapegLKI3Xbd7AjsV7dTZXsG8F0jXC/+ANhtoFLk7mTr65ugjYCvmeSYojo/nwh Y/GyywG5P6PrdNt6FAIkejKbNPq8Qn43juoJw= MIME-Version: 1.0 Received: by 10.216.87.8 with SMTP id x8mr1639987wee.46.1301514232503; Wed, 30 Mar 2011 12:43:52 -0700 (PDT) Received: by 10.216.54.143 with HTTP; Wed, 30 Mar 2011 12:43:52 -0700 (PDT) Date: Wed, 30 Mar 2011 15:43:52 -0400 Message-ID: From: Kyungsoo Lee To: freebsd-net Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: UDP on FreeBSD X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 19:43:54 -0000 Hi All, I want to check UDP on FreeBSD. I am using IPERF on FreeBSD for wireless testing with Proxim 8470 FC PCMCIA card on IBM T42 and T61. When I'm transmitting data from FreeBSD to FreeBSD or CentOS using Iperf with -u -b 100M on iperf, they had lost lots of packets. Sniffer near the two nodes shows the sender could not send all packets. Iperf sender said that they try to send 85469 packets but they lost 68824 packets. I think that the UDP buffer on the sender could not handle all packets. But if I'm trying to send data from CentOS to FreeBSD using Iperf with -u -b 100M option on iperf, the sender tries 18636 packets so they lost few packets like 1 or 2 packets.As a result, they have similar bandwidth result on the report. I think that it happens from different implement between FreeBSD and Linux. But I want to double check that this is normal for FreeBSD or not. If I have some missing points, let me know please. Thank you! From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 19:50:17 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E63E5106566C; Wed, 30 Mar 2011 19:50:17 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail.yamagi.overkill.yamagi.org (unknown [IPv6:2a01:4f8:121:2102:1::7]) by mx1.freebsd.org (Postfix) with ESMTP id 7982D8FC18; Wed, 30 Mar 2011 19:50:17 +0000 (UTC) Received: from [2001:5c0:110d:6600:226:c6ff:fec4:399e] (unknown [IPv6:2001:5c0:110d:6600:226:c6ff:fec4:399e]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.yamagi.overkill.yamagi.org (Postfix) with ESMTPSA id 7409416663D1; Wed, 30 Mar 2011 21:50:12 +0200 (CEST) Date: Wed, 30 Mar 2011 21:50:12 +0200 (CEST) From: Yamagi Burmeister X-X-Sender: yamagi@maka.home.yamagi.org To: YongHyeon PYUN In-Reply-To: <20110330173145.GB8601@michelle.cdnetworks.com> Message-ID: References: <20110330173145.GB8601@michelle.cdnetworks.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org, Yamagi Burmeister , yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 19:50:18 -0000 On Wed, 30 Mar 2011, YongHyeon PYUN wrote: > On Wed, Mar 30, 2011 at 04:22:23PM +0200, Yamagi Burmeister wrote: > >> All for boxes are unstable if the Attansic NIC is in use, no one of them >> survived more than 60 minutes of ~20mb/s network traffic. I managed to >> get some coredumps and extracted the backtraces. Since everytime one of >> the boxes paniced I got different panic message and a different backtrace >> with a different subsystem involved I suspected broken hardware. I >> plugged a em(4) NIC into the PCI slot and wasn't able to reproduce the >> problem, in fact the boxes run rock solid for several days. Next I set >> up a Windows 7, installed the Attansic vendor driver and did another >> run. All went smooth, no crash for nearly 24 hours. >> >> My guess is kernel memory corruption by age(4), which would explain all >> the different backtraces and the different panic messages. This problem >> is reproducible in at least FreeBSD 7.4 and 8.2 and with TSO4 enabled >> and disabled. I'm willing to debug this, but I really don't know how. So >> any help or a pointer into the right direction would be appreciated. >> > > AFAIK this is the first report for possible memory corruption > triggered by age(4). I'm still not sure whether it's caused by > age(4) but you can disable RX checksum offloading and see whether > that makes any difference. > Since I have no longer access to the hardware it would be even > better if you can tell me which traffic pattern triggered the > issue. Okay, I did a test run with RX checksum, TX checksum and both disabled. In all three cases the crash occurs within about 20 minutes. I'm either not sure that age(4) is the problem but it has definedly something to do with the problem, since with another nic driver the same scenario is rock solid... The workload: It's a NFS3 server (FreeBSDs non-experimental implementation), serving and receiving file with about 250 to 500 megabytes at about 20mb/s. The clients are FreeBSD 7 and 8 systems and are mounting the shares via TCP. The connection is 1000mbit/s via a "dumb" gigabit switch. -- Homepage: www.yamagi.org Jabber: yamagi@yamagi.org GnuPG/GPG: 0xEFBCCBCB From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 20:30:13 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7CDDC1065675; Wed, 30 Mar 2011 20:30:13 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id D17678FC14; Wed, 30 Mar 2011 20:30:12 +0000 (UTC) Received: by pzk27 with SMTP id 27so323423pzk.13 for ; Wed, 30 Mar 2011 13:30:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:date:to:cc:subject:message-id:reply-to :references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=5rpsYhL8OkXlYz/uv7PjipIs0rer0gwSUdcPeEhcaE4=; b=C89uFMt2FFOzjW2Axsk5pmwHCc/SfxIk6OMnK34JX5pLm4Qk2Im8kqen6xNbNkIDBx N18g1EMd6l5vMoC5uozQBz+UwLFPaJ+AI4UpiSu0BJ12ictYiQQiwzpDlJUWmO4TXU7M rww3qvGfQUYjAdoWrdh4BHMzri+fN3jvPj8Os= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=AiQ9yR33nNLNPDy7KREzcpMBCOn8SDOvK0cEaBeUxpYKDUhm1/V584JVQxVDN+uRIk rcd8e6QfD8nWl/ufZwDENKpsRkZs6aVZ/AtdM2U0tuhU8rlIyM9MhL5JVX3455mx6pY+ wVYyxoWIHR4C65UlPc1Eg/s//Vm9y0nc3UGiI= Received: by 10.142.117.5 with SMTP id p5mr1296390wfc.246.1301517012335; Wed, 30 Mar 2011 13:30:12 -0700 (PDT) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id 25sm505518wfb.22.2011.03.30.13.30.09 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 30 Mar 2011 13:30:11 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Wed, 30 Mar 2011 13:28:58 -0700 From: YongHyeon PYUN Date: Wed, 30 Mar 2011 13:28:58 -0700 To: Yamagi Burmeister Message-ID: <20110330202858.GC8601@michelle.cdnetworks.com> References: <20110330173145.GB8601@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org, yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 20:30:13 -0000 On Wed, Mar 30, 2011 at 09:50:12PM +0200, Yamagi Burmeister wrote: > On Wed, 30 Mar 2011, YongHyeon PYUN wrote: > > >On Wed, Mar 30, 2011 at 04:22:23PM +0200, Yamagi Burmeister wrote: > > > >>All for boxes are unstable if the Attansic NIC is in use, no one of them > >>survived more than 60 minutes of ~20mb/s network traffic. I managed to > >>get some coredumps and extracted the backtraces. Since everytime one of > >>the boxes paniced I got different panic message and a different backtrace > >>with a different subsystem involved I suspected broken hardware. I > >>plugged a em(4) NIC into the PCI slot and wasn't able to reproduce the > >>problem, in fact the boxes run rock solid for several days. Next I set > >>up a Windows 7, installed the Attansic vendor driver and did another > >>run. All went smooth, no crash for nearly 24 hours. > >> > >>My guess is kernel memory corruption by age(4), which would explain all > >>the different backtraces and the different panic messages. This problem > >>is reproducible in at least FreeBSD 7.4 and 8.2 and with TSO4 enabled > >>and disabled. I'm willing to debug this, but I really don't know how. So > >>any help or a pointer into the right direction would be appreciated. > >> > > > >AFAIK this is the first report for possible memory corruption > >triggered by age(4). I'm still not sure whether it's caused by > >age(4) but you can disable RX checksum offloading and see whether > >that makes any difference. > >Since I have no longer access to the hardware it would be even > >better if you can tell me which traffic pattern triggered the > >issue. > > Okay, I did a test run with RX checksum, TX checksum and both disabled. > In all three cases the crash occurs within about 20 minutes. I'm either > not sure that age(4) is the problem but it has definedly something to do > with the problem, since with another nic driver the same scenario is > rock solid... > OK. > The workload: It's a NFS3 server (FreeBSDs non-experimental > implementation), serving and receiving file with about 250 to 500 > megabytes at about 20mb/s. The clients are FreeBSD 7 and 8 systems and > are mounting the shares via TCP. The connection is 1000mbit/s via a > "dumb" gigabit switch. > That's too broad to narrow down the issue. :-( I'm not sure but your box seem to have more than 4GB memory. Could you limit the available memory to 3GB via loader.conf and test it again? From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 21:41:02 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4028F1065677 for ; Wed, 30 Mar 2011 21:41:02 +0000 (UTC) (envelope-from kungfujesus06@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id C7ED58FC1B for ; Wed, 30 Mar 2011 21:41:01 +0000 (UTC) Received: by fxm11 with SMTP id 11so1807007fxm.13 for ; Wed, 30 Mar 2011 14:41:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=MXvXV1+1+qVHPX35m6lFKp3toQl3MiLf/R2M4CXA7Wc=; b=jAsEHMW1WmnQVlJ2DZRrDX7F0MJn4sU0Ekfg/erMBhZ4WsIalBCCjCZA8biHArREbJ 7YIAOEMroGccTr877gka78gpCc1OCQmBLrCo8izSOB3eyDgFu34FLoaBpvOaImasgAnP JeXUnH2ZGy3cl+bqTAPjVZGMgtHdDwE/k7rPw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=cCQ5f20g8sb4jOWghqKZ1n3TcOD8Ablrr66pfO9Aq7uvNWvNI55clDQkiR3J7wRrZ1 IyZtD+HFOT+jFNewsUpwoh+Y4BYkoRZUvSXkz3c43lKyI6vlEhbxXszyrlmFYDZ1psUb iL1mJyWxHgxlO/2jP3j8Lo6kzEz1Fuv3dTpn8= MIME-Version: 1.0 Received: by 10.223.121.102 with SMTP id g38mr119408far.9.1301519873278; Wed, 30 Mar 2011 14:17:53 -0700 (PDT) Received: by 10.223.110.147 with HTTP; Wed, 30 Mar 2011 14:17:53 -0700 (PDT) Date: Wed, 30 Mar 2011 17:17:53 -0400 Message-ID: From: Adam Stylinski To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: net80211 and interface requests X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 21:41:02 -0000 Hello, This list has helped me before so I'll email again with the hopes that somebody has an answer. All is working well with my project, however for the life of me I cannot get the interface to inject the raw frames faster than 11mbps. I'm following the example given in /usr/src/tools/tools/net80211/wlaninject.c, and manually specifying parameters such as ucastrate, mcastrate, and mgmtrate within ifconfig. I'm putting the card into pureg mode, and yet I still can't inject any faster. I've even gone so far as to specify an ieee802211_txparam struct giving values of 255 both mcast and ucast rates within the struct (and of course anding them by 0xff). I then used the ioctl call to set the flags within the interface request. Any help would be greatly appreciated. I am doing nanosleeps in between transmissions as if I don't the bpf clone can't inject due to the buffer being too full. There's probably a better way of doing this, but I doubt the nanosleeps are the issue (afterall, I get almost exactly 11mbps). I should probably note I'm not doing any ACKs, this is pure transmits. If anybody cares enough to look at my unpolished code to get a better idea, look here: http://projhinternet.svn.sourceforge.net/ The idea is to allow unidirectional traffic so that with an FCC amateur license (yes I know I'm not currently broadcasting the call sign as of yet) you can broadcast unencrypted transmissions for miles (with a linear amplifier spec'd to 2.4ghz). With the license FCC part15 no longer applies and you can operate just like in any other amateur band. From owner-freebsd-net@FreeBSD.ORG Wed Mar 30 22:01:12 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 52BEB106566B for ; Wed, 30 Mar 2011 22:01:12 +0000 (UTC) (envelope-from mike@jellydonut.org) Received: from mail-ew0-f54.google.com (mail-ew0-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id E65178FC17 for ; Wed, 30 Mar 2011 22:01:11 +0000 (UTC) Received: by ewy1 with SMTP id 1so608797ewy.13 for ; Wed, 30 Mar 2011 15:01:10 -0700 (PDT) MIME-Version: 1.0 Received: by 10.213.21.134 with SMTP id j6mr1200129ebb.141.1301520778254; Wed, 30 Mar 2011 14:32:58 -0700 (PDT) Received: by 10.213.105.204 with HTTP; Wed, 30 Mar 2011 14:32:58 -0700 (PDT) In-Reply-To: References: Date: Wed, 30 Mar 2011 17:32:58 -0400 Message-ID: From: Michael Proto To: Kyungsoo Lee Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net Subject: Re: UDP on FreeBSD X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Mar 2011 22:01:12 -0000 On Wed, Mar 30, 2011 at 3:43 PM, Kyungsoo Lee wrote: > Hi All, > > I want to check UDP on FreeBSD. > > I am using IPERF on FreeBSD for wireless testing with Proxim 8470 FC PCMCIA > card on IBM T42 and T61. > > When I'm transmitting data from FreeBSD to FreeBSD or CentOS using Iperf > with -u -b 100M on iperf, they had lost lots of packets. Sniffer near the > two nodes shows the sender could not send all packets. Iperf sender said > that they try to send 85469 packets but they lost 68824 packets. I think > that the UDP buffer on the sender could not handle all packets. > > But if I'm trying to send data from CentOS to FreeBSD using Iperf with -u -b > 100M option on iperf, the sender tries 18636 packets so they lost few > packets like 1 or 2 packets.As a result, they have similar bandwidth result > on the report. I think that it happens from different implement between > FreeBSD and Linux. > > But I want to double check that this is normal for FreeBSD or not. If I have > some missing points, let me know please. > > Thank you! > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > Just a guess, but have you tried adjusting the net.inet.udp.maxdgram sysctl? I believe the default is somewhat low for UDP transmit. I don't know what size packets iperf is using but increasing the maxdgram value might help your testing. -Proto From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 07:03:29 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D836E106564A for ; Thu, 31 Mar 2011 07:03:29 +0000 (UTC) (envelope-from bschmidt@techwires.net) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 6E6AA8FC0C for ; Thu, 31 Mar 2011 07:03:28 +0000 (UTC) Received: by bwz12 with SMTP id 12so1841965bwz.13 for ; Thu, 31 Mar 2011 00:03:28 -0700 (PDT) Received: by 10.204.48.33 with SMTP id p33mr727328bkf.153.1301555007945; Thu, 31 Mar 2011 00:03:27 -0700 (PDT) Received: from jessie.localnet (p5B2ECC03.dip0.t-ipconnect.de [91.46.204.3]) by mx.google.com with ESMTPS id k5sm511447bku.4.2011.03.31.00.03.25 (version=SSLv3 cipher=OTHER); Thu, 31 Mar 2011 00:03:26 -0700 (PDT) Sender: Bernhard Schmidt From: Bernhard Schmidt To: Adam Stylinski Date: Thu, 31 Mar 2011 09:02:45 +0200 User-Agent: KMail/1.13.5 (Linux/2.6.32-30-generic; KDE/4.4.5; i686; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201103310902.46236.bschmidt@freebsd.org> Cc: freebsd-net@freebsd.org Subject: Re: net80211 and interface requests X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: bschmidt@freebsd.org List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 07:03:29 -0000 On Wednesday, March 30, 2011 23:17:53 Adam Stylinski wrote: > Hello, > > This list has helped me before so I'll email again with the hopes that > somebody has an answer. All is working well with my project, however for > the life of me I cannot get the interface to inject the raw frames faster > than 11mbps. I'm following the example given in > /usr/src/tools/tools/net80211/wlaninject.c, and manually specifying > parameters such as ucastrate, mcastrate, and mgmtrate within ifconfig. I'm > putting the card into pureg mode, and yet I still can't inject any faster. > I've even gone so far as to specify an ieee802211_txparam struct giving > values of 255 both mcast and ucast rates within the struct (and of course > anding them by 0xff). I then used the ioctl call to set the flags within > the interface request. Any help would be greatly appreciated. You've set the ibp_rate0 parameter right? This one is in half-mbps, so a value of 108 should give you 54m. The only thing I can think of right now is that the device (or channel) is actually configured for 11b not 11g mode. Can we rule that out? Which device are you using? > I am doing nanosleeps in between transmissions as if I don't the bpf clone > can't inject due to the buffer being too full. There's probably a better > way of doing this, but I doubt the nanosleeps are the issue (afterall, I get > almost exactly 11mbps). I should probably note I'm not doing any ACKs, this > is pure transmits. > > If anybody cares enough to look at my unpolished code to get a better idea, > look here: > > http://projhinternet.svn.sourceforge.net/ > > The idea is to allow unidirectional traffic so that with an FCC amateur > license (yes I know I'm not currently broadcasting the call sign as of yet) > you can broadcast unencrypted transmissions for miles (with a linear > amplifier spec'd to 2.4ghz). With the license FCC part15 no longer applies > and you can operate just like in any other amateur band. > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > -- Bernhard From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 07:05:31 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 90FB7106566C; Thu, 31 Mar 2011 07:05:31 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail.yamagi.overkill.yamagi.org (unknown [IPv6:2a01:4f8:121:2102:1::7]) by mx1.freebsd.org (Postfix) with ESMTP id 2C5428FC12; Thu, 31 Mar 2011 07:05:31 +0000 (UTC) Received: from [2001:5c0:110d:6600:21b:21ff:fe07:b562] (unknown [IPv6:2001:5c0:110d:6600:21b:21ff:fe07:b562]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.yamagi.overkill.yamagi.org (Postfix) with ESMTPSA id BFEFF16663D1; Thu, 31 Mar 2011 09:05:28 +0200 (CEST) Date: Thu, 31 Mar 2011 09:05:19 +0200 (CEST) From: Yamagi Burmeister X-X-Sender: yamagi@saya.home.yamagi.org To: YongHyeon PYUN In-Reply-To: <20110330202858.GC8601@michelle.cdnetworks.com> Message-ID: References: <20110330173145.GB8601@michelle.cdnetworks.com> <20110330202858.GC8601@michelle.cdnetworks.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org, Yamagi Burmeister , yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 07:05:31 -0000 On Wed, 30 Mar 2011, YongHyeon PYUN wrote: >> Okay, I did a test run with RX checksum, TX checksum and both disabled. >> In all three cases the crash occurs within about 20 minutes. I'm either >> not sure that age(4) is the problem but it has definedly something to do >> with the problem, since with another nic driver the same scenario is >> rock solid... >> > > OK. > >> The workload: It's a NFS3 server (FreeBSDs non-experimental >> implementation), serving and receiving file with about 250 to 500 >> megabytes at about 20mb/s. The clients are FreeBSD 7 and 8 systems and >> are mounting the shares via TCP. The connection is 1000mbit/s via a >> "dumb" gigabit switch. >> > > That's too broad to narrow down the issue. :-( > I'm not sure but your box seem to have more than 4GB memory. Could > you limit the available memory to 3GB via loader.conf and test it > again? All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64. After limiting the memory via hw.physmem to 3GB the problems are gone. The box is running crashfree for more than 6 hours and has served over 300GB of data via age(4). -- Homepage: www.yamagi.org Jabber: yamagi@yamagi.org GnuPG/GPG: 0xEFBCCBCB From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 10:45:21 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 47BA6106564A for ; Thu, 31 Mar 2011 10:45:21 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id BC4298FC18 for ; Thu, 31 Mar 2011 10:45:20 +0000 (UTC) Received: from julian-mac.elischer.org (home-nat.elischer.org [67.100.89.137]) (authenticated bits=0) by vps1.elischer.org (8.14.4/8.14.4) with ESMTP id p2VAjGtP031570 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 31 Mar 2011 03:45:18 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <4D945B55.6080600@freebsd.org> Date: Thu, 31 Mar 2011 03:45:41 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9 MIME-Version: 1.0 To: Michael Proto References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net Subject: Re: UDP on FreeBSD X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 10:45:21 -0000 On 3/30/11 2:32 PM, Michael Proto wrote: > On Wed, Mar 30, 2011 at 3:43 PM, Kyungsoo Lee wrote: >> Hi All, >> >> I want to check UDP on FreeBSD. >> >> I am using IPERF on FreeBSD for wireless testing with Proxim 8470 FC PCMCIA >> card on IBM T42 and T61. >> >> When I'm transmitting data from FreeBSD to FreeBSD or CentOS using Iperf >> with -u -b 100M on iperf, they had lost lots of packets. Sniffer near the >> two nodes shows the sender could not send all packets. Iperf sender said >> that they try to send 85469 packets but they lost 68824 packets. I think >> that the UDP buffer on the sender could not handle all packets. >> >> But if I'm trying to send data from CentOS to FreeBSD using Iperf with -u -b >> 100M option on iperf, the sender tries 18636 packets so they lost few >> packets like 1 or 2 packets.As a result, they have similar bandwidth result >> on the report. I think that it happens from different implement between >> FreeBSD and Linux. >> >> But I want to double check that this is normal for FreeBSD or not. If I have >> some missing points, let me know please. >> >> Thank you! >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > Just a guess, but have you tried adjusting the net.inet.udp.maxdgram > sysctl? I believe the default is somewhat low for UDP transmit. I > don't know what size packets iperf is using but increasing the > maxdgram value might help your testing. this is many years out of date but a decade or so ago freebsd would return ENOBUFS and linux would block when the outgoing queues filled up. the answer then was that teh programs are all written for Linux and didn't check for ENOBUFS but that may be out of date now in many different ways. > > -Proto > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 12:20:43 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7D4381065672 for ; Thu, 31 Mar 2011 12:20:43 +0000 (UTC) (envelope-from kungfujesus06@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 3B8E48FC26 for ; Thu, 31 Mar 2011 12:20:43 +0000 (UTC) Received: by iyj12 with SMTP id 12so2964467iyj.13 for ; Thu, 31 Mar 2011 05:20:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=2dxyWxteT8sUr+UQrLA5VRAz8OHxW0NlXo1cDhH+BSA=; b=hJPabEvT2KXgYSIN9GBOQFujpXukZYEmjxKrVxJX4hANRF6S0FqKq62d8z6+qd0L+1 R0o6OzhiNPjj+AlaTOI/4X0hpvvGMwN9zxbCjYvfECrnFvFhw8kge7vnKTwlCCL9bF1J JCuXNiXyuaH2jkIQacwxH7pt5aMhUkfCzKOEA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=iYNzgGey426/qIDeGiWIZze1Iqrq6PYYeFzpudDixWfb3C7eyINn3SBXmBSZiPfPg9 j3oUemCX7HkZrnIVxSJjPhFDTfhejd8bOU1scEBWrJ0G4GSGjWkjnDJV8f14lLYDFN48 DF6EYdQFop6wDROn4MWbf5MPTkr9WYzofAB/U= Received: by 10.42.159.6 with SMTP id j6mr2861849icx.260.1301574042374; Thu, 31 Mar 2011 05:20:42 -0700 (PDT) Received: from freebsdbox.adamsnet ([72.49.234.31]) by mx.google.com with ESMTPS id xi12sm603200icb.18.2011.03.31.05.20.40 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 31 Mar 2011 05:20:41 -0700 (PDT) Date: Thu, 31 Mar 2011 08:20:33 -0400 From: Adam Stylinski To: Bernhard Schmidt Message-ID: <20110331122033.GA66992@freebsdbox.adamsnet> References: <201103310902.46236.bschmidt@freebsd.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="FL5UXtIhxfXey3p5" Content-Disposition: inline In-Reply-To: <201103310902.46236.bschmidt@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-net@freebsd.org Subject: Re: net80211 and interface requests X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 12:20:43 -0000 --FL5UXtIhxfXey3p5 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Mar 31, 2011 at 09:02:45AM +0200, Bernhard Schmidt wrote: > On Wednesday, March 30, 2011 23:17:53 Adam Stylinski wrote: > > Hello, > >=20 > > This list has helped me before so I'll email again with the hopes that > > somebody has an answer. All is working well with my project, however f= or > > the life of me I cannot get the interface to inject the raw frames fast= er > > than 11mbps. I'm following the example given in > > /usr/src/tools/tools/net80211/wlaninject.c, and manually specifying > > parameters such as ucastrate, mcastrate, and mgmtrate within ifconfig. = I'm > > putting the card into pureg mode, and yet I still can't inject any fast= er. > > I've even gone so far as to specify an ieee802211_txparam struct giving > > values of 255 both mcast and ucast rates within the struct (and of cour= se > > anding them by 0xff). I then used the ioctl call to set the flags with= in > > the interface request. Any help would be greatly appreciated. >=20 > You've set the ibp_rate0 parameter right? This one is in half-mbps, so > a value of 108 should give you 54m. The only thing I can think of right > now is that the device (or channel) is actually configured for 11b not > 11g mode. Can we rule that out? Which device are you using? >=20 > > I am doing nanosleeps in between transmissions as if I don't the bpf cl= one > > can't inject due to the buffer being too full. There's probably a bett= er > > way of doing this, but I doubt the nanosleeps are the issue (afterall, = I get > > almost exactly 11mbps). I should probably note I'm not doing any ACKs,= this > > is pure transmits. > >=20 > > If anybody cares enough to look at my unpolished code to get a better i= dea, > > look here: > >=20 > > http://projhinternet.svn.sourceforge.net/ > >=20 > > The idea is to allow unidirectional traffic so that with an FCC amateur > > license (yes I know I'm not currently broadcasting the call sign as of = yet) > > you can broadcast unencrypted transmissions for miles (with a linear > > amplifier spec'd to 2.4ghz). With the license FCC part15 no longer app= lies > > and you can operate just like in any other amateur band. > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > >=20 >=20 > --=20 > Bernhard I'm using an atheros AR2413 chipset, running in pure g mode, with also the = card put into "mode 11g" and ucast, mcast, and mgmt rates set to 54. I thi= nk the parameter for ibp_rate0 is just for setting it in the header (but I = could be wrong). Regardless I am doing this, let me give you the exact sou= rce files I'm doing this in. Line 38 in this file: http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/callbacks= =2Ec?revision=3D69&view=3Dmarkup=20 And the setup_if function in this: http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/libinject= =2Ec?revision=3D69&view=3Dmarkup --FL5UXtIhxfXey3p5 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (FreeBSD) iQIcBAEBAgAGBQJNlHGRAAoJED6sRHE6Tvmn3/gQAKKD4JJs3wK5gbqEKO4DHZbg b1OMKfw3vx5w1nggBmfA0ENEF1Wpw7hfomCTNln9i8pzOML0DFJGAH3QCBVr3zAB 5ABMWyi9TkSXhGrnls1MWI8rw+5O/jP/46MLoIQiq/YHT8GfofilrvWXwj01O5HE IFBC6wR22humlmnmdMzI29QoLewTHkf8V6lNEU0pWTbAnaVnhfCOJxe65dN077gc eQ0recYdOlv5oK1w1DBGJb/q3bTqMHBALn+fnXjCdPH5m4f3muPJJi5yuoCJg7U4 i5jjtME6XVMPZxLl/tSuo6er5AlRkMioSnLKZQTZch8ru98h+sy/IPGw8JkIMlQK eYvSsd5ncheT+cEChn+cndibz1sFk3GZepIJz7g9CzX3eiwZTBKEM5auyr2KmMED VDfZkZEj28y5QTUlPHghTSI5FG+aG6u4qQZ2m+1DqMKFEe0TMLnu3Xv+cEcOeO6Q 8uzV7Ke/QoRKmRyWiYEVzdcGzWcScoNzNJrtidOGbEu9LLm0wlY1jaUy7t3CWAiv k3SetMiuLkWNuGt6WbVigC8hQTmNPXOp5ijhfJ5wVhW/zYiQddHSUCArv7FsITCd CSXpw+fJGJNcY/etfOSUvN1ETldU7E8yGy0Pgi7+Fp6hnRQZejUtUG49qUrblSed 32ArZ1FPnVpUN/0cVEzY =l/RJ -----END PGP SIGNATURE----- --FL5UXtIhxfXey3p5-- From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 12:45:27 2011 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D9633106564A for ; Thu, 31 Mar 2011 12:45:27 +0000 (UTC) (envelope-from venglin@freebsd.lublin.pl) Received: from lagoon.freebsd.lublin.pl (lagoon.freebsd.lublin.pl [IPv6:2a02:2928:a::3]) by mx1.freebsd.org (Postfix) with ESMTP id 6A3D68FC0C for ; Thu, 31 Mar 2011 12:45:27 +0000 (UTC) Received: from [IPv6:2a02:2928:a:ffff:9d25:6557:6376:85c6] (unknown [IPv6:2a02:2928:a:ffff:9d25:6557:6376:85c6]) by lagoon.freebsd.lublin.pl (Postfix) with ESMTPSA id 5ED4723944C for ; Thu, 31 Mar 2011 14:45:25 +0200 (CEST) Message-ID: <4D947756.6050808@freebsd.lublin.pl> Date: Thu, 31 Mar 2011 14:45:10 +0200 From: Przemyslaw Frasunek User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; pl; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9 MIME-Version: 1.0 To: freebsd-net@FreeBSD.org X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: Subject: mpd5/Netgraph issues after upgrading to 7.4 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 12:45:27 -0000 Hello, I have upgraded one of my mpd5 based PPPoE access servers from 7.3-RELEASE to 7.4-RELEASE. Just after upgrade, I started getting following errors: Mar 31 13:48:06 lsm-gw mpd: [B-150] Bundle: Interface ng149 created Mar 31 13:48:06 lsm-gw mpd: [B-150] can't create ppp node at ".:"->"b150": Operation not permitted Mar 31 13:48:06 lsm-gw mpd: [B-150] Bundle netgraph initialization failed Mar 31 13:48:06 lsm-gw mpd: [zemborzyce-156] Bundle creation error Mar 31 13:48:06 lsm-gw mpd: [zemborzyce-156] link did not validate in bundle Mar 31 13:48:06 lsm-gw mpd: [zemborzyce-156] LCP: parameter negotiation failed Mar 31 13:48:06 lsm-gw mpd: [zemborzyce-156] LCP: state change Opened --> Stopping [root@lsm-gw /var/log]# grep -ic "operation not permitted" mpd.log 1756 It seems to occur only on specific bundles and after brief period of time, session establishment eventually succeeds. Is this related to resource shortage? From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 13:07:59 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B675F106566B for ; Thu, 31 Mar 2011 13:07:59 +0000 (UTC) (envelope-from bschmidt@techwires.net) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 527128FC13 for ; Thu, 31 Mar 2011 13:07:58 +0000 (UTC) Received: by fxm11 with SMTP id 11so2347579fxm.13 for ; Thu, 31 Mar 2011 06:07:58 -0700 (PDT) Received: by 10.223.124.7 with SMTP id s7mr99687far.123.1301576878218; Thu, 31 Mar 2011 06:07:58 -0700 (PDT) Received: from jessie.localnet (p5B2ECC03.dip0.t-ipconnect.de [91.46.204.3]) by mx.google.com with ESMTPS id n15sm415314fam.36.2011.03.31.06.07.56 (version=SSLv3 cipher=OTHER); Thu, 31 Mar 2011 06:07:56 -0700 (PDT) Sender: Bernhard Schmidt From: Bernhard Schmidt To: Adam Stylinski Date: Thu, 31 Mar 2011 15:07:15 +0200 User-Agent: KMail/1.13.5 (Linux/2.6.32-30-generic; KDE/4.4.5; i686; ; ) References: <201103310902.46236.bschmidt@freebsd.org> <20110331122033.GA66992@freebsdbox.adamsnet> In-Reply-To: <20110331122033.GA66992@freebsdbox.adamsnet> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-6" Content-Transfer-Encoding: 7bit Message-Id: <201103311507.16263.bschmidt@freebsd.org> Cc: freebsd-net@freebsd.org Subject: Re: net80211 and interface requests X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: bschmidt@freebsd.org List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 13:07:59 -0000 On Thursday, March 31, 2011 14:20:33 Adam Stylinski wrote: > On Thu, Mar 31, 2011 at 09:02:45AM +0200, Bernhard Schmidt wrote: > > On Wednesday, March 30, 2011 23:17:53 Adam Stylinski wrote: > > > Hello, > > > > > > This list has helped me before so I'll email again with the hopes that > > > somebody has an answer. All is working well with my project, however for > > > the life of me I cannot get the interface to inject the raw frames faster > > > than 11mbps. I'm following the example given in > > > /usr/src/tools/tools/net80211/wlaninject.c, and manually specifying > > > parameters such as ucastrate, mcastrate, and mgmtrate within ifconfig. I'm > > > putting the card into pureg mode, and yet I still can't inject any faster. > > > I've even gone so far as to specify an ieee802211_txparam struct giving > > > values of 255 both mcast and ucast rates within the struct (and of course > > > anding them by 0xff). I then used the ioctl call to set the flags within > > > the interface request. Any help would be greatly appreciated. > > > > You've set the ibp_rate0 parameter right? This one is in half-mbps, so > > a value of 108 should give you 54m. The only thing I can think of right > > now is that the device (or channel) is actually configured for 11b not > > 11g mode. Can we rule that out? Which device are you using? > > > > > I am doing nanosleeps in between transmissions as if I don't the bpf clone > > > can't inject due to the buffer being too full. There's probably a better > > > way of doing this, but I doubt the nanosleeps are the issue (afterall, I get > > > almost exactly 11mbps). I should probably note I'm not doing any ACKs, this > > > is pure transmits. > > > > > > If anybody cares enough to look at my unpolished code to get a better idea, > > > look here: > > > > > > http://projhinternet.svn.sourceforge.net/ > > > > > > The idea is to allow unidirectional traffic so that with an FCC amateur > > > license (yes I know I'm not currently broadcasting the call sign as of yet) > > > you can broadcast unencrypted transmissions for miles (with a linear > > > amplifier spec'd to 2.4ghz). With the license FCC part15 no longer applies > > > and you can operate just like in any other amateur band. > > > _______________________________________________ > > > freebsd-net@freebsd.org mailing list > > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > > > > > > I'm using an atheros AR2413 chipset, running in pure g mode, with also the card put into "mode 11g" and ucast, mcast, and mgmt rates set to 54. I think the parameter for ibp_rate0 is just for setting it in the header (but I could be wrong). Regardless I am doing this, let me give you the exact source files I'm doing this in. Well, the ath_rate_* modules afaik do not honor the fixed rate settings. At least I've heard something about those being broken. The ibp_rate0 parameter set to 108 seems to be correct though. No clue why that doesn't work, you may have to debug ath_tx_findrix(). Adding a printf of the passed over rate and ridx should shed some light on this I guess. > Line 38 in this file: > http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/callbacks.c?revision=69&view=markup > > And the setup_if function in this: > http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/libinject.c?revision=69&view=markup > -- Bernhard From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 13:51:02 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3D053106564A for ; Thu, 31 Mar 2011 13:51:02 +0000 (UTC) (envelope-from lacombar@gmail.com) Received: from mail-yi0-f54.google.com (mail-yi0-f54.google.com [209.85.218.54]) by mx1.freebsd.org (Postfix) with ESMTP id EF30D8FC23 for ; Thu, 31 Mar 2011 13:51:01 +0000 (UTC) Received: by yie12 with SMTP id 12so1130783yie.13 for ; Thu, 31 Mar 2011 06:51:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to:cc :content-type; bh=X6/KdAkdds5B5SYumr54RS6nOrTuCYENRfu1eRrrwkg=; b=ZGut0t72s9uJSNcdOwnKgyAxHR4asARi8GfbCjYTxxsohrLX8/P7+DG2uNplySuBmm W/uvCcKCpGDdArSNfG0uLXTX6JSq4uCY2djbO3vgL3EzZNeX4iZAzJ5E1wHvY8q1/8tp RY55Znbdx/eGXmvZaemRSOvyR5MapYVcThjso= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type; b=RCGYM8CE86n4bynWRQhNL8MIOSImb0FTgJlQHj3LK/6poi4tPBXqlMOIoHPN7lP9lT VHkxq8xV5yfPoARFiy5UnfVp7jsZBK77/pyqxK47sqhlwSr4Gzsdzjm4cU04rh1uOPW8 ANzqcEhG/n78gLEFkuDDu1TVkTJJJJBsiNe8Y= MIME-Version: 1.0 Received: by 10.43.63.72 with SMTP id xd8mr2901691icb.215.1301579460835; Thu, 31 Mar 2011 06:51:00 -0700 (PDT) Received: by 10.42.146.72 with HTTP; Thu, 31 Mar 2011 06:51:00 -0700 (PDT) Date: Thu, 31 Mar 2011 09:51:00 -0400 Message-ID: From: Arnaud Lacombe To: Jack Vogel Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org Subject: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 13:51:02 -0000 Hi [let's start a new thread :)] On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel wrote: > Read the code in HEAD, em_local_timer() has a test of ALL the rx queues and > will schedule a task that refreshes mbufs if they are empty. This has > exactly the > same effect as checking for some interrupt cause, a cause that is not > available > when using MSIX on 82574, but this approach works for everything. > ok, it took me a long time to reproduce the issue with em(4) version 7.1.9, about 3h rather than a few minutes a month ago and only got ~875 allocations failure vs. several thousand before, here are some stats: # sysctl -a | grep missed dev.em.0.mac_stats.missed_packets: 1917112 dev.em.1.mac_stats.missed_packets: 0 dev.em.2.mac_stats.missed_packets: 0 dev.em.3.mac_stats.missed_packets: 0 dev.em.4.mac_stats.missed_packets: 0 dev.em.5.mac_stats.missed_packets: 0 # sysctl dev.em.0.debug=1 dev.em.0.debug: I-1nterface is RUNNING and INACTIVE em0: hw tdh = 861, hw tdt = 861 em0: hw rdh = 929, hw rdt = 929 em0: Tx Queue Status = 0 em0: TX descriptors avail = 1024 em0: Tx Descriptors avail failure = 0 em0: RX discarded packets = 0 em0: RX Next to Check = 929 em0: RX Next to Refresh = 930 -> -1 I backported the -current driver to 7.1 and re-ran the test overnight. Now, the box is running 7.2.2. The box was hung this morning: dev.em.0.mac_stats.missed_packets: 25513991 dev.em.1.mac_stats.missed_packets: 0 dev.em.2.mac_stats.missed_packets: 0 dev.em.3.mac_stats.missed_packets: 0 dev.em.4.mac_stats.missed_packets: 0 dev.em.5.mac_stats.missed_packets: 0 There has been about 1000 mbuf allocation denial. I changed some relevant field of the RX soft stat in the sysctl output of the device [Of course, the only field of interest, `next_to_check' is invalid because of a typo... I should not change code past a certain hour :)], here it is: # sysctl dev.em.0 dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2 dev.em.0.%driver: em dev.em.0.%location: slot=0 function=0 dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 subdevice=0x0000 class=0x020000 dev.em.0.%parent: pci1 dev.em.0.nvm: -1 dev.em.0.debug: -1 dev.em.0.rx_int_delay: 0 dev.em.0.tx_int_delay: 66 dev.em.0.rx_abs_int_delay: 66 dev.em.0.tx_abs_int_delay: 66 dev.em.0.rx_processing_limit: 100 dev.em.0.flow_control: 3 dev.em.0.eee_control: 0 dev.em.0.link_irq: 11621474 dev.em.0.mbuf_alloc_fail: 0 dev.em.0.cluster_alloc_fail: 0 dev.em.0.dropped: 0 dev.em.0.tx_dma_fail: 0 dev.em.0.rx_overruns: 0 dev.em.0.watchdog_timeouts: 0 dev.em.0.device_control: 1477444168 dev.em.0.rx_control: 67141634 dev.em.0.fc_high_water: 18432 dev.em.0.fc_low_water: 16932 dev.em.0.queue0.txd_head: 904 dev.em.0.queue0.txd_tail: 904 dev.em.0.queue0.tx_irq: 10291170 dev.em.0.queue0.no_desc_avail: 0 dev.em.0.queue0.rxd_head: 766 dev.em.0.queue0.rxd_tail: 767 dev.em.0.queue0.rx_irq: 6937760 dev.em.0.queue0.rx_discarded: 0 dev.em.0.queue0.rx_forced_refill: 0 dev.em.0.queue0.next_to_check: 6937760 ^^^ this field is invalid... bad code... :( dev.em.0.queue0.next_to_refresh: 767 dev.em.0.mac_stats.excess_coll: 0 dev.em.0.mac_stats.single_coll: 0 dev.em.0.mac_stats.multiple_coll: 0 dev.em.0.mac_stats.late_coll: 0 dev.em.0.mac_stats.collision_count: 0 dev.em.0.mac_stats.symbol_errors: 0 dev.em.0.mac_stats.sequence_errors: 0 dev.em.0.mac_stats.defer_count: 0 dev.em.0.mac_stats.missed_packets: 25752895 dev.em.0.mac_stats.recv_no_buff: 3 dev.em.0.mac_stats.recv_undersize: 0 dev.em.0.mac_stats.recv_fragmented: 0 dev.em.0.mac_stats.recv_oversize: 0 dev.em.0.mac_stats.recv_jabber: 0 dev.em.0.mac_stats.recv_errs: 0 dev.em.0.mac_stats.crc_errs: 0 dev.em.0.mac_stats.alignment_errs: 0 dev.em.0.mac_stats.coll_ext_errs: 0 dev.em.0.mac_stats.xon_recvd: 0 dev.em.0.mac_stats.xon_txd: 0 dev.em.0.mac_stats.xoff_recvd: 0 dev.em.0.mac_stats.xoff_txd: 25752073 dev.em.0.mac_stats.total_pkts_recvd: 39996734 dev.em.0.mac_stats.good_pkts_recvd: 14243839 dev.em.0.mac_stats.bcast_pkts_recvd: 5 dev.em.0.mac_stats.mcast_pkts_recvd: 0 dev.em.0.mac_stats.rx_frames_64: 13878627 dev.em.0.mac_stats.rx_frames_65_127: 365212 dev.em.0.mac_stats.rx_frames_128_255: 0 dev.em.0.mac_stats.rx_frames_256_511: 0 dev.em.0.mac_stats.rx_frames_512_1023: 0 dev.em.0.mac_stats.rx_frames_1024_1522: 0 dev.em.0.mac_stats.good_octets_recvd: 916346006 dev.em.0.mac_stats.good_octets_txd: 21377046229 dev.em.0.mac_stats.total_pkts_txd: 44415008 dev.em.0.mac_stats.good_pkts_txd: 18661905 dev.em.0.mac_stats.bcast_pkts_txd: 24822815 dev.em.0.mac_stats.mcast_pkts_txd: 0 dev.em.0.mac_stats.tx_frames_64: 1278447 dev.em.0.mac_stats.tx_frames_65_127: 1221602 dev.em.0.mac_stats.tx_frames_128_255: 503121 dev.em.0.mac_stats.tx_frames_256_511: 770073 dev.em.0.mac_stats.tx_frames_512_1023: 1921953 dev.em.0.mac_stats.tx_frames_1024_1522: 12966709 dev.em.0.mac_stats.tso_txd: 0 dev.em.0.mac_stats.tso_ctx_fail: 0 dev.em.0.interrupts.asserts: 5297765 dev.em.0.interrupts.rx_pkt_timer: 0 dev.em.0.interrupts.rx_abs_timer: 0 dev.em.0.interrupts.tx_pkt_timer: 1 dev.em.0.interrupts.tx_abs_timer: 1 dev.em.0.interrupts.tx_queue_empty: 0 dev.em.0.interrupts.tx_queue_min_thresh: 0 dev.em.0.interrupts.rx_desc_min_thresh: 0 dev.em.0.interrupts.rx_overrun: 3186 I added `rx_forced_refill', which account the number of time the RX taskqueue is scheduled from em_local_timer(). Now, if I read the patched em_local_timer() correctly, the relevant code being: /* trigger tq to refill rx ring queue if it is empty */ for (int i = 0; i < adapter->num_queues; i++, rxr++) { if (rxr->next_to_check == rxr->next_to_refresh) { rxr->rx_forced_refill++; taskqueue_enqueue(rxr->tq, &rxr->rx_task); } } and assuming that we have the same pattern between `next_to_check' and `next_to_refresh' as we did with 7.1.9, it is understandable the driver hang. I'll remove part of the changes I made to keep only `rx_forced_refill' and the associated sysctl, re-run the tests and come back with correct value, hopefully in a few hours. - Arnaud From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 14:52:01 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 759CA1065674; Thu, 31 Mar 2011 14:52:01 +0000 (UTC) (envelope-from kungfujesus06@gmail.com) Received: from mail-yw0-f54.google.com (mail-yw0-f54.google.com [209.85.213.54]) by mx1.freebsd.org (Postfix) with ESMTP id 16D078FC19; Thu, 31 Mar 2011 14:52:00 +0000 (UTC) Received: by ywf9 with SMTP id 9so1156204ywf.13 for ; Thu, 31 Mar 2011 07:52:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=bepvgrKE7a9Fxuzzxj36pZHAmeEU6yYvfN/AMu9Xpdo=; b=T/RWT8ZDEc2mlzx+T/5KqGSvLO5V1pN5CWnJIyuoi1DWfByntkbAjYqS5GO4qnd6JB JFjTszWPwCX92DTXSyIZOLaO+ghvbsRt3rES2VrGbDLN6SpO+5Rqhs/sT8JT5gInz27a h/kDF8MSE3p2+/shXMuNhspNfz1lpIKFdyJ50= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=Gs04bRxGjerBs5dxaHaH8eiHusuaw5nf5/QXak1cdItdlEzKubZuM2gEUGawB6alSp 1ShDHT6bhg5Su8hXDOebECTHWJMpPvyIfSLnCoiY+6vjOJuNiCFba3oYvvzzVPHbvH0k l3UqHzzzeRr5zhJHRNhr9AxWJsGvpaJjQuRRc= Received: by 10.90.13.27 with SMTP id 27mr3034536agm.188.1301583120275; Thu, 31 Mar 2011 07:52:00 -0700 (PDT) Received: from ossumpossum.geop.uc.edu ([129.137.163.184]) by mx.google.com with ESMTPS id c38sm1256191anc.18.2011.03.31.07.51.58 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 31 Mar 2011 07:51:59 -0700 (PDT) Date: Thu, 31 Mar 2011 10:51:53 -0400 From: Adam Stylinski To: Bernhard Schmidt Message-ID: <20110331145153.GA2243@ossumpossum.geop.uc.edu> References: <201103310902.46236.bschmidt@freebsd.org> <20110331122033.GA66992@freebsdbox.adamsnet> <201103311507.16263.bschmidt@freebsd.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="k+w/mQv8wyuph6w0" Content-Disposition: inline In-Reply-To: <201103311507.16263.bschmidt@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-net@freebsd.org Subject: Re: net80211 and interface requests X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 14:52:01 -0000 --k+w/mQv8wyuph6w0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Mar 31, 2011 at 03:07:15PM +0200, Bernhard Schmidt wrote: > On Thursday, March 31, 2011 14:20:33 Adam Stylinski wrote: > > On Thu, Mar 31, 2011 at 09:02:45AM +0200, Bernhard Schmidt wrote: > > > On Wednesday, March 30, 2011 23:17:53 Adam Stylinski wrote: > > > > Hello, > > > >=20 > > > > This list has helped me before so I'll email again with the hopes t= hat > > > > somebody has an answer. All is working well with my project, howev= er for > > > > the life of me I cannot get the interface to inject the raw frames = faster > > > > than 11mbps. I'm following the example given in > > > > /usr/src/tools/tools/net80211/wlaninject.c, and manually specifying > > > > parameters such as ucastrate, mcastrate, and mgmtrate within ifconf= ig. I'm > > > > putting the card into pureg mode, and yet I still can't inject any = faster. > > > > I've even gone so far as to specify an ieee802211_txparam struct g= iving > > > > values of 255 both mcast and ucast rates within the struct (and of = course > > > > anding them by 0xff). I then used the ioctl call to set the flags = within > > > > the interface request. Any help would be greatly appreciated. > > >=20 > > > You've set the ibp_rate0 parameter right? This one is in half-mbps, so > > > a value of 108 should give you 54m. The only thing I can think of rig= ht > > > now is that the device (or channel) is actually configured for 11b not > > > 11g mode. Can we rule that out? Which device are you using? > > >=20 > > > > I am doing nanosleeps in between transmissions as if I don't the bp= f clone > > > > can't inject due to the buffer being too full. There's probably a = better > > > > way of doing this, but I doubt the nanosleeps are the issue (aftera= ll, I get > > > > almost exactly 11mbps). I should probably note I'm not doing any A= CKs, this > > > > is pure transmits. > > > >=20 > > > > If anybody cares enough to look at my unpolished code to get a bett= er idea, > > > > look here: > > > >=20 > > > > http://projhinternet.svn.sourceforge.net/ > > > >=20 > > > > The idea is to allow unidirectional traffic so that with an FCC ama= teur > > > > license (yes I know I'm not currently broadcasting the call sign as= of yet) > > > > you can broadcast unencrypted transmissions for miles (with a linear > > > > amplifier spec'd to 2.4ghz). With the license FCC part15 no longer= applies > > > > and you can operate just like in any other amateur band. > > > > _______________________________________________ > > > > freebsd-net@freebsd.org mailing list > > > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.o= rg" > > > >=20 > > >=20 > >=20 > > I'm using an atheros AR2413 chipset, running in pure g mode, with also = the card put into "mode 11g" and ucast, mcast, and mgmt rates set to 54. I= think the parameter for ibp_rate0 is just for setting it in the header (bu= t I could be wrong). Regardless I am doing this, let me give you the exact= source files I'm doing this in. >=20 > Well, the ath_rate_* modules afaik do not honor the fixed rate > settings. At least I've heard something about those being broken. The > ibp_rate0 parameter set to 108 seems to be correct though. >=20 > No clue why that doesn't work, you may have to debug ath_tx_findrix(). > Adding a printf of the passed over rate and ridx should shed some light > on this I guess. >=20 > > Line 38 in this file: > > http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/callb= acks.c?revision=3D69&view=3Dmarkup=20 > >=20 > > And the setup_if function in this: > > http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/libin= ject.c?revision=3D69&view=3Dmarkup > >=20 >=20 > --=20 > Bernhard Is there any way to do this without using a kernel debugger? It'd be reall= y disappointing if the whole problem is a bug in the driver. =20 --k+w/mQv8wyuph6w0 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (FreeBSD) iQIcBAEBAgAGBQJNlJUIAAoJED6sRHE6Tvmnxz0QANG8yMyuuwTQgxCpjGTm9HeR 739l+L0Xu+KY7Tl56m60PVcyDwMoLVh8NPd936XDm36W3CBswWHFuFRyCHyLsfkp 7gWlNp+C2NGKYxktdBzR/b1POH8AtIBDT1vD454dEjikok5h6T6VKj0DLiXfyfAo Em1dvo/nAbAgojJOUj1SqaxCIyH5CjhyJkvr7dRRR1QrVQNxd3rw1hEBe7NFBwYG aIT32d1MMqbuGr0CSobg+VYCcHZ9/aXBwSikkHrraPL6H/GDXSV1LK7kfOS4zBmt 49eqr+j4+ovmiAcGA1vdxXOtGZ0YuCtsHzoHlrEie/hH/GjDPFBSLT2DS1IVDdbu BykGb9dp6Umm8sPdgfaUjlPBQO6Z+wIlacC9mVlYMZ1CF6ftVb87tNMfzVIo0Riw /gXmHV108P8rDHushuntsj8USLt2uDT7prT7OSjgp2dvv41SvJ43DOJKvNwgkDQg 6sY4+cO0DPGf87m61+4QYlBGpwL3NThfuzHJOnoYNfLt8xotPLJEE38Bv/T4X61u MrdF5CyLWjUVrvLvUNCqjwygHd1EagiAHKaILlNSlXLBkMTL7payzKAxviViI4gn e45aRawyXxFgjHKch3uH+HS5m3nMYAQIewil90hTe5zZffn0pYp5lFJZE/J2te1x 9UGgj4ru3LW7qkH8CbO5 =mxO7 -----END PGP SIGNATURE----- --k+w/mQv8wyuph6w0-- From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 16:15:59 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 07916106564A; Thu, 31 Mar 2011 16:15:58 +0000 (UTC) (envelope-from kungfujesus06@gmail.com) Received: from mail-gx0-f182.google.com (mail-gx0-f182.google.com [209.85.161.182]) by mx1.freebsd.org (Postfix) with ESMTP id 632758FC13; Thu, 31 Mar 2011 16:15:58 +0000 (UTC) Received: by gxk28 with SMTP id 28so1197382gxk.13 for ; Thu, 31 Mar 2011 09:15:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=n1nUyKIP7PpXexPL3BOvLPRmIcpI1HaGY7Mnj/hj8eU=; b=nyhLg0oV8TTAEC88cRh3XhfeXWFUJX5Ud07HaPxA6iHZ2Zt3+PlmMqKIOWn8Mtaydp gbQObhlDhKWDdWTUqGtZYRLGxwbfAqzxYqid+hWemnXRBJIMqfe15RkmLKNrdvCGxhzg LQPzlGHd0uYQ8rWLvOAMQnX3YotaKAqSo2MLE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=e9zmmDZaXybfIZ6cAfJGrVYGLwKKdzBa2JS+HkWp4yR8h1oFQTiENq4zpYfh/dWlcF hYWi7PXjF4MzNHhshDGtYyTB1glni44AZ8nvbFwh7MOMSHSv7OTCDMV5EfwDdfQQahVo rL3bkJpzI65uBgMQ/wXZd4dY9PNdZaV8lLq1g= Received: by 10.236.185.1 with SMTP id t1mr3942268yhm.88.1301588157564; Thu, 31 Mar 2011 09:15:57 -0700 (PDT) Received: from ossumpossum.geop.uc.edu ([129.137.163.184]) by mx.google.com with ESMTPS id y5sm634080yhc.83.2011.03.31.09.15.55 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 31 Mar 2011 09:15:56 -0700 (PDT) Date: Thu, 31 Mar 2011 12:15:53 -0400 From: Adam Stylinski To: Bernhard Schmidt Message-ID: <20110331161553.GB3263@ossumpossum.geop.uc.edu> References: <201103311507.16263.bschmidt@freebsd.org> <20110331151421.GA3263@ossumpossum.geop.uc.edu> <201103311735.40634.bschmidt@freebsd.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="0ntfKIWw70PvrIHh" Content-Disposition: inline In-Reply-To: <201103311735.40634.bschmidt@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-net@freebsd.org Subject: Re: net80211 and interface requests X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 16:15:59 -0000 --0ntfKIWw70PvrIHh Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Mar 31, 2011 at 05:35:40PM +0200, Bernhard Schmidt wrote: > On Thursday, March 31, 2011 17:14:21 Adam Stylinski wrote: > > On Thu, Mar 31, 2011 at 03:07:15PM +0200, Bernhard Schmidt wrote: > > > On Thursday, March 31, 2011 14:20:33 Adam Stylinski wrote: > > > > On Thu, Mar 31, 2011 at 09:02:45AM +0200, Bernhard Schmidt wrote: > > > > > On Wednesday, March 30, 2011 23:17:53 Adam Stylinski wrote: > > > > > > Hello, > > > > > >=20 > > > > > > This list has helped me before so I'll email again with the hop= es that > > > > > > somebody has an answer. All is working well with my project, h= owever for > > > > > > the life of me I cannot get the interface to inject the raw fra= mes faster > > > > > > than 11mbps. I'm following the example given in > > > > > > /usr/src/tools/tools/net80211/wlaninject.c, and manually specif= ying > > > > > > parameters such as ucastrate, mcastrate, and mgmtrate within if= config. I'm > > > > > > putting the card into pureg mode, and yet I still can't inject = any faster. > > > > > > I've even gone so far as to specify an ieee802211_txparam stru= ct giving > > > > > > values of 255 both mcast and ucast rates within the struct (and= of course > > > > > > anding them by 0xff). I then used the ioctl call to set the fl= ags within > > > > > > the interface request. Any help would be greatly appreciated. > > > > >=20 > > > > > You've set the ibp_rate0 parameter right? This one is in half-mbp= s, so > > > > > a value of 108 should give you 54m. The only thing I can think of= right > > > > > now is that the device (or channel) is actually configured for 11= b not > > > > > 11g mode. Can we rule that out? Which device are you using? > > > > >=20 > > > > > > I am doing nanosleeps in between transmissions as if I don't th= e bpf clone > > > > > > can't inject due to the buffer being too full. There's probabl= y a better > > > > > > way of doing this, but I doubt the nanosleeps are the issue (af= terall, I get > > > > > > almost exactly 11mbps). I should probably note I'm not doing a= ny ACKs, this > > > > > > is pure transmits. > > > > > >=20 > > > > > > If anybody cares enough to look at my unpolished code to get a = better idea, > > > > > > look here: > > > > > >=20 > > > > > > http://projhinternet.svn.sourceforge.net/ > > > > > >=20 > > > > > > The idea is to allow unidirectional traffic so that with an FCC= amateur > > > > > > license (yes I know I'm not currently broadcasting the call sig= n as of yet) > > > > > > you can broadcast unencrypted transmissions for miles (with a l= inear > > > > > > amplifier spec'd to 2.4ghz). With the license FCC part15 no lo= nger applies > > > > > > and you can operate just like in any other amateur band. > > > > > > _______________________________________________ > > > > > > freebsd-net@freebsd.org mailing list > > > > > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > > > > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freeb= sd.org" > > > > > >=20 > > > > >=20 > > > >=20 > > > > I'm using an atheros AR2413 chipset, running in pure g mode, with a= lso the card put into "mode 11g" and ucast, mcast, and mgmt rates set to 54= =2E I think the parameter for ibp_rate0 is just for setting it in the head= er (but I could be wrong). Regardless I am doing this, let me give you the= exact source files I'm doing this in. > > >=20 > > > Well, the ath_rate_* modules afaik do not honor the fixed rate > > > settings. At least I've heard something about those being broken. The > > > ibp_rate0 parameter set to 108 seems to be correct though. > > >=20 > > > No clue why that doesn't work, you may have to debug ath_tx_findrix(). > > > Adding a printf of the passed over rate and ridx should shed some lig= ht > > > on this I guess. > > >=20 > > > > Line 38 in this file: > > > > http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/c= allbacks.c?revision=3D69&view=3Dmarkup=20 > > > >=20 > > > > And the setup_if function in this: > > > > http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/l= ibinject.c?revision=3D69&view=3Dmarkup > > > >=20 > > >=20 > >=20 > > It turns out strange coincidences can happen. I decided to busy loop, = thinking maybe it was my nanosleep call. And what do you know, 52Mb/sec. = Is there some sort of call I can use to probe the fd to see if the buffer h= as been sent yet? =20 >=20 > Honestly, no clue. The bpf transmit path is a bunch of ugly hacks.. > What you can try though is to enable various debug options for > net80211 and ath to figure out what's going on, especially the bits > for xmit. >=20 > On a unrelated side note, how is the ath/wlan0 interface configured? > I mean, is it in sta mode or ahdemo? I guess most tests have been done > in ahdemo mode. Also I'm sure that all frames are simply discarded if > the device is currently scanning. >=20 > --=20 > Bernhard I'm running in ahdemo mode. Hmm, I really don't want to busyloop the CPU, = but around ~90,000 for loop assignment and comparisons for my 2.8GHz CPU yi= elds the correct time. If we disregard any piplining that could be occurri= ng it would come out to around (2*90,000)/(2800*10^6) seconds. This is abo= ut 64ish microseconds. Now realistically FreeBSD is not a real-time OS by = any means but is there some better way I can use other than spin locking th= e process? --0ntfKIWw70PvrIHh Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (FreeBSD) iQIcBAEBAgAGBQJNlKi4AAoJED6sRHE6TvmncMYP/3ALSiT9tqwvXNaeVdT33OZA X5Mk17zi+fo9jcT4ay0DcHszwfSozeZnWvOVaNsmYzvvxC3LCe2ICw6VAyEet8rd ID6e6i+hRd7wRKww9hBWoUoXlTtF0LI4EZaQWhf3+KNpHJ+x8PFhrRv6xOYRVZ03 ixoq1/wm7T6RfpEg4V5QcbxPw7JKAbcQJlBWKnNOUfnu/XGSe90Gf/mIGHVC5ps3 5foa18JVvuX9M/E/IvC6ols0vW/N76/Vne4Wpvq5ziW22fQlGD0qzxos4O8vlrle CTq6jTipdM6o/jo5PqsW/B7L0GIVPPhPxHTejzUADpnqQFAaFbY2sKGAQL1sH4ir zD4SCI9KXhr9hf/ecttxwpyRQ+z66ScxGEPRLuEJ23atNKdiNpPC47vpo1UJh8LJ H5uqcV/UNhIoZ7zsr+M0yRA0hjbpbapfbMxiWz2kRlLOgGkoHRMqxgIlJI5sYgsQ /2mbI5Sb0/6YkKNEAzKL4A6HGduEcHqD1cVqUda0OIAYUVmRBA19jog3cmSboKWO r0reW40Bre8ROXIKx/cIESB7TRuwTwpAkAjEHHjq4Wj8qt7ISImVrYCz/RdQqPem NMdhh5YHhuXHS25s8lLaAQN3lrzK8LOCATlZTNYT8vbdpDhANBi4ku12SdT5kuMv IsD5QBTdMlVQDKutMHUq =xrwE -----END PGP SIGNATURE----- --0ntfKIWw70PvrIHh-- From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 17:14:19 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C41161065672; Thu, 31 Mar 2011 17:14:19 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 699F78FC14; Thu, 31 Mar 2011 17:14:19 +0000 (UTC) Received: by iwn33 with SMTP id 33so3247556iwn.13 for ; Thu, 31 Mar 2011 10:14:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:date:to:cc:subject:message-id:reply-to :references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=u13DP9GH/GfVlMoG7Q5+eLJlyoAg27Cv7nvkf1SPHh8=; b=Ofz6TvhR7aEHwjQPK45Gn7hQGIOBRQXzD7fbA9+Mvz1ZXwgqydI5xJ+Nqa/EqjoLb2 ARfQilG0SCqnRxZNkhGyLct/IpI7u7/XndDlVrYgjKCvI+czwA2BNQwtNiVQO9igcuMb tZS5f6KzrxdJZ3nWrcNV4HkM/XjHa0SWCNVj4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=I4FSql0ZmbWfG1b3f2ZzknUBtNDk+KpJFWS7TJ3AYyXVi1fvwSNaYIITHr4+czBxBe ibLn04rNDiuqh6vo9EnyYhFh8/ZvNa32EL5Z5yQ41s5NTe+htFAieHoA18moUuxDmAsy dELGEMMDVftp+dQ+7odPCYS1WNeJu9ln2i06Y= Received: by 10.42.161.7 with SMTP id r7mr3665259icx.228.1301591658653; Thu, 31 Mar 2011 10:14:18 -0700 (PDT) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id uk4sm736345icb.9.2011.03.31.10.14.14 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 31 Mar 2011 10:14:16 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Thu, 31 Mar 2011 10:13:03 -0700 From: YongHyeon PYUN Date: Thu, 31 Mar 2011 10:13:03 -0700 To: Yamagi Burmeister Message-ID: <20110331171302.GA11981@michelle.cdnetworks.com> References: <20110330173145.GB8601@michelle.cdnetworks.com> <20110330202858.GC8601@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="7AUc2qLy4jB3hD7Z" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org, yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 17:14:19 -0000 --7AUc2qLy4jB3hD7Z Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Mar 31, 2011 at 09:05:19AM +0200, Yamagi Burmeister wrote: > On Wed, 30 Mar 2011, YongHyeon PYUN wrote: > > >>Okay, I did a test run with RX checksum, TX checksum and both disabled. > >>In all three cases the crash occurs within about 20 minutes. I'm either > >>not sure that age(4) is the problem but it has definedly something to do > >>with the problem, since with another nic driver the same scenario is > >>rock solid... > >> > > > >OK. > > > >>The workload: It's a NFS3 server (FreeBSDs non-experimental > >>implementation), serving and receiving file with about 250 to 500 > >>megabytes at about 20mb/s. The clients are FreeBSD 7 and 8 systems and > >>are mounting the shares via TCP. The connection is 1000mbit/s via a > >>"dumb" gigabit switch. > >> > > > >That's too broad to narrow down the issue. :-( > >I'm not sure but your box seem to have more than 4GB memory. Could > >you limit the available memory to 3GB via loader.conf and test it > >again? > > All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64. > After limiting the memory via hw.physmem to 3GB the problems are gone. > The box is running crashfree for more than 6 hours and has served over > 300GB of data via age(4). > Thanks for testing. Remove the hw.physmem configuration and try attached patch and let me know how it goes. --7AUc2qLy4jB3hD7Z Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="age.dma.diff" Index: sys/dev/age/if_age.c =================================================================== --- sys/dev/age/if_age.c (revision 220116) +++ sys/dev/age/if_age.c (working copy) @@ -2452,6 +2452,9 @@ /* Update the consumer index. */ sc->age_cdata.age_rr_cons = rr_cons; + bus_dmamap_sync(sc->age_cdata.age_rx_ring_tag, + sc->age_cdata.age_rx_ring_map, + BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); /* Sync descriptors. */ bus_dmamap_sync(sc->age_cdata.age_rr_ring_tag, sc->age_cdata.age_rr_ring_map, --7AUc2qLy4jB3hD7Z-- From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 18:07:37 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A1660106566C; Thu, 31 Mar 2011 18:07:37 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail.yamagi.overkill.yamagi.org (unknown [IPv6:2a01:4f8:121:2102:1::7]) by mx1.freebsd.org (Postfix) with ESMTP id 3A5B98FC19; Thu, 31 Mar 2011 18:07:37 +0000 (UTC) Received: from [2001:5c0:110d:6600:226:c6ff:fec4:399e] (unknown [IPv6:2001:5c0:110d:6600:226:c6ff:fec4:399e]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.yamagi.overkill.yamagi.org (Postfix) with ESMTPSA id 3B89216663D3; Thu, 31 Mar 2011 20:07:31 +0200 (CEST) Date: Thu, 31 Mar 2011 20:07:17 +0200 (CEST) From: Yamagi Burmeister X-X-Sender: yamagi@maka.home.yamagi.org To: YongHyeon PYUN In-Reply-To: <20110331171302.GA11981@michelle.cdnetworks.com> Message-ID: References: <20110330173145.GB8601@michelle.cdnetworks.com> <20110330202858.GC8601@michelle.cdnetworks.com> <20110331171302.GA11981@michelle.cdnetworks.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org, Yamagi Burmeister , yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 18:07:37 -0000 On Thu, 31 Mar 2011, YongHyeon PYUN wrote: >> All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64. >> After limiting the memory via hw.physmem to 3GB the problems are gone. >> The box is running crashfree for more than 6 hours and has served over >> 300GB of data via age(4). >> > > Thanks for testing. Remove the hw.physmem configuration and try > attached patch and let me know how it goes. Thanks for your help, but the patch doesn't work. Another random panic - this time "page fault in kernel mode" - with nothing age(4) or network stack related stuff in the backtrace... Maybe it'll help to know about a bug fix in the linux atl1 driver, now replaced by atlx. In git commit 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 64 bit DMA was disabled: 64-bit DMA causes data corruption with atl1. We don't know why, and Atheros is working on it. For now, just use 32-bit DMA. This is a big hack that is probably wrong, but it stops the bleeding. There was no later follow up on it. I think that this can't be problem on FreeBSD but maybe I'm reading the driver code wrong. The kernel.org gitweb URL is: http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.23.y.git;a=commitdiff;h=5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 -- Homepage: www.yamagi.org Jabber: yamagi@yamagi.org GnuPG/GPG: 0xEFBCCBCB From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 18:18:08 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7AFED10656ED; Thu, 31 Mar 2011 18:18:08 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 0FBFC8FC08; Thu, 31 Mar 2011 18:18:07 +0000 (UTC) Received: by iyj12 with SMTP id 12so3357226iyj.13 for ; Thu, 31 Mar 2011 11:18:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:date:to:cc:subject:message-id:reply-to :references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=PzhCTNtVq86/xf1SsfG5QImhzqly/GLQJpxjhBPpk2U=; b=j9hcJRmq8PUGCy/wtlQ8vIbjnZZE3bEbFx3DCGU3U6y3gzeTrX8kpNehSSfUklADa6 8zNZgvhjmmb4TYSs8BZC2viqoA7Ey5DyPv2nW5lPOTFoNawAbr944KaCCerls1rRzdp6 J0nTJyE2DAAgdJC5pIRCtKQ7Lk/yGEVRxvYKc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=xgEHmLTG+/0OZsUfw4Zg3h4O2rcgZjMD2K0KLg1HYlg0Ver4wlj1bW+v86MvpX1YP4 0cdsa9nESUPI3IynfL4fmax1sa1Mm98z0d7uKEF3SBzm5cx/vZfsf0HwkuAjgVCVlB5i YownWZUzN+rb4xbPEGRw9hfmc5UfSvyL5XMTg= Received: by 10.43.51.135 with SMTP id vi7mr3256371icb.336.1301595486535; Thu, 31 Mar 2011 11:18:06 -0700 (PDT) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id gx2sm887576ibb.43.2011.03.31.11.18.03 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 31 Mar 2011 11:18:05 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Thu, 31 Mar 2011 11:16:52 -0700 From: YongHyeon PYUN Date: Thu, 31 Mar 2011 11:16:52 -0700 To: Yamagi Burmeister Message-ID: <20110331181651.GB11981@michelle.cdnetworks.com> References: <20110330173145.GB8601@michelle.cdnetworks.com> <20110330202858.GC8601@michelle.cdnetworks.com> <20110331171302.GA11981@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="xgyAXRrhYN0wYx8y" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org, yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 18:18:08 -0000 --xgyAXRrhYN0wYx8y Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Mar 31, 2011 at 08:07:17PM +0200, Yamagi Burmeister wrote: > On Thu, 31 Mar 2011, YongHyeon PYUN wrote: > > >>All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64. > >>After limiting the memory via hw.physmem to 3GB the problems are gone. > >>The box is running crashfree for more than 6 hours and has served over > >>300GB of data via age(4). > >> > > > >Thanks for testing. Remove the hw.physmem configuration and try > >attached patch and let me know how it goes. > > Thanks for your help, but the patch doesn't work. Another random panic - > this time "page fault in kernel mode" - with nothing age(4) or network > stack related stuff in the backtrace... > > Maybe it'll help to know about a bug fix in the linux atl1 driver, now > replaced by atlx. In git commit 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 > 64 bit DMA was disabled: > > 64-bit DMA causes data corruption with atl1. We don't know why, and > Atheros is working on it. For now, just use 32-bit DMA. This is a big > hack that is probably wrong, but it stops the bleeding. > > There was no later follow up on it. I think that this can't be problem > on FreeBSD but maybe I'm reading the driver code wrong. The kernel.org > gitweb URL is: > > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.23.y.git;a=commitdiff;h=5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 > Thanks a lot! It seems the L1 controller has data corruption issue when 64bit DMA addressing is used. Try this one. --xgyAXRrhYN0wYx8y Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="age.dma.diff2" Index: sys/dev/age/if_age.c =================================================================== --- sys/dev/age/if_age.c (revision 220116) +++ sys/dev/age/if_age.c (working copy) @@ -1092,10 +1092,13 @@ * Create Tx/Rx buffer parent tag. * L1 supports full 64bit DMA addressing in Tx/Rx buffers * so it needs separate parent DMA tag. + * XXX + * It seems enabling 64bit DMA causes data corruption. Limit + * DMA address space to 32bit. */ error = bus_dma_tag_create( bus_get_dma_tag(sc->age_dev), /* parent */ - 1, 0, /* alignment, boundary */ + BUS_SPACE_MAXADDR_32BIT, 0, /* alignment, boundary */ BUS_SPACE_MAXADDR, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ @@ -2452,6 +2455,9 @@ /* Update the consumer index. */ sc->age_cdata.age_rr_cons = rr_cons; + bus_dmamap_sync(sc->age_cdata.age_rx_ring_tag, + sc->age_cdata.age_rx_ring_map, + BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); /* Sync descriptors. */ bus_dmamap_sync(sc->age_cdata.age_rr_ring_tag, sc->age_cdata.age_rr_ring_map, --xgyAXRrhYN0wYx8y-- From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 18:32:10 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6D228106579B; Thu, 31 Mar 2011 18:32:10 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 8F2AF8FC20; Thu, 31 Mar 2011 18:32:09 +0000 (UTC) Received: by iwn33 with SMTP id 33so3334774iwn.13 for ; Thu, 31 Mar 2011 11:32:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:date:to:cc:subject:message-id:reply-to :references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=lwa7dXbQmYw3PyuXynzgHDdzlenF1/eh7AAn5sDP//Q=; b=sms9XluXm0CjN59xWlGjZXRXCUfT4KoajjCIhDmO2nByS97P6tDolMlKwB0QgxY4tb WX9CROMUpTkqWAfUv0o0q0Qgqv8X4Bh+5UVJNhUXEe4TlUzL3vHdFOffro16RtVkm1Mr RLh1yeu4m+l7Khk9Swhm9oN012e1nMeRfX9Wo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=IO3irBVSr7Y6CROM0rV8BJ55G5a2WcqzEtnAsfLdm8QOORgNSp9xzYRD17p3EFvzJ0 kTCOCxNwpbiojEjggIrMDFY4DfqAkyOVasXgetaq0qgfN+6NE47/Dq1hWbsAy82qMWqC 3X0tZo+EkBoV4DRsUR/9wEkrcEmxLpIY9ZstY= Received: by 10.43.60.200 with SMTP id wt8mr3425221icb.358.1301596328824; Thu, 31 Mar 2011 11:32:08 -0700 (PDT) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id uf10sm772377icb.5.2011.03.31.11.32.05 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 31 Mar 2011 11:32:07 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Thu, 31 Mar 2011 11:30:54 -0700 From: YongHyeon PYUN Date: Thu, 31 Mar 2011 11:30:54 -0700 To: Yamagi Burmeister Message-ID: <20110331183054.GC11981@michelle.cdnetworks.com> References: <20110330173145.GB8601@michelle.cdnetworks.com> <20110330202858.GC8601@michelle.cdnetworks.com> <20110331171302.GA11981@michelle.cdnetworks.com> <20110331181651.GB11981@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="JgQwtEuHJzHdouWu" Content-Disposition: inline In-Reply-To: <20110331181651.GB11981@michelle.cdnetworks.com> User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org, yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 18:32:10 -0000 --JgQwtEuHJzHdouWu Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Mar 31, 2011 at 11:16:52AM -0700, YongHyeon PYUN wrote: > On Thu, Mar 31, 2011 at 08:07:17PM +0200, Yamagi Burmeister wrote: > > On Thu, 31 Mar 2011, YongHyeon PYUN wrote: > > > > >>All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64. > > >>After limiting the memory via hw.physmem to 3GB the problems are gone. > > >>The box is running crashfree for more than 6 hours and has served over > > >>300GB of data via age(4). > > >> > > > > > >Thanks for testing. Remove the hw.physmem configuration and try > > >attached patch and let me know how it goes. > > > > Thanks for your help, but the patch doesn't work. Another random panic - > > this time "page fault in kernel mode" - with nothing age(4) or network > > stack related stuff in the backtrace... > > > > Maybe it'll help to know about a bug fix in the linux atl1 driver, now > > replaced by atlx. In git commit 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 > > 64 bit DMA was disabled: > > > > 64-bit DMA causes data corruption with atl1. We don't know why, and > > Atheros is working on it. For now, just use 32-bit DMA. This is a big > > hack that is probably wrong, but it stops the bleeding. > > > > There was no later follow up on it. I think that this can't be problem > > on FreeBSD but maybe I'm reading the driver code wrong. The kernel.org > > gitweb URL is: > > > > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.23.y.git;a=commitdiff;h=5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 > > > > Thanks a lot! It seems the L1 controller has data corruption issue > when 64bit DMA addressing is used. Try this one. Oops, there was a bug in previous patch. Try this instead. --JgQwtEuHJzHdouWu Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="age.dma.diff3" Index: sys/dev/age/if_age.c =================================================================== --- sys/dev/age/if_age.c (revision 220116) +++ sys/dev/age/if_age.c (working copy) @@ -1092,11 +1092,14 @@ * Create Tx/Rx buffer parent tag. * L1 supports full 64bit DMA addressing in Tx/Rx buffers * so it needs separate parent DMA tag. + * XXX + * It seems enabling 64bit DMA causes data corruption. Limit + * DMA address space to 32bit. */ error = bus_dma_tag_create( bus_get_dma_tag(sc->age_dev), /* parent */ 1, 0, /* alignment, boundary */ - BUS_SPACE_MAXADDR, /* lowaddr */ + BUS_SPACE_MAXADDR_32BIT, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ BUS_SPACE_MAXSIZE_32BIT, /* maxsize */ @@ -2452,6 +2455,9 @@ /* Update the consumer index. */ sc->age_cdata.age_rr_cons = rr_cons; + bus_dmamap_sync(sc->age_cdata.age_rx_ring_tag, + sc->age_cdata.age_rx_ring_map, + BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); /* Sync descriptors. */ bus_dmamap_sync(sc->age_cdata.age_rr_ring_tag, sc->age_cdata.age_rr_ring_map, --JgQwtEuHJzHdouWu-- From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 19:59:31 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 424D6106567C; Thu, 31 Mar 2011 19:59:31 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail.yamagi.overkill.yamagi.org (unknown [IPv6:2a01:4f8:121:2102:1::7]) by mx1.freebsd.org (Postfix) with ESMTP id CA7778FC20; Thu, 31 Mar 2011 19:59:30 +0000 (UTC) Received: from [2001:5c0:110d:6600:226:c6ff:fec4:399e] (unknown [IPv6:2001:5c0:110d:6600:226:c6ff:fec4:399e]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.yamagi.overkill.yamagi.org (Postfix) with ESMTPSA id 64E2F16663D1; Thu, 31 Mar 2011 21:59:23 +0200 (CEST) Date: Thu, 31 Mar 2011 21:59:12 +0200 (CEST) From: Yamagi Burmeister X-X-Sender: yamagi@maka.home.yamagi.org To: YongHyeon PYUN In-Reply-To: <20110331183054.GC11981@michelle.cdnetworks.com> Message-ID: References: <20110330173145.GB8601@michelle.cdnetworks.com> <20110330202858.GC8601@michelle.cdnetworks.com> <20110331171302.GA11981@michelle.cdnetworks.com> <20110331181651.GB11981@michelle.cdnetworks.com> <20110331183054.GC11981@michelle.cdnetworks.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org, Yamagi Burmeister , yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 19:59:31 -0000 On Thu, 31 Mar 2011, YongHyeon PYUN wrote: >> Thanks a lot! It seems the L1 controller has data corruption issue >> when 64bit DMA addressing is used. Try this one. > > Oops, there was a bug in previous patch. > Try this instead. Okay, that patch seems to do the trick. This was just a short test run of about one hour with just 50gb copied, but without the patch the system would have crashed in the first 20 minutes. I'll do a more comprehensive test over night and report back tomorrow morning. -- Homepage: www.yamagi.org Jabber: yamagi@yamagi.org GnuPG/GPG: 0xEFBCCBCB From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 21:09:06 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A27CF1065670 for ; Thu, 31 Mar 2011 21:09:06 +0000 (UTC) (envelope-from lacombar@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 66CB28FC16 for ; Thu, 31 Mar 2011 21:09:06 +0000 (UTC) Received: by iyj12 with SMTP id 12so3553008iyj.13 for ; Thu, 31 Mar 2011 14:09:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=w6m8P+XfFxNuI7rB6FjnbtB6dC38pJjWZmmpcsrZ3p0=; b=vE8jF29B4YpFAegcFQnhm8hpXFK90W4xMurJs7xKtN9REYN4vAZlfuGAsPPhZDku+4 Nl4KcNY3vww45agO7rHft2P2wToJOdYv5mGWVShdxJacVa8Z12BsU8SSTtB4wv2dAe6X 1Cy6udniTVWkhAUQjq3vxphoaHP68oWG/OiOk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=UNCQ/KJKUtxuJlzj29iBxcEKSpr4dl1s/NGzlUrw3YYH8aG0EKOe4Xgm8TQYgV7anF aSrJECjD8Miznaw2hTsPF+YidurL4LUseRtY1sT2ITh2dJL6gADXp+YTvA0iF9jVpGQV jDvYUq5F9i2r/KQJg1qvrzWzMD69aT03hIebQ= MIME-Version: 1.0 Received: by 10.42.1.70 with SMTP id 6mr3946197icf.483.1301605745923; Thu, 31 Mar 2011 14:09:05 -0700 (PDT) Received: by 10.42.146.72 with HTTP; Thu, 31 Mar 2011 14:09:05 -0700 (PDT) In-Reply-To: References: Date: Thu, 31 Mar 2011 17:09:05 -0400 Message-ID: From: Arnaud Lacombe To: Jack Vogel Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org Subject: Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 21:09:06 -0000 Hi Jack, On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe wrote: > [...] > I'll remove part of the changes I made to keep only `rx_forced_refill' > and the associated sysctl, re-run the tests and come back with correct > value, hopefully in a few hours. > Here it is: # sysctl dev.em.0.%desc dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2 # sysctl dev.em.0.mac_stats.missed_packets dev.em.0.mac_stats.missed_packets: 917428 # sysctl dev.em.0.debug=3D1 dev.em.0.debug: I-1nterface is RUNNING and INACTIVE em0: hw tdh =3D 975, hw tdt =3D 975 em0: hw rdh =3D 884, hw rdt =3D 885 em0: Tx Queue Status =3D 0 em0: TX descriptors avail =3D 1024 em0: Tx Descriptors avail failure =3D 0 em0: RX discarded packets =3D 0 em0: RX Next to Check =3D 884 em0: RX Next to Refresh =3D 885 -> -1 So the taskqueue cannot be scheduled to run and the driver is stuck. > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel wrote: >> Read the code in HEAD, em_local_timer() has a test of ALL the rx queues = and >> will schedule a task that refreshes mbufs if they are empty. This has >> exactly the >> same effect as checking for some interrupt cause, a cause that is not >> available >> when using MSIX on 82574, but this approach works for everything. >> Can you please point me to a reference datasheet (or errata), provided by Intel, about the RX Overrun interrupt not being available with MSI-X on the 82574 ? Currently, I only have access to [0], which precises the following: 7.4 Interrupts 7.4.2 MSI-X Mode [...] The following configuration and parameters are involved: =95 The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx queues and ot= her events to 5 interrupt vectors =95 The ICR[24:20] bits reflect specific interrupt causes =95 Five MSI-X interrupt vectors are provided (calculated based on four vectors for queues and one vector for other causes). The requested number of vectors is loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X capability structure of the function. 10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC) [...] about bit 24: Other Interrupt. Indicates one of the following interrupts was set: =95 Link Status Change. =95 Receiver Overrun. =95 MDIO Access Complete. =95 Small Receive Packet Detected. =95 Receive ACK Frame Detected. =95 Manageability Event Detected. Thanks in advance, - Arnaud [0]: ftp://download.intel.com/design/network/datashts/82574.pdf From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 21:57:57 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 54431106566B for ; Thu, 31 Mar 2011 21:57:57 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 019998FC17 for ; Thu, 31 Mar 2011 21:57:56 +0000 (UTC) Received: by vxc34 with SMTP id 34so2811426vxc.13 for ; Thu, 31 Mar 2011 14:57:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=N1e+u+M4/jMaeMrPSTJNfbqV8f3/YdB0qxf1EzH2db4=; b=KUA3peiPnbxJ0QTe2l0ffLOpWwNY5W8JwYjuv8DRX3lZK/Np00pVuQogLeoULepzkr cySk4nYcDnCFRYJUDZpbryd3oN93rizLJJsUG32v0Jgicyw3FtCbxu6LrMU3n3Dt29RN AUYJnIR9aEwqEph9uR0QoUhTzX0yYnYLq12To= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=H9OBYaamBsg6qLO3cDzW8jhlKQZgqAY1aqgLhmD/U49IZEuYGAjz15WVEsqpHbR3hD JXgOll1TLv3D+zzymzRrmIUylu8ulCMHgFJI/rQpQTOyHjRQ5SPpv9bM2zk2Sru4lb2Q N02eyLbWYySkNhxnk3Tqp15RtNYUYDS9FfNQQ= MIME-Version: 1.0 Received: by 10.52.92.161 with SMTP id cn1mr4365487vdb.253.1301608676366; Thu, 31 Mar 2011 14:57:56 -0700 (PDT) Received: by 10.52.167.6 with HTTP; Thu, 31 Mar 2011 14:57:56 -0700 (PDT) In-Reply-To: References: Date: Thu, 31 Mar 2011 14:57:56 -0700 Message-ID: From: Jack Vogel To: Arnaud Lacombe Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org Subject: Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 21:57:57 -0000 So, what is the evidence that the driver is stuck here? I see that next_to_check !=3D next_to_refresh, which is why the local timer won't schedule anything. OH, and I also realized there is a problem with local_timer anyway, it will run rxeof, but that won't hel= p if you can't enter the loop, so I need to add some code at the top to call em_refresh_mbufs() when in this state. On this interrupt cause that you are focused upon, although its there in th= e design, I had talked with some of our most seasoned developers on both the Windows and Linux side of the house, and NO one has ever used this 'feature', because (and I'm quoting here) "there's no good use case for it"= . Meaning, there's always some simpler way of handling the issue. When you use MSIX you can't read causes btw, if you configured it, it would mean you'd just get into the regular RX handler, same as always, so why some special bother with this cause? On non-MSIX hardware there is just no particular reason to worry about the cause either, we can just handle the RX situation in the interrupt handler. Jack On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe wrote: > Hi Jack, > > On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe > wrote: > > [...] > > I'll remove part of the changes I made to keep only `rx_forced_refill' > > and the associated sysctl, re-run the tests and come back with correct > > value, hopefully in a few hours. > > > Here it is: > > # sysctl dev.em.0.%desc > dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2 > > # sysctl dev.em.0.mac_stats.missed_packets > dev.em.0.mac_stats.missed_packets: 917428 > > # sysctl dev.em.0.debug=3D1 > dev.em.0.debug: I-1nterface is RUNNING and INACTIVE > em0: hw tdh =3D 975, hw tdt =3D 975 > em0: hw rdh =3D 884, hw rdt =3D 885 > em0: Tx Queue Status =3D 0 > em0: TX descriptors avail =3D 1024 > em0: Tx Descriptors avail failure =3D 0 > em0: RX discarded packets =3D 0 > em0: RX Next to Check =3D 884 > em0: RX Next to Refresh =3D 885 > -> -1 > > So the taskqueue cannot be scheduled to run and the driver is stuck. > > > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel wrote: > >> Read the code in HEAD, em_local_timer() has a test of ALL the rx queue= s > and > >> will schedule a task that refreshes mbufs if they are empty. This has > >> exactly the > >> same effect as checking for some interrupt cause, a cause that is not > >> available > >> when using MSIX on 82574, but this approach works for everything. > >> > Can you please point me to a reference datasheet (or errata), provided > by Intel, about the RX Overrun interrupt not being available with > MSI-X on the 82574 ? > > Currently, I only have access to [0], which precises the following: > > 7.4 Interrupts > 7.4.2 MSI-X Mode > [...] > The following configuration and parameters are involved: > =95 The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx queues and > other > events to 5 interrupt vectors > =95 The ICR[24:20] bits reflect specific interrupt causes > =95 Five MSI-X interrupt vectors are provided (calculated based on four > vectors for > queues and one vector for other causes). The requested number of vectors = is > loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X capabili= ty > structure of the function. > > 10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC) > [...] > > about bit 24: > > Other Interrupt. Indicates one of the following interrupts was set: > =95 Link Status Change. > =95 Receiver Overrun. > =95 MDIO Access Complete. > =95 Small Receive Packet Detected. > =95 Receive ACK Frame Detected. > =95 Manageability Event Detected. > > Thanks in advance, > - Arnaud > > [0]: ftp://download.intel.com/design/network/datashts/82574.pdf > From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 22:07:34 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 26653106566B for ; Thu, 31 Mar 2011 22:07:34 +0000 (UTC) (envelope-from joesuf4@gmail.com) Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id CF4178FC12 for ; Thu, 31 Mar 2011 22:07:33 +0000 (UTC) Received: by vws18 with SMTP id 18so2789148vws.13 for ; Thu, 31 Mar 2011 15:07:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=X3NZg870jVIK00CQIVryZAvdvT5pp8+r9Iwl7Bam+uo=; b=dnJet5R31Z1EAwGfGJLtJ8W09XjT3mVyFEn7pmcKx9xRf2Ss0iGehxukOfTeXe59RB 1ZU6hd8jQTKCK5jZljBqjptkNtR5sun76+xRKo9L5xsXQVzPjH1aD2IYU0KWe+/wjxGu bHqKyA5mjhXGBFC8F87Ami1gt6miHpqG9Nh+s= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=x9h+YFNsYX7rthY5qQn7BV797PE0fPr15QczPc9H4D3RQ9OxyzC/zyoURcnXVXRp4c 3uGvX4q+woRnSAagHy+7i7jw8av2bgAkyZ37P5bRbThkWt9A1ynixUyNwnlRCABzymzV sgD+tyTMI0PeusrjpA+GkMhaNI+su6Hkv19ps= MIME-Version: 1.0 Received: by 10.52.0.205 with SMTP id 13mr4150116vdg.58.1301607950566; Thu, 31 Mar 2011 14:45:50 -0700 (PDT) Received: by 10.52.164.132 with HTTP; Thu, 31 Mar 2011 14:45:50 -0700 (PDT) Date: Thu, 31 Mar 2011 17:45:50 -0400 Message-ID: From: Joe Schaefer To: freebsd-net@freebsd.org Content-Type: text/plain; charset=UTF-8 Subject: any restrictions on nmbclusters vs nmbjumbop X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 22:07:34 -0000 I have the following config in boot/loader.conf kern.ipc.nmbclusters="65536" kern.ipc.nmbjumbop="65536" and having just tried running a host with that config the host stopped responding to commands (not even login worked) and I had to power cycle it. My situation is that I have a need for a large nmbjumbop setting but the nmbclusters size (according to netstat -m) can remain small. Is this possible or do I need to bump the nmbclusters to 128K in order to get nmbjumbop where I want (64K)? From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 22:15:34 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8E352106566B for ; Thu, 31 Mar 2011 22:15:34 +0000 (UTC) (envelope-from lacombar@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 507188FC14 for ; Thu, 31 Mar 2011 22:15:34 +0000 (UTC) Received: by iyj12 with SMTP id 12so3626994iyj.13 for ; Thu, 31 Mar 2011 15:15:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=lAihy+FND423WbXNeCSc3rTmXYoaGl6ulSoCTzlQEdI=; b=bf0W3xeXV6Ez23iqBrbRX64DIG28F9GZQM0JY6ygguNVBf8lJElum11InE7eZthzGD Gmpx7OLZ4p0boREbnvpmF67xbfLKrwmGGQ+hRRXysfjuy1FtC4z9dQGSeKGM1p2JvGhX ghPYBgsuCBsK8CG9tIb7kJ8NbOGNFV+71NX4c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=jwchjP68tkbVx78ytOezHyBhv8nUCL6wuV6rSDv5sm1uKeJ0qj0NWgoa0INM48nEzT f4ys/TEc100fv1ELvg8x3b+aUkDyX8+VfjM353FhPLPzj0ol0MKPcthVyXhJ4uir9GBC CCwDMEGWMCDd3ZHjszI0gcTu9FHme5S/c3Z1I= MIME-Version: 1.0 Received: by 10.43.63.72 with SMTP id xd8mr3720059icb.215.1301609733002; Thu, 31 Mar 2011 15:15:33 -0700 (PDT) Received: by 10.42.146.72 with HTTP; Thu, 31 Mar 2011 15:15:32 -0700 (PDT) In-Reply-To: References: Date: Thu, 31 Mar 2011 18:15:32 -0400 Message-ID: From: Arnaud Lacombe To: Jack Vogel Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org Subject: Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 22:15:34 -0000 Hi, On Thu, Mar 31, 2011 at 5:57 PM, Jack Vogel wrote: > So, what is the evidence that the driver is stuck here? > About 800 pps (mostly SYN) present wire but never ever seen on em0, plus a couple of ARP reply, which still never hit em0, plus the `missed_packets' count increasing by the same 800 pps in the last hour. Is that enough ? - Arnaud ps: I forgot to add that MAC address on the wire are fine. > I see that next_to_check !=3D next_to_refresh, which is why the > local timer won't schedule anything. OH, and I also realized there > is a problem with local_timer anyway, it will run rxeof, but that won't h= elp > if you can't enter the loop, so I need to add some code at the top to > call em_refresh_mbufs() when in this state. > > On this interrupt cause that you are focused upon, although its there in = the > design, I had talked with some of our most seasoned developers on both > the Windows and Linux side of the house, and NO one has ever used this > 'feature', because (and I'm quoting here) "there's no good use case for i= t". > Meaning, there's always some simpler way of handling the issue. > > When you use MSIX you can't read causes btw, if you configured it, it wou= ld > mean you'd just get into the regular RX handler, same as always, so why > some special bother with this cause? > > On non-MSIX hardware there is just no particular reason to worry about th= e > cause either, we can just handle the RX situation in the interrupt handle= r. > > Jack > > > On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe wrot= e: >> >> Hi Jack, >> >> On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe >> wrote: >> > [...] >> > I'll remove part of the changes I made to keep only `rx_forced_refill' >> > and the associated sysctl, re-run the tests and come back with correct >> > value, hopefully in a few hours. >> > >> Here it is: >> >> # sysctl dev.em.0.%desc >> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2 >> >> # sysctl dev.em.0.mac_stats.missed_packets >> dev.em.0.mac_stats.missed_packets: 917428 >> >> # sysctl dev.em.0.debug=3D1 >> dev.em.0.debug: I-1nterface is RUNNING and INACTIVE >> em0: hw tdh =3D 975, hw tdt =3D 975 >> em0: hw rdh =3D 884, hw rdt =3D 885 >> em0: Tx Queue Status =3D 0 >> em0: TX descriptors avail =3D 1024 >> em0: Tx Descriptors avail failure =3D 0 >> em0: RX discarded packets =3D 0 >> em0: RX Next to Check =3D 884 >> em0: RX Next to Refresh =3D 885 >> =A0-> -1 >> >> So the taskqueue cannot be scheduled to run and the driver is stuck. >> >> > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel wrote: >> >> Read the code in HEAD, em_local_timer() has a test of ALL the rx queu= es >> >> and >> >> will schedule a task that refreshes mbufs if they are empty. This has >> >> exactly the >> >> same effect as checking for some interrupt cause, a cause that is not >> >> available >> >> when using MSIX on 82574, but this approach works for everything. >> >> >> Can you please point me to a reference datasheet (or errata), provided >> by Intel, about the RX Overrun interrupt not being available with >> MSI-X on the 82574 ? >> >> Currently, I only have access to [0], which precises the following: >> >> 7.4 Interrupts >> 7.4.2 MSI-X Mode >> [...] >> The following configuration and parameters are involved: >> =95 The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx queues and >> other >> events to 5 interrupt vectors >> =95 The ICR[24:20] bits reflect specific interrupt causes >> =95 Five MSI-X interrupt vectors are provided (calculated based on four >> vectors for >> queues and one vector for other causes). The requested number of vectors >> is >> loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X >> capability >> structure of the function. >> >> 10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC) >> [...] >> >> about bit 24: >> >> Other Interrupt. Indicates one of the following interrupts was set: >> =95 Link Status Change. >> =95 Receiver Overrun. >> =95 MDIO Access Complete. >> =95 Small Receive Packet Detected. >> =95 Receive ACK Frame Detected. >> =95 Manageability Event Detected. >> >> Thanks in advance, >> =A0- Arnaud >> >> [0]: ftp://download.intel.com/design/network/datashts/82574.pdf > > From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 22:28:48 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C01BF1065674 for ; Thu, 31 Mar 2011 22:28:48 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 6EC3D8FC0A for ; Thu, 31 Mar 2011 22:28:48 +0000 (UTC) Received: by vws18 with SMTP id 18so2803018vws.13 for ; Thu, 31 Mar 2011 15:28:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=5GGNiMgfgA5eo7oR2J4NNSPchSMeo/dC7DcpsTPH+qI=; b=j5LumbBzoxQUleUFwZoa5SqqQA2n+VSza3sqzRQPUgY6/TLqcLsc1LmMIsMzdS12dO Fqj4sHl/MrWUoIr++sy1rPBhNe449/CHBxSzb1vpsd2vMeVwyhJQPp55zDrLvhJoCEYe m9PW26Tv3+CNDVgrlajyNgZvQKFzoQAWXIycE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=k9VTAKu99G4pbs3C2ltuRMIToh9DwvQqn/wS4LjM6Nt66aJZ4EXP1ifqlQQn3sDRTu bGr6VkENGo502Eds+qjNzGHzolvSHsUoSKvEW7D1DdgHfKQV3sUw+oAUDYoK15h2qdax ZEee/tuiC279baktX3lxqqnqWbyg5nsut8T2o= MIME-Version: 1.0 Received: by 10.52.93.177 with SMTP id cv17mr4449560vdb.133.1301610526837; Thu, 31 Mar 2011 15:28:46 -0700 (PDT) Received: by 10.52.167.6 with HTTP; Thu, 31 Mar 2011 15:28:46 -0700 (PDT) In-Reply-To: References: Date: Thu, 31 Mar 2011 15:28:46 -0700 Message-ID: From: Jack Vogel To: Arnaud Lacombe Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org Subject: Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 22:28:48 -0000 OK, but those are not something present in this data, that was what I'm asking. So, you have a hang for which we do not have a certain cause. What does netstat -m show? Jack On Thu, Mar 31, 2011 at 3:15 PM, Arnaud Lacombe wrote: > Hi, > > On Thu, Mar 31, 2011 at 5:57 PM, Jack Vogel wrote: > > So, what is the evidence that the driver is stuck here? > > > About 800 pps (mostly SYN) present wire but never ever seen on em0, > plus a couple of ARP reply, which still never hit em0, plus the > `missed_packets' count increasing by the same 800 pps in the last > hour. Is that enough ? > > - Arnaud > > ps: I forgot to add that MAC address on the wire are fine. > > > I see that next_to_check !=3D next_to_refresh, which is why the > > local timer won't schedule anything. OH, and I also realized there > > is a problem with local_timer anyway, it will run rxeof, but that won't > help > > if you can't enter the loop, so I need to add some code at the top to > > call em_refresh_mbufs() when in this state. > > > > On this interrupt cause that you are focused upon, although its there i= n > the > > design, I had talked with some of our most seasoned developers on both > > the Windows and Linux side of the house, and NO one has ever used this > > 'feature', because (and I'm quoting here) "there's no good use case for > it". > > Meaning, there's always some simpler way of handling the issue. > > > > When you use MSIX you can't read causes btw, if you configured it, it > would > > mean you'd just get into the regular RX handler, same as always, so why > > some special bother with this cause? > > > > On non-MSIX hardware there is just no particular reason to worry about > the > > cause either, we can just handle the RX situation in the interrupt > handler. > > > > Jack > > > > > > On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe > wrote: > >> > >> Hi Jack, > >> > >> On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe > >> wrote: > >> > [...] > >> > I'll remove part of the changes I made to keep only `rx_forced_refil= l' > >> > and the associated sysctl, re-run the tests and come back with corre= ct > >> > value, hopefully in a few hours. > >> > > >> Here it is: > >> > >> # sysctl dev.em.0.%desc > >> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2 > >> > >> # sysctl dev.em.0.mac_stats.missed_packets > >> dev.em.0.mac_stats.missed_packets: 917428 > >> > >> # sysctl dev.em.0.debug=3D1 > >> dev.em.0.debug: I-1nterface is RUNNING and INACTIVE > >> em0: hw tdh =3D 975, hw tdt =3D 975 > >> em0: hw rdh =3D 884, hw rdt =3D 885 > >> em0: Tx Queue Status =3D 0 > >> em0: TX descriptors avail =3D 1024 > >> em0: Tx Descriptors avail failure =3D 0 > >> em0: RX discarded packets =3D 0 > >> em0: RX Next to Check =3D 884 > >> em0: RX Next to Refresh =3D 885 > >> -> -1 > >> > >> So the taskqueue cannot be scheduled to run and the driver is stuck. > >> > >> > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel > wrote: > >> >> Read the code in HEAD, em_local_timer() has a test of ALL the rx > queues > >> >> and > >> >> will schedule a task that refreshes mbufs if they are empty. This h= as > >> >> exactly the > >> >> same effect as checking for some interrupt cause, a cause that is n= ot > >> >> available > >> >> when using MSIX on 82574, but this approach works for everything. > >> >> > >> Can you please point me to a reference datasheet (or errata), provided > >> by Intel, about the RX Overrun interrupt not being available with > >> MSI-X on the 82574 ? > >> > >> Currently, I only have access to [0], which precises the following: > >> > >> 7.4 Interrupts > >> 7.4.2 MSI-X Mode > >> [...] > >> The following configuration and parameters are involved: > >> =95 The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx queues a= nd > >> other > >> events to 5 interrupt vectors > >> =95 The ICR[24:20] bits reflect specific interrupt causes > >> =95 Five MSI-X interrupt vectors are provided (calculated based on fou= r > >> vectors for > >> queues and one vector for other causes). The requested number of vecto= rs > >> is > >> loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X > >> capability > >> structure of the function. > >> > >> 10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC) > >> [...] > >> > >> about bit 24: > >> > >> Other Interrupt. Indicates one of the following interrupts was set: > >> =95 Link Status Change. > >> =95 Receiver Overrun. > >> =95 MDIO Access Complete. > >> =95 Small Receive Packet Detected. > >> =95 Receive ACK Frame Detected. > >> =95 Manageability Event Detected. > >> > >> Thanks in advance, > >> - Arnaud > >> > >> [0]: ftp://download.intel.com/design/network/datashts/82574.pdf > > > > > From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 23:06:45 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9AC981065672 for ; Thu, 31 Mar 2011 23:06:45 +0000 (UTC) (envelope-from lacombar@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 5B7F48FC17 for ; Thu, 31 Mar 2011 23:06:45 +0000 (UTC) Received: by iyj12 with SMTP id 12so3676797iyj.13 for ; Thu, 31 Mar 2011 16:06:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=s8/rb+4nPT7KCKnIeFHMMoXHQnR75ezwEKmzBJKd3Vk=; b=q3OZXVlgN9jTfU2cGaN1ptmc2q/InSh+IxqoBLI1QvYRyJDdwjs7XsVp4zfinMQB1/ lRD7K7HXbyRLoPl0q9eqBDM/qbInK4N6j2Uf7IOaqPs+aqMTxjA2UJv5VKAZq5Pb3sJz BavE0xusdlgtHHW6ATp74+8NjJStREN+APToI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=F2wvMwkFOnTj30tJd4ZT2+lkkBoOy5qpIsm7gEDiR0bmAtB030xf5iDhvE9lCNDhlD UxLOsMtJFo3yz/5dfWhEsjGsk56avpPJQiRO82SEgMoKODWIaij1yLEKZk6RJvuCnzqN 9y3czLP0mcqbQPyEdW0Ya40opRwZTo5OXffWI= MIME-Version: 1.0 Received: by 10.43.63.72 with SMTP id xd8mr3778009icb.215.1301612804526; Thu, 31 Mar 2011 16:06:44 -0700 (PDT) Received: by 10.42.146.72 with HTTP; Thu, 31 Mar 2011 16:06:44 -0700 (PDT) In-Reply-To: References: Date: Thu, 31 Mar 2011 19:06:44 -0400 Message-ID: From: Arnaud Lacombe To: Jack Vogel Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org Subject: Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 23:06:45 -0000 Hi, On Thu, Mar 31, 2011 at 6:28 PM, Jack Vogel wrote: > OK, but those are not something present in this data, that was what I'm > asking. > > So, you have a hang for which we do not have a certain cause.=A0 What doe= s > netstat -m show? > # netstat -m 3073/74927/78000 mbufs in use (current/cache/total) 3070/29698/32768/32768 mbuf clusters in use (current/cache/total/max) 0/383 mbuf+clusters out of packet secondary zone in use (current/cache) 0/12800/12800/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 6908K/129327K/136236K bytes allocated to network (current/cache/total) 0/1080/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/7/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines Note that the mbuf allocation denial did not appended at once. It has been progressively increasing by block of ~200 over the 5h of uptime of the machine, until the current condition occurred. I have previously been trying to simulate the depletion and the hang, but the driver recovered. I assume the condition is met in em_local_timer() to refresh the ring, I'd still need to check that. - Arnaud > Jack > > > On Thu, Mar 31, 2011 at 3:15 PM, Arnaud Lacombe wrot= e: >> >> Hi, >> >> On Thu, Mar 31, 2011 at 5:57 PM, Jack Vogel wrote: >> > So, what is the evidence that the driver is stuck here? >> > >> About 800 pps (mostly SYN) present wire but never ever seen on em0, >> plus a couple of ARP reply, which still never hit em0, plus the >> `missed_packets' count increasing by the same 800 pps in the last >> hour. Is that enough ? >> >> =A0- Arnaud >> >> ps: I forgot to add that MAC address on the wire are fine. >> >> > I see that next_to_check !=3D next_to_refresh, which is why the >> > local timer won't schedule anything. OH, and I also realized there >> > is a problem with local_timer anyway, it will run rxeof, but that won'= t >> > help >> > if you can't enter the loop, so I need to add some code at the top to >> > call em_refresh_mbufs() when in this state. >> > >> > On this interrupt cause that you are focused upon, although its there = in >> > the >> > design, I had talked with some of our most seasoned developers on both >> > the Windows and Linux side of the house, and NO one has ever used this >> > 'feature', because (and I'm quoting here) "there's no good use case fo= r >> > it". >> > Meaning, there's always some simpler way of handling the issue. >> > >> > When you use MSIX you can't read causes btw, if you configured it, it >> > would >> > mean you'd just get into the regular RX handler, same as always, so wh= y >> > some special bother with this cause? >> > >> > On non-MSIX hardware there is just no particular reason to worry about >> > the >> > cause either, we can just handle the RX situation in the interrupt >> > handler. >> > >> > Jack >> > >> > >> > On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe >> > wrote: >> >> >> >> Hi Jack, >> >> >> >> On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe >> >> wrote: >> >> > [...] >> >> > I'll remove part of the changes I made to keep only >> >> > `rx_forced_refill' >> >> > and the associated sysctl, re-run the tests and come back with >> >> > correct >> >> > value, hopefully in a few hours. >> >> > >> >> Here it is: >> >> >> >> # sysctl dev.em.0.%desc >> >> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2 >> >> >> >> # sysctl dev.em.0.mac_stats.missed_packets >> >> dev.em.0.mac_stats.missed_packets: 917428 >> >> >> >> # sysctl dev.em.0.debug=3D1 >> >> dev.em.0.debug: I-1nterface is RUNNING and INACTIVE >> >> em0: hw tdh =3D 975, hw tdt =3D 975 >> >> em0: hw rdh =3D 884, hw rdt =3D 885 >> >> em0: Tx Queue Status =3D 0 >> >> em0: TX descriptors avail =3D 1024 >> >> em0: Tx Descriptors avail failure =3D 0 >> >> em0: RX discarded packets =3D 0 >> >> em0: RX Next to Check =3D 884 >> >> em0: RX Next to Refresh =3D 885 >> >> =A0-> -1 >> >> >> >> So the taskqueue cannot be scheduled to run and the driver is stuck. >> >> >> >> > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel >> >> > wrote: >> >> >> Read the code in HEAD, em_local_timer() has a test of ALL the rx >> >> >> queues >> >> >> and >> >> >> will schedule a task that refreshes mbufs if they are empty. This >> >> >> has >> >> >> exactly the >> >> >> same effect as checking for some interrupt cause, a cause that is >> >> >> not >> >> >> available >> >> >> when using MSIX on 82574, but this approach works for everything. >> >> >> >> >> Can you please point me to a reference datasheet (or errata), provide= d >> >> by Intel, about the RX Overrun interrupt not being available with >> >> MSI-X on the 82574 ? >> >> >> >> Currently, I only have access to [0], which precises the following: >> >> >> >> 7.4 Interrupts >> >> 7.4.2 MSI-X Mode >> >> [...] >> >> The following configuration and parameters are involved: >> >> =95 The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx queues = and >> >> other >> >> events to 5 interrupt vectors >> >> =95 The ICR[24:20] bits reflect specific interrupt causes >> >> =95 Five MSI-X interrupt vectors are provided (calculated based on fo= ur >> >> vectors for >> >> queues and one vector for other causes). The requested number of >> >> vectors >> >> is >> >> loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X >> >> capability >> >> structure of the function. >> >> >> >> 10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC) >> >> [...] >> >> >> >> about bit 24: >> >> >> >> Other Interrupt. Indicates one of the following interrupts was set: >> >> =95 Link Status Change. >> >> =95 Receiver Overrun. >> >> =95 MDIO Access Complete. >> >> =95 Small Receive Packet Detected. >> >> =95 Receive ACK Frame Detected. >> >> =95 Manageability Event Detected. >> >> >> >> Thanks in advance, >> >> =A0- Arnaud >> >> >> >> [0]: ftp://download.intel.com/design/network/datashts/82574.pdf >> > >> > > > From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 23:38:08 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E5AB5106566B for ; Thu, 31 Mar 2011 23:38:08 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 8D73C8FC0C for ; Thu, 31 Mar 2011 23:38:08 +0000 (UTC) Received: by vxc34 with SMTP id 34so2875870vxc.13 for ; Thu, 31 Mar 2011 16:38:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=laiEhyoGj04yo3BE+QMlAL+UHPBRVeS+utCf8ag9JTw=; b=w9wCVLbW0fUUDpAQ9sDHAL0H9AkP16T741OKtu+uKVnlrADl47m2cjLVc/s8LbvOi+ bcEOT8cG5Cs9x412ZErMswDzGjOEdNVPGFk6i42m2xdDqh82R5scGVNoaTtqDmPYrOYk L7hyC6Peu69CgCf2vj/8Q4IEsDZA1xmST9D1o= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=UJc5/8GbZfbhRaisQV1DpwGHS6H76eKT8YOdQgHjaZ4TKflqVuSoIfV5VnrauuEaBa kifnPGPreTSFMrVDylhtpzPBRL9P5Jyi86/pJColfKWQnSQfGrBVXXCSJcVyg82ovyER xqkvj+enX8d+cKZ1fovhMo3Eji0x0lR8B9njs= MIME-Version: 1.0 Received: by 10.52.94.48 with SMTP id cz16mr4254004vdb.173.1301614687689; Thu, 31 Mar 2011 16:38:07 -0700 (PDT) Received: by 10.52.167.6 with HTTP; Thu, 31 Mar 2011 16:38:07 -0700 (PDT) In-Reply-To: References: Date: Thu, 31 Mar 2011 16:38:07 -0700 Message-ID: From: Jack Vogel To: Arnaud Lacombe Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org Subject: Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 23:38:09 -0000 My validation group has some kind of hang... happens when they use a certai= n number of clients each running a stress test to the SUT, its like this, no real handle on what's wrong, if I knew what was wrong it would be half way or more to fixing it := ) The evidence shows you have hit the max clusters at one point, but have freed most of them back up again, there is no shortage right at this point. Your previous data showed a normal idle head/tail relationship.... Just as a data point, will you please disable msix, recompile and run in MS= I mode, I just want to see if that makes a difference. Search in the driver for em_enable_msix and set it FALSE. Jack On Thu, Mar 31, 2011 at 4:06 PM, Arnaud Lacombe wrote: > Hi, > > On Thu, Mar 31, 2011 at 6:28 PM, Jack Vogel wrote: > > OK, but those are not something present in this data, that was what I'm > > asking. > > > > So, you have a hang for which we do not have a certain cause. What doe= s > > netstat -m show? > > > # netstat -m > 3073/74927/78000 mbufs in use (current/cache/total) > 3070/29698/32768/32768 mbuf clusters in use (current/cache/total/max) > 0/383 mbuf+clusters out of packet secondary zone in use (current/cache) > 0/12800/12800/12800 4k (page size) jumbo clusters in use > (current/cache/total/max) > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) > 6908K/129327K/136236K bytes allocated to network (current/cache/total) > 0/1080/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0/7/6656 sfbufs in use (current/peak/max) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > 0 calls to protocol drain routines > > Note that the mbuf allocation denial did not appended at once. It has > been progressively increasing by block of ~200 over the 5h of uptime > of the machine, until the current condition occurred. > > I have previously been trying to simulate the depletion and the hang, > but the driver recovered. I assume the condition is met in > em_local_timer() to refresh the ring, I'd still need to check that. > > - Arnaud > > > Jack > > > > > > On Thu, Mar 31, 2011 at 3:15 PM, Arnaud Lacombe > wrote: > >> > >> Hi, > >> > >> On Thu, Mar 31, 2011 at 5:57 PM, Jack Vogel wrote: > >> > So, what is the evidence that the driver is stuck here? > >> > > >> About 800 pps (mostly SYN) present wire but never ever seen on em0, > >> plus a couple of ARP reply, which still never hit em0, plus the > >> `missed_packets' count increasing by the same 800 pps in the last > >> hour. Is that enough ? > >> > >> - Arnaud > >> > >> ps: I forgot to add that MAC address on the wire are fine. > >> > >> > I see that next_to_check !=3D next_to_refresh, which is why the > >> > local timer won't schedule anything. OH, and I also realized there > >> > is a problem with local_timer anyway, it will run rxeof, but that > won't > >> > help > >> > if you can't enter the loop, so I need to add some code at the top t= o > >> > call em_refresh_mbufs() when in this state. > >> > > >> > On this interrupt cause that you are focused upon, although its ther= e > in > >> > the > >> > design, I had talked with some of our most seasoned developers on bo= th > >> > the Windows and Linux side of the house, and NO one has ever used th= is > >> > 'feature', because (and I'm quoting here) "there's no good use case > for > >> > it". > >> > Meaning, there's always some simpler way of handling the issue. > >> > > >> > When you use MSIX you can't read causes btw, if you configured it, i= t > >> > would > >> > mean you'd just get into the regular RX handler, same as always, so > why > >> > some special bother with this cause? > >> > > >> > On non-MSIX hardware there is just no particular reason to worry abo= ut > >> > the > >> > cause either, we can just handle the RX situation in the interrupt > >> > handler. > >> > > >> > Jack > >> > > >> > > >> > On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe > >> > wrote: > >> >> > >> >> Hi Jack, > >> >> > >> >> On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe > >> >> wrote: > >> >> > [...] > >> >> > I'll remove part of the changes I made to keep only > >> >> > `rx_forced_refill' > >> >> > and the associated sysctl, re-run the tests and come back with > >> >> > correct > >> >> > value, hopefully in a few hours. > >> >> > > >> >> Here it is: > >> >> > >> >> # sysctl dev.em.0.%desc > >> >> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2 > >> >> > >> >> # sysctl dev.em.0.mac_stats.missed_packets > >> >> dev.em.0.mac_stats.missed_packets: 917428 > >> >> > >> >> # sysctl dev.em.0.debug=3D1 > >> >> dev.em.0.debug: I-1nterface is RUNNING and INACTIVE > >> >> em0: hw tdh =3D 975, hw tdt =3D 975 > >> >> em0: hw rdh =3D 884, hw rdt =3D 885 > >> >> em0: Tx Queue Status =3D 0 > >> >> em0: TX descriptors avail =3D 1024 > >> >> em0: Tx Descriptors avail failure =3D 0 > >> >> em0: RX discarded packets =3D 0 > >> >> em0: RX Next to Check =3D 884 > >> >> em0: RX Next to Refresh =3D 885 > >> >> -> -1 > >> >> > >> >> So the taskqueue cannot be scheduled to run and the driver is stuck= . > >> >> > >> >> > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel > >> >> > wrote: > >> >> >> Read the code in HEAD, em_local_timer() has a test of ALL the rx > >> >> >> queues > >> >> >> and > >> >> >> will schedule a task that refreshes mbufs if they are empty. Thi= s > >> >> >> has > >> >> >> exactly the > >> >> >> same effect as checking for some interrupt cause, a cause that i= s > >> >> >> not > >> >> >> available > >> >> >> when using MSIX on 82574, but this approach works for everything= . > >> >> >> > >> >> Can you please point me to a reference datasheet (or errata), > provided > >> >> by Intel, about the RX Overrun interrupt not being available with > >> >> MSI-X on the 82574 ? > >> >> > >> >> Currently, I only have access to [0], which precises the following: > >> >> > >> >> 7.4 Interrupts > >> >> 7.4.2 MSI-X Mode > >> >> [...] > >> >> The following configuration and parameters are involved: > >> >> =95 The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx queue= s > and > >> >> other > >> >> events to 5 interrupt vectors > >> >> =95 The ICR[24:20] bits reflect specific interrupt causes > >> >> =95 Five MSI-X interrupt vectors are provided (calculated based on = four > >> >> vectors for > >> >> queues and one vector for other causes). The requested number of > >> >> vectors > >> >> is > >> >> loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X > >> >> capability > >> >> structure of the function. > >> >> > >> >> 10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC) > >> >> [...] > >> >> > >> >> about bit 24: > >> >> > >> >> Other Interrupt. Indicates one of the following interrupts was set: > >> >> =95 Link Status Change. > >> >> =95 Receiver Overrun. > >> >> =95 MDIO Access Complete. > >> >> =95 Small Receive Packet Detected. > >> >> =95 Receive ACK Frame Detected. > >> >> =95 Manageability Event Detected. > >> >> > >> >> Thanks in advance, > >> >> - Arnaud > >> >> > >> >> [0]: ftp://download.intel.com/design/network/datashts/82574.pdf > >> > > >> > > > > > > From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 23:40:18 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8834D1065672; Thu, 31 Mar 2011 23:40:18 +0000 (UTC) (envelope-from sec@42.org) Received: from ice.42.org (v6.42.org [IPv6:2001:608:9::1]) by mx1.freebsd.org (Postfix) with ESMTP id 3C7348FC12; Thu, 31 Mar 2011 23:40:18 +0000 (UTC) Received: by ice.42.org (Postfix, from userid 1000) id 7748D2841C; Fri, 1 Apr 2011 01:40:17 +0200 (CEST) Date: Fri, 1 Apr 2011 01:40:17 +0200 From: Stefan `Sec` Zehl To: John Baldwin Message-ID: <20110331234017.GC3308@ice.42.org> Mail-Followup-To: John Baldwin , freebsd-net@freebsd.org References: <4D8B99B4.4070404@FreeBSD.org> <201103281423.52202.jhb@freebsd.org> <20110328183810.GF23803@ice.42.org> <201103300838.09608.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201103300838.09608.jhb@freebsd.org> User-Agent: Mutt/1.4.2.3i I-love-doing-this: really X-Modeline: vim:set ts=8 sw=4 smarttab tw=72 si noic notitle: Accept-Languages: de, en X-URL: http://sec.42.org/ Cc: freebsd-net@freebsd.org Subject: Re: The tale of a TCP bug X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 23:40:18 -0000 On Wed, Mar 30, 2011 at 08:38 -0400, John Baldwin wrote: > There is at least one case I know of related to a bug I reported earlier > where a window probe from a remote connection can cause rcv_nxt to advance > past rcv_adv by one. However, I think we want to know about those cases, > and we should probably be treating rcv_adv - rcv_nxt as if it is zero in > that case, not -1 (my patch in my original e-mail does just that in a > different place in tcp_output() when we calculate the window "for real"). I've been running for about a day now with the committed patch and adv_neg is still zero: | ice:~>uptime; sysctl net.inet.tcp.adv_neg | 1:36AM up 1 day, 4:52, 1 user, load averages: 0.12, 0.06, 0.05 | net.inet.tcp.adv_neg: 0 I'll of course monitor this value and report back if I ever see it increase :-) CU, Sec -- Diplomacy is the ability to tell a person to go to hell in such a nice way that he or she looks forward to the trip. From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 00:28:04 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 61CB2106566B for ; Fri, 1 Apr 2011 00:28:04 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 0D49A8FC14 for ; Fri, 1 Apr 2011 00:28:03 +0000 (UTC) Received: by vws18 with SMTP id 18so2870460vws.13 for ; Thu, 31 Mar 2011 17:28:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=9J23yy1/KnWefkXSD5Vi4REWMELFHmML7sT9tIK1Q+A=; b=ZpiLr5siizHueMofHridt3JnguLmXm6OLJkoinYiU6sTKTEolYgE0zAnOL3aBa7BAZ squDJb/i57Ft+AvGJUiHXJlsaxWBanf2AZHnMGTwBiJZu1fJhgEuir3KCvX9R0s1xeCF SoDe469xIUKbA9xIps+5aK/XwhM+9iSSPVzwU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=hjsKCiEJ29ITT70Tc9sIJ619IJ/qxHUOAWdkaHYM2O/H4HjMdVtfy0TKStDhAezbuq eH2wL4Us1XOpd5vjp0c2ivVM/zYW+IXGogXIqDZ8nUATIzU4Rx92EVbseW0V2Cg9LRz/ D88t5VTMR9ZrVW7sA20xmWlnfIBIlMtORFCXo= MIME-Version: 1.0 Received: by 10.52.94.48 with SMTP id cz16mr4300779vdb.173.1301617682877; Thu, 31 Mar 2011 17:28:02 -0700 (PDT) Received: by 10.52.167.6 with HTTP; Thu, 31 Mar 2011 17:28:02 -0700 (PDT) In-Reply-To: References: Date: Thu, 31 Mar 2011 17:28:02 -0700 Message-ID: From: Jack Vogel To: Arnaud Lacombe Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org Subject: Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 00:28:04 -0000 You know what Arnaud, I've looked at the numbers again, and I suddenly saw that next_to_check and next_to_refresh are NOT in a good state, exactly the opposite, check is BEHIND refresh, which means the whole ring is empty, the HEAD (next_to_check) is pointing at 929, but next_to_refresh is at 930, RIGHT IN FRONT of it, so the whole ring is depleted!! What this means is that just a test of check =3D=3D refresh is not going to= be good enough to protect against all cases, so let me think about how to handle this... Jack On Thu, Mar 31, 2011 at 4:38 PM, Jack Vogel wrote: > My validation group has some kind of hang... happens when they use a > certain number > of clients each running a stress test to the SUT, its like this, no real > handle on what's > wrong, if I knew what was wrong it would be half way or more to fixing it > :) > > The evidence shows you have hit the max clusters at one point, but have > freed most > of them back up again, there is no shortage right at this point. Your > previous data > showed a normal idle head/tail relationship.... > > Just as a data point, will you please disable msix, recompile and run in > MSI mode, > I just want to see if that makes a difference. Search in the driver for > em_enable_msix > and set it FALSE. > > Jack > > > > On Thu, Mar 31, 2011 at 4:06 PM, Arnaud Lacombe wrote= : > >> Hi, >> >> On Thu, Mar 31, 2011 at 6:28 PM, Jack Vogel wrote: >> > OK, but those are not something present in this data, that was what I'= m >> > asking. >> > >> > So, you have a hang for which we do not have a certain cause. What do= es >> > netstat -m show? >> > >> # netstat -m >> 3073/74927/78000 mbufs in use (current/cache/total) >> 3070/29698/32768/32768 mbuf clusters in use (current/cache/total/max) >> 0/383 mbuf+clusters out of packet secondary zone in use (current/cache) >> 0/12800/12800/12800 4k (page size) jumbo clusters in use >> (current/cache/total/max) >> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) >> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) >> 6908K/129327K/136236K bytes allocated to network (current/cache/total) >> 0/1080/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) >> 0/0/0 requests for jumbo clusters denied (4k/9k/16k) >> 0/7/6656 sfbufs in use (current/peak/max) >> 0 requests for sfbufs denied >> 0 requests for sfbufs delayed >> 0 requests for I/O initiated by sendfile >> 0 calls to protocol drain routines >> >> Note that the mbuf allocation denial did not appended at once. It has >> been progressively increasing by block of ~200 over the 5h of uptime >> of the machine, until the current condition occurred. >> >> I have previously been trying to simulate the depletion and the hang, >> but the driver recovered. I assume the condition is met in >> em_local_timer() to refresh the ring, I'd still need to check that. >> >> - Arnaud >> >> > Jack >> > >> > >> > On Thu, Mar 31, 2011 at 3:15 PM, Arnaud Lacombe >> wrote: >> >> >> >> Hi, >> >> >> >> On Thu, Mar 31, 2011 at 5:57 PM, Jack Vogel wrote= : >> >> > So, what is the evidence that the driver is stuck here? >> >> > >> >> About 800 pps (mostly SYN) present wire but never ever seen on em0, >> >> plus a couple of ARP reply, which still never hit em0, plus the >> >> `missed_packets' count increasing by the same 800 pps in the last >> >> hour. Is that enough ? >> >> >> >> - Arnaud >> >> >> >> ps: I forgot to add that MAC address on the wire are fine. >> >> >> >> > I see that next_to_check !=3D next_to_refresh, which is why the >> >> > local timer won't schedule anything. OH, and I also realized there >> >> > is a problem with local_timer anyway, it will run rxeof, but that >> won't >> >> > help >> >> > if you can't enter the loop, so I need to add some code at the top = to >> >> > call em_refresh_mbufs() when in this state. >> >> > >> >> > On this interrupt cause that you are focused upon, although its the= re >> in >> >> > the >> >> > design, I had talked with some of our most seasoned developers on >> both >> >> > the Windows and Linux side of the house, and NO one has ever used >> this >> >> > 'feature', because (and I'm quoting here) "there's no good use case >> for >> >> > it". >> >> > Meaning, there's always some simpler way of handling the issue. >> >> > >> >> > When you use MSIX you can't read causes btw, if you configured it, = it >> >> > would >> >> > mean you'd just get into the regular RX handler, same as always, so >> why >> >> > some special bother with this cause? >> >> > >> >> > On non-MSIX hardware there is just no particular reason to worry >> about >> >> > the >> >> > cause either, we can just handle the RX situation in the interrupt >> >> > handler. >> >> > >> >> > Jack >> >> > >> >> > >> >> > On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe >> >> > wrote: >> >> >> >> >> >> Hi Jack, >> >> >> >> >> >> On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe > > >> >> >> wrote: >> >> >> > [...] >> >> >> > I'll remove part of the changes I made to keep only >> >> >> > `rx_forced_refill' >> >> >> > and the associated sysctl, re-run the tests and come back with >> >> >> > correct >> >> >> > value, hopefully in a few hours. >> >> >> > >> >> >> Here it is: >> >> >> >> >> >> # sysctl dev.em.0.%desc >> >> >> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2 >> >> >> >> >> >> # sysctl dev.em.0.mac_stats.missed_packets >> >> >> dev.em.0.mac_stats.missed_packets: 917428 >> >> >> >> >> >> # sysctl dev.em.0.debug=3D1 >> >> >> dev.em.0.debug: I-1nterface is RUNNING and INACTIVE >> >> >> em0: hw tdh =3D 975, hw tdt =3D 975 >> >> >> em0: hw rdh =3D 884, hw rdt =3D 885 >> >> >> em0: Tx Queue Status =3D 0 >> >> >> em0: TX descriptors avail =3D 1024 >> >> >> em0: Tx Descriptors avail failure =3D 0 >> >> >> em0: RX discarded packets =3D 0 >> >> >> em0: RX Next to Check =3D 884 >> >> >> em0: RX Next to Refresh =3D 885 >> >> >> -> -1 >> >> >> >> >> >> So the taskqueue cannot be scheduled to run and the driver is stuc= k. >> >> >> >> >> >> > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel >> >> >> > wrote: >> >> >> >> Read the code in HEAD, em_local_timer() has a test of ALL the r= x >> >> >> >> queues >> >> >> >> and >> >> >> >> will schedule a task that refreshes mbufs if they are empty. Th= is >> >> >> >> has >> >> >> >> exactly the >> >> >> >> same effect as checking for some interrupt cause, a cause that = is >> >> >> >> not >> >> >> >> available >> >> >> >> when using MSIX on 82574, but this approach works for everythin= g. >> >> >> >> >> >> >> Can you please point me to a reference datasheet (or errata), >> provided >> >> >> by Intel, about the RX Overrun interrupt not being available with >> >> >> MSI-X on the 82574 ? >> >> >> >> >> >> Currently, I only have access to [0], which precises the following= : >> >> >> >> >> >> 7.4 Interrupts >> >> >> 7.4.2 MSI-X Mode >> >> >> [...] >> >> >> The following configuration and parameters are involved: >> >> >> =95 The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx queu= es >> and >> >> >> other >> >> >> events to 5 interrupt vectors >> >> >> =95 The ICR[24:20] bits reflect specific interrupt causes >> >> >> =95 Five MSI-X interrupt vectors are provided (calculated based on >> four >> >> >> vectors for >> >> >> queues and one vector for other causes). The requested number of >> >> >> vectors >> >> >> is >> >> >> loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X >> >> >> capability >> >> >> structure of the function. >> >> >> >> >> >> 10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC) >> >> >> [...] >> >> >> >> >> >> about bit 24: >> >> >> >> >> >> Other Interrupt. Indicates one of the following interrupts was set= : >> >> >> =95 Link Status Change. >> >> >> =95 Receiver Overrun. >> >> >> =95 MDIO Access Complete. >> >> >> =95 Small Receive Packet Detected. >> >> >> =95 Receive ACK Frame Detected. >> >> >> =95 Manageability Event Detected. >> >> >> >> >> >> Thanks in advance, >> >> >> - Arnaud >> >> >> >> >> >> [0]: ftp://download.intel.com/design/network/datashts/82574.pdf >> >> > >> >> > >> > >> > >> > > From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 01:16:09 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EB43F106566B for ; Fri, 1 Apr 2011 01:16:09 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 8D9B98FC0C for ; Fri, 1 Apr 2011 01:16:09 +0000 (UTC) Received: by vws18 with SMTP id 18so2896425vws.13 for ; Thu, 31 Mar 2011 18:16:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=KLSRdOfhaiV1/88z/hYM7IR5jRyk+OqHN+luJZeAFSs=; b=DSHxe8MtTaB1P/uoCuiRNec13NpVgHj3lZJtM0MhR9m0bRPYdTjLDoQfy8y1BkbtuK hZKiuRlp44USV65B9vqx+ib/rGDI3FCzF45WmU9Nl0eViIWKrxmuxvdEzh83oY6z246T sWvam/PnfY4Sx1T+SgwapX+47YUG7t8T+M0DY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=F5IAtCGKTcLVANtPqkvXKrxTNq1ipssdwg2Nl4roP+k/4QS88COFLYRoqDRsynNYs4 yfa/hKwxDRwe1AUYPrSLFAh3X/eQWr1Byvhy61OZDOGuT94neZIMBKdaqq0onJaZyL/u Dq0Z1RlXrAQU0AES7JKM0/RnkpN3rhqIKlV30= MIME-Version: 1.0 Received: by 10.52.94.48 with SMTP id cz16mr4345852vdb.173.1301620568561; Thu, 31 Mar 2011 18:16:08 -0700 (PDT) Received: by 10.52.167.6 with HTTP; Thu, 31 Mar 2011 18:16:08 -0700 (PDT) In-Reply-To: References: Date: Thu, 31 Mar 2011 18:16:08 -0700 Message-ID: From: Jack Vogel To: Arnaud Lacombe Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org Subject: Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 01:16:10 -0000 I know how I'm going to handle this, am formulating code for it, should hav= e a something that can be tested tomorrow, time to head out for the night.. Essentially, rather than just looking for equality, I will calculate the number of unrefreshed mbufs given the check/refresh values, and then call refresh when anything is unrefreshed. This will happen in rxeof, but I will also pu= t back the rx interrupt trigger into local timer. I'm pretty sure this will b= e bullet proof, at least for this kind of hang. Jack On Thu, Mar 31, 2011 at 5:28 PM, Jack Vogel wrote: > You know what Arnaud, I've looked at the numbers again, and I suddenly sa= w > that next_to_check and next_to_refresh are NOT in a good state, exactly t= he > opposite, check is BEHIND refresh, which means the whole ring is empty, t= he > HEAD (next_to_check) is pointing at 929, but next_to_refresh is at 930, > RIGHT > IN FRONT of it, so the whole ring is depleted!! > > What this means is that just a test of check =3D=3D refresh is not going = to be > good > enough to protect against all cases, so let me think about how to handle > this... > > Jack > > > > On Thu, Mar 31, 2011 at 4:38 PM, Jack Vogel wrote: > >> My validation group has some kind of hang... happens when they use a >> certain number >> of clients each running a stress test to the SUT, its like this, no real >> handle on what's >> wrong, if I knew what was wrong it would be half way or more to fixing i= t >> :) >> >> The evidence shows you have hit the max clusters at one point, but have >> freed most >> of them back up again, there is no shortage right at this point. Your >> previous data >> showed a normal idle head/tail relationship.... >> >> Just as a data point, will you please disable msix, recompile and run in >> MSI mode, >> I just want to see if that makes a difference. Search in the driver for >> em_enable_msix >> and set it FALSE. >> >> Jack >> >> >> >> On Thu, Mar 31, 2011 at 4:06 PM, Arnaud Lacombe wrot= e: >> >>> Hi, >>> >>> On Thu, Mar 31, 2011 at 6:28 PM, Jack Vogel wrote: >>> > OK, but those are not something present in this data, that was what I= 'm >>> > asking. >>> > >>> > So, you have a hang for which we do not have a certain cause. What >>> does >>> > netstat -m show? >>> > >>> # netstat -m >>> 3073/74927/78000 mbufs in use (current/cache/total) >>> 3070/29698/32768/32768 mbuf clusters in use (current/cache/total/max) >>> 0/383 mbuf+clusters out of packet secondary zone in use (current/cache) >>> 0/12800/12800/12800 4k (page size) jumbo clusters in use >>> (current/cache/total/max) >>> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) >>> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) >>> 6908K/129327K/136236K bytes allocated to network (current/cache/total) >>> 0/1080/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) >>> 0/0/0 requests for jumbo clusters denied (4k/9k/16k) >>> 0/7/6656 sfbufs in use (current/peak/max) >>> 0 requests for sfbufs denied >>> 0 requests for sfbufs delayed >>> 0 requests for I/O initiated by sendfile >>> 0 calls to protocol drain routines >>> >>> Note that the mbuf allocation denial did not appended at once. It has >>> been progressively increasing by block of ~200 over the 5h of uptime >>> of the machine, until the current condition occurred. >>> >>> I have previously been trying to simulate the depletion and the hang, >>> but the driver recovered. I assume the condition is met in >>> em_local_timer() to refresh the ring, I'd still need to check that. >>> >>> - Arnaud >>> >>> > Jack >>> > >>> > >>> > On Thu, Mar 31, 2011 at 3:15 PM, Arnaud Lacombe >>> wrote: >>> >> >>> >> Hi, >>> >> >>> >> On Thu, Mar 31, 2011 at 5:57 PM, Jack Vogel >>> wrote: >>> >> > So, what is the evidence that the driver is stuck here? >>> >> > >>> >> About 800 pps (mostly SYN) present wire but never ever seen on em0, >>> >> plus a couple of ARP reply, which still never hit em0, plus the >>> >> `missed_packets' count increasing by the same 800 pps in the last >>> >> hour. Is that enough ? >>> >> >>> >> - Arnaud >>> >> >>> >> ps: I forgot to add that MAC address on the wire are fine. >>> >> >>> >> > I see that next_to_check !=3D next_to_refresh, which is why the >>> >> > local timer won't schedule anything. OH, and I also realized there >>> >> > is a problem with local_timer anyway, it will run rxeof, but that >>> won't >>> >> > help >>> >> > if you can't enter the loop, so I need to add some code at the top >>> to >>> >> > call em_refresh_mbufs() when in this state. >>> >> > >>> >> > On this interrupt cause that you are focused upon, although its >>> there in >>> >> > the >>> >> > design, I had talked with some of our most seasoned developers on >>> both >>> >> > the Windows and Linux side of the house, and NO one has ever used >>> this >>> >> > 'feature', because (and I'm quoting here) "there's no good use cas= e >>> for >>> >> > it". >>> >> > Meaning, there's always some simpler way of handling the issue. >>> >> > >>> >> > When you use MSIX you can't read causes btw, if you configured it, >>> it >>> >> > would >>> >> > mean you'd just get into the regular RX handler, same as always, s= o >>> why >>> >> > some special bother with this cause? >>> >> > >>> >> > On non-MSIX hardware there is just no particular reason to worry >>> about >>> >> > the >>> >> > cause either, we can just handle the RX situation in the interrupt >>> >> > handler. >>> >> > >>> >> > Jack >>> >> > >>> >> > >>> >> > On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe >> > >>> >> > wrote: >>> >> >> >>> >> >> Hi Jack, >>> >> >> >>> >> >> On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe < >>> lacombar@gmail.com> >>> >> >> wrote: >>> >> >> > [...] >>> >> >> > I'll remove part of the changes I made to keep only >>> >> >> > `rx_forced_refill' >>> >> >> > and the associated sysctl, re-run the tests and come back with >>> >> >> > correct >>> >> >> > value, hopefully in a few hours. >>> >> >> > >>> >> >> Here it is: >>> >> >> >>> >> >> # sysctl dev.em.0.%desc >>> >> >> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2 >>> >> >> >>> >> >> # sysctl dev.em.0.mac_stats.missed_packets >>> >> >> dev.em.0.mac_stats.missed_packets: 917428 >>> >> >> >>> >> >> # sysctl dev.em.0.debug=3D1 >>> >> >> dev.em.0.debug: I-1nterface is RUNNING and INACTIVE >>> >> >> em0: hw tdh =3D 975, hw tdt =3D 975 >>> >> >> em0: hw rdh =3D 884, hw rdt =3D 885 >>> >> >> em0: Tx Queue Status =3D 0 >>> >> >> em0: TX descriptors avail =3D 1024 >>> >> >> em0: Tx Descriptors avail failure =3D 0 >>> >> >> em0: RX discarded packets =3D 0 >>> >> >> em0: RX Next to Check =3D 884 >>> >> >> em0: RX Next to Refresh =3D 885 >>> >> >> -> -1 >>> >> >> >>> >> >> So the taskqueue cannot be scheduled to run and the driver is >>> stuck. >>> >> >> >>> >> >> > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel >>> >> >> > wrote: >>> >> >> >> Read the code in HEAD, em_local_timer() has a test of ALL the = rx >>> >> >> >> queues >>> >> >> >> and >>> >> >> >> will schedule a task that refreshes mbufs if they are empty. >>> This >>> >> >> >> has >>> >> >> >> exactly the >>> >> >> >> same effect as checking for some interrupt cause, a cause that >>> is >>> >> >> >> not >>> >> >> >> available >>> >> >> >> when using MSIX on 82574, but this approach works for >>> everything. >>> >> >> >> >>> >> >> Can you please point me to a reference datasheet (or errata), >>> provided >>> >> >> by Intel, about the RX Overrun interrupt not being available with >>> >> >> MSI-X on the 82574 ? >>> >> >> >>> >> >> Currently, I only have access to [0], which precises the followin= g: >>> >> >> >>> >> >> 7.4 Interrupts >>> >> >> 7.4.2 MSI-X Mode >>> >> >> [...] >>> >> >> The following configuration and parameters are involved: >>> >> >> =95 The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx que= ues >>> and >>> >> >> other >>> >> >> events to 5 interrupt vectors >>> >> >> =95 The ICR[24:20] bits reflect specific interrupt causes >>> >> >> =95 Five MSI-X interrupt vectors are provided (calculated based o= n >>> four >>> >> >> vectors for >>> >> >> queues and one vector for other causes). The requested number of >>> >> >> vectors >>> >> >> is >>> >> >> loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X >>> >> >> capability >>> >> >> structure of the function. >>> >> >> >>> >> >> 10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC) >>> >> >> [...] >>> >> >> >>> >> >> about bit 24: >>> >> >> >>> >> >> Other Interrupt. Indicates one of the following interrupts was se= t: >>> >> >> =95 Link Status Change. >>> >> >> =95 Receiver Overrun. >>> >> >> =95 MDIO Access Complete. >>> >> >> =95 Small Receive Packet Detected. >>> >> >> =95 Receive ACK Frame Detected. >>> >> >> =95 Manageability Event Detected. >>> >> >> >>> >> >> Thanks in advance, >>> >> >> - Arnaud >>> >> >> >>> >> >> [0]: ftp://download.intel.com/design/network/datashts/82574.pdf >>> >> > >>> >> > >>> > >>> > >>> >> >> > From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 12:43:48 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EEC8A1065703 for ; Fri, 1 Apr 2011 12:43:48 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id C75928FC14 for ; Fri, 1 Apr 2011 12:43:48 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 79E0346BA4; Fri, 1 Apr 2011 08:43:48 -0400 (EDT) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 104988A01B; Fri, 1 Apr 2011 08:43:48 -0400 (EDT) From: John Baldwin To: "Stefan `Sec` Zehl" Date: Fri, 1 Apr 2011 08:32:49 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110325; KDE/4.5.5; amd64; ; ) References: <4D8B99B4.4070404@FreeBSD.org> <201103300838.09608.jhb@freebsd.org> <20110331234017.GC3308@ice.42.org> In-Reply-To: <20110331234017.GC3308@ice.42.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201104010832.49214.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Fri, 01 Apr 2011 08:43:48 -0400 (EDT) Cc: freebsd-net@freebsd.org Subject: Re: The tale of a TCP bug X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 12:43:49 -0000 On Thursday, March 31, 2011 7:40:17 pm Stefan `Sec` Zehl wrote: > On Wed, Mar 30, 2011 at 08:38 -0400, John Baldwin wrote: > > There is at least one case I know of related to a bug I reported earlier > > where a window probe from a remote connection can cause rcv_nxt to advance > > past rcv_adv by one. However, I think we want to know about those cases, > > and we should probably be treating rcv_adv - rcv_nxt as if it is zero in > > that case, not -1 (my patch in my original e-mail does just that in a > > different place in tcp_output() when we calculate the window "for real"). > > I've been running for about a day now with the committed patch and > adv_neg is still zero: Well, after thinking some more, rcv_nxt == rcv_adv + 1 will not make adv negative. > | ice:~>uptime; sysctl net.inet.tcp.adv_neg > | 1:36AM up 1 day, 4:52, 1 user, load averages: 0.12, 0.06, 0.05 > | net.inet.tcp.adv_neg: 0 > > I'll of course monitor this value and report back if I ever see it > increase :-) Great, thanks! -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 14:03:43 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 075941065670 for ; Fri, 1 Apr 2011 14:03:43 +0000 (UTC) (envelope-from ulsanrub@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id BFC988FC0A for ; Fri, 1 Apr 2011 14:03:42 +0000 (UTC) Received: by iyj12 with SMTP id 12so4470890iyj.13 for ; Fri, 01 Apr 2011 07:03:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=M8bKFWig5KPy5W8kP7DL3LLXVTYDGMgHgJmjk1zO/cs=; b=WFDjqCwA+boCOWk54BElkAJE91HwtTlWImudzZmvKQ6U3dFlBUyjFdftuOiVBfpcGS w7IfnCmDFm+g6zryH0JZUTn9AlMrMOncWvazuQ4c0LH1u9U+gmngymzIG2Xz9kvbfa02 7QJjNvBu4108WVFS//IJrAQ6rViLJ1XFhQrxA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=kirP8b6dM+IYJa5EeIyI3GX9wGLE7IAkP3IEe/NftUOytaG177lDcoYlu+FmSRna/v qCpJeiA+sou8onR04jRA9TMp93WQwVe8lp2foc8+oTDs0gkD/HQCy/XysIL2N1Z+iQIG s9/Gi1VB+evOfhKmxUd014EIHqxZuTpz1h3tI= MIME-Version: 1.0 Received: by 10.43.65.132 with SMTP id xm4mr5451001icb.424.1301666621723; Fri, 01 Apr 2011 07:03:41 -0700 (PDT) Received: by 10.42.240.71 with HTTP; Fri, 1 Apr 2011 07:03:41 -0700 (PDT) In-Reply-To: <4D945B55.6080600@freebsd.org> References: <4D945B55.6080600@freebsd.org> Date: Fri, 1 Apr 2011 10:03:41 -0400 Message-ID: From: Kyungsoo Lee To: Julian Elischer Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Michael Proto , freebsd-net Subject: Re: UDP on FreeBSD X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 14:03:43 -0000 Thank you for your responses. :) On Thu, Mar 31, 2011 at 6:45 AM, Julian Elischer wrote: > On 3/30/11 2:32 PM, Michael Proto wrote: > >> On Wed, Mar 30, 2011 at 3:43 PM, Kyungsoo Lee wrote: >> >>> Hi All, >>> >>> I want to check UDP on FreeBSD. >>> >>> I am using IPERF on FreeBSD for wireless testing with Proxim 8470 FC >>> PCMCIA >>> card on IBM T42 and T61. >>> >>> When I'm transmitting data from FreeBSD to FreeBSD or CentOS using Iperf >>> with -u -b 100M on iperf, they had lost lots of packets. Sniffer near the >>> two nodes shows the sender could not send all packets. Iperf sender said >>> that they try to send 85469 packets but they lost 68824 packets. I think >>> that the UDP buffer on the sender could not handle all packets. >>> >>> But if I'm trying to send data from CentOS to FreeBSD using Iperf with -u >>> -b >>> 100M option on iperf, the sender tries 18636 packets so they lost few >>> packets like 1 or 2 packets.As a result, they have similar bandwidth >>> result >>> on the report. I think that it happens from different implement between >>> FreeBSD and Linux. >>> >>> But I want to double check that this is normal for FreeBSD or not. If I >>> have >>> some missing points, let me know please. >>> >>> Thank you! >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>> >>> Just a guess, but have you tried adjusting the net.inet.udp.maxdgram >> sysctl? I believe the default is somewhat low for UDP transmit. I >> don't know what size packets iperf is using but increasing the >> maxdgram value might help your testing. >> > > this is many years out of date but a decade or so ago freebsd would return > ENOBUFS > and linux would block when the outgoing queues filled up. > the answer then was that teh programs are all written for Linux and didn't > check for ENOBUFS > but that may be out of date now in many different ways. > >> >> -Proto >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >> > From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 14:36:19 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 147691065670 for ; Fri, 1 Apr 2011 14:36:19 +0000 (UTC) (envelope-from free@isafeelin.org) Received: from progress.isafeelin.org (progress.isafeelin.org [80.69.81.6]) by mx1.freebsd.org (Postfix) with ESMTP id CCE5A8FC13 for ; Fri, 1 Apr 2011 14:36:18 +0000 (UTC) Received: from progress.isafeelin.org (localhost [127.0.0.1]) by progress.isafeelin.org (Postfix) with ESMTP id E7953131182 for ; Fri, 1 Apr 2011 16:16:55 +0200 (CEST) Received: from s5375723c.adsl.wanadoo.nl (s5375723c.adsl.wanadoo.nl [83.117.114.60]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by progress.isafeelin.org (Postfix) with ESMTPS id BDFD813117A for ; Fri, 1 Apr 2011 16:16:55 +0200 (CEST) Received: by s5375723c.adsl.wanadoo.nl (Postfix, from userid 1002) id 603D828428; Fri, 1 Apr 2011 16:16:55 +0200 (CEST) Date: Fri, 1 Apr 2011 16:16:55 +0200 From: Frederique Rijsdijk To: freebsd-net@freebsd.org Message-ID: <20110401141655.GA5350@deta.isafeelin.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: ClamAV using ClamSMTP Subject: Network stack unstable after arp flapping X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 14:36:19 -0000 Hi, We (hosting provider) are in the process of implementing ipv6 in our network (yay). Yesterday one of the final steps in configuring and updating our core routers were taken, which did not go entirely as planned. As a result, the default gateway mac addresses for all our machines changed about 800 times in a time span of about 4 minutes. Here's a small piece of the logging: Mar 31 18:36:12 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d to 00:00:0c:07:ac:3d on bge0 Mar 31 18:36:12 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d to 00:00:0c:9f:f0:3d on bge0 Mar 31 18:36:13 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d to 00:00:0c:07:ac:3d on bge0 Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d to 00:00:0c:9f:f0:3d on bge0 Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d to 00:00:0c:07:ac:3d on bge0 Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d to 00:00:0c:9f:f0:3d on bge0 Mar 31 18:36:15 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d to 00:00:0c:07:ac:3d on bge0 The x.x.x.1 is always the same IP, the gateway of the machine. The result of that, is that loads of FreeBSD machines (6.x, 7.x and 8.x) developed serious network issues, mainly being no or slow traffic between other (FreeBSD) machine accross different VLAN's in our own network. First thing that comes to mind is the network itself, but all Linux machines (Ubuntu, Red Hat and CentOS) had no issues at all. Only BSD. An arp -ad on both machines where problems occured, didn't solve anything. What worked better was /etc/rc.d/netif restart and a /etc/rc.d/routing restart. Some machines even had to be rebooted in order to get networking back to normal. This almost sounds like a bug in the network stack in BSD, but I can not imagine that I'm right. The BSD networking stack is considered to be one of the best.. Any ideas anyone? -- Frederique From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 14:50:30 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 286DD1065674 for ; Fri, 1 Apr 2011 14:50:30 +0000 (UTC) (envelope-from korvus@comcast.net) Received: from qmta09.westchester.pa.mail.comcast.net (qmta09.westchester.pa.mail.comcast.net [76.96.62.96]) by mx1.freebsd.org (Postfix) with ESMTP id C8E7D8FC0A for ; Fri, 1 Apr 2011 14:50:29 +0000 (UTC) Received: from omta10.westchester.pa.mail.comcast.net ([76.96.62.28]) by qmta09.westchester.pa.mail.comcast.net with comcast id SEmZ1g0020cZkys59EqW02; Fri, 01 Apr 2011 14:50:30 +0000 Received: from [192.168.2.164] ([206.210.89.202]) by omta10.westchester.pa.mail.comcast.net with comcast id SEqK1g00q4Mx3R23WEqMnN; Fri, 01 Apr 2011 14:50:28 +0000 Message-ID: <4D95E62A.5000109@comcast.net> Date: Fri, 01 Apr 2011 10:50:18 -0400 From: Steve Polyack User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.15) Gecko/20110316 Lightning/1.0b2 Thunderbird/3.1.9 MIME-Version: 1.0 To: Frederique Rijsdijk References: <20110401141655.GA5350@deta.isafeelin.org> In-Reply-To: <20110401141655.GA5350@deta.isafeelin.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org Subject: Re: Network stack unstable after arp flapping X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 14:50:30 -0000 On 04/01/11 10:16, Frederique Rijsdijk wrote: > Hi, > > We (hosting provider) are in the process of implementing ipv6 in our network (yay). Yesterday one of the final steps in configuring and updating our core routers were taken, which did not go entirely as planned. As a result, the default gateway mac addresses for all our machines changed about 800 times in a time span of about 4 minutes. > > Here's a small piece of the logging: > > Mar 31 18:36:12 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d to 00:00:0c:07:ac:3d on bge0 > Mar 31 18:36:12 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d to 00:00:0c:9f:f0:3d on bge0 > Mar 31 18:36:13 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d to 00:00:0c:07:ac:3d on bge0 > Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d to 00:00:0c:9f:f0:3d on bge0 > Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d to 00:00:0c:07:ac:3d on bge0 > Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d to 00:00:0c:9f:f0:3d on bge0 > Mar 31 18:36:15 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d to 00:00:0c:07:ac:3d on bge0 > > The x.x.x.1 is always the same IP, the gateway of the machine. > > The result of that, is that loads of FreeBSD machines (6.x, 7.x and 8.x) developed serious network issues, mainly being no or slow traffic between other (FreeBSD) machine accross different VLAN's in our own network. > > First thing that comes to mind is the network itself, but all Linux machines (Ubuntu, Red Hat and CentOS) had no issues at all. Only BSD. > > An arp -ad on both machines where problems occured, didn't solve anything. What worked better was /etc/rc.d/netif restart and a /etc/rc.d/routing restart. Some machines even had to be rebooted in order to get networking back to normal. > > This almost sounds like a bug in the network stack in BSD, but I can not imagine that I'm right. The BSD networking stack is considered to be one of the best.. > > Any ideas anyone? We experienced a similar issue here, but IIRC only on our 8.x systems (we don't have any 7.x). Disabling flowtable cleared everything up immediately. You can try that and see if it helps. It seems like the flowtable caches and associates the next-hop router MAC address with each flow, and unfortunately this doesn't get purged when the kernel senses and logs an ARP change. The only other solution I've seen was to stop all network traffic on the machine until the flows/cache entries expired. http://www.freebsd.org/cgi/query-pr.cgi?pr=155604 has more details of my run-in with this. The title should be corrected though, as I found shortly after that all traffic is affected. - Steve From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 16:00:37 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 458D0106566B for ; Fri, 1 Apr 2011 16:00:37 +0000 (UTC) (envelope-from jamesbrandongooch@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id CCF228FC08 for ; Fri, 1 Apr 2011 16:00:36 +0000 (UTC) Received: by mail-wy0-f182.google.com with SMTP id 23so3656151wyf.13 for ; Fri, 01 Apr 2011 09:00:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=knllz0id8Rtd2dedthDUgIWpy9OkL3PYNA0aZLYhffs=; b=UdE2WTlRIfhr5u/DxjeXKzmjCKO+j0e//sdBFSJ6X+qd9nKSX00GOhyeSDV8NwQiRx TyoP79mCoKGGZKJUTWtbChtdDT0i4knoy+2hJ7zB3Vz2gfmvTCUyxZpKKLw6drzjYE8s GGiYoGmcn5kP3Ly+oauB7qIQShM0cFEBiE8zE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=eTcrMoEwjWNp4Od/2gpsmqCXYiYSiLXCJBleNUsRJXYiHK/BZmBAPJz0nAOlJqRKB+ KDxkKRQNSXMAyGfdY18FwV4VbNVnKaaHDPtSolobDs+ZhjmSaAydooBnVPzCafHOnqlQ 256JVm9ybn3gJIKcgBZlMSxpqk56DKGNAkewQ= MIME-Version: 1.0 Received: by 10.216.144.223 with SMTP id n73mr4125630wej.37.1301673636421; Fri, 01 Apr 2011 09:00:36 -0700 (PDT) Received: by 10.216.0.205 with HTTP; Fri, 1 Apr 2011 09:00:36 -0700 (PDT) In-Reply-To: <4D95E62A.5000109@comcast.net> References: <20110401141655.GA5350@deta.isafeelin.org> <4D95E62A.5000109@comcast.net> Date: Fri, 1 Apr 2011 11:00:36 -0500 Message-ID: From: Brandon Gooch To: Steve Polyack Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org, Frederique Rijsdijk Subject: Re: Network stack unstable after arp flapping X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 16:00:37 -0000 On Fri, Apr 1, 2011 at 9:50 AM, Steve Polyack wrote: > On 04/01/11 10:16, Frederique Rijsdijk wrote: >> >> Hi, >> >> We (hosting provider) are in the process of implementing ipv6 in our >> network (yay). Yesterday one of the final steps in configuring and updat= ing >> our core routers were taken, which did not go entirely as planned. As a >> result, the default gateway mac addresses for all our machines changed a= bout >> 800 times in a time span of about 4 minutes. >> >> Here's a small piece of the logging: >> >> Mar 31 18:36:12 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d = to >> 00:00:0c:07:ac:3d on bge0 >> Mar 31 18:36:12 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d = to >> 00:00:0c:9f:f0:3d on bge0 >> Mar 31 18:36:13 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d = to >> 00:00:0c:07:ac:3d on bge0 >> Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d = to >> 00:00:0c:9f:f0:3d on bge0 >> Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d = to >> 00:00:0c:07:ac:3d on bge0 >> Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d = to >> 00:00:0c:9f:f0:3d on bge0 >> Mar 31 18:36:15 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d = to >> 00:00:0c:07:ac:3d on bge0 >> >> The x.x.x.1 is always the same IP, the gateway of the machine. >> >> The result of that, is that loads of FreeBSD machines (6.x, 7.x and 8.x) >> developed serious network issues, mainly being no or slow traffic betwee= n >> other (FreeBSD) machine accross different VLAN's in our own network. >> >> First thing that comes to mind is the network itself, but all Linux >> machines (Ubuntu, Red Hat and CentOS) had no issues at all. Only BSD. >> >> An arp -ad on both machines where problems occured, didn't solve anythin= g. >> What worked better was /etc/rc.d/netif restart and a /etc/rc.d/routing >> restart. Some machines even had to be rebooted in order to get networkin= g >> back to normal. >> >> This almost sounds like a bug in the network stack in BSD, but I can not >> imagine that I'm right. The BSD networking stack is considered to be one= of >> the best.. >> >> Any ideas anyone? > > We experienced a similar issue here, but IIRC only on our 8.x systems (we > don't have any 7.x). =A0Disabling flowtable cleared everything up immedia= tely. > =A0You can try that and see if it helps. =A0It seems like the flowtable = =A0caches > and associates the next-hop router MAC address with each flow, and > unfortunately this doesn't get purged when the kernel senses and logs an = ARP > change. =A0The only other solution I've seen was to stop all network traf= fic > on the machine until the flows/cache entries expired. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=3D155604 has more details of m= y > run-in with this. =A0The title should be corrected though, as I found sho= rtly > after that all traffic is affected. > > - Steve FYI, the FLOWTABLE option has been removed from the DEFAULT kernel config on HEAD, a change which will be MFC'd in a couple of days to 8-STABLE... -Brandon From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 17:38:18 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6CDA1106566C for ; Fri, 1 Apr 2011 17:38:18 +0000 (UTC) (envelope-from kickbsd@yandex.ru) Received: from forward10.mail.yandex.net (forward10.mail.yandex.net [77.88.61.49]) by mx1.freebsd.org (Postfix) with ESMTP id 8B7A48FC14 for ; Fri, 1 Apr 2011 17:38:17 +0000 (UTC) Received: from web100.yandex.ru (web100.yandex.ru [77.88.61.1]) by forward10.mail.yandex.net (Yandex) with ESMTP id B96A51021529 for ; Fri, 1 Apr 2011 21:27:11 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1301678831; bh=VQgOG8ul4UzH0RW43nVzMJ2PVwaFU5pCb69C2t9sZq8=; h=From:To:Subject:MIME-Version:Message-Id:Date: Content-Transfer-Encoding:Content-Type; b=o61gJGlwaFyal+tW/eQwJCkZeJCKAZvEuTsKglnEecm9hJBaCWK6B6EL4n9fm85xe ZKyrwhHlkU2TYo19jCP+Y92+ADA/q+AfDWYTFTJaRF1VGAINLPv2nS0Jf8JaeGrjC7 jZ+r9aV/k3e8GH2mvt6s8tlxHoyM3Z3B1GFSfDts= Received: from localhost (localhost.localdomain [127.0.0.1]) by web100.yandex.ru (Yandex) with ESMTP id AE7A6FC8039 for ; Fri, 1 Apr 2011 21:27:11 +0400 (MSD) Received: from leo.de.teleglobe.net (leo.de.teleglobe.net [64.86.53.146]) by mail.yandex.ru with HTTP; Fri, 01 Apr 2011 21:27:10 +0400 From: Baginski Darren To: freebsd-net@freebsd.org MIME-Version: 1.0 Message-Id: <1128701301678831@web100.yandex.ru> Date: Fri, 01 Apr 2011 21:27:10 +0400 X-Mailer: Yamail [ http://yandex.ru ] 5.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain Subject: Multiple gateways support X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 17:38:18 -0000 Hi! Could please someone tell me about current state of multiple gw capabilities of FreeBSD? I have dual homed FreeBSD box, one interface ISP1 another ISP2 : 1) can I balance outgoing traffic across them ? 2) Is there support of any kind dead gateway detection? 3) Can I install multiple routes to the same network (with same and with different wight)? If yes how it behaves with one link failure, in particular if interface is down? Thank you! From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 18:55:13 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CC3E4106566B; Fri, 1 Apr 2011 18:55:13 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 77F4F8FC1D; Fri, 1 Apr 2011 18:55:13 +0000 (UTC) Received: by iwn33 with SMTP id 33so4743738iwn.13 for ; Fri, 01 Apr 2011 11:55:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:date:to:cc:subject:message-id:reply-to :references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=828DKWF859IXhZ8CUJEw4JNHM04vw/qs6u4KboclsKQ=; b=iXb1BcZK9rmcvPtl0ocUTuT3K0rFXy7JqozJJn6HE69MyHx0Sw96ekNrQcit18XSVm 1WPlA1Nt8PxHcnGs4hCcmISc9EP7urbfzQourhrWVJP8MP3pAJeOo+S/vT9D6uGnn2pI K8owhCV+hFc6/tzCvOeAQkAQxe+c2aFdibhCk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=YqAwUKfNxNEvdBozRga2itMytYHIYp333pqcHZUtJhhhPRSEtuRFKpLRMpPyuAuNW+ 8eaqfBrUFMdBEhb5XZwUItpEet60LhCeL7jGW70uZte1IuRn25avBtrXe6bJO0myEpdf ICQMTMla7iIT0hihTk0Y9e1fZPdcTdbJvuAwk= Received: by 10.43.52.193 with SMTP id vn1mr6058171icb.460.1301684112699; Fri, 01 Apr 2011 11:55:12 -0700 (PDT) Received: from pyunyh@gmail.com ([174.35.1.224]) by mx.google.com with ESMTPS id i20sm1625908iby.14.2011.04.01.11.55.08 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 01 Apr 2011 11:55:10 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Fri, 01 Apr 2011 11:53:58 -0700 From: YongHyeon PYUN Date: Fri, 1 Apr 2011 11:53:57 -0700 To: Yamagi Burmeister Message-ID: <20110401185357.GA15910@michelle.cdnetworks.com> References: <20110330173145.GB8601@michelle.cdnetworks.com> <20110330202858.GC8601@michelle.cdnetworks.com> <20110331171302.GA11981@michelle.cdnetworks.com> <20110331181651.GB11981@michelle.cdnetworks.com> <20110331183054.GC11981@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-net@freebsd.org, yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 18:55:13 -0000 On Thu, Mar 31, 2011 at 09:59:12PM +0200, Yamagi Burmeister wrote: > On Thu, 31 Mar 2011, YongHyeon PYUN wrote: > > >>Thanks a lot! It seems the L1 controller has data corruption issue > >>when 64bit DMA addressing is used. Try this one. > > > >Oops, there was a bug in previous patch. > >Try this instead. > > Okay, that patch seems to do the trick. This was just a short test run > of about one hour with just 50gb copied, but without the patch the > system would have crashed in the first 20 minutes. I'll do a more > comprehensive test over night and report back tomorrow morning. > Fix committed to HEAD(r220249, r220252). Thanks a lot for testing! From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 18:55:15 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8E6261065672 for ; Fri, 1 Apr 2011 18:55:15 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 33CD48FC0C for ; Fri, 1 Apr 2011 18:55:14 +0000 (UTC) Received: by vxc34 with SMTP id 34so3644778vxc.13 for ; Fri, 01 Apr 2011 11:55:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=WSAkZQ1xiwA+BrvcssdJ5R6LdWhS8scrP2z3QttUNxo=; b=BMiAl+ui+aQs2hVsgPbFmgz81M9HFELoUvPbP/a6bDbQ7coOW0lhgD90QNovs8VaZq 25IHxplYsscxWVgXsPNt6hisEPuYCKgfJYWo/BzE4ATU1ISFbd4SjtvX0m1ifxwl9qhe diLSBwI504pH4yaVZTHoTdCmqZvioZM392uJE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=H1H6q9ssXh4sqZiRqPZbAhiJrHoEItkL9PFdZ2H9AeK8CLh+YpgjpNprGndgCS8mof J2v1JFWn1T+yrmegLlP9xn9z+ZExxVGh05f9kV4ilEdDOi243/g6io/awiFWmbjvZHYy cgrkIHJfVlz4vkkSONFdFOhALJzk5ygv9m5jw= MIME-Version: 1.0 Received: by 10.52.94.48 with SMTP id cz16mr347414vdb.173.1301684114417; Fri, 01 Apr 2011 11:55:14 -0700 (PDT) Received: by 10.52.167.6 with HTTP; Fri, 1 Apr 2011 11:55:14 -0700 (PDT) In-Reply-To: References: Date: Fri, 1 Apr 2011 11:55:14 -0700 Message-ID: From: Jack Vogel To: Arnaud Lacombe Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org Subject: Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 18:55:15 -0000 Arnaud, Please try the code change I just checked into HEAD, it should finally resolve any hang that is due to mbufs not being refreshed. That's not to say there may not be other reasons out there but I'm keeping my fingers crossed that this is behind at least some of the hangs. Jack On Thu, Mar 31, 2011 at 6:16 PM, Jack Vogel wrote: > I know how I'm going to handle this, am formulating code for it, should > have a > something that can be tested tomorrow, time to head out for the night.. > > Essentially, rather than just looking for equality, I will calculate the > number > of unrefreshed mbufs given the check/refresh values, and then call refres= h > when anything is unrefreshed. This will happen in rxeof, but I will also > put > back the rx interrupt trigger into local timer. I'm pretty sure this will > be > bullet proof, at least for this kind of hang. > > Jack > > > On Thu, Mar 31, 2011 at 5:28 PM, Jack Vogel wrote: > >> You know what Arnaud, I've looked at the numbers again, and I suddenly s= aw >> that next_to_check and next_to_refresh are NOT in a good state, exactly >> the >> opposite, check is BEHIND refresh, which means the whole ring is empty, >> the >> HEAD (next_to_check) is pointing at 929, but next_to_refresh is at 930, >> RIGHT >> IN FRONT of it, so the whole ring is depleted!! >> >> What this means is that just a test of check =3D=3D refresh is not going= to be >> good >> enough to protect against all cases, so let me think about how to handl= e >> this... >> >> Jack >> >> >> >> On Thu, Mar 31, 2011 at 4:38 PM, Jack Vogel wrote: >> >>> My validation group has some kind of hang... happens when they use a >>> certain number >>> of clients each running a stress test to the SUT, its like this, no rea= l >>> handle on what's >>> wrong, if I knew what was wrong it would be half way or more to fixing = it >>> :) >>> >>> The evidence shows you have hit the max clusters at one point, but have >>> freed most >>> of them back up again, there is no shortage right at this point. Your >>> previous data >>> showed a normal idle head/tail relationship.... >>> >>> Just as a data point, will you please disable msix, recompile and run i= n >>> MSI mode, >>> I just want to see if that makes a difference. Search in the driver for >>> em_enable_msix >>> and set it FALSE. >>> >>> Jack >>> >>> >>> >>> On Thu, Mar 31, 2011 at 4:06 PM, Arnaud Lacombe wro= te: >>> >>>> Hi, >>>> >>>> On Thu, Mar 31, 2011 at 6:28 PM, Jack Vogel wrote: >>>> > OK, but those are not something present in this data, that was what >>>> I'm >>>> > asking. >>>> > >>>> > So, you have a hang for which we do not have a certain cause. What >>>> does >>>> > netstat -m show? >>>> > >>>> # netstat -m >>>> 3073/74927/78000 mbufs in use (current/cache/total) >>>> 3070/29698/32768/32768 mbuf clusters in use (current/cache/total/max) >>>> 0/383 mbuf+clusters out of packet secondary zone in use (current/cache= ) >>>> 0/12800/12800/12800 4k (page size) jumbo clusters in use >>>> (current/cache/total/max) >>>> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) >>>> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) >>>> 6908K/129327K/136236K bytes allocated to network (current/cache/total) >>>> 0/1080/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) >>>> 0/0/0 requests for jumbo clusters denied (4k/9k/16k) >>>> 0/7/6656 sfbufs in use (current/peak/max) >>>> 0 requests for sfbufs denied >>>> 0 requests for sfbufs delayed >>>> 0 requests for I/O initiated by sendfile >>>> 0 calls to protocol drain routines >>>> >>>> Note that the mbuf allocation denial did not appended at once. It has >>>> been progressively increasing by block of ~200 over the 5h of uptime >>>> of the machine, until the current condition occurred. >>>> >>>> I have previously been trying to simulate the depletion and the hang, >>>> but the driver recovered. I assume the condition is met in >>>> em_local_timer() to refresh the ring, I'd still need to check that. >>>> >>>> - Arnaud >>>> >>>> > Jack >>>> > >>>> > >>>> > On Thu, Mar 31, 2011 at 3:15 PM, Arnaud Lacombe >>>> wrote: >>>> >> >>>> >> Hi, >>>> >> >>>> >> On Thu, Mar 31, 2011 at 5:57 PM, Jack Vogel >>>> wrote: >>>> >> > So, what is the evidence that the driver is stuck here? >>>> >> > >>>> >> About 800 pps (mostly SYN) present wire but never ever seen on em0, >>>> >> plus a couple of ARP reply, which still never hit em0, plus the >>>> >> `missed_packets' count increasing by the same 800 pps in the last >>>> >> hour. Is that enough ? >>>> >> >>>> >> - Arnaud >>>> >> >>>> >> ps: I forgot to add that MAC address on the wire are fine. >>>> >> >>>> >> > I see that next_to_check !=3D next_to_refresh, which is why the >>>> >> > local timer won't schedule anything. OH, and I also realized ther= e >>>> >> > is a problem with local_timer anyway, it will run rxeof, but that >>>> won't >>>> >> > help >>>> >> > if you can't enter the loop, so I need to add some code at the to= p >>>> to >>>> >> > call em_refresh_mbufs() when in this state. >>>> >> > >>>> >> > On this interrupt cause that you are focused upon, although its >>>> there in >>>> >> > the >>>> >> > design, I had talked with some of our most seasoned developers on >>>> both >>>> >> > the Windows and Linux side of the house, and NO one has ever used >>>> this >>>> >> > 'feature', because (and I'm quoting here) "there's no good use ca= se >>>> for >>>> >> > it". >>>> >> > Meaning, there's always some simpler way of handling the issue. >>>> >> > >>>> >> > When you use MSIX you can't read causes btw, if you configured it= , >>>> it >>>> >> > would >>>> >> > mean you'd just get into the regular RX handler, same as always, = so >>>> why >>>> >> > some special bother with this cause? >>>> >> > >>>> >> > On non-MSIX hardware there is just no particular reason to worry >>>> about >>>> >> > the >>>> >> > cause either, we can just handle the RX situation in the interrup= t >>>> >> > handler. >>>> >> > >>>> >> > Jack >>>> >> > >>>> >> > >>>> >> > On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe < >>>> lacombar@gmail.com> >>>> >> > wrote: >>>> >> >> >>>> >> >> Hi Jack, >>>> >> >> >>>> >> >> On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe < >>>> lacombar@gmail.com> >>>> >> >> wrote: >>>> >> >> > [...] >>>> >> >> > I'll remove part of the changes I made to keep only >>>> >> >> > `rx_forced_refill' >>>> >> >> > and the associated sysctl, re-run the tests and come back with >>>> >> >> > correct >>>> >> >> > value, hopefully in a few hours. >>>> >> >> > >>>> >> >> Here it is: >>>> >> >> >>>> >> >> # sysctl dev.em.0.%desc >>>> >> >> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2 >>>> >> >> >>>> >> >> # sysctl dev.em.0.mac_stats.missed_packets >>>> >> >> dev.em.0.mac_stats.missed_packets: 917428 >>>> >> >> >>>> >> >> # sysctl dev.em.0.debug=3D1 >>>> >> >> dev.em.0.debug: I-1nterface is RUNNING and INACTIVE >>>> >> >> em0: hw tdh =3D 975, hw tdt =3D 975 >>>> >> >> em0: hw rdh =3D 884, hw rdt =3D 885 >>>> >> >> em0: Tx Queue Status =3D 0 >>>> >> >> em0: TX descriptors avail =3D 1024 >>>> >> >> em0: Tx Descriptors avail failure =3D 0 >>>> >> >> em0: RX discarded packets =3D 0 >>>> >> >> em0: RX Next to Check =3D 884 >>>> >> >> em0: RX Next to Refresh =3D 885 >>>> >> >> -> -1 >>>> >> >> >>>> >> >> So the taskqueue cannot be scheduled to run and the driver is >>>> stuck. >>>> >> >> >>>> >> >> > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel >>>> >> >> > wrote: >>>> >> >> >> Read the code in HEAD, em_local_timer() has a test of ALL the >>>> rx >>>> >> >> >> queues >>>> >> >> >> and >>>> >> >> >> will schedule a task that refreshes mbufs if they are empty. >>>> This >>>> >> >> >> has >>>> >> >> >> exactly the >>>> >> >> >> same effect as checking for some interrupt cause, a cause tha= t >>>> is >>>> >> >> >> not >>>> >> >> >> available >>>> >> >> >> when using MSIX on 82574, but this approach works for >>>> everything. >>>> >> >> >> >>>> >> >> Can you please point me to a reference datasheet (or errata), >>>> provided >>>> >> >> by Intel, about the RX Overrun interrupt not being available wit= h >>>> >> >> MSI-X on the 82574 ? >>>> >> >> >>>> >> >> Currently, I only have access to [0], which precises the >>>> following: >>>> >> >> >>>> >> >> 7.4 Interrupts >>>> >> >> 7.4.2 MSI-X Mode >>>> >> >> [...] >>>> >> >> The following configuration and parameters are involved: >>>> >> >> =95 The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx qu= eues >>>> and >>>> >> >> other >>>> >> >> events to 5 interrupt vectors >>>> >> >> =95 The ICR[24:20] bits reflect specific interrupt causes >>>> >> >> =95 Five MSI-X interrupt vectors are provided (calculated based = on >>>> four >>>> >> >> vectors for >>>> >> >> queues and one vector for other causes). The requested number of >>>> >> >> vectors >>>> >> >> is >>>> >> >> loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X >>>> >> >> capability >>>> >> >> structure of the function. >>>> >> >> >>>> >> >> 10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC) >>>> >> >> [...] >>>> >> >> >>>> >> >> about bit 24: >>>> >> >> >>>> >> >> Other Interrupt. Indicates one of the following interrupts was >>>> set: >>>> >> >> =95 Link Status Change. >>>> >> >> =95 Receiver Overrun. >>>> >> >> =95 MDIO Access Complete. >>>> >> >> =95 Small Receive Packet Detected. >>>> >> >> =95 Receive ACK Frame Detected. >>>> >> >> =95 Manageability Event Detected. >>>> >> >> >>>> >> >> Thanks in advance, >>>> >> >> - Arnaud >>>> >> >> >>>> >> >> [0]: ftp://download.intel.com/design/network/datashts/82574.pdf >>>> >> > >>>> >> > >>>> > >>>> > >>>> >>> >>> >> > From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 21:55:27 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D876E1065679; Fri, 1 Apr 2011 21:55:27 +0000 (UTC) (envelope-from freebsd@hub.org) Received: from hub.org (hub.org [200.46.204.220]) by mx1.freebsd.org (Postfix) with ESMTP id AA4E98FC19; Fri, 1 Apr 2011 21:55:27 +0000 (UTC) Received: from maia.hub.org (maia-5.hub.org [200.46.204.29]) by hub.org (Postfix) with ESMTP id F41493250A90; Fri, 1 Apr 2011 18:35:31 -0300 (ADT) Received: from hub.org ([200.46.204.220]) by maia.hub.org (mx1.hub.org [200.46.204.29]) (amavisd-maia, port 10024) with ESMTP id 73447-05; Fri, 1 Apr 2011 21:35:32 +0000 (UTC) Received: by hub.org (Postfix, from userid 1002) id BDC453250A8F; Fri, 1 Apr 2011 18:35:31 -0300 (ADT) Received: from localhost (localhost [127.0.0.1]) by hub.org (Postfix) with ESMTP id B6DAD3250A8D; Fri, 1 Apr 2011 18:35:31 -0300 (ADT) Date: Fri, 1 Apr 2011 18:35:31 -0300 (ADT) From: "Marc G. Fournier" X-X-Sender: scrappy@hub.org To: freebsd-net@freebsd.org, freebsd-questions@freebsd.org Message-ID: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII Cc: Subject: nfs error: No route to host when starting apache ... X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 21:55:27 -0000 I just setup an nfs mount between two servers ... ServerA, nfsd on 192.168.1.8 ServerB, nfs client on 192.168.1.7 I have a jail, ServerC, running on 192.168.1.7 ... most operations appear to work, but it looks like 'special files' of a sort aren't working, for when I try and startup Apache, I get: [Fri Apr 01 19:42:02 2011] [emerg] (65)No route to host: couldn't grab the accept mutex When I try and do a 'newaliases', I get: # newaliases postalias: fatal: lock /etc/aliases.db: No route to host Yet, for instance, both MySQL and PostgreSQL are running without any issues ... So, the mount is there, it is readable, it is working ... I can ssh into the jail, I can create files, etc ... I do have rpc.lockd and rpc.statd running on both client / server sides ... I'm not seeing anything in eithr the man page for mount_nfs *or* nfsd that might account / corect for something like this, but since I'm not sure what "this" is exactly, not sure exactl what I should be looking for :( Note that this behaviour happens at the *physical* server level as well, having tested with using postalias to generate the same 'lock' issue above ... Now, I do have mountd/nfsd started iwth the -h to bind them to 192.168.1.8 ... *but*, the servers themselves, although on same switch do have different default gateways ... I'm not seeing anything within the man page for, say, rpc.statd/rpc.lockd that allows me to bind it to the 192.168.1.0/24 IP, so is it binding to my public IP instead of my private? So nfsd / mount_nfs can talk find, as they go thorugh 192.168.1.0/24 as desired, but rpc.statd/rpc.lockd are the public IPs and not able to talk to each other? Thx ... From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 23:25:28 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 363D3106564A; Fri, 1 Apr 2011 23:25:28 +0000 (UTC) (envelope-from scrappy@hub.org) Received: from hub.org (hub.org [200.46.204.220]) by mx1.freebsd.org (Postfix) with ESMTP id DC0BA8FC15; Fri, 1 Apr 2011 23:25:27 +0000 (UTC) Received: from maia.hub.org (maia-3.hub.org [200.46.204.243]) by hub.org (Postfix) with ESMTP id 826683250A90; Fri, 1 Apr 2011 20:07:25 -0300 (ADT) Received: from hub.org ([200.46.204.220]) by maia.hub.org (mx1.hub.org [200.46.204.243]) (amavisd-maia, port 10024) with ESMTP id 06097-10; Fri, 1 Apr 2011 23:07:17 +0000 (UTC) Received: by hub.org (Postfix, from userid 1002) id BC9C93250A8D; Fri, 1 Apr 2011 20:07:17 -0300 (ADT) Received: from localhost (localhost [127.0.0.1]) by hub.org (Postfix) with ESMTP id AFDF63250A8B; Fri, 1 Apr 2011 20:07:17 -0300 (ADT) Date: Fri, 1 Apr 2011 20:07:17 -0300 (ADT) From: "Marc G. Fournier" To: "Marc G. Fournier" In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org, freebsd-questions@freebsd.org Subject: Re: nfs error: No route to host when starting apache ... X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 23:25:28 -0000 I've succeedig in getting a bit further ... by the time I got to the bottom of my original, I started to think in terms of rpc more, and had overlooked lookign at thte rpcbind man page, which *does* have a -h option ... setting that fixes things perfectly *almost* ... The last issue I seem to be hitting *might* be a 6.x NFS client against a 7.x server issue ... ? Postfix generates: postfix/showq[65261]: fatal: select lock: Permission denied The only post I found about this was: http://lists.freebsd.org/pipermail/freebsd-questions/2010-April/215284.html But there didn't appear to be any responses ... so either all responses were private to Robert, or ... ? This is my last 6.x box, so it is not overly critical, but would be nice if I could get it to work properly ... On Fri, 1 Apr 2011, Marc G. Fournier wrote: > > I just setup an nfs mount between two servers ... > > ServerA, nfsd on 192.168.1.8 > ServerB, nfs client on 192.168.1.7 > > I have a jail, ServerC, running on 192.168.1.7 ... most operations appear to > work, but it looks like 'special files' of a sort aren't working, for when I > try and startup Apache, I get: > > [Fri Apr 01 19:42:02 2011] [emerg] (65)No route to host: couldn't grab the > accept mutex > > When I try and do a 'newaliases', I get: > > # newaliases > postalias: fatal: lock /etc/aliases.db: No route to host > > Yet, for instance, both MySQL and PostgreSQL are running without any issues > ... > > So, the mount is there, it is readable, it is working ... I can ssh into the > jail, I can create files, etc ... > > I do have rpc.lockd and rpc.statd running on both client / server sides ... > > I'm not seeing anything in eithr the man page for mount_nfs *or* nfsd that > might account / corect for something like this, but since I'm not sure what > "this" is exactly, not sure exactl what I should be looking for :( > > Note that this behaviour happens at the *physical* server level as well, > having tested with using postalias to generate the same 'lock' issue above > ... > > Now, I do have mountd/nfsd started iwth the -h to bind them to 192.168.1.8 > ... *but*, the servers themselves, although on same switch do have different > default gateways ... I'm not seeing anything within the man page for, say, > rpc.statd/rpc.lockd that allows me to bind it to the 192.168.1.0/24 IP, so is > it binding to my public IP instead of my private? So nfsd / mount_nfs can > talk find, as they go thorugh 192.168.1.0/24 as desired, but > rpc.statd/rpc.lockd are the public IPs and not able to talk to each other? > > Thx ... > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > ---- Marc G. Fournier Hub.Org Hosting Solutions S.A. scrappy@hub.org http://www.hub.org Yahoo:yscrappy Skype: hub.org ICQ:7615664 MSN:scrappy@hub.org From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 23:33:33 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CA0E1106564A; Fri, 1 Apr 2011 23:33:33 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 7207C8FC19; Fri, 1 Apr 2011 23:33:33 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApwEAHxZlk2DaFvO/2dsb2JhbACESKIRiHmnTJBagSiBaIFkdwSLeYEi X-IronPort-AV: E=Sophos;i="4.63,285,1299474000"; d="scan'208";a="116808781" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 01 Apr 2011 19:04:32 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 0051EB40B2; Fri, 1 Apr 2011 19:04:31 -0400 (EDT) Date: Fri, 1 Apr 2011 19:04:31 -0400 (EDT) From: Rick Macklem To: "Marc G. Fournier" Message-ID: <116776764.2605927.1301699071938.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - IE8 (Win)/6.0.10_GA_2692) Cc: freebsd-net@freebsd.org, freebsd-questions@freebsd.org Subject: Re: nfs error: No route to host when starting apache ... X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 23:33:33 -0000 > I just setup an nfs mount between two servers ... > > ServerA, nfsd on 192.168.1.8 > ServerB, nfs client on 192.168.1.7 > > I have a jail, ServerC, running on 192.168.1.7 ... most operations > appear > to work, but it looks like 'special files' of a sort aren't working, > for > when I try and startup Apache, I get: > > [Fri Apr 01 19:42:02 2011] [emerg] (65)No route to host: couldn't grab > the > accept mutex > > When I try and do a 'newaliases', I get: > > # newaliases > postalias: fatal: lock /etc/aliases.db: No route to host > > Yet, for instance, both MySQL and PostgreSQL are running without any > issues ... > > So, the mount is there, it is readable, it is working ... I can ssh > into > the jail, I can create files, etc ... > > I do have rpc.lockd and rpc.statd running on both client / server > sides > ... > Since rpc.lockd and rpc.statd expect to be able to do IP broadcast (same goes for rpcbind), I suspect that might be a problem w.r.t. jails, although I know nothing about how jails work? > I'm not seeing anything in eithr the man page for mount_nfs *or* nfsd > that > might account / corect for something like this, but since I'm not sure > what "this" is exactly, not sure exactl what I should be looking for > :( > > Note that this behaviour happens at the *physical* server level as > well, > having tested with using postalias to generate the same 'lock' issue > above > ... > > Now, I do have mountd/nfsd started iwth the -h to bind them to > 192.168.1.8 > ... *but*, the servers themselves, although on same switch do have > different default gateways ... I'm not seeing anything within the man > page > for, say, rpc.statd/rpc.lockd that allows me to bind it to the > 192.168.1.0/24 IP, so is it binding to my public IP instead of my > private? > So nfsd / mount_nfs can talk find, as they go thorugh 192.168.1.0/24 > as > desired, but rpc.statd/rpc.lockd are the public IPs and not able to > talk > to each other? > > Thx ... > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Fri Apr 1 23:42:41 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F02B11065719; Fri, 1 Apr 2011 23:42:41 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 9C9BE8FC13; Fri, 1 Apr 2011 23:42:41 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApwEALphlk2DaFvO/2dsb2JhbACESKIRiHmnOZBagSiBaIFkdwSLeYEi X-IronPort-AV: E=Sophos;i="4.63,285,1299474000"; d="scan'208";a="115858924" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 01 Apr 2011 19:42:40 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id A0145B3F29; Fri, 1 Apr 2011 19:42:40 -0400 (EDT) Date: Fri, 1 Apr 2011 19:42:40 -0400 (EDT) From: Rick Macklem To: "Marc G. Fournier" Message-ID: <326244177.2606708.1301701360593.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <116776764.2605927.1301699071938.JavaMail.root@erie.cs.uoguelph.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - IE8 (Win)/6.0.10_GA_2692) Cc: freebsd-net@freebsd.org, freebsd-questions@freebsd.org Subject: Re: nfs error: No route to host when starting apache ... X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2011 23:42:42 -0000 > > I just setup an nfs mount between two servers ... > > > > ServerA, nfsd on 192.168.1.8 > > ServerB, nfs client on 192.168.1.7 > > > > I have a jail, ServerC, running on 192.168.1.7 ... most operations > > appear > > to work, but it looks like 'special files' of a sort aren't working, > > for > > when I try and startup Apache, I get: > > > > [Fri Apr 01 19:42:02 2011] [emerg] (65)No route to host: couldn't > > grab > > the > > accept mutex > > > > When I try and do a 'newaliases', I get: > > > > # newaliases > > postalias: fatal: lock /etc/aliases.db: No route to host > > > > Yet, for instance, both MySQL and PostgreSQL are running without any > > issues ... > > > > So, the mount is there, it is readable, it is working ... I can ssh > > into > > the jail, I can create files, etc ... > > > > I do have rpc.lockd and rpc.statd running on both client / server > > sides > > ... > > > Since rpc.lockd and rpc.statd expect to be able to do IP broadcast > (same goes for rpcbind), I suspect that might be a problem w.r.t. > jails, although I know nothing about how jails work? > Oh, and you can use the "nolock" mount option to avoid use of rpc.lockd and rpc.statd. > > I'm not seeing anything in eithr the man page for mount_nfs *or* > > nfsd > > that > > might account / corect for something like this, but since I'm not > > sure > > what "this" is exactly, not sure exactl what I should be looking for > > :( > > > > Note that this behaviour happens at the *physical* server level as > > well, > > having tested with using postalias to generate the same 'lock' issue > > above > > ... > > > > Now, I do have mountd/nfsd started iwth the -h to bind them to > > 192.168.1.8 > > ... *but*, the servers themselves, although on same switch do have > > different default gateways ... I'm not seeing anything within the > > man > > page > > for, say, rpc.statd/rpc.lockd that allows me to bind it to the > > 192.168.1.0/24 IP, so is it binding to my public IP instead of my > > private? > > So nfsd / mount_nfs can talk find, as they go thorugh 192.168.1.0/24 > > as > > desired, but rpc.statd/rpc.lockd are the public IPs and not able to > > talk > > to each other? > > > > Thx ... > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to > > "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Sat Apr 2 06:37:42 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 90E85106564A; Sat, 2 Apr 2011 06:37:42 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail.yamagi.overkill.yamagi.org (unknown [IPv6:2a01:4f8:121:2102:1::7]) by mx1.freebsd.org (Postfix) with ESMTP id 293AC8FC13; Sat, 2 Apr 2011 06:37:42 +0000 (UTC) Received: from [2001:5c0:150f:8700:223:54ff:fe31:a012] (unknown [IPv6:2001:5c0:150f:8700:223:54ff:fe31:a012]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.yamagi.overkill.yamagi.org (Postfix) with ESMTPSA id F04D016663D1; Sat, 2 Apr 2011 08:37:36 +0200 (CEST) Date: Sat, 2 Apr 2011 08:37:31 +0200 (CEST) From: Yamagi Burmeister X-X-Sender: yamagi@saya.home.yamagi.org To: YongHyeon PYUN In-Reply-To: <20110401185357.GA15910@michelle.cdnetworks.com> Message-ID: References: <20110330173145.GB8601@michelle.cdnetworks.com> <20110330202858.GC8601@michelle.cdnetworks.com> <20110331171302.GA11981@michelle.cdnetworks.com> <20110331181651.GB11981@michelle.cdnetworks.com> <20110331183054.GC11981@michelle.cdnetworks.com> <20110401185357.GA15910@michelle.cdnetworks.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org, Yamagi Burmeister , yongari@freebsd.org Subject: Re: Kernel memory corruption(?) with age(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Apr 2011 06:37:42 -0000 On Fri, 1 Apr 2011, YongHyeon PYUN wrote: > On Thu, Mar 31, 2011 at 09:59:12PM +0200, Yamagi Burmeister wrote: >> On Thu, 31 Mar 2011, YongHyeon PYUN wrote: >> >>>> Thanks a lot! It seems the L1 controller has data corruption issue >>>> when 64bit DMA addressing is used. Try this one. >>> >>> Oops, there was a bug in previous patch. >>> Try this instead. >> >> Okay, that patch seems to do the trick. This was just a short test run >> of about one hour with just 50gb copied, but without the patch the >> system would have crashed in the first 20 minutes. I'll do a more >> comprehensive test over night and report back tomorrow morning. >> > > Fix committed to HEAD(r220249, r220252). > Thanks a lot for testing! No problem. -- Homepage: www.yamagi.org Jabber: yamagi@yamagi.org GnuPG/GPG: 0xEFBCCBCB From owner-freebsd-net@FreeBSD.ORG Sat Apr 2 11:58:26 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4B2B71065670; Sat, 2 Apr 2011 11:58:26 +0000 (UTC) (envelope-from sec@42.org) Received: from ice.42.org (v6.42.org [IPv6:2001:608:9::1]) by mx1.freebsd.org (Postfix) with ESMTP id 020438FC0C; Sat, 2 Apr 2011 11:58:26 +0000 (UTC) Received: by ice.42.org (Postfix, from userid 1000) id A44F12841F; Sat, 2 Apr 2011 13:58:23 +0200 (CEST) Date: Sat, 2 Apr 2011 13:58:23 +0200 From: Stefan `Sec` Zehl To: John Baldwin , freebsd-net@freebsd.org Message-ID: <20110402115823.GE37730@ice.42.org> Mail-Followup-To: John Baldwin , freebsd-net@freebsd.org References: <4D8B99B4.4070404@FreeBSD.org> <201103281423.52202.jhb@freebsd.org> <20110328183810.GF23803@ice.42.org> <201103300838.09608.jhb@freebsd.org> <20110331234017.GC3308@ice.42.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110331234017.GC3308@ice.42.org> User-Agent: Mutt/1.4.2.3i I-love-doing-this: really X-Modeline: vim:set ts=8 sw=4 smarttab tw=72 si noic notitle: Accept-Languages: de, en X-URL: http://sec.42.org/ Cc: Subject: Re: The tale of a TCP bug X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Apr 2011 11:58:26 -0000 Hi I'm back :) On Fri, Apr 01, 2011 at 01:40 +0200, Stefan `Sec` Zehl wrote: > I'll of course monitor this value and report back if I ever see it > increase :-) It did: | ice:~>uptime | 1:45PM up 2 days, 17:01, 0 users, load averages: 1.29, 0.98, 0.60 | ice:~>sysctl net.inet.tcp.adv_neg | net.inet.tcp.adv_neg: 120 | ice:~> I currently have no idea why. But I think it would be a good idea to fix that adv calculation on 64bit for the negative case anyway. As my original attempt with a (long) cast was frowned upon, maybe something like what OpenBSD did in r1.15 / 1998? http://www.openbsd.org/cgi-bin/cvsweb/src/sys/netinet/tcp_output.c.diff?r1=1.14;r2=1.15 --- tcp_output.c.pre 2011-04-02 13:50:32.000000000 +0200 +++ tcp_output.c 2011-04-02 13:50:35.000000000 +0200 @@ -575,7 +575,7 @@ * taking into account that we are limited by * TCP_MAXWIN << tp->rcv_scale. */ - long adv = min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) - + long adv = lmin(recwin, (long)TCP_MAXWIN << tp->rcv_scale) - (tp->rcv_adv - tp->rcv_nxt); if(min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) < If anyone has an idea what could trigger these cases, I'd be happy to help debug. But without a clear testcase, it's a bit difficult. CU, Sec -- "few languages are as bad as PHP for doing serious development work" -- Experiences of Using PHP in Large Websites From owner-freebsd-net@FreeBSD.ORG Sat Apr 2 19:52:33 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B702610656D1; Sat, 2 Apr 2011 19:52:33 +0000 (UTC) (envelope-from kungfujesus06@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 68A6A8FC12; Sat, 2 Apr 2011 19:52:33 +0000 (UTC) Received: by iwn33 with SMTP id 33so5654553iwn.13 for ; Sat, 02 Apr 2011 12:52:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=3H+D3x3sx+nb6IyosQO7k1UZs5gRBaiShkypQVcT1aA=; b=a1Qr6/zhkYVml3zlTmBOalkQV4lUPga145JwbZIddGaKNL8UVl/1mWkYmCOlXWX5kY nq3bQ+YWhhWaKFVCWCXVIEBHvHIppHb+Ywn9iXhumSAf9BcBGE/FSefLyXDVjKeetyAp WOChp+zOn/trKqG03R6pBJrK6r4K40tq4U908= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=ZERxmbCat4SZ2S1c/mS95pSUzhnUw2wH0OGQNvodLSTgkRDjuDtgdr4DQWeBI3LO+x ql0/owWg/lxWbtH+SbALWmGPnqLpO7DI5FvzIZ0CYEY8ctX4hNFggrO1zaWpe45pcliu BmCMdU4EZxLqNoELMjbq9nWzHy+7/XwAoCBsA= Received: by 10.231.93.73 with SMTP id u9mr2478746ibm.155.1301773952636; Sat, 02 Apr 2011 12:52:32 -0700 (PDT) Received: from zephyr.adamsnet ([72.49.234.31]) by mx.google.com with ESMTPS id c1sm2410475ibe.49.2011.04.02.12.52.30 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 02 Apr 2011 12:52:31 -0700 (PDT) Date: Sat, 2 Apr 2011 15:52:04 -0400 From: Adam Stylinski To: Bernhard Schmidt Message-ID: <20110402195204.GA6629@zephyr.adamsnet> References: <201103311507.16263.bschmidt@freebsd.org> <20110331151421.GA3263@ossumpossum.geop.uc.edu> <201103311735.40634.bschmidt@freebsd.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="bg08WKrSYDhXBjb5" Content-Disposition: inline In-Reply-To: <201103311735.40634.bschmidt@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-net@freebsd.org Subject: Re: net80211 and interface requests X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Apr 2011 19:52:33 -0000 --bg08WKrSYDhXBjb5 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Mar 31, 2011 at 05:35:40PM +0200, Bernhard Schmidt wrote: > On Thursday, March 31, 2011 17:14:21 Adam Stylinski wrote: > > On Thu, Mar 31, 2011 at 03:07:15PM +0200, Bernhard Schmidt wrote: > > > On Thursday, March 31, 2011 14:20:33 Adam Stylinski wrote: > > > > On Thu, Mar 31, 2011 at 09:02:45AM +0200, Bernhard Schmidt wrote: > > > > > On Wednesday, March 30, 2011 23:17:53 Adam Stylinski wrote: > > > > > > Hello, > > > > > >=20 > > > > > > This list has helped me before so I'll email again with the hop= es that > > > > > > somebody has an answer. All is working well with my project, h= owever for > > > > > > the life of me I cannot get the interface to inject the raw fra= mes faster > > > > > > than 11mbps. I'm following the example given in > > > > > > /usr/src/tools/tools/net80211/wlaninject.c, and manually specif= ying > > > > > > parameters such as ucastrate, mcastrate, and mgmtrate within if= config. I'm > > > > > > putting the card into pureg mode, and yet I still can't inject = any faster. > > > > > > I've even gone so far as to specify an ieee802211_txparam stru= ct giving > > > > > > values of 255 both mcast and ucast rates within the struct (and= of course > > > > > > anding them by 0xff). I then used the ioctl call to set the fl= ags within > > > > > > the interface request. Any help would be greatly appreciated. > > > > >=20 > > > > > You've set the ibp_rate0 parameter right? This one is in half-mbp= s, so > > > > > a value of 108 should give you 54m. The only thing I can think of= right > > > > > now is that the device (or channel) is actually configured for 11= b not > > > > > 11g mode. Can we rule that out? Which device are you using? > > > > >=20 > > > > > > I am doing nanosleeps in between transmissions as if I don't th= e bpf clone > > > > > > can't inject due to the buffer being too full. There's probabl= y a better > > > > > > way of doing this, but I doubt the nanosleeps are the issue (af= terall, I get > > > > > > almost exactly 11mbps). I should probably note I'm not doing a= ny ACKs, this > > > > > > is pure transmits. > > > > > >=20 > > > > > > If anybody cares enough to look at my unpolished code to get a = better idea, > > > > > > look here: > > > > > >=20 > > > > > > http://projhinternet.svn.sourceforge.net/ > > > > > >=20 > > > > > > The idea is to allow unidirectional traffic so that with an FCC= amateur > > > > > > license (yes I know I'm not currently broadcasting the call sig= n as of yet) > > > > > > you can broadcast unencrypted transmissions for miles (with a l= inear > > > > > > amplifier spec'd to 2.4ghz). With the license FCC part15 no lo= nger applies > > > > > > and you can operate just like in any other amateur band. > > > > > > _______________________________________________ > > > > > > freebsd-net@freebsd.org mailing list > > > > > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > > > > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freeb= sd.org" > > > > > >=20 > > > > >=20 > > > >=20 > > > > I'm using an atheros AR2413 chipset, running in pure g mode, with a= lso the card put into "mode 11g" and ucast, mcast, and mgmt rates set to 54= =2E I think the parameter for ibp_rate0 is just for setting it in the head= er (but I could be wrong). Regardless I am doing this, let me give you the= exact source files I'm doing this in. > > >=20 > > > Well, the ath_rate_* modules afaik do not honor the fixed rate > > > settings. At least I've heard something about those being broken. The > > > ibp_rate0 parameter set to 108 seems to be correct though. > > >=20 > > > No clue why that doesn't work, you may have to debug ath_tx_findrix(). > > > Adding a printf of the passed over rate and ridx should shed some lig= ht > > > on this I guess. > > >=20 > > > > Line 38 in this file: > > > > http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/c= allbacks.c?revision=3D69&view=3Dmarkup=20 > > > >=20 > > > > And the setup_if function in this: > > > > http://projhinternet.svn.sourceforge.net/viewvc/projhinternet/src/l= ibinject.c?revision=3D69&view=3Dmarkup > > > >=20 > > >=20 > >=20 > > It turns out strange coincidences can happen. I decided to busy loop, = thinking maybe it was my nanosleep call. And what do you know, 52Mb/sec. = Is there some sort of call I can use to probe the fd to see if the buffer h= as been sent yet? =20 >=20 > Honestly, no clue. The bpf transmit path is a bunch of ugly hacks.. > What you can try though is to enable various debug options for > net80211 and ath to figure out what's going on, especially the bits > for xmit. >=20 > On a unrelated side note, how is the ath/wlan0 interface configured? > I mean, is it in sta mode or ahdemo? I guess most tests have been done > in ahdemo mode. Also I'm sure that all frames are simply discarded if > the device is currently scanning. >=20 > --=20 > Bernhard After looking through what BPF has to offer, the thought has occurred to me= that the example program provided under the src tree do not make proper us= e of bpf semantics. Wouldn't using the bpf_tap() and related helper functi= ons actually inject on the interface in a sleeping manner (as in wait for t= he interface to finish transmitting and wait for the device specific interr= upt)? Opening a Datagram socket on the interface seems a bit hackish, and = it doesn't seem to obey synchronous write semantics (BPF constantly returns= -1 if the device is busy for whatever reason). =20 Am I wrong? --bg08WKrSYDhXBjb5 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iQIcBAEBAgAGBQJNl35jAAoJED6sRHE6TvmnXC0P/2SrczA0I84p8CDQvZ3Z4dqZ /ivIFUqL8CRSVaLr3fqiLDccpPbkdT3wAxxGtG/gaLPGBTxpsImOU1eFwZKqT4eh yqazJflg6F3DeQLrk+h55uuMjgyDTq0OHzCJavszw5quN1GwOUbzpCvWW9cvf58U LdKzUtw89zlJr7gmdjcmgsw38b+o0mlupoTFTGqapYBTeUPH324JE/gb3PY6gxIu nU/NjVrCbwinw09/wWO3bEm37HNxUvSrpKHBkZedi/y8lElefdiaKe/65t3GiEPy 3kyLmpl/iF7/xbXsqnr2IvmYh8XMV9yZviShP1GMBQOBxdwxeEy2SETzHdMouJHV MzVrQDtefNdqDzG8hs0ZC3mTNv8WPGqyTiW4Ss/bc14vF1Cn1ks6Yt4IygyOreMK iQZUtU1jpbFiDMe+MUiX8NW2f/cdbIAeyPC59kzm/ZhTUM3FIYFaOBsKPUfsNY5j +f3VF6N6LD7A0ZoXWX7R6bjxosmFBX7InCDRjvxKcfyIvroESWtO/Kw7fCSEpaRq 1gh7UxUKVIf2iYZiZ4URCU1B9un9oQ/6/VHY721osz0sJLTkgNa+RPKUhP66lssv foX3u0l/dys9ykdvnb4+tftmARo1nWOGRo+KbVdAeUXRAIqNqMW9Ea83dziVfbHU a3OxQWElVni+tPu+FsJ5 =aIzG -----END PGP SIGNATURE----- --bg08WKrSYDhXBjb5--