From owner-freebsd-current@FreeBSD.ORG Sun Nov 7 14:31:30 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EF630106566B for ; Sun, 7 Nov 2010 14:31:30 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 929288FC0A for ; Sun, 7 Nov 2010 14:31:30 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApwEAKdH1kyDaFvO/2dsb2JhbACDMJ9KqByQGoRVcwSEWIV9 X-IronPort-AV: E=Sophos;i="4.58,310,1286164800"; d="scan'208";a="99890952" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 07 Nov 2010 09:31:12 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 4D67CB3F55; Sun, 7 Nov 2010 09:31:12 -0500 (EST) Date: Sun, 7 Nov 2010 09:31:12 -0500 (EST) From: Rick Macklem To: pyunyh@gmail.com Message-ID: <794553731.206411.1289140272239.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20101106023345.GC22715@michelle.cdnetworks.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_206410_719282442.1289140272238" X-Originating-IP: [99.225.56.115] X-Mailer: Zimbra 6.0.7_GA_2476.RHEL4 (ZimbraWebClient - IE8 (Win)/6.0.7_GA_2473.RHEL4_64) Cc: freebsd-current@freebsd.org Subject: Re: re(4) driver dropping packets when reading NFS files X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Nov 2010 14:31:31 -0000 ------=_Part_206410_719282442.1289140272238 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit > > I've added a counter of how many times re_rxeof() gets called, but > > then > > returns without handling any received packets (I think because > > RL_RDESC_STAT_OWN is set on the first entry it looks at in the rcv. > > ring.) > > > > This count comes out as almost the same as the # of missed frames > > (see > > "rxe did 0:" in the attached stats). > > > > So, I think what is happenning about 10% of the time is that > > re_rxeof() > > is looking at the ring descriptor before it has been "updated" and > > then > > returns without handling the packet and then it doesn't get called > > again > > because the RL_ISR bit has been cleared. > > > > When "options DEVICE_POLLING" is specified, it works ok, because it > > calls > > re_rxeof() fairly frequently and it doesn't pay any attention to the > > RL_ISR > > bits. > > > > That's one of possible theory. See below for another theory. > Well, my above theory is bunk. I hacked it so it would continue to run re_int_task() until the receive descriptor was valid and it just hummed away until the next frame was received. > > Now, I don't know if this is a hardware flaw on this machine or > > something > > that can be fixed in software? I know diddly about the current > > driver > > I highly doubt it could be hardware issue. > You have much more confidence in the hardware guys than I do;-) (see summary below) > > Another possibility I have in mind is the controller would have > reported RL_ISR_RX_OVERRUN but re(4) didn't check that condition. > The old data sheet I have does not clearly mention when it is set > and what other bits of RL_ISR register is affected from the bit. > If RL_ISR_RX_OVERRUN bit is set when there are no more free RX > descriptors available to controller and RL_ISR_RX_ERR is not set > when RL_ISR_RX_OVERRUN is set, re(4) have ignored that event. > Because driver ignored that error interrupt, the next error > interrupt would be RL_ISR_FIFO_OFLOW error and this time it would > be served in re_rxeof() and will refill RX buffers. > However driver would have lost lots of frames received between the > time window RL_ISR_RX_OVERRUN error and RL_ISR_FIFO_OFLOW error. > RL_ISR_RX_OVERRUN never gets set during the testing I do. > If this theory is correct, the attached patch may mitigate the > issue. > Your re.intr.patch4 had no effect. I've tried a bunch of similar other ones that also had no effect. > > > Otherwise, I can live with "options DEVICE_POLLING". > > This turned out to be my mistake. "options DEVICE_POLLING" doesn't help. (I think I screwed up by running a test without it enabled and then re-running the test after enabling it. It ran fast because I had primed the cache, oops.) Summary sofar: - almost all variants I've tried result in a "missed frame" rate of 8-11%, which gives you about 500Kbytes/sec read rate. here are some of the things I tried that didn't change this: - disabling the hardware offload stuff like checksums - changing the size of the receive ring (except making it really small seemed to degrade perf. a little) - enabling use of the I/O map ** - as noted above "options DEVICE_POLLING" doesn't actually help - enabling RE_TX_MODERATION - changing the value of maxpkts in re_rxeof() smaller seems to make the miss rate slightly higher - when the above is happening, the # of times re_rxeof() finds the first descriptor in the receive ring with the frame still owned by the device is just slightly lower than the # of "missed frames" reported by the chip's stats. - about half of these interrupts are due to RL_ISR_FIFO_OFLOW and the other half RL_ISR_RX_OK. (As noted above, I never see RL_ISR_RX_OVERRUN.) The only combination that I've come up with that reduces the "missed frame" rate significantly is the little patch I passed along before. It does 3 things: - disables msi - doesn't clear RL_IMR in re_intr() - doesn't set RL_IMR in re_int_task(), since the bits weren't cleared (This little patch is attached, in case anyone is interested in trying it.) When I run with this variant the "missed frame" rate drops to less than 2% (1.something) and I get about 5Mbytes/sec read rate on the 100Mbps port. I'm out of ideas w.r.t. what to try. I think it takes someone who knows what actually causes the chip to "miss a frame" to solve it? I don't know why the little patch I descibe above reduces, but does not eliminate the "missed frame" problem. (Since the chip seems to know it "missed the frame", it smells hardware related to me?) I'd think that RL_ISR_FIFO_OFLOW would mean that it ran out of rcv. buffers, but there's 256 of them (and I tried increasing it to 512 with no effect). Again, I think it takes someone familiar with the hardware to say why this would be happening? Anyhow, I'm out of theories/ideas. I can easily test any patch that seems like it ight help. Thanks for your assistance, rick ------=_Part_206410_719282442.1289140272238 Content-Type: text/x-patch; name=if_re.patch Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename=if_re.patch LS0tIGlmX3JlLmMuc2F2CTIwMTAtMTEtMDcgMDk6MjY6MzMuMDAwMDAwMDAwIC0wNTAwCisrKyBp Zl9yZS5jCTIwMTAtMTEtMDcgMDk6Mjc6MTIuMDAwMDAwMDAwIC0wNTAwCkBAIC0xNTcsNyArMTU3 LDcgQEAKICNpbmNsdWRlICJtaWlidXNfaWYuaCIKIAogLyogVHVuYWJsZXMuICovCi1zdGF0aWMg aW50IG1zaV9kaXNhYmxlID0gMDsKK3N0YXRpYyBpbnQgbXNpX2Rpc2FibGUgPSAxOwogVFVOQUJM RV9JTlQoImh3LnJlLm1zaV9kaXNhYmxlIiwgJm1zaV9kaXNhYmxlKTsKIHN0YXRpYyBpbnQgcHJl ZmVyX2lvbWFwID0gMDsKIFRVTkFCTEVfSU5UKCJody5yZS5wcmVmZXJfaW9tYXAiLCAmcHJlZmVy X2lvbWFwKTsKQEAgLTIyMjIsNyArMjIyMiw2IEBACiAJc3RhdHVzID0gQ1NSX1JFQURfMihzYywg UkxfSVNSKTsKIAlpZiAoc3RhdHVzID09IDB4RkZGRiB8fCAoc3RhdHVzICYgUkxfSU5UUlNfQ1BM VVMpID09IDApCiAgICAgICAgICAgICAgICAgcmV0dXJuIChGSUxURVJfU1RSQVkpOwotCUNTUl9X UklURV8yKHNjLCBSTF9JTVIsIDApOwogCiAJdGFza3F1ZXVlX2VucXVldWVfZmFzdCh0YXNrcXVl dWVfZmFzdCwgJnNjLT5ybF9pbnR0YXNrKTsKIApAQCAtMjI5Niw3ICsyMjk1LDYgQEAKIAkJcmV0 dXJuOwogCX0KIAotCUNTUl9XUklURV8yKHNjLCBSTF9JTVIsIFJMX0lOVFJTX0NQTFVTKTsKIH0K IAogc3RhdGljIGludAo= ------=_Part_206410_719282442.1289140272238--