Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Nov 2010 09:31:12 -0500 (EST)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        pyunyh@gmail.com
Cc:        freebsd-current@freebsd.org
Subject:   Re: re(4) driver dropping packets when reading NFS files
Message-ID:  <794553731.206411.1289140272239.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20101106023345.GC22715@michelle.cdnetworks.com>

next in thread | previous in thread | raw e-mail | index | archive | help
------=_Part_206410_719282442.1289140272238
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

> > I've added a counter of how many times re_rxeof() gets called, but
> > then
> > returns without handling any received packets (I think because
> > RL_RDESC_STAT_OWN is set on the first entry it looks at in the rcv.
> > ring.)
> >
> > This count comes out as almost the same as the # of missed frames
> > (see
> > "rxe did 0:" in the attached stats).
> >
> > So, I think what is happenning about 10% of the time is that
> > re_rxeof()
> > is looking at the ring descriptor before it has been "updated" and
> > then
> > returns without handling the packet and then it doesn't get called
> > again
> > because the RL_ISR bit has been cleared.
> >
> > When "options DEVICE_POLLING" is specified, it works ok, because it
> > calls
> > re_rxeof() fairly frequently and it doesn't pay any attention to the
> > RL_ISR
> > bits.
> >
> 
> That's one of possible theory. See below for another theory.
> 

Well, my above theory is bunk. I hacked it so it would continue to
run re_int_task() until the receive descriptor was valid and it
just hummed away until the next frame was received.

> > Now, I don't know if this is a hardware flaw on this machine or
> > something
> > that can be fixed in software? I know diddly about the current
> > driver
> 
> I highly doubt it could be hardware issue.
> 
You have much more confidence in the hardware guys than I do;-)
(see summary below)

> 
> Another possibility I have in mind is the controller would have
> reported RL_ISR_RX_OVERRUN but re(4) didn't check that condition.
> The old data sheet I have does not clearly mention when it is set
> and what other bits of RL_ISR register is affected from the bit.
> If RL_ISR_RX_OVERRUN bit is set when there are no more free RX
> descriptors available to controller and RL_ISR_RX_ERR is not set
> when RL_ISR_RX_OVERRUN is set, re(4) have ignored that event.
> Because driver ignored that error interrupt, the next error
> interrupt would be RL_ISR_FIFO_OFLOW error and this time it would
> be served in re_rxeof() and will refill RX buffers.
> However driver would have lost lots of frames received between the
> time window RL_ISR_RX_OVERRUN error and RL_ISR_FIFO_OFLOW error.
> 
RL_ISR_RX_OVERRUN never gets set during the testing I do.

> If this theory is correct, the attached patch may mitigate the
> issue.
> 
Your re.intr.patch4 had no effect. I've tried a bunch of similar
other ones that also had no effect.

> 
> > Otherwise, I can live with "options DEVICE_POLLING".
> >

This turned out to be my mistake. "options DEVICE_POLLING"
doesn't help. (I think I screwed up by running a test without
it enabled and then re-running the test after enabling it. It
ran fast because I had primed the cache, oops.)

Summary sofar:
- almost all variants I've tried result in a "missed frame"
  rate of 8-11%, which gives you about 500Kbytes/sec read rate.
  here are some of the things I tried that didn't change this:
  - disabling the hardware offload stuff like checksums
  - changing the size of the receive ring (except making it
    really small seemed to degrade perf. a little)
  - enabling use of the I/O map
  ** - as noted above "options DEVICE_POLLING" doesn't actually
     help
  - enabling RE_TX_MODERATION
  - changing the value of maxpkts in re_rxeof() smaller seems to
    make the miss rate slightly higher

- when the above is happening, the # of times re_rxeof() finds the
  first descriptor in the receive ring with the frame still owned
  by the device is just slightly lower than the # of "missed frames"
  reported by the chip's stats.
  - about half of these interrupts are due to RL_ISR_FIFO_OFLOW and
    the other half RL_ISR_RX_OK. (As noted above, I never see
    RL_ISR_RX_OVERRUN.)

The only combination that I've come up with that reduces the "missed frame"
rate significantly is the little patch I passed along before. It does 3
things:
- disables msi
- doesn't clear RL_IMR in re_intr()
- doesn't set RL_IMR in re_int_task(), since the bits weren't cleared
(This little patch is attached, in case anyone is interested in trying it.)

When I run with this variant the "missed frame" rate drops to less than
2% (1.something) and I get about 5Mbytes/sec read rate on the 100Mbps port.

I'm out of ideas w.r.t. what to try. I think it takes someone who knows
what actually causes the chip to "miss a frame" to solve it? I don't know
why the little patch I descibe above reduces, but does not eliminate the
"missed frame" problem. (Since the chip seems to know it "missed the frame",
it smells hardware related to me?)

I'd think that RL_ISR_FIFO_OFLOW would mean that it ran out of rcv. buffers,
but there's 256 of them (and I tried increasing it to 512 with no effect).
Again, I think it takes someone familiar with the hardware to say why this
would be happening?

Anyhow, I'm out of theories/ideas. I can easily test any patch that seems
like it ight help.

Thanks for your assistance, rick

------=_Part_206410_719282442.1289140272238
Content-Type: text/x-patch; name=if_re.patch
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=if_re.patch

LS0tIGlmX3JlLmMuc2F2CTIwMTAtMTEtMDcgMDk6MjY6MzMuMDAwMDAwMDAwIC0wNTAwCisrKyBp
Zl9yZS5jCTIwMTAtMTEtMDcgMDk6Mjc6MTIuMDAwMDAwMDAwIC0wNTAwCkBAIC0xNTcsNyArMTU3
LDcgQEAKICNpbmNsdWRlICJtaWlidXNfaWYuaCIKIAogLyogVHVuYWJsZXMuICovCi1zdGF0aWMg
aW50IG1zaV9kaXNhYmxlID0gMDsKK3N0YXRpYyBpbnQgbXNpX2Rpc2FibGUgPSAxOwogVFVOQUJM
RV9JTlQoImh3LnJlLm1zaV9kaXNhYmxlIiwgJm1zaV9kaXNhYmxlKTsKIHN0YXRpYyBpbnQgcHJl
ZmVyX2lvbWFwID0gMDsKIFRVTkFCTEVfSU5UKCJody5yZS5wcmVmZXJfaW9tYXAiLCAmcHJlZmVy
X2lvbWFwKTsKQEAgLTIyMjIsNyArMjIyMiw2IEBACiAJc3RhdHVzID0gQ1NSX1JFQURfMihzYywg
UkxfSVNSKTsKIAlpZiAoc3RhdHVzID09IDB4RkZGRiB8fCAoc3RhdHVzICYgUkxfSU5UUlNfQ1BM
VVMpID09IDApCiAgICAgICAgICAgICAgICAgcmV0dXJuIChGSUxURVJfU1RSQVkpOwotCUNTUl9X
UklURV8yKHNjLCBSTF9JTVIsIDApOwogCiAJdGFza3F1ZXVlX2VucXVldWVfZmFzdCh0YXNrcXVl
dWVfZmFzdCwgJnNjLT5ybF9pbnR0YXNrKTsKIApAQCAtMjI5Niw3ICsyMjk1LDYgQEAKIAkJcmV0
dXJuOwogCX0KIAotCUNTUl9XUklURV8yKHNjLCBSTF9JTVIsIFJMX0lOVFJTX0NQTFVTKTsKIH0K
IAogc3RhdGljIGludAo=
------=_Part_206410_719282442.1289140272238--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?794553731.206411.1289140272239.JavaMail.root>