Date: Tue, 6 Apr 2010 20:00:27 +0200 From: Andre Albsmeier <Andre.Albsmeier@siemens.com> To: Pyun YongHyeon <pyunyh@gmail.com> Cc: "svn-src-stable@freebsd.org" <svn-src-stable@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "Albsmeier, Andre" <andre.albsmeier@siemens.com>, Pyun YongHyeon <yongari@freebsd.org> Subject: Re: svn commit: r205614 - stable/7/sys/dev/msk Message-ID: <20100406180027.GA3724@curry.mchp.siemens.de> In-Reply-To: <20100406134626.GA1727@curry.mchp.siemens.de> References: <201003241721.o2OHL5K9063538@svn.freebsd.org> <20100405145937.GA78871@curry.mchp.siemens.de> <20100405180642.GD1225@michelle.cdnetworks.com> <20100406134626.GA1727@curry.mchp.siemens.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 06-Apr-2010 at 15:46:26 +0200, Andre Albsmeier wrote: > On Mon, 05-Apr-2010 at 20:06:42 +0200, Pyun YongHyeon wrote: > > On Mon, Apr 05, 2010 at 04:59:37PM +0200, Andre Albsmeier wrote: > > > On Wed, 24-Mar-2010 at 18:21:05 +0100, Pyun YongHyeon wrote: > > > > Author: yongari > > > > Date: Wed Mar 24 17:21:05 2010 > > > > New Revision: 205614 > > > > URL: http://svn.freebsd.org/changeset/base/205614 > > > > > > > > Log: > > > > MFC r204545: > > > > Remove taskqueue based interrupt handling. After r204541 msk(4) > > > > does not generate excessive interrupts any more so we don't need > > > > to have two copies of interrupt handler. > > > > While I'm here remove two STAT_PUT_IDX register accesses in LE > > > > status event handler. After r204539 msk(4) always sync status LEs > > > > so there is no need to resort to reading STAT_PUT_IDX register to > > > > know the end of status LE processing. Just trust status LE's > > > > ownership bit. > > > > > > This ruined the performance on my system heavily. I noticed it > > > when unpacking a local tar archive onto an NFS-mounted location > > > on an em(4)-based box. This archive is about 50MB of size with > > > a bit over 5600 files so files have an average size of 9 kB. > > > > > > I also noticed the slowdown when doing rdist-based updates (again > > > lots of small files) onto the other box. > > > > > > Just pumping bytes over the network shows no problems -- I can > > > transmit 100-105 MB/s and receive 95-100 MB/s when talking > > > to this em(4)-based box without problem (and as it was before). > > > > > > When copying a few big files (several GBs of size) over NFS > > > I get something between 70 and 90 MB/s which is the same as > > > what I had got before. > > > > > > If have made some tests to track down when the issues began. > > > Problems started with rev. 1.18.2.37 of if_msk.c but could > > > be alleviated by setting dev.mskc.0.int_holdoff to 1 or 0. > > > Things really got problematic with rev. 1.18.2.38 -- adjusting > > > dev.mskc.0.int_holdoff helped a lot but we are far from what > > > we had with 1.18.2.36 or earlier. I did 5 rounds of testing, > > > each with the same set of if_msk.c revisions and values for > > > int_holdoff (where appropriate) just to check reproducibility: > > > > > > if_msk.c rev. round1 round2 round3 round4 round5 > > > -------------------------------------------------------- > > > 1.18.2.34 17,115 18,408 17,977 16,412 19,170 > > > 1.18.2.35 18,414 17,863 17,000 18,428 18,093 > > > 1.18.2.36 19,631 18,167 18,105 18,401 17,995 > > > 1.18.2.37 22,707 24,830 24,322 23,613 22,498 > > > int_holdoff=10 19,259 19,870 19,355 18,725 19,273 > > > int_holdoff=1 18,464 18,218 17,862 16,701 17,798 > > > int_holdoff=0 19,423 18,507 19,505 20,714 20,460 > > > 1.18.2.38 57,169 53,394 58,721 not done > > > int_holdoff=10 30,266 33,493 33,240 33,247 30,470 > > > int_holdoff=1 27,013 28,777 28,047 25,858 27,615 > > > int_holdoff=0 40,284 33,040 33,726 36,834 35,235 > > > > > > All this is on > > > > > > FreeBSD-7.3-STABLE > > > > > > CPU: Intel(R) Core(TM)2 Quad CPU Q9650 @ 3.00GHz (3001.18-MHz 686-class CPU) > > > Origin = "GenuineIntel" Id = 0x1067a Family = 6 Model = 17 Stepping = 10 > > > > > > dev.mskc.0.%desc: Marvell Yukon 88E8053 Gigabit Ethernet > > > dev.msk.0.%desc: Marvell Technology Group Ltd. Yukon EC Id 0xb6 Rev 0x02 > > > > > > hw.msk.msi_disable was set to 1 but didn't change results > > > when commenting it out. > > > > > > Any ideas or things I can try? > > > > > > > Could you narrow down which side(RX or TX) cause the issue you're > > seeing? From your description it's not clear whether msk(4) is used > > as sender or receiver. > > Well, both. I will try to describe the setup more exactly: > > On the msk(4)-box a locally residing tar file (48 MB size > containing 5600 files) is unpacked onto an NFS volume. > This NFS volume is mounted from another box which got > an em(4)-based NIC. I have now measured the amounts of > data being sent end received simply by using netstat: > > About 62 MB are being sent out of the msk(4)-box to the > em(4)-based NFS box and about 22 MB are received on the > msk(4)-box from the em(4)-based NFS box. > > I have now tried the reverse direction as well: The em(4)- > based box mounts an NFS volume from the msk(4)-box and > unpacks the same tar file (now the 62 MB are received on > the msk(4)-box and 22 MB are transmitted from the msk(4)- > box). The results are similar: > > rev. 1.18.2.38: 48,243 seconds > rev. 1.18.2.36: 17,536 seconds > > But I noticed another thing here at work: If I choose a > remote machine which uses myk(4) (not msk(4)) instead of > em(4) there are no performance issues noticable. Unfortu- > natley I can't test msk(4) on the remote side at the > moment... So the performance issues exist only when the > new msk driver is talking to an em-based NIC... > > > As you know 1.18.2.38 removed taskqueue based interrupt handling so > > it could be culprit of the issue. But that revision also removed > > two register accesses in TX path so I'd like to know which one > > caused the issue. > > I have now tried rev. 1.18.2.38 with this patch (no idea if > this is right ;-)): > > --- if_msk.c.1.18.2.38 2010-04-06 15:09:19.000000000 +0200 > +++ if_msk.c.1.18.2.38.TRY 2010-04-06 15:38:13.000000000 +0200 > @@ -3327,6 +3327,11 @@ > uint32_t control, status; > int cons, len, port, rxprog; > > + int idx; > + idx = CSR_READ_2(sc, STAT_PUT_IDX); > + if (idx == sc->msk_stat_cons) > + return (0); > + > /* Sync status LEs. */ > bus_dmamap_sync(sc->msk_stat_tag, sc->msk_stat_map, > BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE); > @@ -3407,7 +3412,7 @@ > if (rxput[MSK_PORT_B] > 0) > msk_rxput(sc->msk_if[MSK_PORT_B]); > > - return (rxprog > sc->msk_process_limit ? EAGAIN : 0); > + return (sc->msk_stat_cons != CSR_READ_2(sc, STAT_PUT_IDX)); > } > > static void > > Now performance seems to be the same as with the older > driver (at least here at work) and in both directions! > Some numbers: > > em0 writes to rev. 1.18.2.36: 20 seconds > em0 writes to rev. 1.18.2.38: 50 seconds > em0 writes to rev. 1.18.2.38 with patch from above: 23 seconds > same as before but with int_holdoff: 100 -> 1: 20 seconds > > rev. 1.18.2.36 writes to em0: 22 seconds > rev. 1.18.2.38 writes to em0: 40 seconds > rev. 1.18.2.38 with patch from above writes to em0: 21 seconds > same as before but with int_holdoff: 100 -> 1: 20 seconds > > It seems that these two CSR_READ_2s really help ;-). > > As I said, this is at work and with slightly different machines. > I will try things at home later but I am rather confident of > receiving good results there as well... OK, tests at home show similar good results with the above patch. When setting int_holdoff to 3, performance seems equal to the older versions. -Andre
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100406180027.GA3724>