From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 15:44:04 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 28A9737B401; Wed, 9 Apr 2003 15:44:04 -0700 (PDT) Received: from postal2.lbl.gov (postal2.lbl.gov [131.243.248.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3E7DA43F93; Wed, 9 Apr 2003 15:44:01 -0700 (PDT) (envelope-from j_guojun@lbl.gov) Received: from postal2.lbl.gov (localhost [127.0.0.1]) by postal2.lbl.gov (8.12.8/8.12.8) with ESMTP id h39MhwZ6012838; Wed, 9 Apr 2003 15:43:58 -0700 (PDT) Received: from lbl.gov (gracie.lbl.gov [131.243.2.175]) by postal2.lbl.gov (8.12.8/8.12.8) with ESMTP id h39MhvIg012832; Wed, 9 Apr 2003 15:43:57 -0700 (PDT) Sender: jin@lbl.gov Message-ID: <3E94A22D.174321F0@lbl.gov> Date: Wed, 09 Apr 2003 15:43:57 -0700 From: "Jin Guojun [DSD]" X-Mailer: Mozilla 4.76 [en] (X11; U; FreeBSD 4.7-RELEASE i386) X-Accept-Language: zh, zh-CN, en MIME-Version: 1.0 To: freebsd-hackers@freebsd.org, freebsd-performance@freebsd.org Content-Type: multipart/mixed; boundary="------------32793A0542BCFE2DD05E9119" X-Mailman-Approved-At: Wed, 09 Apr 2003 15:53:43 -0700 Subject: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 22:44:04 -0000 This is a multi-part message in MIME format. --------------32793A0542BCFE2DD05E9119 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit When testing GigE path that has 67 ms RTT, the maximum TCP throughput is limited at 250 Mb/s. By tracing the problem, I found that tcp_output() is starving where snd_wnd and snd_cwnd are fully open. The snd_cc is never filled beyond the 4.05MB even though the snd_hiwat is 10MB and snd_sbmax is 8MB. That is, sosend never stopped at sbwait. So only place can slow down is the mbuf allocation in sosend(). The attached trace file shows that each MGET and MCLGET takes significant time -- around 8 us at slow start time, and gradually increasing after that in an range 18 to 648. Each packet Tx on GigE takes 12 us. It average mbuf allocation takes 18 us, then the performance will be reduced to 40%, in fact it is down to 25%, which means higher average delay. I have change NMBCLUSTER from 2446 to 6566 to 10240, and nothing is improved. Any one can tell what factors would cause MGET / MCLGET to wait? Is there any way to make MGET/MCLGET not to wait? -Jin ----------- system info ------------- kern.ipc.maxsockbuf: 10485760 net.inet.tcp.sendspace: 8388608 kern.ipc.nmbclusters: 10240 kern.ipc.mbuf_wait: 32 kern.ipc.mbtypes: 2606 322 0 0 0 0 0 0 0 0 0 0 0 0 0 0 kern.ipc.nmbufs: 40960 -------------- code trace and explanation ---------- sosend() { ... if (space < resid + clen && (atomic || space < so->so_snd.sb_lowat || space < clen)) { if (so->so_state & SS_NBIO) snderr(EWOULDBLOCK); sbunlock(&so->so_snd); error = sbwait(&so->so_snd); /***** never come down to here ****/ splx(s); if (error) goto out; goto restart; } splx(s); mp = ⊤ space -= clen; do { if (uio == NULL) { /* * Data is prepackaged in "top". */ resid = 0; if (flags & MSG_EOR) top->m_flags |= M_EOR; } else do { if (top == 0) { microtime(&t1); MGETHDR(m, M_WAIT, MT_DATA); if (m == NULL) { error = ENOBUFS; goto release; } mlen = MHLEN; m->m_pkthdr.len = 0; m->m_pkthdr.rcvif = (struct ifnet *)0; } else { MGET(m, M_WAIT, MT_DATA); if (m == NULL) { error = ENOBUFS; goto release; } mlen = MLEN; } if (resid >= MINCLSIZE) { MCLGET(m, M_WAIT); if ((m->m_flags & M_EXT) == 0) goto nopages; mlen = MCLBYTES; len = min(min(mlen, resid), space); } else { nopages: len = min(min(mlen, resid), space); /* * For datagram protocols, leave room * for protocol headers in first mbuf. */ if (atomic && top == 0 && len < mlen) MH_ALIGN(m, len); } microtime(&t2); td = time_diff(&t2, &t1); if ((td > 5 && (++tcnt & 31) == 0) || td > 50) log( ... "td %d %d\n", td, tcnt); ... } /* end of sosend */ --------------32793A0542BCFE2DD05E9119 Content-Type: text/plain; charset=us-ascii; name="sosend.trace" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="sosend.trace" /kernel: l-timer 1049825316.1469006 1049825316.787656 68135 seq# 1608354634 /kernel: lion init rtt 68135 hiwat 8388608 mbmax 10485760 /kernel: init6 maxbss 8516875 (1460) co_tick 1 nbs 6 NIC 1000000 # delay count /kernel: sosend: td 6 32 /kernel: sosend: td 7 64 /kernel: sosend: td 6 96 /kernel: sosend: td 9 128 /kernel: sosend: td 6 160 /kernel: sosend: td 6 192 /kernel: sosend: td 7 224 /kernel: sosend: td 6 256 /kernel: sosend: td 6 288 /kernel: sosend: td 7 320 /kernel: sosend: td 6 352 /kernel: sosend: td 7 384 /kernel: sosend: td 8 416 /kernel: sosend: td 6 448 /kernel: sosend: td 6 480 /kernel: sosend: td 6 512 /kernel: sosend: td 6 544 /kernel: sosend: td 6 576 /kernel: sosend: td 6 608 /kernel: sosend: td 6 640 /kernel: sosend: td 6 672 /kernel: sosend: td 6 704 /kernel: sosend: td 7 736 /kernel: sosend: td 8 768 /kernel: sosend: td 7 800 /kernel: sosend: td 7 832 /kernel: sosend: td 7 864 /kernel: sosend: td 8 896 /kernel: sosend: td 8 928 /kernel: sosend: td 7 960 /kernel: tcp_lion SO_SNDBUF 8517136 // end of slow start /kernel: #(MBS 25550625) swnd 202496 cwnd 6744744 mbcnt 4133376 sp 4843184 /kernel: sosend: td 58 965 /kernel: sosend: td 370 966 /kernel: sosend: td 57 970 /kernel: sosend: td 77 972 /kernel: sosend: td 52 974 /kernel: sosend: td 55 975 /kernel: sosend: td 58 976 /kernel: sosend: td 53 982 /kernel: sosend: td 25 992 /kernel: sosend: td 55 1017 /kernel: sosend: td 19 1024 /kernel: sosend: td 27 1056 /kernel: sosend: td 26 1088 /kernel: sosend: td 43 1120 /kernel: sosend: td 30 1152 /kernel: sosend: td 17 1184 /kernel: sosend: td 25 1216 /kernel: sosend: td 25 1248 /kernel: sosend: td 29 1280 /kernel: sosend: td 18 1312 /kernel: sosend: td 28 1344 /kernel: sosend: td 26 1376 /kernel: sosend: td 26 1408 /kernel: sosend: td 27 1440 /kernel: sosend: td 32 1472 /kernel: sosend: td 27 1504 /kernel: sosend: td 19 1536 /kernel: sosend: td 56 1538 /kernel: sosend: td 21 1568 /kernel: sosend: td 648 1578 /kernel: sosend: td 27 1600 /kernel: sosend: td 27 1632 /kernel: sosend: td 29 1664 /kernel: sosend: td 25 1696 /kernel: sosend: td 70 1717 /kernel: sosend: td 28 1728 /kernel: sosend: td 53 1746 /kernel: sosend: td 51 1750 /kernel: sosend: td 84 1751 /kernel: sosend: td 63 1760 /kernel: sosend: td 293 1766 /kernel: sosend: td 166 1768 /kernel: sosend: td 127 1770 /kernel: sosend: td 76 1771 /kernel: sosend: td 78 1773 /kernel: sosend: td 79 1774 /kernel: sosend: td 308 1776 /kernel: sosend: td 78 1777 /kernel: sosend: td 80 1778 /kernel: sosend: td 79 1779 /kernel: sosend: td 150 1781 /kernel: sosend: td 107 1782 /kernel: sosend: td 106 1784 /kernel: sosend: td 102 1785 /kernel: sosend: td 18 1792 /kernel: sosend: td 97 1793 /kernel: sosend: td 113 1794 /kernel: sosend: td 108 1795 /kernel: sosend: td 100 1796 /kernel: sosend: td 188 1799 /kernel: sosend: td 25 1824 /kernel: sosend: td 26 1856 /kernel: sosend: td 26 1888 /kernel: sosend: td 53 1897 /kernel: sosend: td 28 1920 /kernel: FIN len 96 nxt 1691626219 max 1691626219 seq 1691626219 total counted delay = 39596 per packet (1448 bytes) delay is .7132564500 us --------------32793A0542BCFE2DD05E9119-- From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 16:07:49 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3DCB837B401; Wed, 9 Apr 2003 16:07:49 -0700 (PDT) Received: from perrin.int.nxad.com (internal.ext.nxad.com [69.1.70.251]) by mx1.FreeBSD.org (Postfix) with ESMTP id B13EC43F75; Wed, 9 Apr 2003 16:07:48 -0700 (PDT) (envelope-from sean@perrin.int.nxad.com) Received: by perrin.int.nxad.com (Postfix, from userid 1001) id 7ABF320F00; Wed, 9 Apr 2003 16:07:33 -0700 (PDT) Date: Wed, 9 Apr 2003 16:07:33 -0700 From: Sean Chittenden To: "Jin Guojun [DSD]" Message-ID: <20030409230733.GX79923@perrin.int.nxad.com> References: <3E94A22D.174321F0@lbl.gov> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3E94A22D.174321F0@lbl.gov> User-Agent: Mutt/1.4i X-PGP-Key: finger seanc@FreeBSD.org X-PGP-Fingerprint: 3849 3760 1AFE 7B17 11A0 83A6 DD99 E31F BC84 B341 X-Web-Homepage: http://sean.chittenden.org/ cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 23:07:49 -0000 > When testing GigE path that has 67 ms RTT, the maximum TCP throughput is > limited at 250 Mb/s. By tracing the problem, I found that tcp_output() is > starving > where snd_wnd and snd_cwnd are fully open. The snd_cc is never filled beyond > the 4.05MB even though the snd_hiwat is 10MB and snd_sbmax is 8MB. That is, > sosend never stopped at sbwait. So only place can slow down is the mbuf > allocation > in sosend(). The attached trace file shows that each MGET and MCLGET takes > significant time -- around 8 us at slow start time, and gradually increasing > after that > in an range 18 to 648. > Each packet Tx on GigE takes 12 us. It average mbuf allocation takes 18 us, then > > the performance will be reduced to 40%, in fact it is down to 25%, which means > higher average delay. > > I have change NMBCLUSTER from 2446 to 6566 to 10240, and nothing is improved. > > Any one can tell what factors would cause MGET / MCLGET to wait? > Is there any way to make MGET/MCLGET not to wait? Luigi posted a patch about this a while back (last summer sometime, iirc). http://people.freebsd.org/~seanc/patches/#o1_mbuf_lookup I updated his patch but haven't had a chance to test it. If you're feeling brave, see if applying this patch fixes this bottle neck. -sc -- Sean Chittenden From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 16:12:08 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7C5D337B401; Wed, 9 Apr 2003 16:12:08 -0700 (PDT) Received: from postal2.lbl.gov (postal2.lbl.gov [131.243.248.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id B2EA143FAF; Wed, 9 Apr 2003 16:12:07 -0700 (PDT) (envelope-from j_guojun@lbl.gov) Received: from postal2.lbl.gov (localhost [127.0.0.1]) by postal2.lbl.gov (8.12.8/8.12.8) with ESMTP id h39NC5Z6013878; Wed, 9 Apr 2003 16:12:05 -0700 (PDT) Received: from lbl.gov (gracie.lbl.gov [131.243.2.175]) by postal2.lbl.gov (8.12.8/8.12.8) with ESMTP id h39NC4Ig013875; Wed, 9 Apr 2003 16:12:04 -0700 (PDT) Sender: jin@lbl.gov Message-ID: <3E94A8C4.3A196E42@lbl.gov> Date: Wed, 09 Apr 2003 16:12:04 -0700 From: "Jin Guojun [DSD]" X-Mailer: Mozilla 4.76 [en] (X11; U; FreeBSD 4.7-RELEASE i386) X-Accept-Language: zh, zh-CN, en MIME-Version: 1.0 To: freebsd-hackers@freebsd.org, freebsd-performance@freebsd.org References: <3E94A22D.174321F0@lbl.gov> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 23:12:08 -0000 Some details was left behind -- The machine is 2 GHz Intel P4 with 1 GB memory, so the delay is not from either CPU or lack of memory. -Jin "Jin Guojun [DSD]" wrote: > When testing GigE path that has 67 ms RTT, the maximum TCP throughput is > limited at 250 Mb/s. By tracing the problem, I found that tcp_output() is > starving > where snd_wnd and snd_cwnd are fully open. The snd_cc is never filled beyond > the 4.05MB even though the snd_hiwat is 10MB and snd_sbmax is 8MB. That is, > sosend never stopped at sbwait. So only place can slow down is the mbuf > allocation > in sosend(). The attached trace file shows that each MGET and MCLGET takes > significant time -- around 8 us at slow start time, and gradually increasing > after that > in an range 18 to 648 us. > Each packet Tx on GigE takes 12 us. It average mbuf allocation takes 18 us, then > > the performance will be reduced to 40%, in fact it is down to 25%, which means > higher average delay. > > I have change NMBCLUSTER from 2446 to 6566 to 10240, and nothing is improved. > > Any one can tell what factors would cause MGET / MCLGET to wait? > Is there any way to make MGET/MCLGET not to wait? > > -Jin > > ----------- system info ------------- > > kern.ipc.maxsockbuf: 10485760 > net.inet.tcp.sendspace: 8388608 > kern.ipc.nmbclusters: 10240 > kern.ipc.mbuf_wait: 32 > kern.ipc.mbtypes: 2606 322 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > kern.ipc.nmbufs: 40960 > > -------------- code trace and explanation ---------- > > sosend() > { > ... > if (space < resid + clen && > (atomic || space < so->so_snd.sb_lowat || space < clen)) { > if (so->so_state & SS_NBIO) > snderr(EWOULDBLOCK); > sbunlock(&so->so_snd); > error = sbwait(&so->so_snd); /***** never come > down to here ****/ > splx(s); > if (error) > goto out; > goto restart; > } > splx(s); > mp = ⊤ > space -= clen; > do { > if (uio == NULL) { > /* > * Data is prepackaged in "top". > */ > resid = 0; > if (flags & MSG_EOR) > top->m_flags |= M_EOR; > } else do { > if (top == 0) { > microtime(&t1); > MGETHDR(m, M_WAIT, MT_DATA); > if (m == NULL) { > error = ENOBUFS; > goto release; > } > mlen = MHLEN; > m->m_pkthdr.len = 0; > m->m_pkthdr.rcvif = (struct ifnet *)0; > } else { > MGET(m, M_WAIT, MT_DATA); > if (m == NULL) { > error = ENOBUFS; > goto release; > } > mlen = MLEN; > } > if (resid >= MINCLSIZE) { > MCLGET(m, M_WAIT); > if ((m->m_flags & M_EXT) == 0) > goto nopages; > mlen = MCLBYTES; > len = min(min(mlen, resid), space); > } else { > nopages: > len = min(min(mlen, resid), space); > /* > * For datagram protocols, leave room > * for protocol headers in first mbuf. > */ > if (atomic && top == 0 && len < mlen) > MH_ALIGN(m, len); > } > microtime(&t2); > td = time_diff(&t2, &t1); > if ((td > 5 && (++tcnt & 31) == 0) || td > 50) > log( ... "td %d %d\n", td, tcnt); > > ... > > } /* end of sosend */ From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 17:23:49 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9ECC937B401; Wed, 9 Apr 2003 17:23:49 -0700 (PDT) Received: from postal2.lbl.gov (postal2.lbl.gov [131.243.248.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id D8CF343FAF; Wed, 9 Apr 2003 17:23:48 -0700 (PDT) (envelope-from j_guojun@lbl.gov) Received: from postal2.lbl.gov (localhost [127.0.0.1]) by postal2.lbl.gov (8.12.8/8.12.8) with ESMTP id h3A0NkZ8016682; Wed, 9 Apr 2003 17:23:46 -0700 (PDT) Received: from lbl.gov (gracie.lbl.gov [131.243.2.175]) by postal2.lbl.gov (8.12.8/8.12.8) with ESMTP id h3A0NjIg016679; Wed, 9 Apr 2003 17:23:45 -0700 (PDT) Sender: jin@lbl.gov Message-ID: <3E94B993.D282DEB2@lbl.gov> Date: Wed, 09 Apr 2003 17:23:47 -0700 From: "Jin Guojun [DSD]" X-Mailer: Mozilla 4.76 [en] (X11; U; FreeBSD 4.7-RELEASE i386) X-Accept-Language: zh, zh-CN, en MIME-Version: 1.0 To: Sean Chittenden References: <3E94A22D.174321F0@lbl.gov> <20030409230733.GX79923@perrin.int.nxad.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.1 cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 00:23:49 -0000 The interesting result -- In normal FreeBSD TCP stack, the large delay goes away, but more small delays. Apr 9 17:10:01 tcp_lion /kernel: sosend: td 23 3424 (bumped up from 1920) The performance dropped badly to 92 Mb/s (no loss) In my new TCP, I did not see any delay above 5 us, which is good, but the overall TCP performance drop to 120 Mb/s. So, maybe either there are a lot delay below 5 us and about 100 ns, or some other bottleneck is trigged somewhere. I guess there is more work to do to determine what is going on :-( I will post whatever I will discover. Thanks for pointing to the patch. -Jin Sean Chittenden wrote: > > When testing GigE path that has 67 ms RTT, the maximum TCP throughput is > > limited at 250 Mb/s. By tracing the problem, I found that tcp_output() is > > starving > > where snd_wnd and snd_cwnd are fully open. The snd_cc is never filled beyond > > the 4.05MB even though the snd_hiwat is 10MB and snd_sbmax is 8MB. That is, > > sosend never stopped at sbwait. So only place can slow down is the mbuf > > allocation > > in sosend(). The attached trace file shows that each MGET and MCLGET takes > > significant time -- around 8 us at slow start time, and gradually increasing > > after that > > in an range 18 to 648. > > Each packet Tx on GigE takes 12 us. It average mbuf allocation takes 18 us, then > > > > the performance will be reduced to 40%, in fact it is down to 25%, which means > > higher average delay. > > > > I have change NMBCLUSTER from 2446 to 6566 to 10240, and nothing is improved. > > > > Any one can tell what factors would cause MGET / MCLGET to wait? > > Is there any way to make MGET/MCLGET not to wait? > > Luigi posted a patch about this a while back (last summer sometime, > iirc). > > http://people.freebsd.org/~seanc/patches/#o1_mbuf_lookup > > I updated his patch but haven't had a chance to test it. If you're > feeling brave, see if applying this patch fixes this bottle neck. -sc > > -- > Sean Chittenden -- ------------ Jin Guojun ----------- v --- j_guojun@lbl.gov --- Distributed Systems Department http://www.itg.lbl.gov/~jin M/S 50B-2239 Ph#:(510) 486-7531 Fax: 486-6363 Lawrence Berkeley National Laboratory, Berkeley, CA 94720 From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 17:28:28 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6599337B401; Wed, 9 Apr 2003 17:28:28 -0700 (PDT) Received: from mta6.snfc21.pbi.net (mta6.snfc21.pbi.net [206.13.28.240]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0DF1843FA3; Wed, 9 Apr 2003 17:28:28 -0700 (PDT) (envelope-from hsu@FreeBSD.org) Received: from FreeBSD.org ([63.193.112.125]) by mta6.snfc21.pbi.net (iPlanet Messaging Server 5.1 HotFix 1.6 (built Oct 18 2002)) with ESMTP id <0HD30045UQMBJD@mta6.snfc21.pbi.net>; Wed, 09 Apr 2003 17:27:47 -0700 (PDT) Date: Wed, 09 Apr 2003 17:29:02 -0700 From: Jeffrey Hsu To: j_guojun@lbl.gov Message-id: <0HD30045VQMBJD@mta6.snfc21.pbi.net> MIME-version: 1.0 X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT X-Mailman-Approved-At: Wed, 09 Apr 2003 17:57:44 -0700 cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 00:28:28 -0000 > I have change NMBCLUSTER from 2446 to 6566 to 10240, > and nothing is improved. What does netstat -m say? > Is there any way to make MGET/MCLGET not to wait? You could try changing the M_WAIT to M_NOWAIT. Finally, I hope you're running FreeBSD 4.8, because the 5.x series is known to be much slower. Jeffrey From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 21:15:28 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8D74537B401; Wed, 9 Apr 2003 21:15:28 -0700 (PDT) Received: from sabre.velocet.net (sabre.velocet.net [216.138.209.205]) by mx1.FreeBSD.org (Postfix) with ESMTP id CB17643FCB; Wed, 9 Apr 2003 21:15:27 -0700 (PDT) (envelope-from dgilbert@velocet.ca) Received: from trooper.velocet.ca (trooper.velocet.net [216.138.242.2]) by sabre.velocet.net (Postfix) with ESMTP id 7EFE9138443; Thu, 10 Apr 2003 00:15:26 -0400 (EDT) Received: by trooper.velocet.ca (Postfix, from userid 66) id 2065974D72; Thu, 10 Apr 2003 00:15:26 -0400 (EDT) Received: by canoe.velocet.net (Postfix, from userid 101) id A0C3E56791B; Thu, 10 Apr 2003 00:14:35 -0400 (EDT) From: David Gilbert MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16020.61355.551242.159558@canoe.velocet.net> Date: Thu, 10 Apr 2003 00:14:35 -0400 To: "Jin Guojun [DSD]" In-Reply-To: <3E94A8C4.3A196E42@lbl.gov> References: <3E94A22D.174321F0@lbl.gov> <3E94A8C4.3A196E42@lbl.gov> X-Mailer: VM 7.07 under 21.1 (patch 14) "Cuyahoga Valley" XEmacs Lucid cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 04:15:28 -0000 >>>>> "Jin" == Jin Guojun <[DSD]" > writes: Jin> Some details was left behind -- The machine is 2 GHz Intel P4 Jin> with 1 GB memory, so the delay is not from either CPU or lack of Jin> memory. I just want to quickly jump in with the comment that our GigE tests of routing through FreeBSD have exposed several-order-of-magnitude differences by changing ram/motherboard/which-slot-the-card-is-in. Do not assume that a fast CPU is the key. We went through 10 motherboards to commission the current routers ... and sometimes faster cpus would route slower (in various combinations of motherboards and RAM). Dave. -- ============================================================================ |David Gilbert, Velocet Communications. | Two things can only be | |Mail: dgilbert@velocet.net | equal if and only if they | |http://daveg.ca | are precisely opposite. | =========================================================GLO================ From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 05:31:36 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8CC0137B401; Thu, 10 Apr 2003 05:31:36 -0700 (PDT) Received: from otter3.centtech.com (moat3.centtech.com [207.200.51.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id BA74243FBD; Thu, 10 Apr 2003 05:31:35 -0700 (PDT) (envelope-from anderson@centtech.com) Received: from centtech.com (electron.centtech.com [204.177.173.173]) by otter3.centtech.com (8.12.3/8.12.3) with ESMTP id h3ACVY56060110; Thu, 10 Apr 2003 07:31:34 -0500 (CDT) (envelope-from anderson@centtech.com) Message-ID: <3E95641D.1080100@centtech.com> Date: Thu, 10 Apr 2003 07:31:25 -0500 From: Eric Anderson User-Agent: Mozilla/5.0 (X11; U; Linux i386; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0 X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Gilbert References: <3E94A22D.174321F0@lbl.gov> <3E94A8C4.3A196E42@lbl.gov> <16020.61355.551242.159558@canoe.velocet.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 12:31:36 -0000 David Gilbert wrote: >>>>>>"Jin" == Jin Guojun <[DSD]" > writes: >>>>> > > Jin> Some details was left behind -- The machine is 2 GHz Intel P4 > Jin> with 1 GB memory, so the delay is not from either CPU or lack of > Jin> memory. > > I just want to quickly jump in with the comment that our GigE tests of > routing through FreeBSD have exposed several-order-of-magnitude > differences by changing ram/motherboard/which-slot-the-card-is-in. > > Do not assume that a fast CPU is the key. We went through 10 > motherboards to commission the current routers ... and sometimes > faster cpus would route slower (in various combinations of > motherboards and RAM). Wow - this is very interesting. I have two machines with GigE in them, one has 6 gige nics (intel pro/1000T server), in a dual Xeon 1.5Ghz (P4), with 2gb of ram. I haven't put it into production yet, so if there are any tests I can do, let me know. It would be good to know the pitfalls before I bring it up live.. :) Eric -- ------------------------------------------------------------------ Eric Anderson Systems Administrator Centaur Technology Attitudes are contagious, is yours worth catching? ------------------------------------------------------------------ From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 09:16:52 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0523937B401; Thu, 10 Apr 2003 09:16:52 -0700 (PDT) Received: from otter3.centtech.com (moat3.centtech.com [207.200.51.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1101643F93; Thu, 10 Apr 2003 09:16:51 -0700 (PDT) (envelope-from anderson@centtech.com) Received: from centtech.com (electron.centtech.com [204.177.173.173]) by otter3.centtech.com (8.12.3/8.12.3) with ESMTP id h3AGGl56087861; Thu, 10 Apr 2003 11:16:47 -0500 (CDT) (envelope-from anderson@centtech.com) Message-ID: <3E9598E4.2000601@centtech.com> Date: Thu, 10 Apr 2003 11:16:36 -0500 From: Eric Anderson User-Agent: Mozilla/5.0 (X11; U; Linux i386; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Mike Silbersack References: <200304101311.h3ADBgjY022790@samson.dc.luth.se> <20030410114227.A472@odysseus.silby.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit cc: Borje Josefsson cc: freebsd-performance@freebsd.org cc: David Gilbert cc: freebsd-hackers@freebsd.org Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 16:16:52 -0000 Mike Silbersack wrote: >>My hosts are connected directly to core routers in a 10Gbps nationwide >>network, so if anybody is interested in some testing I am more than >>willing to participate. If anybody produces a patch, I have a third system >>that I can use for piloting of that too. >> >>--Börje > > > This brings up something I've been wondering about, which you might want > to investigate: > >>From tcp_output: > > if (error == ENOBUFS) { > if (!callout_active(tp->tt_rexmt) && > !callout_active(tp->tt_persist)) > callout_reset(tp->tt_rexmt, tp->t_rxtcur, > tcp_timer_rexmt, tp); > tcp_quench(tp->t_inpcb, 0); > return (0); > } > > That tcp_quench knocks the window size back to one packet, if I'm not > mistaken. You might want to put a counter there and see if that's > happening frequently to you; if so, it might explain some loss of > performance. > > Have you tried running kernel profiling yet? It would be interesting to > see which functions are using up the largest amount of time. It's interesting - I'm only getting about 320mb/s.. I must be hitting a similar problem. I'm not nearly as adept at hacking code to find bugs though. :( Eric -- ------------------------------------------------------------------ Eric Anderson Systems Administrator Centaur Technology Attitudes are contagious, is yours worth catching? ------------------------------------------------------------------ From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 06:12:04 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D375337B401; Thu, 10 Apr 2003 06:12:04 -0700 (PDT) Received: from samson.dc.luth.se (samson.dc.luth.se [130.240.112.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3D0A343FB1; Thu, 10 Apr 2003 06:12:03 -0700 (PDT) (envelope-from bj@dc.luth.se) Received: from ra.dc.luth.se (bj@ra.dc.luth.se [130.240.112.180]) by samson.dc.luth.se (8.12.5/8.12.5) with ESMTP id h3ADBgjY022790; Thu, 10 Apr 2003 15:11:42 +0200 (MET DST) Message-Id: <200304101311.h3ADBgjY022790@samson.dc.luth.se> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Eric Anderson In-reply-to: Your message of Thu, 10 Apr 2003 07:31:25 CDT. <3E95641D.1080100@centtech.com> Dcc: X-Disposition-notification-to: Borje.Josefsson@dc.luth.se X-uri: http://www.dc.luth.se/~bj/index.html Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Thu, 10 Apr 2003 15:11:42 +0200 From: Borje Josefsson X-Mailman-Approved-At: Thu, 10 Apr 2003 13:00:28 -0700 cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: bj@dc.luth.se List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 13:12:05 -0000 On Thu, 10 Apr 2003 07:31:25 CDT Eric Anderson wrote: > David Gilbert wrote: > >>>>>>"Jin" =3D=3D Jin Guojun <[DSD]" > writes: > >>>>> > > = > > Jin> Some details was left behind -- The machine is 2 GHz Intel P4 > > Jin> with 1 GB memory, so the delay is not from either CPU or lack of= > > Jin> memory. > > = > > I just want to quickly jump in with the comment that our GigE tests o= f > > routing through FreeBSD have exposed several-order-of-magnitude > > differences by changing ram/motherboard/which-slot-the-card-is-in. > > = > > Do not assume that a fast CPU is the key. We went through 10 > > motherboards to commission the current routers ... and sometimes > > faster cpus would route slower (in various combinations of > > motherboards and RAM). > = > Wow - this is very interesting. I have two machines with GigE in them,= = > one has 6 gige nics (intel pro/1000T server), in a dual Xeon 1.5Ghz = > (P4), with 2gb of ram. I haven't put it into production yet, so if = > there are any tests I can do, let me know. It would be good to know th= e = > pitfalls before I bring it up live.. :) > = I am also *very* interested in participating in this. I did apply [some = version of] Sean Chittendens patch, but that didn't help. With that patch= = applied my system deadlocked(?) when stressing it vith ttcp. To me (but = bear in mind that I'm not at all a kernel hacker) one of the if-statement= s = in the patch seem to cover to little - i.e. when having the fastscan OIS = unset - code that has to do with fastscan is still executed, at least in = the version I tried. I get Mar 27 11:42:41 stinky kernel: sbappend: bad tail 0x0xc0ef8200 instead of= = 0x0xc0ef4b00 when "fastscan" is unset. I have two test systems with xeon CPUs, that can handle a 1000 km (21 ms)= = full speed GE-connection quite well with NetBSD (I get approx 970 Mbit/se= c = with ttcp), but I run out of CPU when using FreeBSD long before I can fil= l = then network. I could stay with NetBSD, but since I am used to FreeBSD I'= d = rader have that instead. We have tested with Linux too on the same = distance, and it seems that they can handle this too. What we did in NetBSD (-current) was to increase IFQ_MAXLEN in (their) = sys/net/if.h, apart from that it's only "traditional" TCP tuning. My hosts are connected directly to core routers in a 10Gbps nationwide = network, so if anybody is interested in some testing I am more than = willing to participate. If anybody produces a patch, I have a third syste= m = that I can use for piloting of that too. --B=F6rje From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 08:48:22 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9BC8637B404 for ; Thu, 10 Apr 2003 08:48:22 -0700 (PDT) Received: from relay.pair.com (relay.pair.com [209.68.1.20]) by mx1.FreeBSD.org (Postfix) with SMTP id 07E0B43FCB for ; Thu, 10 Apr 2003 08:48:21 -0700 (PDT) (envelope-from silby@silby.com) Received: (qmail 21143 invoked from network); 10 Apr 2003 15:48:20 -0000 Received: from niwun.pair.com (HELO localhost) (209.68.2.70) by relay.pair.com with SMTP; 10 Apr 2003 15:48:20 -0000 X-pair-Authenticated: 209.68.2.70 Date: Thu, 10 Apr 2003 11:44:42 -0500 (CDT) From: Mike Silbersack To: Borje Josefsson In-Reply-To: <200304101311.h3ADBgjY022790@samson.dc.luth.se> Message-ID: <20030410114227.A472@odysseus.silby.com> References: <200304101311.h3ADBgjY022790@samson.dc.luth.se> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Mailman-Approved-At: Thu, 10 Apr 2003 13:00:28 -0700 cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 15:48:23 -0000 On Thu, 10 Apr 2003, Borje Josefsson wrote: > What we did in NetBSD (-current) was to increase IFQ_MAXLEN in (their) > sys/net/if.h, apart from that it's only "traditional" TCP tuning. > > My hosts are connected directly to core routers in a 10Gbps nationwide > network, so if anybody is interested in some testing I am more than > willing to participate. If anybody produces a patch, I have a third syste= m > that I can use for piloting of that too. > > --B=F6rje This brings up something I've been wondering about, which you might want to investigate: From=20tcp_output: =09=09if (error =3D=3D ENOBUFS) { =09 if (!callout_active(tp->tt_rexmt) && !callout_active(tp->tt_persist)) =09 callout_reset(tp->tt_rexmt, tp->t_rxtcur, tcp_timer_rexmt, tp); =09=09=09tcp_quench(tp->t_inpcb, 0); =09=09=09return (0); =09=09} That tcp_quench knocks the window size back to one packet, if I'm not mistaken. You might want to put a counter there and see if that's happening frequently to you; if so, it might explain some loss of performance. Have you tried running kernel profiling yet? It would be interesting to see which functions are using up the largest amount of time. Mike "Silby" Silbersack From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 10:16:48 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1928E37B401; Thu, 10 Apr 2003 10:16:48 -0700 (PDT) Received: from porter.dc.luth.se (host-n12-30.homerun.telia.com [212.181.227.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4C28843FA3; Thu, 10 Apr 2003 10:16:46 -0700 (PDT) (envelope-from bj@dc.luth.se) Received: from porter.dc.luth.se (localhost.dc.luth.se [127.0.0.1]) by porter.dc.luth.se (Postfix) with ESMTP id C44793B2; Thu, 10 Apr 2003 19:16:40 +0200 (CEST) X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Eric Anderson In-reply-to: Your message of Thu, 10 Apr 2003 11:16:36 CDT. <3E9598E4.2000601@centtech.com> Dcc: X-Disposition-notification-to: Borje.Josefsson@dc.luth.se X-uri: http://www.dc.luth.se/~bj/index.html Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Thu, 10 Apr 2003 19:16:40 +0200 From: Borje Josefsson Message-Id: <20030410171640.C44793B2@porter.dc.luth.se> X-Mailman-Approved-At: Thu, 10 Apr 2003 13:00:28 -0700 cc: Mike Silbersack cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: bj@dc.luth.se List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 17:16:48 -0000 On Thu, 10 Apr 2003 11:16:36 CDT Eric Anderson wrote: > Mike Silbersack wrote: > >>My hosts are connected directly to core routers in a 10Gbps nationwid= e > >>network, so if anybody is interested in some testing I am more than > >>willing to participate. If anybody produces a patch, I have a third s= ystem > >>that I can use for piloting of that too. > >> > >>--B=F6rje > > = > > = > > This brings up something I've been wondering about, which you might w= ant > > to investigate: > > = > >>From tcp_output: > > = > > if (error =3D=3D ENOBUFS) { > > if (!callout_active(tp->tt_rexmt) && > > !callout_active(tp->tt_persist)) > > callout_reset(tp->tt_rexmt, tp->t_rxtcur, > > tcp_timer_rexmt, tp); > > tcp_quench(tp->t_inpcb, 0); > > return (0); > > } > > = > > That tcp_quench knocks the window size back to one packet, if I'm not= > > mistaken. You might want to put a counter there and see if that's > > happening frequently to you; if so, it might explain some loss of > > performance. > > = > > Have you tried running kernel profiling yet? It would be interesting= to > > see which functions are using up the largest amount of time. Could do that if I knew how... Not before the weekend though, right now = I'm at the longue at the airport... = > It's interesting - I'm only getting about 320mb/s.. I must be hitting a= = > similar problem. I'm not nearly as adept at hacking code to find bugs = > though. :( 320 Mbit/sec seems familiar, this was what I got when I first tried on a = system with "traditional" PCI bus. Changing the OS to NetBSD on that box = bumped that to 525 Mbit/sec. You need wide PCI (or preferrably PCI-X for = this). What happens in that case for me is that I run out of CPU resources. Try = running "top" in one window and "netstat 1" in another while bashing the = net with ttcp. If everything is OK (which it apparently isn't), top will show free CPU, = and netstat should show a *very* steady packet flow (around 90kpps if You= = have MTU 1500). Any packet loss is fatal for this speed, so if there is a= = way (as indicated by Mike above) to not restarting with windowsize from = scratch that will make recovery much better. My test was done with ttcp and this parameters: ttcp -s -t -f m -l 61440 -n 20345 dest.host (tuned for a 10 sec test at 1Gbps). IMPORTANT NOTE: Several tests here has shown that this is VERY BADLY = affected if You have too much LAN equipment (especially VLAN seems to be = harmful) at the edges. My speed of 960 Mbit/sec fell to 165 just by addin= g = 10 feet of cable and two switches :-( --B=F6rje From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 13:12:04 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3572C37B404; Thu, 10 Apr 2003 13:12:04 -0700 (PDT) Received: from perrin.int.nxad.com (internal.ext.nxad.com [69.1.70.251]) by mx1.FreeBSD.org (Postfix) with ESMTP id E2AB743FA3; Thu, 10 Apr 2003 13:12:00 -0700 (PDT) (envelope-from sean@perrin.int.nxad.com) Received: by perrin.int.nxad.com (Postfix, from userid 1001) id 3DFD821065; Thu, 10 Apr 2003 13:11:43 -0700 (PDT) Date: Thu, 10 Apr 2003 13:11:43 -0700 From: Sean Chittenden To: "Jin Guojun [NCS]" Message-ID: <20030410201143.GF79923@perrin.int.nxad.com> References: <3E94A22D.174321F0@lbl.gov> <20030409230733.GX79923@perrin.int.nxad.com> <3E94B993.D282DEB2@lbl.gov> <20030410005846.GA79923@perrin.int.nxad.com> <3E95A37E.36186A9F@lbl.gov> <3E95A653.8F5CE89C@lbl.gov> <3E95BEBB.58F86F4F@lbl.gov> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3E95BEBB.58F86F4F@lbl.gov> User-Agent: Mutt/1.4i X-PGP-Key: finger seanc@FreeBSD.org X-PGP-Fingerprint: 3849 3760 1AFE 7B17 11A0 83A6 DD99 E31F BC84 B341 X-Web-Homepage: http://sean.chittenden.org/ cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 20:12:04 -0000 > I guess I overlooked something after applying the patch (attached): > > Apr 10 12:11:52 ncs /kernel: sbappend: bad tail 0x0xc209a200 instead of 0x0xc243 > 6c00 > Apr 10 12:11:52 ncs /kernel: sbappend: bad tail 0x0xc2436c00 instead of 0x0xc238 > bf00 > Apr 10 12:11:52 ncs /kernel: sbappend: bad tail 0x0xc238bf00 instead of 0x0xc243 > f300 > ... > > A large number of such message was added into /var/log/message. This > indicates either bad patch code or something I changed in the patch > to make it work in 4.8 (attached). > > Any thought? That's likely the sign that the patch isn't appending to the tail of the list correctly. Doing a tail append where the tail is known should be an O(1) operation and should make adding an mbuf to a cluster faster. Right now it has to do a linear scan to append data, iirc, which is likely _a_ cause of some performance degradation. I'm not an mbuf expert, but I wonder how free mbuf's are identified. Regardless, I'll see if I can't figure out where this problem is with the patch, it should do nothing but make things faster. -sc -- Sean Chittenden From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 13:50:47 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2D73037B47D; Thu, 10 Apr 2003 13:50:46 -0700 (PDT) Received: from rms21.rommon.net (rms21.rommon.net [193.64.42.200]) by mx1.FreeBSD.org (Postfix) with ESMTP id DE5EA43FB1; Thu, 10 Apr 2003 13:50:44 -0700 (PDT) (envelope-from pete@he.iki.fi) Received: from PHE (h93.vuokselantie10.fi [193.64.42.147]) by rms21.rommon.net (8.12.6p2/8.12.6) with SMTP id h3AKodqo068247; Thu, 10 Apr 2003 23:50:40 +0300 (EEST) (envelope-from pete@he.iki.fi) Message-ID: <05b601c2ffa2$ed87a5b0$932a40c1@PHE> From: "Petri Helenius" To: "Jin Guojun [DSD]" , , References: <3E94A22D.174321F0@lbl.gov> <3E94A8C4.3A196E42@lbl.gov> Date: Thu, 10 Apr 2003 23:51:13 +0300 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 20:50:47 -0000 There was a discussion on mballoc performance on freebsd-net about a month ago but it has since died without conclusion. Pete ----- Original Message ----- From: "Jin Guojun [DSD]" To: ; Sent: Thursday, April 10, 2003 2:12 AM Subject: Re: tcp_output starving -- is due to mbuf get delay? > Some details was left behind -- > > The machine is 2 GHz Intel P4 with 1 GB memory, so the delay is not from > either CPU or lack of memory. > > -Jin > > "Jin Guojun [DSD]" wrote: > > > When testing GigE path that has 67 ms RTT, the maximum TCP throughput is > > limited at 250 Mb/s. By tracing the problem, I found that tcp_output() is > > starving > > where snd_wnd and snd_cwnd are fully open. The snd_cc is never filled beyond > > the 4.05MB even though the snd_hiwat is 10MB and snd_sbmax is 8MB. That is, > > sosend never stopped at sbwait. So only place can slow down is the mbuf > > allocation > > in sosend(). The attached trace file shows that each MGET and MCLGET takes > > significant time -- around 8 us at slow start time, and gradually increasing > > after that > > in an range 18 to 648 us. > > Each packet Tx on GigE takes 12 us. It average mbuf allocation takes 18 us, then > > > > the performance will be reduced to 40%, in fact it is down to 25%, which means > > higher average delay. > > > > I have change NMBCLUSTER from 2446 to 6566 to 10240, and nothing is improved. > > > > Any one can tell what factors would cause MGET / MCLGET to wait? > > Is there any way to make MGET/MCLGET not to wait? > > > > -Jin > > > > ----------- system info ------------- > > > > kern.ipc.maxsockbuf: 10485760 > > net.inet.tcp.sendspace: 8388608 > > kern.ipc.nmbclusters: 10240 > > kern.ipc.mbuf_wait: 32 > > kern.ipc.mbtypes: 2606 322 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > > kern.ipc.nmbufs: 40960 > > > > -------------- code trace and explanation ---------- > > > > sosend() > > { > > ... > > if (space < resid + clen && > > (atomic || space < so->so_snd.sb_lowat || space < clen)) { > > if (so->so_state & SS_NBIO) > > snderr(EWOULDBLOCK); > > sbunlock(&so->so_snd); > > error = sbwait(&so->so_snd); /***** never come > > down to here ****/ > > splx(s); > > if (error) > > goto out; > > goto restart; > > } > > splx(s); > > mp = ⊤ > > space -= clen; > > do { > > if (uio == NULL) { > > /* > > * Data is prepackaged in "top". > > */ > > resid = 0; > > if (flags & MSG_EOR) > > top->m_flags |= M_EOR; > > } else do { > > if (top == 0) { > > microtime(&t1); > > MGETHDR(m, M_WAIT, MT_DATA); > > if (m == NULL) { > > error = ENOBUFS; > > goto release; > > } > > mlen = MHLEN; > > m->m_pkthdr.len = 0; > > m->m_pkthdr.rcvif = (struct ifnet *)0; > > } else { > > MGET(m, M_WAIT, MT_DATA); > > if (m == NULL) { > > error = ENOBUFS; > > goto release; > > } > > mlen = MLEN; > > } > > if (resid >= MINCLSIZE) { > > MCLGET(m, M_WAIT); > > if ((m->m_flags & M_EXT) == 0) > > goto nopages; > > mlen = MCLBYTES; > > len = min(min(mlen, resid), space); > > } else { > > nopages: > > len = min(min(mlen, resid), space); > > /* > > * For datagram protocols, leave room > > * for protocol headers in first mbuf. > > */ > > if (atomic && top == 0 && len < mlen) > > MH_ALIGN(m, len); > > } > > microtime(&t2); > > td = time_diff(&t2, &t1); > > if ((td > 5 && (++tcnt & 31) == 0) || td > 50) > > log( ... "td %d %d\n", td, tcnt); > > > > ... > > > > } /* end of sosend */ > > _______________________________________________ > freebsd-performance@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-performance > To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org" > From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 14:40:57 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 564A137B401; Thu, 10 Apr 2003 14:40:57 -0700 (PDT) Received: from stork.mail.pas.earthlink.net (stork.mail.pas.earthlink.net [207.217.120.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id A3FA343F93; Thu, 10 Apr 2003 14:40:56 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0274.cvx40-bradley.dialup.earthlink.net ([216.244.43.19] helo=mindspring.com) by stork.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 193jmo-0004lA-00; Thu, 10 Apr 2003 14:40:48 -0700 Message-ID: <3E95E446.73B7E510@mindspring.com> Date: Thu, 10 Apr 2003 14:38:14 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: bj@dc.luth.se References: <20030410171640.C44793B2@porter.dc.luth.se> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a477423e8d118ab8c71501f12832479819a7ce0e8f8d31aa3f350badd9bab72f9c350badd9bab72f9c X-Mailman-Approved-At: Thu, 10 Apr 2003 14:56:46 -0700 cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 21:40:57 -0000 Borje Josefsson wrote: > > It's interesting - I'm only getting about 320mb/s.. I must be hitting a > > similar problem. I'm not nearly as adept at hacking code to find bugs > > though. :( > > 320 Mbit/sec seems familiar, this was what I got when I first tried on a > system with "traditional" PCI bus. Changing the OS to NetBSD on that box > bumped that to 525 Mbit/sec. You need wide PCI (or preferrably PCI-X for > this). 32bit x 33Mhz = 1.03Gbit/S, burst rate 32bit x 66Mhz = 2.06Gbit/S, burst rate 64bit x 66MHz = 4.12Gbit/S, burst rate So it's entirely possible to keep up with 2 1Gbit ethernet cards in standard 64bit PCI bus slots, no problem, without running into bus limitations. PCI-X gets you to 8Gbit/S; you don't need PCI-X for Gbit, or even 2Gbit. > What happens in that case for me is that I run out of CPU resources. Try > running "top" in one window and "netstat 1" in another while bashing the > net with ttcp. This is incredibly bizarre. It's very hard to saturate the CPU at only 1Gbit: in all cases, you are I/O bound, not CPU bound, and not memory bandwidth bound. > IMPORTANT NOTE: Several tests here has shown that this is VERY BADLY > affected if You have too much LAN equipment (especially VLAN seems to be > harmful) at the edges. My speed of 960 Mbit/sec fell to 165 just by adding > 10 feet of cable and two switches :-( The products that Jeffrey Hsu and I and Alfred and Jon Mini worked on at a previous company had no problems at all on a 1Gbit/S saturating the link, even through a VLAN trunk through Cisco and one other less intelligent switch (i.e. two switches and a VLAN trunk). Maybe your network cards don't do hardware interrupt coelescing? Or maybe you are sending 1 byte packets instead of MTU-sized packets, or something? We were using Tigon III's, with 1K packets. -- Terry From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 14:45:17 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BD29537B405 for ; Thu, 10 Apr 2003 14:45:17 -0700 (PDT) Received: from relay.pair.com (relay.pair.com [209.68.1.20]) by mx1.FreeBSD.org (Postfix) with SMTP id B97CA43FAF for ; Thu, 10 Apr 2003 14:45:15 -0700 (PDT) (envelope-from silby@silby.com) Received: (qmail 69442 invoked from network); 10 Apr 2003 21:45:15 -0000 Received: from niwun.pair.com (HELO localhost) (209.68.2.70) by relay.pair.com with SMTP; 10 Apr 2003 21:45:15 -0000 X-pair-Authenticated: 209.68.2.70 Date: Thu, 10 Apr 2003 04:41:32 -0500 (CDT) From: Mike Silbersack To: Borje Josefsson In-Reply-To: <20030410171640.C44793B2@porter.dc.luth.se> Message-ID: <20030410043827.A936@odysseus.silby.com> References: <20030410171640.C44793B2@porter.dc.luth.se> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Mailman-Approved-At: Thu, 10 Apr 2003 14:56:46 -0700 cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 21:45:18 -0000 On Thu, 10 Apr 2003, Borje Josefsson wrote: > > > Have you tried running kernel profiling yet? It would be interesting to > > > see which functions are using up the largest amount of time. > > Could do that if I knew how... Not before the weekend though, right now > I'm at the longue at the airport... I believe that the manpages regarding how to set it up were pretty useful, it didn't take me long to get it operational last time I tried. However, that was a while ago, so I can't give you any helpful tips. > If everything is OK (which it apparently isn't), top will show free CPU, > and netstat should show a *very* steady packet flow (around 90kpps if You > have MTU 1500). Any packet loss is fatal for this speed, so if there is a > way (as indicated by Mike above) to not restarting with windowsize from > scratch that will make recovery much better. Well, the packet loss I pointed out would be due to the ifqueue overflowing, which could concieveably happen even if the actual network wasn't congested. I don't have the equipment to create such a situation, but it sounds like you might, in which case adding a debug printf or a counter to see if it's happening might be advantageous. Mike "Silby" Silbersack From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 14:56:49 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 12EF737B407; Thu, 10 Apr 2003 14:56:49 -0700 (PDT) Received: from mrout2.yahoo.com (mrout2.yahoo.com [216.145.54.172]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4DEC443F75; Thu, 10 Apr 2003 14:56:48 -0700 (PDT) (envelope-from jayanth@yahoo-inc.com) Received: from milk.yahoo.com (milk.yahoo.com [216.145.52.137]) h3ALubs03426; Thu, 10 Apr 2003 14:56:37 -0700 (PDT) Received: (from root@localhost) by milk.yahoo.com (8.11.0/8.11.0) id h3ALuZc59514; Thu, 10 Apr 2003 14:56:35 -0700 (PDT) (envelope-from jayanth) Date: Thu, 10 Apr 2003 14:56:35 -0700 From: jayanth To: Mike Silbersack Message-ID: <20030410145635.A59453@yahoo-inc.com> References: <20030410171640.C44793B2@porter.dc.luth.se> <20030410043827.A936@odysseus.silby.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2i In-Reply-To: <20030410043827.A936@odysseus.silby.com>; from silby@silby.com on Thu, Apr 10, 2003 at 04:41:32AM -0500 X-Mailman-Approved-At: Thu, 10 Apr 2003 15:13:22 -0700 cc: Borje Josefsson cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 21:56:49 -0000 You can probably try netstat -s -p ip and look at the "output packets dropped due to no bufs, etc." value. jayanth Mike Silbersack (silby@silby.com) wrote: > > On Thu, 10 Apr 2003, Borje Josefsson wrote: > > > > > Have you tried running kernel profiling yet? It would be interesting to > > > > see which functions are using up the largest amount of time. > > > > Could do that if I knew how... Not before the weekend though, right now > > I'm at the longue at the airport... > > I believe that the manpages regarding how to set it up were pretty useful, > it didn't take me long to get it operational last time I tried. However, > that was a while ago, so I can't give you any helpful tips. > > > If everything is OK (which it apparently isn't), top will show free CPU, > > and netstat should show a *very* steady packet flow (around 90kpps if You > > have MTU 1500). Any packet loss is fatal for this speed, so if there is a > > way (as indicated by Mike above) to not restarting with windowsize from > > scratch that will make recovery much better. > > Well, the packet loss I pointed out would be due to the ifqueue > overflowing, which could concieveably happen even if the actual network > wasn't congested. I don't have the equipment to create such a situation, > but it sounds like you might, in which case adding a debug printf or a > counter to see if it's happening might be advantageous. > > Mike "Silby" Silbersack > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > > From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 14:58:08 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0F96137B401; Thu, 10 Apr 2003 14:58:08 -0700 (PDT) Received: from mother.ludd.luth.se (mother.ludd.luth.se [130.240.16.3]) by mx1.FreeBSD.org (Postfix) with ESMTP id F26AF43FBD; Thu, 10 Apr 2003 14:58:05 -0700 (PDT) (envelope-from pantzer@ludd.luth.se) Received: from ludd.luth.se (skalman.campus.luth.se [130.240.197.52]) by mother.ludd.luth.se (8.11.6+Sun/8.9.3) with ESMTP id h3ALw1810603; Thu, 10 Apr 2003 23:58:02 +0200 (MEST) Message-ID: <3E95E8E9.3080102@ludd.luth.se> Date: Thu, 10 Apr 2003 23:58:01 +0200 From: Mattias Pantzare User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.2.1) Gecko/20030217 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Terry Lambert References: <20030410171640.C44793B2@porter.dc.luth.se> <3E95E446.73B7E510@mindspring.com> In-Reply-To: <3E95E446.73B7E510@mindspring.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Thu, 10 Apr 2003 15:13:22 -0700 cc: bj@dc.luth.se cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 21:58:08 -0000 > >>What happens in that case for me is that I run out of CPU resources. Try >>running "top" in one window and "netstat 1" in another while bashing the >>net with ttcp. > > > This is incredibly bizarre. It's very hard to saturate the CPU > at only 1Gbit: in all cases, you are I/O bound, not CPU bound, > and not memory bandwidth bound > >>IMPORTANT NOTE: Several tests here has shown that this is VERY BADLY >>affected if You have too much LAN equipment (especially VLAN seems to be >>harmful) at the edges. My speed of 960 Mbit/sec fell to 165 just by adding >>10 feet of cable and two switches :-( > > > The products that Jeffrey Hsu and I and Alfred and Jon Mini > worked on at a previous company had no problems at all on a > 1Gbit/S saturating the link, even through a VLAN trunk through > Cisco and one other less intelligent switch (i.e. two switches > and a VLAN trunk). A key factor here is that the testst where on a link with a 20ms round-tip time, and using a singel TCP connection. So the switches where in addition to a few routers on a 10Gbit/s network. From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 15:25:41 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7FD6437B401; Thu, 10 Apr 2003 15:25:41 -0700 (PDT) Received: from stork.mail.pas.earthlink.net (stork.mail.pas.earthlink.net [207.217.120.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id EAB4943F75; Thu, 10 Apr 2003 15:25:40 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0274.cvx40-bradley.dialup.earthlink.net ([216.244.43.19] helo=mindspring.com) by stork.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 193kU8-00069E-00; Thu, 10 Apr 2003 15:25:32 -0700 Message-ID: <3E95EEAC.AE812757@mindspring.com> Date: Thu, 10 Apr 2003 15:22:36 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Petri Helenius References: <3E94A22D.174321F0@lbl.gov> <3E94A8C4.3A196E42@lbl.gov> <05b601c2ffa2$ed87a5b0$932a40c1@PHE> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4c964df480dfbc21cc9b441a64eb8e474667c3043c0873f7e350badd9bab72f9c350badd9bab72f9c cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 22:25:41 -0000 Petri Helenius wrote: > There was a discussion on mballoc performance on freebsd-net about a month ago > but it has since died without conclusion. Actually, a researcher at UKY implemented an alternate allocator using an explicit freelist and mp_machdep.c tricks, rather than using the zone/slab/uma stuff, and obtained a significant speedup. Yes, I was involved in suggesting implementation details. The resulting code is not SMP-safe, however, unless you are willing to use a global freelist (obviously). BTW: The zone allocator *still* calcualtes some values each time it goes back to the system for more pages, that it should precalculate and cache instead. -- Terry From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 15:23:16 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B355537B401; Thu, 10 Apr 2003 15:23:16 -0700 (PDT) Received: from ebb.errno.com (ebb.errno.com [66.127.85.87]) by mx1.FreeBSD.org (Postfix) with ESMTP id E56F943F3F; Thu, 10 Apr 2003 15:23:15 -0700 (PDT) (envelope-from sam@errno.com) Received: from melange (melange.errno.com [66.127.85.82]) (authenticated bits=0) by ebb.errno.com (8.12.9/8.12.9) with ESMTP id h3AMNCpw048669 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Thu, 10 Apr 2003 15:23:13 -0700 (PDT) (envelope-from sam@errno.com) Message-ID: <02b201c2ffaf$c68acca0$52557f42@errno.com> From: "Sam Leffler" To: "Sean Chittenden" , "Jin Guojun [NCS]" References: <3E94A22D.174321F0@lbl.gov><20030409230733.GX79923@perrin.int.nxad.com> <3E94B993.D282DEB2@lbl.gov><20030410005846.GA79923@perrin.int.nxad.com> <3E95A37E.36186A9F@lbl.gov><3E95A653.8F5CE89C@lbl.gov> <3E95BEBB.58F86F4F@lbl.gov> <20030410201143.GF79923@perrin.int.nxad.com> Date: Thu, 10 Apr 2003 15:22:47 -0700 Organization: Errno Consulting MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4920.2300 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4920.2300 X-Mailman-Approved-At: Thu, 10 Apr 2003 15:31:38 -0700 cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 22:23:17 -0000 > > I guess I overlooked something after applying the patch (attached): > > > > Apr 10 12:11:52 ncs /kernel: sbappend: bad tail 0x0xc209a200 instead of 0x0xc243 > > 6c00 > > Apr 10 12:11:52 ncs /kernel: sbappend: bad tail 0x0xc2436c00 instead of 0x0xc238 > > bf00 > > Apr 10 12:11:52 ncs /kernel: sbappend: bad tail 0x0xc238bf00 instead of 0x0xc243 > > f300 > > ... > > > > A large number of such message was added into /var/log/message. This > > indicates either bad patch code or something I changed in the patch > > to make it work in 4.8 (attached). > > > > Any thought? > > That's likely the sign that the patch isn't appending to the tail of > the list correctly. Doing a tail append where the tail is known > should be an O(1) operation and should make adding an mbuf to a > cluster faster. Right now it has to do a linear scan to append data, > iirc, which is likely _a_ cause of some performance degradation. I'm > not an mbuf expert, but I wonder how free mbuf's are identified. > Regardless, I'll see if I can't figure out where this problem is with > the patch, it should do nothing but make things faster. If this is a repeat of Jason Thorpe's tail pointer optimization for sbappend; the patch may have started from one I did. I never committed it because I could never reproduce the performance gains he saw. I attributed it to a difference between netbsd and freebsd's TCP window setup algorithms. My patch for -stable (now probably very out of date) is in http://www.freebsd.org/~sam/thorpe-stable.patch. I haven't been following this thread closely, but FWIW I routinely get ~700 Mb/s running netperf between two -stable machines connected by a cross-over cable. Each machine has an Intel PRO/1000 card (em driver); 32-bit PCI in one machine and 64-bit in the other, but I've gotten similar performance with 32-bit PCI on both sides. As others have noted you need to watch out for "environmental factors" in understanding performance. Aside from hardware issues (there are many), be especially warry of IRQ entropy harvesting. Sam From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 15:32:22 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DF09B37B401; Thu, 10 Apr 2003 15:32:22 -0700 (PDT) Received: from stork.mail.pas.earthlink.net (stork.mail.pas.earthlink.net [207.217.120.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2732643F93; Thu, 10 Apr 2003 15:32:22 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0274.cvx40-bradley.dialup.earthlink.net ([216.244.43.19] helo=mindspring.com) by stork.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 193kaf-0007Qy-00; Thu, 10 Apr 2003 15:32:18 -0700 Message-ID: <3E95F03C.2A01561D@mindspring.com> Date: Thu, 10 Apr 2003 15:29:16 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Mattias Pantzare References: <20030410171640.C44793B2@porter.dc.luth.se> <3E95E446.73B7E510@mindspring.com> <3E95E8E9.3080102@ludd.luth.se> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4c964df480dfbc21ce9d14358040a3b86a8438e0f32a48e08350badd9bab72f9c350badd9bab72f9c cc: bj@dc.luth.se cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 22:32:23 -0000 Mattias Pantzare wrote: > > The products that Jeffrey Hsu and I and Alfred and Jon Mini > > worked on at a previous company had no problems at all on a > > 1Gbit/S saturating the link, even through a VLAN trunk through > > Cisco and one other less intelligent switch (i.e. two switches > > and a VLAN trunk). > > A key factor here is that the testst where on a link with a 20ms > round-tip time, and using a singel TCP connection. So the switches > where in addition to a few routers on a 10Gbit/s network. Sorry, but tis is not a factor. If you think it is, then you are running with badly tuned send and receive maximum window sizes. Latency = pool retention time = queue size -- Terry From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 16:42:16 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 03E7F37B404 for ; Thu, 10 Apr 2003 16:42:15 -0700 (PDT) Received: from win149.staff.flyingcroc.net (win149.staff.flyingcroc.net [207.246.150.58]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4422743F93 for ; Thu, 10 Apr 2003 16:42:15 -0700 (PDT) (envelope-from rdb@blarg.net) Received: from localhost (localhost.localdomain [127.0.0.1]) h3ANgFN14760 for ; Thu, 10 Apr 2003 16:42:15 -0700 Date: Thu, 10 Apr 2003 16:42:15 -0700 (PDT) From: RDB X-X-Sender: russell@win149.staff.flyingcroc.net To: freebsd-performance@freebsd.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Semi-polling mode and net.inet.tcp.inflight_enable X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2003 23:42:16 -0000 Hi, I'm curious whether anyone on this list has found real-world circumstances in which either semi-polling mode or the net.inet.tcp.inflight_enable setting improved performance, and if so what the circumstances were. Conversely, has anyone experienced unforseen problems (e.g. stability issues) with either of these new features? http://www.freebsd.org/cgi/man.cgi?query=tcp&sektion=4&manpath=FreeBSD+4.7-RELEASE http://www.freebsd.org/cgi/man.cgi?query=polling&sektion=4&manpath=FreeBSD+4.6-RELEASE Sincerely, Russell Brunelle P.S. The release notes for 4.8-R indicate that support for HyperThreading (via the new HTT kernel option) is "rudimentary." I'm curious: in what sense is it rudimentary? Are there stability issues? What are some of the plans to improve this feature, if it's a feature issue rather than a stability issue? From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 17:14:08 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 880E437B401 for ; Thu, 10 Apr 2003 17:14:08 -0700 (PDT) Received: from perrin.int.nxad.com (internal.ext.nxad.com [69.1.70.251]) by mx1.FreeBSD.org (Postfix) with ESMTP id 02EEF43F85 for ; Thu, 10 Apr 2003 17:14:06 -0700 (PDT) (envelope-from sean@perrin.int.nxad.com) Received: by perrin.int.nxad.com (Postfix, from userid 1001) id C452C21065; Thu, 10 Apr 2003 17:13:47 -0700 (PDT) Date: Thu, 10 Apr 2003 17:13:47 -0700 From: Sean Chittenden To: RDB Message-ID: <20030411001347.GI79923@perrin.int.nxad.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i X-PGP-Key: finger seanc@FreeBSD.org X-PGP-Fingerprint: 3849 3760 1AFE 7B17 11A0 83A6 DD99 E31F BC84 B341 X-Web-Homepage: http://sean.chittenden.org/ cc: freebsd-performance@freebsd.org Subject: Re: Semi-polling mode and net.inet.tcp.inflight_enable X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 00:14:08 -0000 > I'm curious whether anyone on this list has found real-world > circumstances in which either semi-polling mode or the > net.inet.tcp.inflight_enable setting improved performance, and if so > what the circumstances were. Performance in terms of what? -sc -- Sean Chittenden From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 17:30:30 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5FC9037B401 for ; Thu, 10 Apr 2003 17:30:30 -0700 (PDT) Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id BE7F943F3F for ; Thu, 10 Apr 2003 17:30:29 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0011.cvx21-bradley.dialup.earthlink.net ([209.179.192.11] helo=mindspring.com) by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 193mQx-0001oo-00; Thu, 10 Apr 2003 17:30:24 -0700 Message-ID: <3E960B8C.B33C8F57@mindspring.com> Date: Thu, 10 Apr 2003 17:25:48 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: RDB References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a48246029b7a7fcab44bf1155fdab59f94350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c cc: freebsd-performance@freebsd.org Subject: Re: Semi-polling mode and net.inet.tcp.inflight_enable X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 00:30:30 -0000 RDB wrote: > I'm curious whether anyone on this list has found real-world circumstances > in which either semi-polling mode or the net.inet.tcp.inflight_enable > setting improved performance, and if so what the circumstances were. Receiver livelock. However, it's not as good as full LRP. To see this, put a Gigabit card into a 32bit PCI @ 33MHz, and drive it full tilt from client machines. Watch as your data peaks, and then stops coming in at all, because all your mbufs are used up, and all your PCI bandwidth is being consumed by DMA's of incoming packets, and you have no cycles left over to do any work. Then go read the Jeffrey Mogul DECWRL paper from 1991, and be enlightened. -- Terry From owner-freebsd-performance@FreeBSD.ORG Thu Apr 10 17:37:26 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7C91837B401 for ; Thu, 10 Apr 2003 17:37:26 -0700 (PDT) Received: from win149.staff.flyingcroc.net (win149.staff.flyingcroc.net [207.246.150.58]) by mx1.FreeBSD.org (Postfix) with ESMTP id EDC6543FAF for ; Thu, 10 Apr 2003 17:37:25 -0700 (PDT) (envelope-from rdb@blarg.net) Received: from localhost (localhost.localdomain [127.0.0.1]) h3B0bPN15057; Thu, 10 Apr 2003 17:37:25 -0700 Date: Thu, 10 Apr 2003 17:37:25 -0700 (PDT) From: RDB X-X-Sender: russell@win149.staff.flyingcroc.net To: Sean Chittenden In-Reply-To: <20030411001347.GI79923@perrin.int.nxad.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-performance@freebsd.org Subject: Re: Semi-polling mode and net.inet.tcp.inflight_enable X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 00:37:26 -0000 > Performance in terms of what? -sc Quite honesty, any sort of positive result. I'm basically just trying to get a handle on what type of real-life situations these settings are intended to benefit. Does semi-polling mode result in less CPU being occupied by "interrupt" state at high bandwidth levels when many IPF rules are being used, for example? Do either of these settings result in lower latency or higher throughput at high bandwidth levels? I don't have a firm handle on what specific real-life issues these two new features were intended to address so I'm not 100% sure how to phrase my question. Russell Brunelle From owner-freebsd-performance@FreeBSD.ORG Fri Apr 11 06:59:04 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6E5F237B401; Fri, 11 Apr 2003 06:59:04 -0700 (PDT) Received: from goliat.adm.luth.se (goliat.adm.luth.se [130.240.127.20]) by mx1.FreeBSD.org (Postfix) with ESMTP id DC3AB43F85; Fri, 11 Apr 2003 06:59:01 -0700 (PDT) (envelope-from pantzer@ludd.luth.se) Received: from ludd.luth.se (pantzer@ra.dc.luth.se [130.240.112.180]) by goliat.adm.luth.se (8.10.1/8.10.1) with ESMTP id h3BDwtv14535; Fri, 11 Apr 2003 15:58:56 +0200 (MET DST) Message-ID: <3E96CA1F.4070000@ludd.luth.se> Date: Fri, 11 Apr 2003 15:58:55 +0200 From: Mattias Pantzare User-Agent: Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.3) Gecko/20030316 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Terry Lambert References: <20030410171640.C44793B2@porter.dc.luth.se> <3E95E446.73B7E510@mindspring.com> <3E95E8E9.3080102@ludd.luth.se> <3E95F03C.2A01561D@mindspring.com> In-Reply-To: <3E95F03C.2A01561D@mindspring.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: bj@dc.luth.se cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 13:59:04 -0000 Terry Lambert wrote: > Mattias Pantzare wrote: > >>>The products that Jeffrey Hsu and I and Alfred and Jon Mini >>>worked on at a previous company had no problems at all on a >>>1Gbit/S saturating the link, even through a VLAN trunk through >>>Cisco and one other less intelligent switch (i.e. two switches >>>and a VLAN trunk). >> >>A key factor here is that the testst where on a link with a 20ms >>round-tip time, and using a singel TCP connection. So the switches >>where in addition to a few routers on a 10Gbit/s network. > > > Sorry, but tis is not a factor. If you think it is, then you > are running with badly tuned send and receive maximum window > sizes. > > Latency = pool retention time = queue size Then explain this, FreeBSD to FreeBSD on that link uses all CPU on the sender, the reciver is fine, but performance is not. NetBSD to FreeBSD fills the link (1 Gbit/s). On the same computers. MTU 4470. Send and receive maximum windows where tuned to the same values on NetBSD and FreeBSD. And packet loss will affect the performance diffrently if you have a large bandwith-latency product. From owner-freebsd-performance@FreeBSD.ORG Fri Apr 11 07:07:53 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 75D7B37B418; Fri, 11 Apr 2003 07:07:50 -0700 (PDT) Received: from samson.dc.luth.se (samson.dc.luth.se [130.240.112.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0886F43FDF; Fri, 11 Apr 2003 07:07:49 -0700 (PDT) (envelope-from bj@dc.luth.se) Received: from dc.luth.se (root@bompe.dc.luth.se [130.240.60.42]) by samson.dc.luth.se (8.12.5/8.12.5) with ESMTP id h3BE7ijY029873; Fri, 11 Apr 2003 16:07:44 +0200 (MET DST) Received: from bompe.dc.luth.se (bj@localhost.dc.luth.se [127.0.0.1]) by dc.luth.se (8.12.6/8.11.3) with ESMTP id h3BE7hKl086838; Fri, 11 Apr 2003 16:07:43 +0200 (CEST) (envelope-from bj@bompe.dc.luth.se) Message-Id: <200304111407.h3BE7hKl086838@dc.luth.se> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Mattias Pantzare In-reply-to: Your message of Fri, 11 Apr 2003 15:58:55 +0200. <3E96CA1F.4070000@ludd.luth.se> Dcc: X-Disposition-notification-to: Borje.Josefsson@dc.luth.se X-uri: http://www.dc.luth.se/~bj/index.html Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Fri, 11 Apr 2003 16:07:43 +0200 From: Borje Josefsson X-Mailman-Approved-At: Fri, 11 Apr 2003 10:54:12 -0700 cc: Terry Lambert cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: bj@dc.luth.se List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 14:07:53 -0000 On Fri, 11 Apr 2003 15:58:55 +0200 Mattias Pantzare wrote: > Terry Lambert wrote: > > Mattias Pantzare wrote: > > = > >>>The products that Jeffrey Hsu and I and Alfred and Jon Mini > >>>worked on at a previous company had no problems at all on a > >>>1Gbit/S saturating the link, even through a VLAN trunk through > >>>Cisco and one other less intelligent switch (i.e. two switches > >>>and a VLAN trunk). > >> > >>A key factor here is that the testst where on a link with a 20ms > >>round-tip time, and using a singel TCP connection. So the switches > >>where in addition to a few routers on a 10Gbit/s network. > > = > > = > > Sorry, but tis is not a factor. If you think it is, then you > > are running with badly tuned send and receive maximum window > > sizes. > > = > > Latency =3D pool retention time =3D queue size > = > Then explain this, FreeBSD to FreeBSD on that link uses all CPU on the = > sender, the reciver is fine, but performance is not. NetBSD to FreeBSD = > fills the link (1 Gbit/s). On the same computers. MTU 4470. Send and = > receive maximum windows where tuned to the same values on NetBSD and = > FreeBSD. I should add that I have tried with MTU 1500 also. Using NetBSD as sender= = works fine (just a little bit higher CPU load). When we tried MTU1500 wit= h = FreeBSD as sender, we got even lower performance. Somebody else in this thread said that he had got full GE speed between = two FreeBSD boxes connected back-to-back. I don't question that, but that= = doesn't prove anything. The problem arises when You are trying to do this= = long-distance and have to handle a large mbuf queue. --B=F6rje From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 05:38:17 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BBA7B37B401 for ; Wed, 9 Apr 2003 05:38:17 -0700 (PDT) Received: from mail.dph.no (c3p0.nith.no [194.19.35.81]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7637E43F3F for ; Wed, 9 Apr 2003 05:38:16 -0700 (PDT) (envelope-from nyogtha@flipp.net) Received: from studraad ([10.36.5.145]) by mail.dph.no (Netscape Messaging Server 3.6) with SMTP id AAA62D8 for ; Wed, 9 Apr 2003 14:38:13 +0200 Message-ID: <001801c2fe94$e14a6800$9105240a@stavanger.nith.no> From: "Aslak Evang" To: Date: Wed, 9 Apr 2003 14:38:10 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Subject: just a test X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 12:38:18 -0000 X-List-Received-Date: Wed, 09 Apr 2003 12:38:18 -0000 is the list active yet? :) - Aslak --- .______________ ___ _________. \__ ___/ | \/ _____/ | It's a simple question! If |THE*********| | | / ~ \_____ \ |you were a hot-dog, would you|**HAPPY***| | | \ Y / \ |eat yourself? I know I would!|********SUMO| |____| \___|_ /_______ / | mail - nyogthaflipp.net |************| \/ \/ From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 05:46:52 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9367737B401 for ; Wed, 9 Apr 2003 05:46:52 -0700 (PDT) Received: from tesla.signet.nl (tesla.signet.nl [193.172.225.246]) by mx1.FreeBSD.org (Postfix) with ESMTP id A0FC443F85 for ; Wed, 9 Apr 2003 05:46:51 -0700 (PDT) (envelope-from remco@signet.nl) Received: from 127.0.0.1 (localhost.signet.nl [127.0.0.1]) by secure-mail.signet.nl (Postfix) with SMTP id A917615E5A8; Wed, 9 Apr 2003 14:46:42 +0200 (CEST) Received: from signet.nl (morse.signet.nl [193.172.225.230]) by tesla.signet.nl (Postfix) with ESMTP id 2767F15E5A6; Wed, 9 Apr 2003 14:46:42 +0200 (CEST) Message-ID: <3E941639.2030007@signet.nl> Date: Wed, 09 Apr 2003 14:46:49 +0200 From: Remco Bressers User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030213 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Aslak Evang References: <001801c2fe94$e14a6800$9105240a@stavanger.nith.no> In-Reply-To: <001801c2fe94$e14a6800$9105240a@stavanger.nith.no> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: performance@freebsd.org Subject: Re: just a test X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 12:46:53 -0000 X-List-Received-Date: Wed, 09 Apr 2003 12:46:53 -0000 guess it is :-) Aslak Evang wrote: >is the list active yet? :) > >- Aslak >--- > >.______________ ___ _________. > \__ ___/ | \/ _____/ | It's a simple question! If |THE*********| > | | / ~ \_____ \ |you were a hot-dog, would you|**HAPPY***| > | | \ Y / \ |eat yourself? I know I would!|********SUMO| > |____| \___|_ /_______ / | mail - nyogthaflipp.net |************| > \/ \/ >_______________________________________________ >freebsd-performance@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-performance >To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org" > > > From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 05:49:06 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 638C237B401 for ; Wed, 9 Apr 2003 05:49:06 -0700 (PDT) Received: from snow.fingers.co.za (snow.fingers.co.za [196.7.148.5]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4F8AE43F93 for ; Wed, 9 Apr 2003 05:49:05 -0700 (PDT) (envelope-from fingers@fingers.co.za) Received: by snow.fingers.co.za (Postfix, from userid 1001) id F291F17523; Wed, 9 Apr 2003 14:49:02 +0200 (SAST) Received: from localhost (localhost [127.0.0.1]) by snow.fingers.co.za (Postfix) with ESMTP id EE7C417413 for ; Wed, 9 Apr 2003 14:49:02 +0200 (SAST) Date: Wed, 9 Apr 2003 14:49:02 +0200 (SAST) From: fingers To: performance@freebsd.org In-Reply-To: <3E941639.2030007@signet.nl> Message-ID: <20030409144758.Q21378@snow.fingers.co.za> References: <001801c2fe94$e14a6800$9105240a@stavanger.nith.no> <3E941639.2030007@signet.nl> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Re: just a test X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 12:49:06 -0000 X-List-Received-Date: Wed, 09 Apr 2003 12:49:06 -0000 you could always just wait for someone to send a message that corresponds with the list's charter, or send 1 yourself, instead of starting a thread of test messages, to which others inevitably end up responding From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 06:07:07 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3D13237B401 for ; Wed, 9 Apr 2003 06:07:07 -0700 (PDT) Received: from wabakimi.chat.carleton.ca (wabakimi.chat.carleton.ca [134.117.1.98]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4EF6243FAF for ; Wed, 9 Apr 2003 06:07:06 -0700 (PDT) (envelope-from creyenga@connectmail.carleton.ca) Received: from fireball (terry.cavern.carleton.ca [134.117.93.187] (may be forged))h39D75OR016857 for ; Wed, 9 Apr 2003 09:07:05 -0400 (EDT) Message-ID: <000701c2fe98$f0cc4c40$0200000a@fireball> From: "Craig Reyenga" To: Date: Wed, 9 Apr 2003 09:07:04 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Subject: Users and setpriority() X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 13:07:07 -0000 X-List-Received-Date: Wed, 09 Apr 2003 13:07:07 -0000 First on topic post! Currently, setpriority() doesn't allow non- uid 0 users to use a nice value above 0. If you set "priority" in /etc/login.conf to a higher value, all you are doing is making every stinking process on the system run at that value initially, which is a disaster. My question is: Is there, or will there be a facility to allow certain non-root users to set higher/raise nice values? This would be a dream for desktop machines where there is essentially one user, because that user could have a non-zero uid, and control of process scheduling. -Craig From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 06:11:27 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3E54637B404 for ; Wed, 9 Apr 2003 06:11:27 -0700 (PDT) Received: from wabakimi.chat.carleton.ca (wabakimi.chat.carleton.ca [134.117.1.98]) by mx1.FreeBSD.org (Postfix) with ESMTP id 43DEA43F93 for ; Wed, 9 Apr 2003 06:11:26 -0700 (PDT) (envelope-from creyenga@connectmail.carleton.ca) Received: from fireball (resnet-93-187.cavern.carleton.ca [134.117.93.187]) h39DBPOR017378 for ; Wed, 9 Apr 2003 09:11:25 -0400 (EDT) Message-ID: <001301c2fe99$8c248450$0200000a@fireball> From: "Craig Reyenga" To: References: <000701c2fe98$f0cc4c40$0200000a@fireball> Date: Wed, 9 Apr 2003 09:11:25 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Subject: Re: Users and setpriority() X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 13:11:27 -0000 X-List-Received-Date: Wed, 09 Apr 2003 13:11:27 -0000 I am such a bonehead! I meant _BELOW_ zero!! From: "Craig Reyenga" > First on topic post! > > Currently, setpriority() doesn't allow non- uid 0 users to use a nice value > above 0. If you set "priority" in /etc/login.conf to a higher value, all you > are doing is making every stinking process on the system run at that value > initially, which is a disaster. My question is: Is there, or will there be a > facility to allow certain non-root users to set higher/raise nice values? > This would be a dream for desktop machines where there is essentially one > user, because that user could have a non-zero uid, and control of process > scheduling. > > -Craig -Craig From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 06:12:07 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C39AC37B401 for ; Wed, 9 Apr 2003 06:12:06 -0700 (PDT) Received: from relay.kiev.sovam.com (relay.kiev.sovam.com [212.109.32.5]) by mx1.FreeBSD.org (Postfix) with ESMTP id E205943F3F for ; Wed, 9 Apr 2003 06:12:05 -0700 (PDT) (envelope-from dimitry@al.org.ua) Received: from [212.109.32.116] (helo=dimitry.kiev.sovam.com) by relay.kiev.sovam.com with esmtp (Exim 3.36 #5) id 193FMx-000EEV-00; Wed, 09 Apr 2003 16:12:03 +0300 From: Dmitry Alyabyev To: "Craig Reyenga" Date: Wed, 9 Apr 2003 16:12:03 +0300 User-Agent: KMail/1.5 References: <000701c2fe98$f0cc4c40$0200000a@fireball> In-Reply-To: <000701c2fe98$f0cc4c40$0200000a@fireball> X-NCC-RegID: ua.svitonline MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200304091612.03211.dimitry@al.org.ua> cc: freebsd-performance@freebsd.org Subject: Re: Users and setpriority() X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: dimitry@al.org.ua List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 13:12:07 -0000 X-List-Received-Date: Wed, 09 Apr 2003 13:12:07 -0000 On Wednesday 09 April 2003 16:07, Craig Reyenga wrote: > First on topic post! > > Currently, setpriority() doesn't allow non- uid 0 users to use a nice value > above 0. If you set "priority" in /etc/login.conf to a higher value, all > you are doing is making every stinking process on the system run at that > value initially, which is a disaster. My question is: Is there, or will > there be a facility to allow certain non-root users to set higher/raise > nice values? This would be a dream for desktop machines where there is > essentially one user, because that user could have a non-zero uid, and > control of process scheduling. 'sudo /usr/bin/renice' will help -- Dimitry From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 06:20:38 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8AA2937B401 for ; Wed, 9 Apr 2003 06:20:38 -0700 (PDT) Received: from perrin.int.nxad.com (internal.ext.nxad.com [69.1.70.251]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2A65443FB1 for ; Wed, 9 Apr 2003 06:20:38 -0700 (PDT) (envelope-from sean@perrin.int.nxad.com) Received: by perrin.int.nxad.com (Postfix, from userid 1001) id D806D21062; Wed, 9 Apr 2003 06:20:23 -0700 (PDT) Date: Wed, 9 Apr 2003 06:20:23 -0700 From: Sean Chittenden To: Craig Reyenga Message-ID: <20030409132023.GQ79923@perrin.int.nxad.com> References: <000701c2fe98$f0cc4c40$0200000a@fireball> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <000701c2fe98$f0cc4c40$0200000a@fireball> User-Agent: Mutt/1.4i X-PGP-Key: finger seanc@FreeBSD.org X-PGP-Fingerprint: 3849 3760 1AFE 7B17 11A0 83A6 DD99 E31F BC84 B341 X-Web-Homepage: http://sean.chittenden.org/ cc: freebsd-performance@freebsd.org Subject: Re: Users and setpriority() X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 13:20:38 -0000 X-List-Received-Date: Wed, 09 Apr 2003 13:20:38 -0000 > First on topic post! Kind of... :) This list is more geared toward server performance, but that's not to say that desktop computing isn't performance sensitive or off topic... > Currently, setpriority() doesn't allow non- uid 0 users to use a > nice value *below* 0. If you set "priority" in /etc/login.conf to a > higher value, all you are doing is making every stinking process on > the system run at that value initially, which is a disaster. Unless I'm misunderstanding what you mean by disaster, this isn't a problem unless a system's CPU resources are in contention. If it isn't, then the scheduler won't need to rely on the priority value of a process to make scheduling decisions on what processes get how much of the CPUs time. > My question is: Is there, or will there be a facility to allow > certain non-root users to set higher/raise nice values? This would > be a dream for desktop machines where there is essentially one user, > because that user could have a non-zero uid, and control of process > scheduling. There isn't a mechanism other than sudo renice (as already suggested). -sc -- Sean Chittenden From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 07:03:14 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6E97837B401 for ; Wed, 9 Apr 2003 07:03:14 -0700 (PDT) Received: from otter3.centtech.com (moat3.centtech.com [207.200.51.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9147043F75 for ; Wed, 9 Apr 2003 07:03:13 -0700 (PDT) (envelope-from anderson@centtech.com) Received: from centtech.com (electron.centtech.com [204.177.173.173]) by otter3.centtech.com (8.12.3/8.12.3) with ESMTP id h39E3C56048930 for ; Wed, 9 Apr 2003 09:03:12 -0500 (CDT) (envelope-from anderson@centtech.com) Message-ID: <3E942819.1080909@centtech.com> Date: Wed, 09 Apr 2003 09:03:05 -0500 From: Eric Anderson User-Agent: Mozilla/5.0 (X11; U; Linux i386; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0 X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-performance@freebsd.org Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Subject: New list eh? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 14:03:14 -0000 X-List-Received-Date: Wed, 09 Apr 2003 14:03:14 -0000 This is great - I've been asking for this for well over a year! I've also been thinking about writing all this stuff into a "FreeBSD Performance Guide" for the handbook. What does everyone think of that? Eric -- ------------------------------------------------------------------ Eric Anderson Systems Administrator Centaur Technology Attitudes are contagious, is yours worth catching? ------------------------------------------------------------------ From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 07:06:36 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DD55737B401 for ; Wed, 9 Apr 2003 07:06:36 -0700 (PDT) Received: from snark.ratmir.ru (snark.ratmir.ru [213.24.248.177]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1DF9C43FAF for ; Wed, 9 Apr 2003 07:06:35 -0700 (PDT) (envelope-from freebsd@snark.ratmir.ru) Received: from snark.ratmir.ru (freebsd@localhost [127.0.0.1]) by snark.ratmir.ru (8.12.9/8.12.9) with ESMTP id h39E6QDR034301; Wed, 9 Apr 2003 18:06:27 +0400 (MSD) (envelope-from freebsd@snark.ratmir.ru) Received: (from freebsd@localhost) by snark.ratmir.ru (8.12.9/8.12.9/Submit) id h39E6QXI034300; Wed, 9 Apr 2003 18:06:26 +0400 (MSD) Date: Wed, 9 Apr 2003 18:06:26 +0400 From: Alex Semenyaka To: Eric Anderson Message-ID: <20030409140626.GD33718@snark.ratmir.ru> References: <3E942819.1080909@centtech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3E942819.1080909@centtech.com> User-Agent: Mutt/1.5.4i X-Mailman-Approved-At: Wed, 09 Apr 2003 07:30:38 -0700 cc: freebsd-performance@freebsd.org Subject: Re: New list eh? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 14:06:37 -0000 X-List-Received-Date: Wed, 09 Apr 2003 14:06:37 -0000 On Wed, Apr 09, 2003 at 09:03:05AM -0500, Eric Anderson wrote: > I've also been thinking about writing all this stuff into a "FreeBSD > Performance Guide" for the handbook. What does everyone think of that? That is really good idea. But it might be a lot of work to gather information, compile it, sort into topics and so on. Probably as the starting point the manual tuning(7) can be taken? SY, Alex From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 07:09:58 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C2D9E37B401 for ; Wed, 9 Apr 2003 07:09:58 -0700 (PDT) Received: from otter3.centtech.com (moat3.centtech.com [207.200.51.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 333C543F75 for ; Wed, 9 Apr 2003 07:09:58 -0700 (PDT) (envelope-from anderson@centtech.com) Received: from centtech.com (electron.centtech.com [204.177.173.173]) by otter3.centtech.com (8.12.3/8.12.3) with ESMTP id h39E9v56050500; Wed, 9 Apr 2003 09:09:57 -0500 (CDT) (envelope-from anderson@centtech.com) Message-ID: <3E9429AF.1030605@centtech.com> Date: Wed, 09 Apr 2003 09:09:51 -0500 From: Eric Anderson User-Agent: Mozilla/5.0 (X11; U; Linux i386; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Alex Semenyaka References: <3E942819.1080909@centtech.com> <20030409140626.GD33718@snark.ratmir.ru> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-performance@freebsd.org Subject: Re: New list eh? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 14:09:59 -0000 X-List-Received-Date: Wed, 09 Apr 2003 14:09:59 -0000 Alex Semenyaka wrote: > On Wed, Apr 09, 2003 at 09:03:05AM -0500, Eric Anderson wrote: > >>I've also been thinking about writing all this stuff into a "FreeBSD >>Performance Guide" for the handbook. What does everyone think of that? > > > That is really good idea. But it might be a lot of work to gather information, > compile it, sort into topics and so on. Probably as the starting point the > manual tuning(7) can be taken? Yes, the tuning man page is ok, but it doesn't address the real heavy duty questions. For instance, I run a mega-heavily used NFS server (well, several of them actually), and I want to know what knobs to turn to make it perform the absolute best. I also occasionally get errors, and I'd like to know how to fix those. The tuning man page says nothing about that stuff. Yes, it will be lots of work, but I already have a lot of time put in to the doc project so it's no big deal for me. Eric -- ------------------------------------------------------------------ Eric Anderson Systems Administrator Centaur Technology Attitudes are contagious, is yours worth catching? ------------------------------------------------------------------ From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 07:10:30 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 99B1D37B401 for ; Wed, 9 Apr 2003 07:10:30 -0700 (PDT) Received: from wabakimi.chat.carleton.ca (wabakimi.chat.carleton.ca [134.117.1.98]) by mx1.FreeBSD.org (Postfix) with ESMTP id C4DED43FA3 for ; Wed, 9 Apr 2003 07:10:29 -0700 (PDT) (envelope-from creyenga@connectmail.carleton.ca) Received: from fireball (terry.cavern.carleton.ca [134.117.93.187] (may be forged))h39EACOR025740; Wed, 9 Apr 2003 10:10:12 -0400 (EDT) Message-ID: <000701c2fea1$c27bacb0$0200000a@fireball> From: "Craig Reyenga" To: "Eric Anderson" References: <3E942819.1080909@centtech.com> Date: Wed, 9 Apr 2003 10:10:12 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 cc: freebsd-performance@freebsd.org Subject: Re: New list eh? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 14:10:30 -0000 X-List-Received-Date: Wed, 09 Apr 2003 14:10:30 -0000 From: "Eric Anderson" > This is great - I've been asking for this for well over a year! > > I've also been thinking about writing all this stuff into a "FreeBSD > Performance Guide" for the handbook. What does everyone think of that? > That sounds pretty cool. -Craig From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 09:32:12 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E6A4437B401 for ; Wed, 9 Apr 2003 09:32:12 -0700 (PDT) Received: from mail.broadpark.no (mail.broadpark.no [217.13.4.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2F63843FAF for ; Wed, 9 Apr 2003 09:32:12 -0700 (PDT) (envelope-from nyogtha@flipp.net) Received: from LAPDANCE (deathpolka.nyogtha.org [217.13.20.12]) by mail.broadpark.no (Postfix) with SMTP id 86B9F791FE for ; Wed, 9 Apr 2003 18:32:10 +0200 (MEST) Message-ID: <002a01c2feb5$8d7e8550$3800000a@LAPDANCE> From: "Aslak Evang" To: References: <001801c2fe94$e14a6800$9105240a@stavanger.nith.no> <3E941639.2030007@signet.nl> <20030409144758.Q21378@snow.fingers.co.za> Date: Wed, 9 Apr 2003 18:32:03 +0200 Organization: THS MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Subject: Re: just a test X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 16:32:13 -0000 X-List-Received-Date: Wed, 09 Apr 2003 16:32:13 -0000 > you could always just wait for someone to send a message that > corresponds with the list's charter, or send 1 yourself, instead of > starting a thread > of test messages, to which others inevitably end up responding And you could have ignored the thread instead of taking part and making it a discussion... I wasn't expecting lots of answers but since I subscribed and a number of hours passed without any posts I felt the need to check. - Aslak From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 09:33:00 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6D7E737B401 for ; Wed, 9 Apr 2003 09:33:00 -0700 (PDT) Received: from mail.broadpark.no (mail.broadpark.no [217.13.4.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id E0E2D43F85 for ; Wed, 9 Apr 2003 09:32:59 -0700 (PDT) (envelope-from nyogtha@flipp.net) Received: from LAPDANCE (deathpolka.nyogtha.org [217.13.20.12]) by mail.broadpark.no (Postfix) with SMTP id EC46F791F9 for ; Wed, 9 Apr 2003 18:32:58 +0200 (MEST) Message-ID: <003201c2feb5$aaad3950$3800000a@LAPDANCE> From: "Aslak Evang" To: References: <3E942819.1080909@centtech.com> Date: Wed, 9 Apr 2003 18:32:52 +0200 Organization: THS MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Subject: Re: New list eh? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 16:33:00 -0000 X-List-Received-Date: Wed, 09 Apr 2003 16:33:00 -0000 > I've also been thinking about writing all this stuff into a "FreeBSD > Performance Guide" for the handbook. What does everyone think of > that? Great idea. This is a subject that I really want to learn much more about. - Aslak From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 09:34:46 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5509437B401 for ; Wed, 9 Apr 2003 09:34:46 -0700 (PDT) Received: from mail.trident-uk.co.uk (mail.trident-uk.co.uk [81.3.89.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9E80843FA3 for ; Wed, 9 Apr 2003 09:34:43 -0700 (PDT) (envelope-from jamie@tridentmicrosystems.co.uk) Received: from localhost (localhost.pe.trident-uk.co.uk [127.0.0.1]) by mail.trident-uk.co.uk (Postfix) with ESMTP id 70B07638; Wed, 9 Apr 2003 17:33:48 +0100 (BST) Received: from jamieheckford (wrkstn-74.pe.trident-uk.co.uk [192.168.100.74]) by mail.trident-uk.co.uk (Postfix) with ESMTP id D8C2E64F; Wed, 9 Apr 2003 17:33:46 +0100 (BST) From: "Jamie Heckford" To: "'Aslak Evang'" Date: Wed, 9 Apr 2003 17:31:10 +0100 Organization: Trident Microsystems Ltd. Message-ID: <000601c2feb5$6dd05260$4a64a8c0@jamieheckford> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.3416 Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 In-Reply-To: <003201c2feb5$aaad3950$3800000a@LAPDANCE> X-Virus-Scanned: by AMaViS perl-11 cc: freebsd-performance@freebsd.org Subject: RE: New list eh? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: jamie@tridentmicrosystems.co.uk List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 16:34:46 -0000 X-List-Received-Date: Wed, 09 Apr 2003 16:34:46 -0000 > > > I've also been thinking about writing all this stuff into a > "FreeBSD > > Performance Guide" for the handbook. What does everyone think of > > that? > That would be a brilliant idea. I would be happy to help with writing such a section for the handbook. A lot of people I have helped have found tuning(8) to be of great help, maybe expand on and keep that updated. Jamie From owner-freebsd-performance@FreeBSD.ORG Wed Apr 9 11:26:39 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A8F5A37B401 for ; Wed, 9 Apr 2003 11:26:39 -0700 (PDT) Received: from otter3.centtech.com (moat3.centtech.com [207.200.51.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id D474443FAF for ; Wed, 9 Apr 2003 11:26:38 -0700 (PDT) (envelope-from anderson@centtech.com) Received: from centtech.com (electron.centtech.com [204.177.173.173]) by otter3.centtech.com (8.12.3/8.12.3) with ESMTP id h39IQO56082951; Wed, 9 Apr 2003 13:26:24 -0500 (CDT) (envelope-from anderson@centtech.com) Message-ID: <3E9465CA.5020609@centtech.com> Date: Wed, 09 Apr 2003 13:26:18 -0500 From: Eric Anderson User-Agent: Mozilla/5.0 (X11; U; Linux i386; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0 X-Accept-Language: en-us, en MIME-Version: 1.0 To: jamie@tridentmicrosystems.co.uk References: <000601c2feb5$6dd05260$4a64a8c0@jamieheckford> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-performance@freebsd.org cc: 'Aslak Evang' Subject: Re: New list eh? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2003 18:26:39 -0000 X-List-Received-Date: Wed, 09 Apr 2003 18:26:39 -0000 Jamie Heckford wrote: >>>I've also been thinking about writing all this stuff into a >> >>"FreeBSD >> >>>Performance Guide" for the handbook. What does everyone think of >>>that? >> > > > That would be a brilliant idea. I would be happy to help with writing > such a section for the handbook. > > A lot of people I have helped have found tuning(8) to be of great help, > maybe expand on and keep that updated. Great! I'll keep you in mind as I build it. Actually, to get started, it would be great if I could get a list of solutions anyone has found for any type of performance tweak, so I can start compiling them and merging them in with the current information in tuning. Eric -- ------------------------------------------------------------------ Eric Anderson Systems Administrator Centaur Technology Attitudes are contagious, is yours worth catching? ------------------------------------------------------------------ From owner-freebsd-performance@FreeBSD.ORG Fri Apr 11 09:09:55 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C9F2E37B401; Fri, 11 Apr 2003 09:09:55 -0700 (PDT) Received: from bluejay.mail.pas.earthlink.net (bluejay.mail.pas.earthlink.net [207.217.120.218]) by mx1.FreeBSD.org (Postfix) with ESMTP id 587A043FBD; Fri, 11 Apr 2003 09:09:50 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0023.cvx40-bradley.dialup.earthlink.net ([216.244.42.23] helo=mindspring.com) by bluejay.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19415z-0007VY-00; Fri, 11 Apr 2003 09:09:44 -0700 Message-ID: <3E96E873.9CC19544@mindspring.com> Date: Fri, 11 Apr 2003 09:08:19 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Mattias Pantzare References: <20030410171640.C44793B2@porter.dc.luth.se> <3E95E446.73B7E510@mindspring.com> <3E95E8E9.3080102@ludd.luth.se> <3E95F03C.2A01561D@mindspring.com> <3E96CA1F.4070000@ludd.luth.se> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a45886caa052154494972ef95f061a06a7a2d4e88014a4647c350badd9bab72f9c350badd9bab72f9c cc: bj@dc.luth.se cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 16:09:56 -0000 Mattias Pantzare wrote: > Terry Lambert wrote: > > Latency = pool retention time = queue size > > Then explain this, FreeBSD to FreeBSD on that link uses all CPU on the > sender, the reciver is fine, but performance is not. NetBSD to FreeBSD > fills the link (1 Gbit/s). On the same computers. MTU 4470. Send and > receive maximum windows where tuned to the same values on NetBSD and > FreeBSD. I rather expect that the number of jumbogram buffers on FreeBSD is tiny and/or your MTU is not being properly negotiated between the endpoints, and you are fragging the bejesus out of your packets. A good thing to look at at this point would be: o Clean boot of FreeBSD target o Run NetBSD against it o Save statistics o Clean boot of FreeBSD target o Run FreeBSD against it o Save statistics o Compare saved statistics of NetBSD vs. FreeBSD against the target machine > And packet loss will affect the performance diffrently if you have a > large bandwith-latency product. You mean "bandwidth delay product". Yes, assuming you have packet loss. From your description of your setup, packet loss should not be possible, so we can discount it as a factor. You may want to disable fast restart on the FreeBSD sender. -- Terry From owner-freebsd-performance@FreeBSD.ORG Fri Apr 11 09:24:16 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3741937B401; Fri, 11 Apr 2003 09:24:16 -0700 (PDT) Received: from samson.dc.luth.se (samson.dc.luth.se [130.240.112.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id 97D6443FB1; Fri, 11 Apr 2003 09:24:14 -0700 (PDT) (envelope-from bj@dc.luth.se) Received: from dc.luth.se (root@bompe.dc.luth.se [130.240.60.42]) by samson.dc.luth.se (8.12.5/8.12.5) with ESMTP id h3BGOAjY021255; Fri, 11 Apr 2003 18:24:10 +0200 (MET DST) Received: from bompe.dc.luth.se (bj@localhost.dc.luth.se [127.0.0.1]) by dc.luth.se (8.12.6/8.11.3) with ESMTP id h3BGO9Kl087165; Fri, 11 Apr 2003 18:24:09 +0200 (CEST) (envelope-from bj@bompe.dc.luth.se) Message-Id: <200304111624.h3BGO9Kl087165@dc.luth.se> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Terry Lambert In-reply-to: Your message of Fri, 11 Apr 2003 09:08:19 PDT. <3E96E873.9CC19544@mindspring.com> Dcc: X-Disposition-notification-to: Borje.Josefsson@dc.luth.se X-uri: http://www.dc.luth.se/~bj/index.html Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Fri, 11 Apr 2003 18:24:09 +0200 From: Borje Josefsson cc: Mattias Pantzare cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: bj@dc.luth.se List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 16:24:16 -0000 On Fri, 11 Apr 2003 09:08:19 PDT Terry Lambert wrote: > Mattias Pantzare wrote: > > Terry Lambert wrote: > > > Latency =3D pool retention time =3D queue size > > = > > Then explain this, FreeBSD to FreeBSD on that link uses all CPU on th= e > > sender, the reciver is fine, but performance is not. NetBSD to FreeBS= D > > fills the link (1 Gbit/s). On the same computers. MTU 4470. Send and > > receive maximum windows where tuned to the same values on NetBSD and > > FreeBSD. > = > I rather expect that the number of jumbogram buffers on FreeBSD is > tiny and/or your MTU is not being properly negotiated between the > endpoints, and you are fragging the bejesus out of your packets. Both endpoints have MTU set to 4470, as have all the routers inbetween. = "traceroute -n -Q 1 -q 1 -w 1 -f remotehost 4470" and netstat both report= s = 4470. = > A good thing to look at at this point would be: > = > o Clean boot of FreeBSD target > o Run NetBSD against it > o Save statistics What type of statistics do You mean? > o Clean boot of FreeBSD target > o Run FreeBSD against it > o Save statistics > o Compare saved statistics of NetBSD vs. FreeBSD > against the target machine > = > > And packet loss will affect the performance diffrently if you have a > > large bandwith-latency product. > = > You mean "bandwidth delay product". Yes, assuming you have packet > loss. From your description of your setup, packet loss should not > be possible, so we can discount it as a factor. Of cause packet loss is possible on a nationwide network. If I loose a = packet on the (expected) 10 second test (with NetBSD), recovering from = that drops performance from 900+ to ~550 Mbps. Thos shows very clearly if= = I run "netstat 1". > You may want to > disable fast restart on the FreeBSD sender. Which OID is that? As a side note, I tried to set tcp.inflight_enable, but that made things = much worse. --B=F6rje From owner-freebsd-performance@FreeBSD.ORG Fri Apr 11 09:24:19 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 94E6137B401; Fri, 11 Apr 2003 09:24:19 -0700 (PDT) Received: from bluejay.mail.pas.earthlink.net (bluejay.mail.pas.earthlink.net [207.217.120.218]) by mx1.FreeBSD.org (Postfix) with ESMTP id D014943FB1; Fri, 11 Apr 2003 09:24:18 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0023.cvx40-bradley.dialup.earthlink.net ([216.244.42.23] helo=mindspring.com) by bluejay.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 1941K1-0001YY-00; Fri, 11 Apr 2003 09:24:14 -0700 Message-ID: <3E96EBD7.5CA4C171@mindspring.com> Date: Fri, 11 Apr 2003 09:22:47 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: bj@dc.luth.se References: <200304111407.h3BE7hKl086838@dc.luth.se> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4fcc023af216ec7303b45f987bfdd2aa4350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c cc: Mattias Pantzare cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 16:24:20 -0000 Borje Josefsson wrote: > I should add that I have tried with MTU 1500 also. Using NetBSD as sender > works fine (just a little bit higher CPU load). When we tried MTU1500 with > FreeBSD as sender, we got even lower performance. > > Somebody else in this thread said that he had got full GE speed between > two FreeBSD boxes connected back-to-back. I don't question that, but that > doesn't prove anything. The problem arises when You are trying to do this > long-distance and have to handle a large mbuf queue. The boxes were not connected "back to back", they were connected through three Gigabit switches and a VLAN trunk. But they were in a lab, yes. I'd be happy to try long distance for you, and even go so far as to fix the problem for you, if you are willing to drop 10GBit fiber to my house. 8-) 8-). As far as a large mbuf queue, one thing that's an obvious difference is SACK support; however, this can not be the problem, since the NetBSD->FreeBSD speed is unafected (supposedly). What is the FreeBSD->NetBSD speed? Some knobs to try on FreeBSD: net.inet.ip.intr_queue_maxlen -> 300 net.inet.ip.check_interface -> 0 net.inet.tcp.rfc1323 -> 0 net.inet.tcp.inflight_enable -> 1 net.inet.tcp.inflight_debug -> 0 net.inet.tcp.delayed_ack -> 0 net.inet.tcp.newreno -> 0 net.inet.tcp.slowstart_flightsize -> 4 net.inet.tcp.msl -> 1000 net.inet.tcp.always_keepalive -> 0 net.inet.tcp.sendspace -> 65536 (on sender) Don't try them all at once and expect magic; you will probably need some combination. Also, try recompiling your kernel *without* IPSEC support. -- Terry From owner-freebsd-performance@FreeBSD.ORG Fri Apr 11 09:34:27 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2FD8637B401; Fri, 11 Apr 2003 09:34:27 -0700 (PDT) Received: from puffin.mail.pas.earthlink.net (puffin.mail.pas.earthlink.net [207.217.120.139]) by mx1.FreeBSD.org (Postfix) with ESMTP id 728C743FBD; Fri, 11 Apr 2003 09:34:26 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0023.cvx40-bradley.dialup.earthlink.net ([216.244.42.23] helo=mindspring.com) by puffin.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 1941Tq-0001O5-00; Fri, 11 Apr 2003 09:34:23 -0700 Message-ID: <3E96EE33.FAF4FABB@mindspring.com> Date: Fri, 11 Apr 2003 09:32:51 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: bj@dc.luth.se References: <200304111624.h3BGO9Kl087165@dc.luth.se> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4fcc023af216ec73058faf13ee1779ec9350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c cc: Mattias Pantzare cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 16:34:27 -0000 Borje Josefsson wrote: > > A good thing to look at at this point would be: > > > > o Clean boot of FreeBSD target > > o Run NetBSD against it > > o Save statistics > > What type of statistics do You mean? Dropped packets; frags; delayed acks. The stuff you get from "netstat -s" and "netstat -m". > > You mean "bandwidth delay product". Yes, assuming you have packet > > loss. From your description of your setup, packet loss should not > > be possible, so we can discount it as a factor. > > Of cause packet loss is possible on a nationwide network. If I loose a > packet on the (expected) 10 second test (with NetBSD), recovering from > that drops performance from 900+ to ~550 Mbps. Thos shows very clearly if > I run "netstat 1". You are running these tests over .se's nationwide network? > > You may want to > > disable fast restart on the FreeBSD sender. > > Which OID is that? See other posting; I listed a bunch of OIDs to play with. One other, if you are running -CURRENT, would be: net.isr.netisr_enable -> 1 This basically implements part 1 of 3 of LRP, which should reduce your per packet latency by about 50ms +/- 50ms. Note: The logic here is inverted; you'd expect "0=No NETISR", but it's just the opposite. > As a side note, I tried to set tcp.inflight_enable, but that made things > much worse. It's less useful on GigE than elsewhere (IMO). Use netisr_enable instead. -- Terry From owner-freebsd-performance@FreeBSD.ORG Fri Apr 11 09:35:06 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6524637B401; Fri, 11 Apr 2003 09:35:06 -0700 (PDT) Received: from samson.dc.luth.se (samson.dc.luth.se [130.240.112.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id F0FD143F3F; Fri, 11 Apr 2003 09:35:04 -0700 (PDT) (envelope-from bj@dc.luth.se) Received: from dc.luth.se (root@bompe.dc.luth.se [130.240.60.42]) by samson.dc.luth.se (8.12.5/8.12.5) with ESMTP id h3BGZ0jY022899; Fri, 11 Apr 2003 18:35:00 +0200 (MET DST) Received: from bompe.dc.luth.se (bj@localhost.dc.luth.se [127.0.0.1]) by dc.luth.se (8.12.6/8.11.3) with ESMTP id h3BGZ0Kl087202; Fri, 11 Apr 2003 18:35:00 +0200 (CEST) (envelope-from bj@bompe.dc.luth.se) Message-Id: <200304111635.h3BGZ0Kl087202@dc.luth.se> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Terry Lambert In-reply-to: Your message of Fri, 11 Apr 2003 09:22:47 PDT. <3E96EBD7.5CA4C171@mindspring.com> Dcc: X-Disposition-notification-to: Borje.Josefsson@dc.luth.se X-uri: http://www.dc.luth.se/~bj/index.html Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Fri, 11 Apr 2003 18:35:00 +0200 From: Borje Josefsson cc: Mattias Pantzare cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: bj@dc.luth.se List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 16:35:06 -0000 On Fri, 11 Apr 2003 09:22:47 PDT Terry Lambert wrote: > Borje Josefsson wrote: > > I should add that I have tried with MTU 1500 also. Using NetBSD as se= nder > > works fine (just a little bit higher CPU load). When we tried MTU1500= with > > FreeBSD as sender, we got even lower performance. > > = > > Somebody else in this thread said that he had got full GE speed betwe= en > > two FreeBSD boxes connected back-to-back. I don't question that, but = that > > doesn't prove anything. The problem arises when You are trying to do = this > > long-distance and have to handle a large mbuf queue. > = > The boxes were not connected "back to back", they were connected > through three Gigabit switches and a VLAN trunk. But they were in > a lab, yes. > = > I'd be happy to try long distance for you, and even go so far as > to fix the problem for you, if you are willing to drop 10GBit fiber > to my house. 8-) 8-). = > = > As far as a large mbuf queue, one thing that's an obvious difference > is SACK support; however, this can not be the problem, since the > NetBSD->FreeBSD speed is unafected (supposedly). > = > What is the FreeBSD->NetBSD speed? > = > Some knobs to try on FreeBSD: ttcp-t: buflen=3D61440, nbuf=3D20345, align=3D16384/0, port=3D5001 ttcp-t: socket ttcp-t: connect ttcp-t: 1249996800 bytes in 16.82 real seconds =3D 567.09 Mbit/sec +++ ttcp-t: 20345 I/O calls, msec/call =3D 0.85, calls/sec =3D 1209.79 ttcp-t: 0.0user 15.5sys 0:16real 92% 16i+380d 326maxrss 0+15pf 16+232csw During that time "top" shows (on the sender): CPU states: 0.4% user, 0.0% nice, 93.0% system, 6.6% interrupt, 0.0% = idle Just for comparation, running in the other direction (with NetBSD as = sender): CPU states: 0.0% user, 0.0% nice, 19.9% system, 13.9% interrupt, 66.2% = idle Netstat of that test: bge0 in bge0 out total in total out = packets errs packets errs colls packets errs packets errs colls 14709 0 22742 0 0 14709 0 22742 0 0 18602 0 28002 0 0 18602 0 28002 0 0 18603 0 28006 0 0 18603 0 28006 0 0 18599 0 28009 0 0 18600 0 28009 0 0 18605 0 28006 0 0 18605 0 28006 0 0 18607 0 28006 0 0 18608 0 28006 0 0 18608 0 28006 0 0 18608 0 28006 0 0 bge0 in bge0 out total in total out = packets errs packets errs colls packets errs packets errs colls 14389511 908 14772089 0 0 18404167 908 18231823 0 0 18607 0 28003 0 0 18607 0 28003 0 0 18598 0 28006 0 0 18599 0 28006 0 0 5823 0 8161 0 0 5823 0 8161 0 0 Note how stable it is! > net.inet.ip.intr_queue_maxlen -> 300 > net.inet.ip.check_interface -> 0 > net.inet.tcp.rfc1323 -> 0 > net.inet.tcp.inflight_enable -> 1 > net.inet.tcp.inflight_debug -> 0 > net.inet.tcp.delayed_ack -> 0 > net.inet.tcp.newreno -> 0 > net.inet.tcp.slowstart_flightsize -> 4 > net.inet.tcp.msl -> 1000 > net.inet.tcp.always_keepalive -> 0 > net.inet.tcp.sendspace -> 65536 (on sender) Hmm. Isn't 65536 an order of magnitude to low? I used = tcp.sendspace=3D3223281 for the test above. RTTmax =3D 20.63 ms. Buffer s= ize = needed =3D 2578625 bytes, and then add a few %. > Don't try them all at once and expect magic; you will probably need > some combination. > = > Also, try recompiling your kernel *without* IPSEC support. I'll do that. --B=F6rje From owner-freebsd-performance@FreeBSD.ORG Fri Apr 11 09:43:03 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id ECC1D37B404; Fri, 11 Apr 2003 09:43:03 -0700 (PDT) Received: from samson.dc.luth.se (samson.dc.luth.se [130.240.112.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0EDD643F75; Fri, 11 Apr 2003 09:43:02 -0700 (PDT) (envelope-from bj@dc.luth.se) Received: from dc.luth.se (root@bompe.dc.luth.se [130.240.60.42]) by samson.dc.luth.se (8.12.5/8.12.5) with ESMTP id h3BGgwjY023931; Fri, 11 Apr 2003 18:42:58 +0200 (MET DST) Received: from bompe.dc.luth.se (bj@localhost.dc.luth.se [127.0.0.1]) by dc.luth.se (8.12.6/8.11.3) with ESMTP id h3BGgwKl087226; Fri, 11 Apr 2003 18:42:58 +0200 (CEST) (envelope-from bj@bompe.dc.luth.se) Message-Id: <200304111642.h3BGgwKl087226@dc.luth.se> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Terry Lambert In-reply-to: Your message of Fri, 11 Apr 2003 09:32:51 PDT. <3E96EE33.FAF4FABB@mindspring.com> Dcc: X-Disposition-notification-to: Borje.Josefsson@dc.luth.se X-uri: http://www.dc.luth.se/~bj/index.html Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Fri, 11 Apr 2003 18:42:57 +0200 From: Borje Josefsson cc: Mattias Pantzare cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: bj@dc.luth.se List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 16:43:04 -0000 On Fri, 11 Apr 2003 09:32:51 PDT Terry Lambert wrote: > Borje Josefsson wrote: > > > A good thing to look at at this point would be: > > > > > > o Clean boot of FreeBSD target > > > o Run NetBSD against it > > > o Save statistics > > = > > What type of statistics do You mean? > = > Dropped packets; frags; delayed acks. The stuff you get from > "netstat -s" and "netstat -m". > = > > > You mean "bandwidth delay product". Yes, assuming you have packet > > > loss. From your description of your setup, packet loss should not > > > be possible, so we can discount it as a factor. > > = > > Of cause packet loss is possible on a nationwide network. If I loose = a > > packet on the (expected) 10 second test (with NetBSD), recovering fro= m > > that drops performance from 900+ to ~550 Mbps. Thos shows very clearl= y if > > I run "netstat 1". > = > You are running these tests over .se's nationwide network? One of them, I'm using GigaSunet, the nationwide network for the = universities in Sweden. 10 Gbps to 22 cities, connecting 35 universities = with 2,5G (redundant). This not a "research network" per se, it's the = "production" network for the universities. If I suceeed with this, the = next challenge (no kidding) is to try the same thing over the commodity = Internet from here to California (or sonething similar), showing that You= = don't need a "private" network for high speed communication if You do = things right. After that I've thought of testing against Japan or New = Zealand, which is about as far You can get from here :-) --B=F6rje From owner-freebsd-performance@FreeBSD.ORG Fri Apr 11 10:09:29 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E97B737B401; Fri, 11 Apr 2003 10:09:28 -0700 (PDT) Received: from samson.dc.luth.se (samson.dc.luth.se [130.240.112.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4E81443FA3; Fri, 11 Apr 2003 10:09:27 -0700 (PDT) (envelope-from bj@dc.luth.se) Received: from dc.luth.se (root@bompe.dc.luth.se [130.240.60.42]) by samson.dc.luth.se (8.12.5/8.12.5) with ESMTP id h3BH9MjY028181; Fri, 11 Apr 2003 19:09:22 +0200 (MET DST) Received: from bompe.dc.luth.se (bj@localhost.dc.luth.se [127.0.0.1]) by dc.luth.se (8.12.6/8.11.3) with ESMTP id h3BH9LKl087299; Fri, 11 Apr 2003 19:09:21 +0200 (CEST) (envelope-from bj@bompe.dc.luth.se) Message-Id: <200304111709.h3BH9LKl087299@dc.luth.se> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Terry Lambert In-reply-to: Your message of Fri, 11 Apr 2003 09:32:51 PDT. <3E96EE33.FAF4FABB@mindspring.com> Dcc: X-Disposition-notification-to: Borje.Josefsson@dc.luth.se X-uri: http://www.dc.luth.se/~bj/index.html Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Date: Fri, 11 Apr 2003 19:09:21 +0200 From: Borje Josefsson cc: Mattias Pantzare cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: bj@dc.luth.se List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 17:09:29 -0000 On Fri, 11 Apr 2003 09:32:51 PDT Terry Lambert wrote: > See other posting; I listed a bunch of OIDs to play with. > = > One other, if you are running -CURRENT, would be: > = > net.isr.netisr_enable -> 1 I'm running 4.8RC on the sender right now. I did a quick test with some combination of the OID:s You sent, except I = didn't reboot between each test: > Some knobs to try on FreeBSD: > > net.inet.ip.intr_queue_maxlen -> 300 > net.inet.ip.check_interface -> 0 > net.inet.tcp.rfc1323 -> 0 > net.inet.tcp.inflight_enable -> 1 > net.inet.tcp.inflight_debug -> 0 > net.inet.tcp.delayed_ack -> 0 > net.inet.tcp.newreno -> 0 > net.inet.tcp.slowstart_flightsize -> 4 > net.inet.tcp.msl -> 1000 > net.inet.tcp.always_keepalive -> 0 > net.inet.tcp.sendspace -> 65536 (on sender) I didn't test all combinations, but all of them give the same result (~57= 0 = Mbit/sec, and I run out of CPU), except inflight_enable, which = *considerably* lowers the performance.I get around 570 Mbit/sec, and run = out of CPU. Netstat -m (tcp and ip portion) when I started and after the trials: =3D=3D=3D before =3D=3D=3D tcp: 12103350 packets sent 2692690 data packets (1962127658 bytes) 10203 data packets (14829100 bytes) retransmitted 0 resends initiated by MTU discovery 6446084 ack-only packets (199 delayed) 0 URG only packets 0 window probe packets 2955524 window update packets 216 control packets 16306846 packets received 847040 acks (for 1962119767 bytes) 14399 duplicate acks 0 acks for unsent data 15281399 packets (2057886406 bytes) received in-sequence 4425 completely duplicate packets (4551322 bytes) 0 old duplicate packets 65 packets with some dup. data (14957 bytes duped) 38818 out-of-order packets (59514384 bytes) 0 packets (0 bytes) of data after window 0 window probes 120161 window update packets 3 packets received after close 4 discarded for bad checksums 0 discarded for bad header offset fields 0 discarded because packet too short 160 connection requests 49 connection accepts 0 bad connection attempts 0 listen queue overflows 63 connections established (including accepts) 383 connections closed (including 5 drops) 10 connections updated cached RTT on close 10 connections updated cached RTT variance on close 2 connections updated cached ssthresh on close 146 embryonic connections dropped 846952 segments updated rtt (of 521492 attempts) 36 retransmit timeouts 2 connections dropped by rexmit timeout 0 persist timeouts 0 connections dropped by persist timeout 438 keepalive timeouts 438 keepalive probes sent 0 connections dropped by keepalive 26449 correct ACK header predictions 15280149 correct data packet header predictions 49 syncache entries added 0 retransmitted 0 dupsyn 0 dropped 49 completed 0 bucket overflow 0 cache overflow 0 reset 0 stale 0 aborted 0 badack 0 unreach 0 zone failures 0 cookies sent 0 cookies received ip: 16393480 total packets received 0 bad header checksums 0 with size smaller than minimum 0 with data size < data length 0 with ip length > max ip packet size 0 with header length < data size 0 with data length < header length 0 with bad options 0 with incorrect version number 1 fragment received 0 fragments dropped (dup or out of space) 1 fragment dropped after timeout 0 packets reassembled ok 16393029 packets for this host 450 packets for unknown/unsupported protocol 0 packets forwarded (0 packets fast forwarded) 0 packets not forwardable 0 packets received for unknown multicast group 0 redirects sent 12201936 packets sent from this host 115 packets sent with fabricated ip header 1367 output packets dropped due to no bufs, etc. 0 output packets discarded due to no route 0 output datagrams fragmented 0 fragments created 0 datagrams that can't be fragmented 0 tunneling packets that can't find gif 0 datagrams with bad address in header =3D=3D=3D=3D after =3D=3D=3D=3D tcp: 13331442 packets sent 3920701 data packets (2694929130 bytes) 10203 data packets (14829100 bytes) retransmitted 0 resends initiated by MTU discovery 6446155 ack-only packets (207 delayed) 0 URG only packets 0 window probe packets 2955524 window update packets 226 control packets 17131926 packets received 1526734 acks (for 2694921244 bytes) 14404 duplicate acks 0 acks for unsent data 15281806 packets (2057905910 bytes) received in-sequence 4425 completely duplicate packets (4551322 bytes) 0 old duplicate packets 65 packets with some dup. data (14957 bytes duped) 38818 out-of-order packets (59514384 bytes) 0 packets (0 bytes) of data after window 0 window probes 265251 window update packets 3 packets received after close 4 discarded for bad checksums 0 discarded for bad header offset fields 0 discarded because packet too short 165 connection requests 49 connection accepts 0 bad connection attempts 0 listen queue overflows 68 connections established (including accepts) 388 connections closed (including 5 drops) 15 connections updated cached RTT on close 15 connections updated cached RTT variance on close 2 connections updated cached ssthresh on close 146 embryonic connections dropped 1526646 segments updated rtt (of 793534 attempts) 36 retransmit timeouts 2 connections dropped by rexmit timeout 0 persist timeouts 0 connections dropped by persist timeout 438 keepalive timeouts 438 keepalive probes sent 0 connections dropped by keepalive 79056 correct ACK header predictions 15280431 correct data packet header predictions 49 syncache entries added 0 retransmitted 0 dupsyn 0 dropped 49 completed 0 bucket overflow 0 cache overflow 0 reset 0 stale 0 aborted 0 badack 0 unreach 0 zone failures 0 cookies sent 0 cookies received ip: 17218567 total packets received 0 bad header checksums 0 with size smaller than minimum 0 with data size < data length 0 with ip length > max ip packet size 0 with header length < data size 0 with data length < header length 0 with bad options 0 with incorrect version number 1 fragment received 0 fragments dropped (dup or out of space) 1 fragment dropped after timeout 0 packets reassembled ok 17218116 packets for this host 450 packets for unknown/unsupported protocol 0 packets forwarded (0 packets fast forwarded) 0 packets not forwardable 0 packets received for unknown multicast group 0 redirects sent 13430039 packets sent from this host 115 packets sent with fabricated ip header 1367 output packets dropped due to no bufs, etc. 0 output packets discarded due to no route 0 output datagrams fragmented 0 fragments created 0 datagrams that can't be fragmented 0 tunneling packets that can't find gif 0 datagrams with bad address in header From owner-freebsd-performance@FreeBSD.ORG Fri Apr 11 15:09:26 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B5F0337B401; Fri, 11 Apr 2003 15:09:26 -0700 (PDT) Received: from puffin.mail.pas.earthlink.net (puffin.mail.pas.earthlink.net [207.217.120.139]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9BAA143FA3; Fri, 11 Apr 2003 15:09:25 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0012.cvx22-bradley.dialup.earthlink.net ([209.179.198.12] helo=mindspring.com) by puffin.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 1946hz-0007GU-00; Fri, 11 Apr 2003 15:09:20 -0700 Message-ID: <3E973CBF.FB552960@mindspring.com> Date: Fri, 11 Apr 2003 15:07:59 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: bj@dc.luth.se References: <200304111709.h3BH9LKl087299@dc.luth.se> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a45ad6850cdb5d488f63a59e7868e57876387f7b89c61deb1d350badd9bab72f9c350badd9bab72f9c cc: Mattias Pantzare cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2003 22:09:27 -0000 Borje Josefsson wrote: > On Fri, 11 Apr 2003 09:32:51 PDT Terry Lambert wrote: > > > See other posting; I listed a bunch of OIDs to play with. > > > > One other, if you are running -CURRENT, would be: > > > > net.isr.netisr_enable -> 1 > > I'm running 4.8RC on the sender right now. You might want to try 4.3 or 4.4, as well. The netisr_enable=1 is a -CURRENT only feature. It'd be worthwhile to test, since it deals with protocol processing latency; if you get receiver livelocked, then it will fix *some* instances of that. > I did a quick test with some combination of the OID:s You sent, except I > didn't reboot between each test: The reboots was intended to keep the statistics counters relatively accurate between the FreeBSD and NteBSD sender runs. By doing that, you can tell if what's happening on the receiver is the same for both sender machines. If you don't reboot, then the statistics are polluted with other traffic, and can't be compared. You should also start clean on each sender, and get the same stats on the sender. I would add "vmstat -i", to look at interrupt overhead. Note that FreeBSD jumbograms are in external mbufs allocated to the cards on receive. On transmit, they are scatter/gathered. NetBSD might not have this overhead. The copy overhead there could account for a lot of CPU time. > Netstat -m (tcp and ip portion) when I started and after the trials: Side-by-side/interleaved is more useful. I will do it manually for the ones that change; if we continue this discussion, you get to do the work in the future (b=before, A=after) -- notice how much more useful this is, and how much more useful it would be, if the "before" values were all "0" because you had rebooted...: b> tcp: b> 12103350 packets sent A> 13331442 packets sent 1228092 b> 2692690 data packets (1962127658 bytes) A> 3920701 data packets (2694929130 bytes) 1228011 732801472 b> 10203 data packets (14829100 bytes) retransmitted A> 10203 data packets (14829100 bytes) retransmitted No retransmits; this is good. Be nicer to see on the transmitters, though... b> 0 resends initiated by MTU discovery b> 6446084 ack-only packets (199 delayed) A> 6446155 ack-only packets (207 delayed) 71 8 This is odd. You must be sending data in both directions. Thus the lower bandwidth could be the result of negotiated options; you may want to try turning _on_ rfc1644. The delayed ACKs are bad. Can you either set "PUSH" on the socket, or turn off delayed ACK entirely? b> 2955524 window update packets A> 2955524 window update packets This is strange. I would expect at least 1 update packet. b> 216 control packets A> 226 control packets 10 Be nice to know what these are, and whether NetBSD and FreeBSD end up with the same number. b> 16306846 packets received A> 17131926 packets received 825080 ...403012 more sent than received. b> 847040 acks (for 1962119767 bytes) A> 1526734 acks (for 2694921244 bytes) 679694 732801477 ...5 more bytes sent than received. Seems odd, as well. b> 14399 duplicate acks A> 14404 duplicate acks 5 ...Until we see this. The duplicate ACK's indicate either a timeout, or an unexpected retransmission. In either case, this is a potential cause of a pipeline stall. b> 0 acks for unsent data b> 15281399 packets (2057886406 bytes) received in-sequence A> 15281806 packets (2057905910 bytes) received in-sequence 407 20504 ...ie: most of the data was received out of sequence. This may indicate that most of the time was being spent in stream reassembly. b> 4425 completely duplicate packets (4551322 bytes) A> 4425 completely duplicate packets (4551322 bytes) ...gotta wonder how this is, with 5 duplicate ACKs... b> 0 old duplicate packets b> 65 packets with some dup. data (14957 bytes duped) A> 65 packets with some dup. data (14957 bytes duped) b> 38818 out-of-order packets (59514384 bytes) A> 38818 out-of-order packets (59514384 bytes) This doesn't jive with the in-sequence numbers, above. b> 0 packets (0 bytes) of data after window b> 0 window probes b> 120161 window update packets A> 265251 window update packets 145090 ...That's a lot of window updates. Notice that there are never any transmit window updates, only receive window updates. This is odd. b> 3 packets received after close A> 3 packets received after close b> 4 discarded for bad checksums A> 4 discarded for bad checksums b> 0 discarded for bad header offset fields b> 0 discarded because packet too short b> 160 connection requests A> 165 connection requests 5 You should account for these. b> 49 connection accepts A> 49 connection accepts b> 0 bad connection attempts b> 0 listen queue overflows b> 63 connections established (including accepts) A> 68 connections established (including accepts) 5 "" b> 383 connections closed (including 5 drops) A> 388 connections closed (including 5 drops) 5 "" b> 10 connections updated cached RTT on close A> 15 connections updated cached RTT on close 5 b> 10 connections updated cached RTT variance on close A> 15 connections updated cached RTT variance on close 5 b> 2 connections updated cached ssthresh on close A> 2 connections updated cached ssthresh on close b> 146 embryonic connections dropped A> 146 embryonic connections dropped b> 846952 segments updated rtt (of 521492 attempts) A> 1526646 segments updated rtt (of 793534 attempts) 679694 272042 b> 36 retransmit timeouts A> 36 retransmit timeouts b> 2 connections dropped by rexmit timeout A> 2 connections dropped by rexmit timeout b> 0 persist timeouts b> 0 connections dropped by persist timeout b> 438 keepalive timeouts A> 438 keepalive timeouts b> 438 keepalive probes sent A> 438 keepalive probes sent b> 0 connections dropped by keepalive b> 26449 correct ACK header predictions A> 79056 correct ACK header predictions 52607 Be nice to know why so many were incorrect, but it's not important for what you are seeing... b> 15280149 correct data packet header predictions A> 15280431 correct data packet header predictions 282 "" b> 49 syncache entries added A> 49 syncache entries added b> 0 retransmitted b> 0 dupsyn b> 0 dropped b> 49 completed A> 49 completed b> 0 bucket overflow b> 0 cache overflow b> 0 reset b> 0 stale b> 0 aborted b> 0 badack b> 0 unreach b> 0 zone failures b> 0 cookies sent b> 0 cookies received b> b> ip: b> 16393480 total packets received A> 17218567 total packets received 825087 Good cross-check on the TCP numbers. Notice this number: 7 larger. b> 0 bad header checksums b> 0 with size smaller than minimum b> 0 with data size < data length b> 0 with ip length > max ip packet size b> 0 with header length < data size b> 0 with data length < header length b> 0 with bad options b> 0 with incorrect version number b> 1 fragment received A> 1 fragment received b> 0 fragments dropped (dup or out of space) b> 1 fragment dropped after timeout A> 1 fragment dropped after timeout b> 0 packets reassembled ok b> 16393029 packets for this host A> 17218116 packets for this host 825087 "" b> 450 packets for unknown/unsupported protocol A> 450 packets for unknown/unsupported protocol b> 0 packets forwarded (0 packets fast forwarded) b> 0 packets not forwardable b> 0 packets received for unknown multicast group b> 0 redirects sent b> 12201936 packets sent from this host A> 13430039 packets sent from this host 1228103 92 larger, this time. b> 115 packets sent with fabricated ip header A> 115 packets sent with fabricated ip header b> 1367 output packets dropped due to no bufs, etc. A> 1367 output packets dropped due to no bufs, etc. b> 0 output packets discarded due to no route b> 0 output datagrams fragmented b> 0 fragments created b> 0 datagrams that can't be fragmented b> 0 tunneling packets that can't find gif b> 0 datagrams with bad address in header All in all, there's not a lot of weird stuff going on; now you need to look at the NetBSD vs. the FreeBSD transmitters, in a similar way, get the deltas for both, and then compare them to each other. A really important thing to look at is the "vmstat -i" I asked for earlier, in order to get interrupt counts on the transmitter. Most likely, there is a driver difference causing the "problem"; you should be able to see this in a differential for the transmit interrupt overhead being higher on the FreeBSD box. It would also be very interesting to compare the netstat numbsrs between the transmitters, as suggested above; the numbers should tell you about differences in implemntation on the driver side. -- Terry From owner-freebsd-performance@FreeBSD.ORG Sat Apr 12 03:37:30 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 615FA37B401; Sat, 12 Apr 2003 03:37:30 -0700 (PDT) Received: from porter.dc.luth.se (dh249.unimaster.se [193.11.24.249]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6CC9043FBD; Sat, 12 Apr 2003 03:37:28 -0700 (PDT) (envelope-from bj@dc.luth.se) Received: from porter.dc.luth.se (localhost.dc.luth.se [127.0.0.1]) by porter.dc.luth.se (Postfix) with ESMTP id 8A6963B3; Sat, 12 Apr 2003 12:37:25 +0200 (CEST) X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Terry Lambert In-reply-to: Your message of Fri, 11 Apr 2003 15:07:59 PDT. <3E973CBF.FB552960@mindspring.com> Dcc: X-Disposition-notification-to: Borje.Josefsson@dc.luth.se X-uri: http://www.dc.luth.se/~bj/index.html Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Sat, 12 Apr 2003 12:37:25 +0200 From: Borje Josefsson Message-Id: <20030412103725.8A6963B3@porter.dc.luth.se> cc: Mattias Pantzare cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert cc: Anders Ragge Magnusson Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: bj@dc.luth.se List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Apr 2003 10:37:30 -0000 On Fri, 11 Apr 2003 15:07:59 PDT Terry Lambert wrote: > Borje Josefsson wrote: > > I did a quick test with some combination of the OID:s You sent, excep= t I > > didn't reboot between each test: > = > The reboots was intended to keep the statistics counters relatively > accurate between the FreeBSD and NteBSD sender runs. By doing that, > you can tell if what's happening on the receiver is the same for > both sender machines. If you don't reboot, then the statistics are > polluted with other traffic, and can't be compared. OK. I found out the -z flag to netstat, that clears the counters. Unfortu= nally NetBSD lacks this flag, so I rebooted that host several times :-( J= ust to be I didn't have anything "old" lying around, I rebooted the FreeBSD host before I started. > You should also start clean on each sender, and get the same stats > on the sender. I would add "vmstat -i", to look at interrupt overhead.= > = > Note that FreeBSD jumbograms are in external mbufs allocated to the > cards on receive. On transmit, they are scatter/gathered. NetBSD > might not have this overhead. The copy overhead there could account > for a lot of CPU time. I think NetBSD-current claims to do zero-copy transfers. I added Anders Magnusson to the CC: of tis mail, he knows very much of NetBSD networking internals. He surely can fill in some more details on this. > > Netstat -m (tcp and ip portion) when I started and after the trials: > = > Side-by-side/interleaved is more useful. I will do it manually for > the ones that change; if we continue this discussion, you get to do > the work in the future (b=3Dbefore, A=3Dafter) > b> 0 resends initiated by MTU discovery > b> 6446084 ack-only packets (199 delayed) > A> 6446155 ack-only packets (207 delayed) > 71 8 > = > This is odd. You must be sending data in both directions. Thus > the lower bandwidth could be the result of negotiated options; you > may want to try turning _on_ rfc1644. Did that. No difference in performance. = > The delayed ACKs are bad. Can you either set "PUSH" on the socket, > or turn off delayed ACK entirely? Did that (tcp.delayed_ack=3D0). No apparent difference. > All in all, there's not a lot of weird stuff going on; now you need > to look at the NetBSD vs. the FreeBSD transmitters, in a similar > way, get the deltas for both, and then compare them to each other. > = > A really important thing to look at is the "vmstat -i" I asked for > earlier, in order to get interrupt counts on the transmitter. Most > likely, there is a driver difference causing the "problem"; you > should be able to see this in a differential for the transmit > interrupt overhead being higher on the FreeBSD box. > = > It would also be very interesting to compare the netstat numbsrs > between the transmitters, as suggested above; the numbers should > tell you about differences in implemntation on the driver side. OK, here goes, as a first attempt to match sender and receiver data. Appologies for the long lines - I have tried to "match" appropiate sender and receiver lines below. *Note* that there are no "before" = and "after" in the netstat figures, this is net values accumulated during the test. In some cases there might be some odd packets that doesn't have to do with my ttcp test (since I access the hosts remotely), but I ran everything from a shell script to file, so the difference should me minor. I'll await comments on the data below before doing something more. --B=F6rje sender=3DFreeBSD receiver=3DNetBSD ***tcp: 305178 packets sent 305179 received 305175 data packets (1249996800 bytes) 305176 packets (1249996800 bytes)= in seq. 0 data packets (0 bytes) retransmitted 0 resends initiated by MTU discovery 1 ack-only packet (0 delayed) 0 URG only packets 0 window probe packets 0 window update packets 0 window update packets received 2 control packets 205911 packets received 206052 sent 168215 acks (for 1249148976 bytes) 136850 ack-only packets (168328 d= elayed) sent 0 duplicate acks 0 acks for unsent data 0 packets (0 bytes) received in-seq 0 completely duplicate packets 0 old duplicate packets 0 packets with some dup. data 0 out-of-order packets (0 bytes) 0 packets of data after window 0 window probes 37696 window update packets 69201 window update packets sent 0 packets received after close 0 discarded for bad checksums 0 discarded for bad header offset f. 0 discarded because packet too short 168215 segments updated rtt (of 59609) 1 segments updated rtt (of 1 atte= mpts) 9795 correct ACK header predictions 1 correct ACK header predictions 0 correct data packet header predict. 305175 correct data packet header= predict. ***ip: 205915 total packets received 206052 packets sent from this hos= t 305179 packets sent from this host 305185 packets for this host vmstat -i on *sender* =3D=3D=3Dbefore=3D=3D=3D =3D=3D=3Dafte= r=3D=3D=3D interrupt total rate total rate ata0 irq14 4 0 4 0 bge1 irq7 48 0 48 0 mux irq11 372597 325 459967 396 mux irq10 15 0 15 0 fdc0 irq6 2 0 2 0 atkbd0 irq1 1 0 1 0 clk irq0 114364 99 115893 99 rtc irq8 146388 127 148346 127 Total 633419 553 724276 624 vmstat -i on *receiver* =3D=3Dbefore=3D=3D =3D=3Dafter=3D=3D= interrupt total rate total rate cpu0 softclock 16738 99 18687 99 cpu0 softnet 170 1 89848 480 cpu0 softserial 1 0 1 0 pic0 pin 11 264 1 90106 481 pic0 pin 14 1528 9 1564 8 pic0 pin 3 1 0 1 0 pic0 pin 0 16910 100 18831 100 Total 35612 211 219038 1171 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D sender=3DNetBSD receiver=3DFreeBSD *** tcp: 282935 packets sent 282936 packets received 282933 data packets (1249996800 bytes) 282933 packets (1249996800 byt= es) received in-sequ 0 data packets (0 bytes) retransmitted 1 ack-only packets (32 delayed) 2 acks (for 49 bytes) received= 0 window probe packets 0 window update packet 1 control packet 0 send attempts resulted in self-quench 187507 packets received 187947 packets sent 187131 acks (for 1247077744 bytes) 95364 ack-only packets (0 dela= yed) sent 0 duplicate acks 0 acks for unsent data 0 packets (0 bytes) received in-sequence 0 completely duplicate packets (0 bytes) 0 old duplicate packets 0 packets with some dup. data 0 out-of-order packets (0 bytes) 0 packets (0 bytes) of data after window 0 window probes 374 window update packets 92582 window update packets se= nt 0 packets received after close 0 discarded for bad checksums 0 discarded for bad header offset fields 0 discarded because packet too short 1 connection request 0 connection accept 1 connections established (incl. accepts) 1 connection established (incl= uding accepts) 0 connection closed (including 0 drops) 0 embryonic connections dropped 182455 segments updated rtt (of 78677) 2 segments updated rtt (of 1 a= ttempt) 0 retransmit timeouts 0 connections dropped by rexmit timeout 0 persist timeouts 0 keepalive timeouts 0 keepalive probes sent 0 connections dropped by keepalive 14 correct ACK header predictions 1 correct ACK header predictio= n 282931 correct data packet hea= der predictions 0 correct data packet header pred. 0 PCB hash misses 0 dropped due to no socket 0 connections drained due to memory shortage 0 PMTUD blackholes detected 0 bad connection attempts 0 SYN cache entries added 0 hash collisions 0 completed 0 aborted (no space to build PCB) 0 timed out 0 dropped due to overflow 0 dropped due to bucket overflow 0 dropped due to RST 0 dropped due to ICMP unreachable 0 SYN,ACKs retransmitted 0 duplicate SYNs received for entries already in the cache 0 SYNs dropped (no route or no space) *** ip: 187503 total packets received 0 bad header checksums 0 bad header checksums 0 with size smaller than minimum 0 with size smaller than minim= um 0 with data size < data length 0 with data size < data length= 0 with length > max ip packet size 0 with ip length > max ip pack= et size 0 with header length < data size 0 with header length < data si= ze 0 with data length < header length 0 with data length < header le= ngth 0 with bad options 0 with bad options 0 with incorrect version number 0 with incorrect version numbe= r 0 fragments received 0 fragments received 0 fragments dropped (dup or out of space) 0 fragments dropped (dup or ou= t of space) 0 malformed fragments dropped 0 fragments dropped after timeout 0 fragments dropped after time= out 0 packets reassembled ok 0 packets reassembled ok 187503 packets for this host 187947 packets sent from this = host 0 packets for unknown/unsupported protocol 0 packets forwarded 0 packets not forwardable 0 redirects sent 282936 packets sent from this host 282936 total packets received 0 packets sent with fabricated ip header 0 output packets dropped due to no bufs, etc. 0 output packets discarded due to no route 0 output datagrams fragmented 0 output datagrams fragmented 0 fragments created 0 fragments created 0 datagrams that can't be fragmented 0 datagrams that can't be frag= mented 0 datagrams with bad address in header 0 datagrams with bad address i= n header vmstat -i on *sender* =3D=3Dbefore=3D=3D=3D =3D=3D=3D after= =3D=3D=3D interrupt total rate total rate cpu0 softclock 4737 98 5777 99 cpu0 softnet 79 1 41426 714 cpu0 softserial 1 0 1 0 pic0 pin 11 146 3 41777 720 pic0 pin 14 1516 31 1537 26 pic0 pin 3 1 0 1 0 pic0 pin 0 4905 102 5928 102 Total 11385 237 96447 1662 vmstat -i on *receiver* =3D=3D=3D before=3D=3D=3D =3D=3D=3D af= ter =3D=3D=3D interrupt total rate total rate ata0 irq14 4 0 4 0 bge1 irq7 48 0 48 0 mux irq11 744037 564 1027879 771 mux irq10 15 0 15 0 fdc0 irq6 2 0 2 0 atkbd0 irq1 1 0 1 0 clk irq0 131831 100 133175 99 rtc irq8 168746 128 170467 127 Total 1044684 792 1331591 999 From owner-freebsd-performance@FreeBSD.ORG Sat Apr 12 12:58:16 2003 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0CBE037B401; Sat, 12 Apr 2003 12:58:16 -0700 (PDT) Received: from angelica.unixdaemons.com (angelica.unixdaemons.com [209.148.64.135]) by mx1.FreeBSD.org (Postfix) with ESMTP id 39C5E43F85; Sat, 12 Apr 2003 12:58:14 -0700 (PDT) (envelope-from hiten@angelica.unixdaemons.com) Received: from angelica.unixdaemons.com (hiten@localhost.unixdaemons.com [127.0.0.1])h3CJvWtK031254; Sat, 12 Apr 2003 15:57:32 -0400 (EDT) Received: (from hiten@localhost) by angelica.unixdaemons.com (8.12.9/8.12.1/Submit) id h3CJvB5f031172; Sat, 12 Apr 2003 15:57:11 -0400 (EDT) (envelope-from hiten) Date: Sat, 12 Apr 2003 15:57:11 -0400 From: Hiten Pandya To: Mike Silbersack Message-ID: <20030412195711.GA30459@unixdaemons.com> References: <200304101311.h3ADBgjY022790@samson.dc.luth.se> <20030410114227.A472@odysseus.silby.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030410114227.A472@odysseus.silby.com> User-Agent: Mutt/1.4.1i X-Operating-System: FreeBSD i386 X-Public-Key: http://www.pittgoth.com/~hiten/pubkey.asc X-URL: http://www.unixdaemons.com/~hiten X-PGP: http://pgp.mit.edu:11371/pks/lookup?search=Hiten+Pandya&op=index cc: Borje Josefsson cc: freebsd-performance@freebsd.org cc: Eric Anderson cc: David Gilbert cc: freebsd-hackers@freebsd.org Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Apr 2003 19:58:16 -0000 [ cross post to hackers removed, thanks ] Mike Silbersack (Thu, Apr 10, 2003 at 11:44:42AM -0500) wrote: > > On Thu, 10 Apr 2003, Borje Josefsson wrote: > > > What we did in NetBSD (-current) was to increase IFQ_MAXLEN in (their) > > sys/net/if.h, apart from that it's only "traditional" TCP tuning. > > > > My hosts are connected directly to core routers in a 10Gbps nationwide > > network, so if anybody is interested in some testing I am more than > > willing to participate. If anybody produces a patch, I have a third system > > that I can use for piloting of that too. > > > > --B?rje > > This brings up something I've been wondering about, which you might want > to investigate: > > >From tcp_output: > > if (error == ENOBUFS) { > if (!callout_active(tp->tt_rexmt) && > !callout_active(tp->tt_persist)) > callout_reset(tp->tt_rexmt, tp->t_rxtcur, > tcp_timer_rexmt, tp); > tcp_quench(tp->t_inpcb, 0); > return (0); > } > > That tcp_quench knocks the window size back to one packet, if I'm not > mistaken. You might want to put a counter there and see if that's > happening frequently to you; if so, it might explain some loss of > performance. Maybe something like this: %%% Index: sys/netinet/tcp_output.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/tcp_output.c,v retrieving revision 1.78 diff -u -r1.78 tcp_output.c --- sys/netinet/tcp_output.c 19 Feb 2003 22:18:05 -0000 1.78 +++ sys/netinet/tcp_output.c 12 Apr 2003 19:52:31 -0000 @@ -930,6 +930,7 @@ !callout_active(tp->tt_persist)) callout_reset(tp->tt_rexmt, tp->t_rxtcur, tcp_timer_rexmt, tp); + tcpstat.tcps_selfquench++; tcp_quench(tp->t_inpcb, 0); return (0); } Index: sys/netinet/tcp_var.h =================================================================== RCS file: /home/ncvs/src/sys/netinet/tcp_var.h,v retrieving revision 1.88 diff -u -r1.88 tcp_var.h --- sys/netinet/tcp_var.h 1 Apr 2003 21:16:46 -0000 1.88 +++ sys/netinet/tcp_var.h 12 Apr 2003 19:52:31 -0000 @@ -394,6 +394,8 @@ u_long tcps_sc_zonefail; /* zalloc() failed */ u_long tcps_sc_sendcookie; /* SYN cookie sent */ u_long tcps_sc_recvcookie; /* SYN cookie received */ + + u_long tcps_selfquench; /* self-quench count */ }; /* Index: usr.bin/netstat/inet.c =================================================================== RCS file: /home/ncvs/src/usr.bin/netstat/inet.c,v retrieving revision 1.58 diff -u -r1.58 inet.c --- usr.bin/netstat/inet.c 2 Apr 2003 20:14:44 -0000 1.58 +++ usr.bin/netstat/inet.c 12 Apr 2003 19:52:32 -0000 @@ -389,6 +389,7 @@ p(tcps_sndprobe, "\t\t%lu window probe packet%s\n"); p(tcps_sndwinup, "\t\t%lu window update packet%s\n"); p(tcps_sndctrl, "\t\t%lu control packet%s\n"); + p(tcps_selfquench, "\t\t%lu send%s resulting in self-quench\n"); p(tcps_rcvtotal, "\t%lu packet%s received\n"); p2(tcps_rcvackpack, tcps_rcvackbyte, "\t\t%lu ack%s (for %lu byte%s)\n"); p(tcps_rcvdupack, "\t\t%lu duplicate ack%s\n"); %%% Cheers. -- Hiten