From owner-freebsd-hackers Fri Jul 13 13:27:49 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from ussenterprise.ufp.org (ussenterprise.ufp.org [208.185.30.210]) by hub.freebsd.org (Postfix) with ESMTP id 4B14137B403 for ; Fri, 13 Jul 2001 13:27:44 -0700 (PDT) (envelope-from bicknell@ussenterprise.ufp.org) Received: (from bicknell@localhost) by ussenterprise.ufp.org (8.11.1/8.11.1) id f6DKRga32864; Fri, 13 Jul 2001 16:27:42 -0400 (EDT) (envelope-from bicknell) Date: Fri, 13 Jul 2001 16:27:42 -0400 From: Leo Bicknell To: Terry Lambert Cc: Leo Bicknell , freebsd-hackers@FreeBSD.ORG Subject: Re: Network performance roadmap. Message-ID: <20010713162742.A31883@ussenterprise.ufp.org> Mail-Followup-To: Leo Bicknell , Terry Lambert , Leo Bicknell , freebsd-hackers@FreeBSD.ORG References: <20010713101107.B9559@ussenterprise.ufp.org> <3B4F4534.37D8FC3E@mindspring.com> <20010713151257.A27664@ussenterprise.ufp.org> <3B4F542F.D0D0E0BA@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <3B4F542F.D0D0E0BA@mindspring.com>; from tlambert2@mindspring.com on Fri, Jul 13, 2001 at 01:03:59PM -0700 Organization: United Federation of Planets Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, Jul 13, 2001 at 01:03:59PM -0700, Terry Lambert wrote: > When I run out of mbufs, I do not have "bad things happen". > > The "bad things" are an artifact of memory overcommit; if you One thing is clear is that we're talking about two different level of problems. On the systems I run (admitedly with < 1,000,000 connections) there is no memory overcommit, and the bad things from running out of mbuf's aren't running out of memory. The problem I would like to solve is the overbuffering of individual sockets. That's not to say there aren't other issues involving allocating more MBUF's, or doing other things. I think there is enough evidence from papers like the ones at psc.edu to show that solving the overbuffering problem one way or another provides a huge increase in performance across a wide range of conditions. > By changing them. I have servers that can support 1,000,000 > concurrent connections. They are based on FreeBSD running on > 4GB memory systems with 2 1Gbit NICs. > > This is why all the hand-waving and suggestions for substantial > (and unnecessary, from empirical practice) changes in the > FreeBSD stack is making me so leery. Well, you can consider it hand waving. That said, on the low end you can easily prove mathematically, and demonstrate empirically that end users (of the high speed DSL and cable modem variety) are being limited on a day-to-day basis. While I won't prevent anyone from fixing larger and more looming issues, I will be satisifed when that is no longer the case. I think fixing that does not require rewriting the kernel memory allocator, as you seem to want to suggest. > That would be nice; first of all, you will need to get > over your aversion to working on kernel memory allocators > (;-)), since the only way to set things up for variable > loads is to take away the fixed nature of the allocations > which are needed to tune for those loads. You can't apply The PSC autotune (which has some other issues) seems to address a large segment of the variable load problem without taking the steps you suggest. It gets far enough to solve the problems I am interested in solving. It also seems to me that even if someone takes on the project you seem interested in something along the lines of the autotuning code will be necessary to take advantage of it. > parameters. This is no good. You need the empirical data, > but it should not be applied to tuning parameters globally, > it should be applied to them on a case by case basis on > server installations. Today I believe (in very round figures) the defaults are "right" for 25% of the users, "right enough" for 50% of the users, and "wrong" for 25% of the installations. It's the "right enough" catagory that worries me the most. I can provide lots of evidence, in the form of my employer's customer's experiences where they were getting "200k/s" across the net and figured "that's all the Internet can provide". After a few tweeks they were getting 1M/sec and their eyes light up with "I never knew the network could support that". It's sad the users don't have higher expectations, but at the same time the fact that the operating system, and not the network, is limiting their performance is completely hidden. So, I'd be happy if we can move to 75% right, 25% "wrong" (note, that's no improvement in the number of wrong cases). > The only way around this is to bite the bullet, and do the > right thing. Failure to do that means that you are subject > to denial of service attacks based on your tuning parameters, > so while you may run OK in the case of needing a lot of HTTP > connections with small windows, someone can panic your system > by advertising very large windows and then giving you many > 2MB HTTP requests. Normal HTTP requests are not that large, The PSC fair share (note, I do not recomend that's what we use, but just use it as a reference) would seem to mitigate the DOS potential with appropriate settings. The whole point of this is that the OS should not be buffering 2M per request because the _LIMIT_ is 2M, it should be buffering 2M because the window and actual transfer rate suggest that it might be necessary. > > Ah, I see, so to prevent MBUF exhaustion I should not let > > my socket buffers get large. Sort of like to prevent serious > > injury in a car crash I should drive at 10MPH on the freeway. > > Or 55MPH. Or 65MPH. Whatever your local limit is, is also > administrative, and quite arbitrary. Many cars are safe at > much, much faster speeds, as long as someone doesn't decide > to drive at 50MPH in the fast lane, so your rate of closure > is 70MPH+. Yes, and most cars have speed limiters built in these days, you'll find they don't kick in until 120-150mph. FreeBSD seems to have it's speed limiter set at 25mph because in the past most users lived in the city where they couldn't drive fast, and as a result we didn't bother to make the throttle variable, it was 0mph or 25mph. It's time for the variable throttle, if nothing else. -- Leo Bicknell - bicknell@ufp.org Systems Engineer - Internetworking Engineer - CCIE 3440 Read TMBG List - tmbg-list-request@tmbg.org, www.tmbg.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message