From owner-freebsd-current Mon Feb 17 21:24:58 2003 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9521337B401 for ; Mon, 17 Feb 2003 21:24:52 -0800 (PST) Received: from measurement-factory.com (measurement-factory.com [206.168.0.5]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8B32B43F75 for ; Mon, 17 Feb 2003 21:24:51 -0800 (PST) (envelope-from rousskov@measurement-factory.com) Received: from measurement-factory.com (localhost [127.0.0.1]) by measurement-factory.com (8.12.6/8.12.5) with ESMTP id h1I5OeeM082074; Mon, 17 Feb 2003 22:24:41 -0700 (MST) (envelope-from rousskov@measurement-factory.com) Received: (from rousskov@localhost) by measurement-factory.com (8.12.6/8.12.5/Submit) id h1I5Oda4082073; Mon, 17 Feb 2003 22:24:39 -0700 (MST) (envelope-from rousskov) Date: Mon, 17 Feb 2003 22:24:39 -0700 (MST) From: Alex Rousskov To: Terry Lambert Cc: Pawel Jakub Dawidek , Scott Long , Sam Leffler , Brad Knowles , freebsd-current@freebsd.org Subject: Re: Polygraph Considered Evil 8^) (was: Re: 5-STABLE Roadmap) In-Reply-To: <3E511E7A.8225ABA9@mindspring.com> Message-ID: References: <20030216184257.GZ10767@garage.freebsd.pl> <3E4FFDD3.9050802@btc.adaptec.com> <20030216214322.GB10767@garage.freebsd.pl> <3E511E7A.8225ABA9@mindspring.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, 17 Feb 2003, Terry Lambert wrote: > First, I just have a slight editorial comment, about cheating on > Polygraph. Terry, This is not the place to start a long discussion about our Polygraph testing methodology, but I have to say, with all due respect, that many of your statements are either misleading or based on misinformation about Web Polygraph and the way standard tests are executed. I have to respond because I both love and understand cache benchmarking. I apologize to the majority of the audience for what may be considered an out-of-scope thread. > One issue I have with Polygraph is that it intentionally works for a > very long time to get worst case performance out of caches; > basically, it cache-busts on purpose. Then the test runs. This is plain wrong. I assume that you are referring to PolyMix workloads that have a filling-the-cache phase and measurement phases. Filling the cache phase does not bust the cache. Its primary purpose is to bring cache's storage to a steady state (hopefully). If you tested many caches, including Squid, then you know that cache performance "on an empty stomach" often differs from sustained performance by 50%. Since we must start from scratch, we must pump enough data to approach steady state. You might have been misinformed that all the fill objects are used during the measurement phases; this is not true. Polygraph keeps the size of the working set constant. That size is usually much smaller than the amount of traffic during the fill phase. Again, the fill phase is there to reach a steady state after you start with an empty disk. > This seems to be an editorial comment on end-to-end guarantees, much > more than it seems a valid measurement of actual cache performance. Not sure what end-to-end guarantees you are referring here. > If you change squid to force a random page preplacement, then you > end up with a bounded worst case which is a better number than you > would be able to get with your best (in terms of the real-world > performance) algorithm (e.g. LRU or whatever), because you make it > arbitrarily hard to characterize what that would be. Random page replacement should not cause better performance, Polygraph simulates hot subsets (aka flash crowds), which you would not be able to take advantage of if you replace randomly. Also, random replacement will lose partial advantages of temporal locality that Polygraph also simulates (e.g., same HTML containers have same images). > NetApp has a tunable in their cache product which might as well be > labelled "get a good Polygraph score"; all it does is turn on random > page replacement, so that the Polygraph code is unable to > characterize "what would constitute worst case performance on this > cache?", and then intentionally exercise that code path, which is > what it would do, otherwise (i.e. pick a working set slightly larger > than the cache size so everythings a miss, etc.). I am unaware of any tunables of that kind. Moreover, I suspect they simply would not work (see above). Are you a rich? If not, you may want to sell a proof of the above to NetApp competitor. I, myself, would be very interested to hear it as well. Keep in mind that NetApp and most other vendors use Polygraph for day-to-day regression tests so they are interested in making the tests realistic. Also, offered Polygraph traffic does not depend on cache performance. Polygraph code does not "characterize" anything run-time, at leat not during PolyMix tests. > Basically, most of the case numbers are 99.xx% miss rates. With > this modification, that number drops down to closer to 80%. Actually, the measured miss ratio is usually about 50% (hit rate of 50+%), which is quite realistic. Offered hit ratio is about 55%. Byte hit ratio is lower. Not sure where you got 99 or 80% numbers. See cache-off results for true values. > That's kind of evil; but at least it's a level playing field, and > we can make a FreeBSD-specific patch for SQUID to get better numbers > for FreeBSD. 8-) 8-). I would not encourage you to cheat, even if there is a way. I would recommend that you suggest ways to improve the benchmark instead. Chances are, Polygraph can already do what you want. > > > options MAXFILES=16384 > > > options NMBCLUSTERS=32678 > > These I understand, though I think they are on the low end. We have never run out of related resources with these settings during a valid test. Keep in mind that we have to keep the number of open concurrent HTTP connections below 5-8K to get robust performance given PolyMix burstiness and other factors. > > > options HZ=1000 > > This one, I don't understand at all. The web page says it's for faster > dummynet processing. But maybe this is an artifact of using NETISR. This setting is a must-have if you use dummynet. We did not invent it, it was suggested by the dummynet author himself, and it did solve performance problems we experienced with standard setting of HZ. I do not know what NETISR controls, so if you know of a better dummynet tuning approach, please let us know! > > > kern.ipc.somaxconn=1024 > This one, either: it's really very small. I do not think we overflow the queue during valid tests. If it gets 1000 requests long, the device under test is already in deep trouble and will fail the test. > > > net.inet.ip.portrange.last=40000 > > This one is OK, but small. It only effects outbound connections; got > to wonder why it isn't 65536, though. This is actually for "dummy user" safety. Correctly configured Polygraph does not use ephemeral ports. There is no reason to have it at 65536 because Polygraph (under PolyMix workload) should not use that many sockets. > > > net.inet.tcp.delayed_ack=0 > > This seems designed to get a good connection rate. I am not sure this is needed. It may be there for historical reasons. > > > net.inet.tcp.msl=3000 > > And this seems designed to get a bad one. You are aware that, by > default, NT systems cheat on the MSL, right? For gigabit, this is > a larger number than you want, I think. MSL must be small, or the kernel will choke on TIME_WAIT connections. Please note that these are settings for Polygraph clients and servers, and _not_ the device under test. During official tests, we use a program (available in Polygraph distro) to verify that _proxies_ use MSL of at least 59 seconds. Cannot comment on Gigabit-related optimizations because we have not played with Gigabit cards much. I would not be surprised if some settings would have to be different. > I haven't looked at the client code, but you are aware that adding > IP aliases doesn't really do anything, unless you managed your port > space your self, manually, with a couple of clever tricks? In other > words, you are going to be limited to your total number of outbound > connections as your ports space (e.g. ~40K), because the port > autoallocation takes place in the same space as the INADDR_ANY > space? I guess this doesn't matter, if your maxopenfiles is only > 16K, since that's going to end up bounding you well before you run > out of ports... Polygraph does manage port space, but I am not sure what you mean by "doesn't really do anything". We do not want aliases to do anything other than participate in routing and make an impression (on the proxy), that there are thousands of source IPs and hundreds of server IPs. That seems to work as desired, but I may be missing your point. The number of connections that is safe during a test is usually below 10K though. > IMO, Polygraph is probably not something you want to include in a > standard suite, if the intent is to get numbers that are good for > FreeBSD PR (Sorry, Alex, but it's true: you have to do significant > and clever and sometimes obtuse and counterintuitive things in order > to get good Polygraph numbers for comparison). > > I don't think that anything you do in this regard is going to be > able to give you iMimic or NetApp level numbers, which are created > by professional benchmark-wranglers, so any comparison values you > get will liekly be poor, compared to commercial offerings. Not sure how to respond to that -- I can discuss specific limitations of the benchmark (there are many) or vendors cheats (there are some known ones), but I cannot defend Polygraph against general "all vendors cheat" accusations. If you want numbers that are good for FreeBSD PR you can simply use iMimic numbers, for example. They are good, and they use FreeBSD. IMO, Polygraph is the best proxy benchmark available as far as realism is concerned, and we are very open to any specific suggestions on how to make it better. If you can think of better workloads or better test rules, please share your ideas with my team. Thank you, Alex. -- | HTTP performance - Web Polygraph benchmark www.measurement-factory.com | HTTP compliance+ - Co-Advisor test suite | all of the above - PolyBox appliance To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message