From owner-freebsd-hackers Tue Jun 20 10:50: 5 2000 Delivered-To: freebsd-hackers@freebsd.org Received: from falla.videotron.net (falla.videotron.net [205.151.222.106]) by hub.freebsd.org (Postfix) with ESMTP id D147137C187 for ; Tue, 20 Jun 2000 10:45:12 -0700 (PDT) (envelope-from bmilekic@dsuper.net) Received: from modemcable009.62-201-24.mtl.mc.videotron.net ([24.201.62.9]) by falla.videotron.net (Sun Internet Mail Server sims.3.5.1999.12.14.10.29.p8) with ESMTP id <0FWG004ZFRTTEN@falla.videotron.net> for freebsd-hackers@freebsd.org; Tue, 20 Jun 2000 13:41:53 -0400 (EDT) Date: Tue, 20 Jun 2000 13:43:42 -0400 (EDT) From: Bosko Milekic Subject: mbuf re-write(s), v 0.1 X-Sender: bmilekic@jehovah.technokratis.com To: freebsd-hackers@freebsd.org Message-id: MIME-version: 1.0 Content-type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG In an attempt to eliminate or significantly reduce the hogging of physical memory by unused mbufs, I have begun re-writing some of the mbuf subsystem. I've re-written the allocator and designed an actual free routine, and have also considerably re-written the MGET, MGETHDR, and MFREE macros. I still have some work to do with this, notably optimisation, but I have not been able to do any profiling whatsoever as profiling, I repeat, seems presently broken on -CURRENT. This is particularily useful for machines which see "peak" mbuf usage periods, where many mbufs are allocated, only to be freed a little while later, but which will unfortunately remain on the free list, holding on to physical memory (for a graphical example, see the THIRD graph at http://www.technokratis.com/stats/mbuf.html). Previously, we used to use the kernel malloc() to do mbuf allocations, coupled with the free() routine to do the freeing. However, the new allocator does not have to worry about chosing the right algorithm, and notably, variable sized objects. Of course, I still have some performance tuning to do, but need the profiling to work for that. Of course, there is an min_on_avail variable added to the code, which is yet to be made sysctl-tunable, and which represents the minimum amount of mbufs that must reside on the free lists, so that the system will not explicitly free pages on every occasion it gets. The reason I named this "v 0.1" has to do with the work that is left to be done here. I've, for the moment, removed the m_reclaim() and wait code for mbufs, but this will all have to be re-placed appropriately (not much voodoo involved here). However, I've moved the mclusters to their own map, mcl_map, which is the correct thing to do here, in order to avoid having to worry about fragmentation in the allocation routines (we want most efficiency possible). I'll go ahead and change the mcluster stuff soon, too, and hopefully fix up some of the mclrefcnt usage for clusters. I'll discuss more of this in time to come, and post the URL here. Also, I'm planning to write an optional "mbuf daemon" that can periodically walk the mbuf system's AVAIL_LST, and EMPTY_LST, and re-organize order of elements on, particularily, the AVAIL_LST, in order to minimize fragmentation during allocations, and augment % utilization for the allocator(s). It should also optionally do some other neat tasks, but I haven't exactly decided on which ones, although I'd like to avoid having it raise to splimp() for too long, though. Unlike what some of you may be thinking right now, this is not theoretical work, I have some diffs right here: http://www.technokratis.com/code/mbuf/ (you'll have to excuse my big tabs) The diffs provided for now are context diffs, and they do several things, among the which (not to go too much into details): 1* Implement new mbuf allocator, implement free routine, re-write mbuf allocation and free macros. Add necessary lists / structures for the new system. 2* Change to OID_AUTO for all sysctls in uipc_mbuf.c 3* Make /sys/sys/mbuf.h look nicer, more consistent comments, etc. 4* Have mbuf clusters remain the same for now, but move them over to mcl_map 5* Remove (temporarily) mbuf wait/reclaim stuff. The diffs are in working condition on -CURRENT (as of a couple of days ago, at least), and I'm running them with no apparent problems as we speak. % utilization is great, for now, and I hope that the daemon-to-come will bring it up even higher. I can also tune it with the min_on_avail variable. Of course, from the above 5 points, you'll quickly note that I still have to go around and rebuild userland stuff, but that will wait until the end of all mbuf system modifications. Comments welcome. Special thanks to Mike Silbersack for already discussing such issues with me. Regards, Bosko -- Bosko Milekic * Voice/Mobile: 514.865.7738 * Pager: 514.921.0237 bmilekic@technokratis.com * http://www.technokratis.com/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message