From owner-freebsd-arch Tue Nov 7 17: 8:37 2000 Delivered-To: freebsd-arch@freebsd.org Received: from smtp03.primenet.com (smtp03.primenet.com [206.165.6.133]) by hub.freebsd.org (Postfix) with ESMTP id 4DEBB37B479 for ; Tue, 7 Nov 2000 17:08:33 -0800 (PST) Received: (from daemon@localhost) by smtp03.primenet.com (8.9.3/8.9.3) id SAA17607; Tue, 7 Nov 2000 18:06:35 -0700 (MST) Received: from usr08.primenet.com(206.165.6.208) via SMTP by smtp03.primenet.com, id smtpdAAAm_aySD; Tue Nov 7 18:01:23 2000 Received: (from tlambert@localhost) by usr08.primenet.com (8.8.5/8.8.5) id SAA01102; Tue, 7 Nov 2000 18:02:56 -0700 (MST) From: Terry Lambert Message-Id: <200011080102.SAA01102@usr08.primenet.com> Subject: Re: softdep panic due to blocked malloc (with traceback) To: julian@elischer.org (Julian Elischer) Date: Wed, 8 Nov 2000 01:02:56 +0000 (GMT) Cc: rjesup@wgate.com (Randell Jesup), gibbs@scsiguy.com (Justin T. Gibbs), dillon@earth.backplane.com (Matt Dillon), phk@critter.freebsd.dk (Poul-Henning Kamp), bde@zeta.org.au (Bruce Evans), mckusick@mckusick.com (Kirk McKusick), arch@FreeBSD.ORG In-Reply-To: <3A087257.DBA40791@elischer.org> from "Julian Elischer" at Nov 07, 2000 01:21:27 PM X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > > I think both Matt's changes and what Poul-Henning can be useful. > > (Actually, it sounds like Matt's are required, and Poul-Henning's might be > > nice if and when someone does them). > > I think that they are talking with cross purposes.. > > Matt is right that nothing that magically comes up with a few > hundred KB of ram can be guaranteed to stop a deadlock, because > after the few hundred KB have been used up, If the big memory > hog keeps eating memory, you are right back where you started, > and you no longer have a few hundred KB up your sleeve. Djikstra's Bankers Algorithm solves this by stalling the big memory hog, if it can't have pages stolen from it (i.e. have its working set size reduced). No CPU cycles granted to the hog equals no new allocations to the hog. I think that the problem being examined is the one that Poul and Matt and Alfred discussed a little while ago, where you have a hog, and its pages are marked as anonymous, yet are dirty. This is really an evil thing to do, since the pages are not permitted to be swapped, nor are they permitted to be cleaned. I think the right thing to do would be to treat the flag as a hint, not a granted right, and start swapping the pages anyway. A more ideal soloution would not let you get into trouble in the first place, by refusing to dirty buffers faster than they can be written -- in other words, if the system is overloaded, it slows down: degrading gracefully. > On the other hand, PHK is correct in that it would be a useful > facility to have and that it might buy some breathing space. > To be ueful however I think it would need to be combined with > some other measures to ensure that we don't get straight back > into debt. For example triggering that queue might change the > strategies in the kernel so that the biggest memory users are > forced to start losing pages. (e.g. it's swapped out) or some > similar work.. This will be, without a doubt, required, in order to ensure an artificial scarcity isn't created. You could consider this the case where you have per CPU resource pools, or you have memory tied up in other places, which could reasonably be recovered. NT and Windows 95/98/2000 both have the capability for the VM system to demand that resources be returned to it, under low memory conditions. This may not help, unless there is a reserve which is kept seperately for I/O buffers vs. other buffers, so the unified VM and buffer cache work against easily resolving this (this may be why Sun punted in Solaris 2.8, and deunified their cache again). As to another comment in this thread, SIGDANGER was intended to make processes free up resources, not the system, so I don't think the AIX approach will work here. Arguably, if one is going to do what Yahoo has been doing in order to workaround other problems, the memory that gets allocated to this purpose should probably be (1) wired down, and (2) size restricted to say 75% of available memory, or some other hard limit. Effectively, memory used in such a way as to cause this problem should probably not be overcommitted. There's also the historical problem with page reclaimation, after the disassociation of an in core inode with the vnode off which clean buffers are hung. This is, inded, wasted memory, and those buffers should probably be released sooner, when the inode/vnode relationship is severed, rather than later, when the reclaimer gets around to it. I'd actually be really interested in the statistics with regard to "reclaimable clean vs. dirty memory" for a system once it has hit "extreme low memory conditions" leading to deadlock. I think that you couldn't say that "low memory conditions always result from X, and never from Y", and that different usage patterns would mean different root causes -- and thus different "optimal recovery strategies". Anyway, real statistics gathered at or near the failure point for these systems would be truly useful; if all the memory is fragmented, then kicking subsystems to make them release what they can won't help (I personally doubt this will be the case). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message