From owner-freebsd-arch@FreeBSD.ORG Thu Mar 3 15:35:41 2005 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D379E16A4CE; Thu, 3 Mar 2005 15:35:41 +0000 (GMT) Received: from VARK.MIT.EDU (VARK.MIT.EDU [18.95.3.179]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3C44843D46; Thu, 3 Mar 2005 15:35:41 +0000 (GMT) (envelope-from das@FreeBSD.ORG) Received: from VARK.MIT.EDU (localhost [127.0.0.1]) by VARK.MIT.EDU (8.13.3/8.13.1) with ESMTP id j23FZ5rL017052; Thu, 3 Mar 2005 10:35:05 -0500 (EST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by VARK.MIT.EDU (8.13.3/8.13.1/Submit) id j23FZ5ro017051; Thu, 3 Mar 2005 10:35:05 -0500 (EST) (envelope-from das@FreeBSD.ORG) Date: Thu, 3 Mar 2005 10:35:05 -0500 From: David Schultz To: John Baldwin Message-ID: <20050303153505.GA16964@VARK.MIT.EDU> Mail-Followup-To: John Baldwin , freebsd-arch@FreeBSD.ORG References: <20050303074242.GA14699@VARK.MIT.EDU> <200503030954.08271.jhb@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200503030954.08271.jhb@FreeBSD.org> cc: freebsd-arch@FreeBSD.ORG Subject: Re: Removing kernel thread stack swapping X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Mar 2005 15:35:42 -0000 On Thu, Mar 03, 2005, John Baldwin wrote: > On Thursday 03 March 2005 02:42 am, David Schultz wrote: > > Any objections to the idea of removing the feature of swapping out > > kernel stacks? Unlike disabling UAREA swapping, this has the > > small downside that it wastes 16K (give or take a power of 2) of > > wired memory per kernel thread that we would otherwise have > > swapped out. However, this disadvantage is probably negligible by > > today's standards, and there are several advantages: > > > > 1. David Xu found that some kernel code stores externally-accessible > > data structures on the stack, then goes to sleep and allows the > > stack to become swappable. This can result in a kernel panic. > > He found one instance. > > > 2. We don't know how many instances of the above problem there are. > > Selectively disabling swapping for the right threads at the > > right times would decrease maintainability. > > Probably 1. Note that since at least FreeBSD 1.0 programmers have had to > realize that the stack can be swapped out. The signal code in pre-5.x stores > part of the signal state in struct proc directly in order to support swapped > out stacks. In 5.x we just malloc the whole signal state directly since we > killed the u-area. sigwait() has a bug that should be fixed, let's not > engage in overkill and throw the baby out with the bath water. In general we > need to discourage use of stack variables anyway because when people use > stack space rather than malloc() space the failure case for running out is > much worse, i.e. kernel panic when you overflow your stack (even though KVM > may be available) vs. waiting until some memory is available or returning > NULL. > > Hence, don't kill this whole feature just because someone is too lazy to fix a > bug. Fair enough. I'll defer to you on the extent of the problem. David seemed to think that it was more widespread. (BTW, does *anyone* know what the PHOLD() in kern_physio is for? Is it a holdover from when the PCB was in struct user?) That still leaves my third point, which is that kernel stack swapping is no longer as effective as it was in 4.X. Resource hogs, particularly multithreaded ones, tend to get passed over by the swapper, so only the well-behaved processes (e.g. interactive ones) tend to get swapped out under high load. I have a WIP that I mentioned to you briefly before that fixes this by doing two things: a) The swapper sets a flag on threads that are unswappable. When I finish this, those threads will swap themselves out the next time they try to enter or exit the kernel. b) Individual threads within a process can be swapped out; they don't all have to be swapped out at the same time. I'm not actually sure if (b) is a good thing to do. For many applications, swapping out one thread will just cause all the others to quickly stall while waiting for it. Thoughts? In any case, I have no time to finish it right now, but assuming I don't get to axe it all, I'll get to finishing it eventually...