Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 03 Mar 2005 11:51:20 -0500
From:      Stephan Uphoff <ups@tree.com>
To:        David Schultz <das@FreeBSD.ORG>
Cc:        "freebsd-arch@freebsd.org" <freebsd-arch@FreeBSD.ORG>
Subject:   Re: Removing kernel thread stack swapping
Message-ID:  <1109868680.56784.17236.camel@palm>
In-Reply-To: <20050303153505.GA16964@VARK.MIT.EDU>
References:  <20050303074242.GA14699@VARK.MIT.EDU> <20050303153505.GA16964@VARK.MIT.EDU>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 2005-03-03 at 10:35, David Schultz wrote:
> On Thu, Mar 03, 2005, John Baldwin wrote:
> > On Thursday 03 March 2005 02:42 am, David Schultz wrote:
> > > Any objections to the idea of removing the feature of swapping out
> > > kernel stacks?  Unlike disabling UAREA swapping, this has the
> > > small downside that it wastes 16K (give or take a power of 2) of
> > > wired memory per kernel thread that we would otherwise have
> > > swapped out.  However, this disadvantage is probably negligible by
> > > today's standards, and there are several advantages:
> > >
> > > 1. David Xu found that some kernel code stores externally-accessible
> > >    data structures on the stack, then goes to sleep and allows the
> > >    stack to become swappable.  This can result in a kernel panic.
> > 
> > He found one instance.
> > 
> > > 2. We don't know how many instances of the above problem there are.
> > >    Selectively disabling swapping for the right threads at the
> > >    right times would decrease maintainability.
> > 
> > Probably 1.  Note that since at least FreeBSD 1.0 programmers have had to 
> > realize that the stack can be swapped out.  The signal code in pre-5.x stores 
> > part of the signal state in struct proc directly in order to support swapped 
> > out stacks.  In 5.x we just malloc the whole signal state directly since we 
> > killed the u-area.  sigwait() has a bug that should be fixed, let's not 
> > engage in overkill and throw the baby out with the bath water.  In general we 
> > need to discourage use of stack variables anyway because when people use 
> > stack space rather than malloc() space the failure case for running out is 
> > much worse, i.e. kernel panic when you overflow your stack (even though KVM 
> > may be available) vs. waiting until some memory is available or returning 
> > NULL.
> > 
> > Hence, don't kill this whole feature just because someone is too lazy to fix a 
> > bug.
> 
> Fair enough.  I'll defer to you on the extent of the problem.
> David seemed to think that it was more widespread.  (BTW, does
> *anyone* know what the PHOLD() in kern_physio is for?  Is it a
> holdover from when the PCB was in struct user?)

I guess the intend is to avoid swapping out the thread while it holds
the pages needed for the I/O.

> 
> That still leaves my third point, which is that kernel stack
> swapping is no longer as effective as it was in 4.X.  Resource
> hogs, particularly multithreaded ones, tend to get passed over by
> the swapper, so only the well-behaved processes (e.g. interactive
> ones) tend to get swapped out under high load.  I have a WIP that
> I mentioned to you briefly before that fixes this by doing two
> things:
> 
>   a) The swapper sets a flag on threads that are unswappable.
>      When I finish this, those threads will swap themselves out
>      the next time they try to enter or exit the kernel.
> 
>   b) Individual threads within a process can be swapped out; they
>      don't all have to be swapped out at the same time.
> 
> I'm not actually sure if (b) is a good thing to do.  For many
> applications, swapping out one thread will just cause all the
> others to quickly stall while waiting for it.  Thoughts?

I think (b) is a good idea - especially for threads on long term sleep
in the kernel.
Not sure about your stalling scenario.
I guess the thing to watch out for is that multi-threaded applications
have the same chance to run as single threaded applications.
Stalling may even be the right thing to do ;-)

> In any case, I have no time to finish it right now, but assuming I
> don't get to axe it all, I'll get to finishing it eventually...
> _______________________________________________
> freebsd-arch@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
> 
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1109868680.56784.17236.camel>