Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 26 Aug 2017 12:18:14 -0600
From:      Ian Lepore <ian@freebsd.org>
To:        "Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net>, Bruce Evans <brde@optusnet.com.au>
Cc:        Don Lewis <truckman@freebsd.org>, avg@freebsd.org, freebsd-arch@freebsd.org
Subject:   Re: ULE steal_idle questions
Message-ID:  <1503771494.56799.49.camel@freebsd.org>
In-Reply-To: <201708261812.v7QIC2eJ074443@pdx.rh.CN85.dnsmgr.net>
References:  <201708261812.v7QIC2eJ074443@pdx.rh.CN85.dnsmgr.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 2017-08-26 at 11:12 -0700, Rodney W. Grimes wrote:
> > 
> > On Fri, 25 Aug 2017, Don Lewis wrote:
> > 
> > > 
> > > ...
> > > Something else that I did not expect is the how frequently
> > > threads are
> > > stolen from the other SMT thread on the same core, even though I
> > > increased steal_thresh from 2 to 3 to account for the off-by-one
> > > problem.  This is true even right after the system has booted and
> > > no
> > > significant load has been applied.  My best guess is that because
> > > of
> > > affinity, both the parent and child processes run on the same CPU
> > > after
> > > fork(), and if a number of processes are forked() in quick
> > > succession,
> > > the run queue of that CPU can get really long.  Forcing a thread
> > > migration in exec() might be a good solution.
> > Since you are trying a lot of combinations, maybe you can tell us
> > which
> > ones work best.  SCHED_4BSD works better for me on an old 2-core
> > system.
> > SCHED_ULE works better on a not-so old 4x2 core (Haswell) system,
> > but I 
> > don't like it due to its complexity.  It makes differences of at
> > most
> > +-2% except when mistuned it can give -5% for real time (but better
> > for
> > CPU and presumably power).
> > 
> > For SCHED_4BSD, I wrote fancy tuning for fork/exec and sometimes
> > get
> > everything to like up for a 3% improvement (803 seconds instead of
> > 823
> > on the old system, with -current much slower at 840+ and old
> > versions
> > of ULE before steal_idle taking 890+).  This is very resource
> > (mainly
> > cache associativity?) dependent and my tuning makes little
> > difference
> > on the newer system.  SCHED_ULE still has bugfeatures which tend to
> > help large builds by reducing context switching, e.g., by bogusly
> > clamping all CPU-bound threads to nearly maximal priority.
> That last bugfeature is probably what makes current systems
> interactive performance tank rather badly when under heavy
> loads.  Would it be hard to fix?
> 

I would second that sentiment... as time goes on, heavily loaded
systems seem to become less and less interactive-friendly.  Also,
running the heavy-load jobs such as builds with nice, even -n 20,
doesn't seem to make any noticible difference in terms of making un-
nice'd processes more responsive (not sure there's any relationship in
the underlying causes of that, though).

-- Ian




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1503771494.56799.49.camel>