From owner-freebsd-current@FreeBSD.ORG Wed Apr 6 18:31:58 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C2174106566C for ; Wed, 6 Apr 2011 18:31:58 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 7CC748FC08 for ; Wed, 6 Apr 2011 18:31:58 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 0ED5E46B60; Wed, 6 Apr 2011 14:31:58 -0400 (EDT) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 88DE78A01B; Wed, 6 Apr 2011 14:31:57 -0400 (EDT) From: John Baldwin To: freebsd-current@freebsd.org Date: Wed, 6 Apr 2011 14:29:49 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110325; KDE/4.5.5; amd64; ; ) References: <201104060836.56542.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201104061429.50185.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Wed, 06 Apr 2011 14:31:57 -0400 (EDT) Cc: Ryan Stone Subject: Re: sched_4bsd startup crash trying to run a bound thread on an AP that hasn't started X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Apr 2011 18:31:58 -0000 On Wednesday, April 06, 2011 1:08:20 pm Ryan Stone wrote: > On Wed, Apr 6, 2011 at 8:36 AM, John Baldwin wrote: > > Hummm. Patching 4BSD to use the same route as ULE may be the best solution > > for now if that is easiest. Alternatively, you could change 4BSD's > > sched_add() to not try to kick other CPUs until smp_started is true. > > At first I thought that it was a consequence of the way it does CPU > affinity, but now I see that it shortcuts if smp_started is not true. > How about something like the following for 4BSD? > > --- sched_4bsd.c (revision 220222) > +++ sched_4bsd.c (working copy) > @@ -1242,14 +1242,14 @@ > } > TD_SET_RUNQ(td); > > - if (td->td_pinned != 0) { > + if (smp_started && td->td_pinned != 0) { > cpu = td->td_lastcpu; > ts->ts_runq = &runq_pcpu[cpu]; > single_cpu = 1; > CTR3(KTR_RUNQ, > "sched_add: Put td_sched:%p(td:%p) on cpu%d runq", ts, td, > cpu); > - } else if (td->td_flags & TDF_BOUND) { > + } else if (smp_started && (td->td_flags & TDF_BOUND)) { > /* Find CPU from bound runq. */ > KASSERT(SKE_RUNQ_PCPU(ts), > ("sched_add: bound td_sched not on cpu runq")); > @@ -1258,7 +1258,7 @@ > CTR3(KTR_RUNQ, > "sched_add: Put td_sched:%p(td:%p) on cpu%d runq", ts, td, > cpu); > - } else if (ts->ts_flags & TSF_AFFINITY) { > + } else if (smp_started && (ts->ts_flags & TSF_AFFINITY)) { > /* Find a valid CPU for our cpuset */ > cpu = sched_pickcpu(td); > ts->ts_runq = &runq_pcpu[cpu]; > > The flow control is a bit awkward because of the multiple > affinity/bound cpu cases. If somebody prefers the code to be > structured differently I'd be open to suggestions. Maybe it could do this: if (!smp_started) { cpu = NOCPU; ts->runq = &runq; } else if (td->td_pinned) { ... That would be a smaller patch and I think more obvious to the reader even though it duplicates the global runq selection. I would even be ok with a goto for this case that if !smp_started it just jumps to the global runq bit in the last else. I guess one other option would be something like this: if (smp_started && (td->td_pinned != 0 || td->td_flags & TDF_BOUND || ts->ts_flags & TSF_AFFINITY)) { if (td->td_pinned != 0) cpu = td->td_lastcpu; else if (td->td_flags & TDF_BOUND) { /* Find CPU from bound runq. */ KASSERT(...); cpu = ts->ts_runq - &runq_pcpu[0]; } else /* Find a valid CPU for our cpuset. */ cpu = sched_pickcpu(td); ts->ts_runq = &runq_pcpu[cpu]; single_cpu = 1; CTR3(KTR_RUNQ, ...); } else { /* Global runq case. */ } This also avoids duplicating some common code to all the single_cpu cases. -- John Baldwin