Date: Sat, 26 Aug 2017 10:50:16 -0700 (PDT) From: Don Lewis <truckman@FreeBSD.org> To: avg@FreeBSD.org Cc: freebsd-arch@FreeBSD.org Subject: Re: ULE steal_idle questions Message-ID: <201708261750.v7QHoG2c053745@gw.catspoiler.org> In-Reply-To: <201708251824.v7PIOA6q048321@gw.catspoiler.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 25 Aug, To: avg@FreeBSD.org wrote: > On 24 Aug, To: avg@FreeBSD.org wrote: >> Aside from the Ryzen problem, I think the steal_idle code should be >> re-written so that it doesn't block interrupts for so long. In its >> current state, interrupt latence increases with the number of cores and >> the complexity of the topology. >> >> What I'm thinking is that we should set a flag at the start of the >> search for a thread to steal. If we are preempted by another, higher >> priority thread, that thread will clear the flag. Next we start the >> loop to search up the hierarchy. Once we find a candidate CPU: >> >> steal = TDQ_CPU(cpu); >> CPU_CLR(cpu, &mask); >> tdq_lock_pair(tdq, steal); >> if (tdq->tdq_load != 0) { >> goto out; /* to exit loop and switch to the new thread */ >> } >> if (flag was cleared) { >> tdq_unlock_pair(tdq, steal); >> goto restart; /* restart the search */ >> } >> if (steal->tdq_load < thresh || steal->tdq_transferable == 0 || >> tdq_move(steal, tdq) == 0) { >> tdq_unlock_pair(tdq, steal); >> continue; >> } >> out: >> TDQ_UNLOCK(steal); >> clear flag; >> mi_switch(SW_VOL | SWT_IDLE, NULL); >> thread_unlock(curthread); >> return (0); >> >> And we also have to clear the flag if we did not find a thread to steal. > > I've implemented something like this and added a bunch of counters to it > to get a better understanding of its behavior. Instead of adding a flag > to detect preemption, I used the same switchcnt test as is used by > sched_idletd(). These are the results of a ~9 hour poudriere run: > > kern.sched.steal.none: 9971668 # no threads were stolen > kern.sched.steal.fail: 23709 # unable to steal from cpu=sched_highest() > kern.sched.steal.level2: 191839 # somewhere on this chip > kern.sched.steal.level1: 557659 # a core on this CCX > kern.sched.steal.level0: 4555426 # the other SMT thread on this core > kern.sched.steal.restart: 404 # preemption detected so restart the search > kern.sched.steal.call: 15276638 # of times tdq_idled() called > > There are a few surprises here. > > One is the number of failed moves. I don't know if the load on the > source CPU fell below thresh, tdq_transferable went to zero, or if > tdq_move() failed. I also wonder if the failures are evenly distributed > across CPUs. It is possible that these failures are concentrated on CPU > 0, which handles most interrupts. If interrupts don't affect switchcnt, > then the data collected by sched_highest() could be a bit stale and we > would not know it. Most of the above failed moves were do to the either tdq_load dropping below the threshold or tdq_transferable going to zero. These are evenly distributed across CPUs that we want to steal from. I didn't not bin the results by which CPU this code was running on. Actual failures of tdq_move() are bursty and not evenly distributed across CPUs. I've created this review for my changes: https://reviews.freebsd.org/D12130
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201708261750.v7QHoG2c053745>