From owner-freebsd-smp  Mon Jul  2  9:13: 6 2001
Delivered-To: freebsd-smp@freebsd.org
Received: from sneakerz.org (sneakerz.org [216.33.66.254])
	by hub.freebsd.org (Postfix) with ESMTP id D35A337B405
	for <smp@freebsd.org>; Mon,  2 Jul 2001 09:13:01 -0700 (PDT)
	(envelope-from bright@sneakerz.org)
Received: by sneakerz.org (Postfix, from userid 1092)
	id 5FA5A5D010; Mon,  2 Jul 2001 11:12:51 -0500 (CDT)
Date: Mon, 2 Jul 2001 11:12:51 -0500
From: Alfred Perlstein <bright@sneakerz.org>
To: "Michael C . Wu" <keichii@peorth.iteration.net>
Cc: smp@freebsd.org
Subject: Re: per cpu runqueues, cpu affinity and cpu binding.
Message-ID: <20010702111251.L84523@sneakerz.org>
References: <20010702003213.I84523@sneakerz.org> <20010702093638.B96996@peorth.iteration.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2i
In-Reply-To: <20010702093638.B96996@peorth.iteration.net>; from keichii@iteration.net on Mon, Jul 02, 2001 at 09:36:38AM -0500
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-smp.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo?subject=subscribe%20freebsd-smp>
List-Unsubscribe: <mailto:majordomo?subject=unsubscribe%20freebsd-smp>
X-Loop: FreeBSD.org

* Michael C . Wu <keichii@iteration.net> [010702 09:36] wrote:
> 
> I don't think doing per-thread affinity is a good idea.  Because
> we want to keep threads lightweight.

That's true, one can also optimize certain operations such as tlb
shootdown when all threads of a process are on a single cpu.

> You may want to take a look at this url about processor affinity: :)
> http://www.isi.edu/lsam/tools/autosearch/load_balancing/19970804.html

Sorry, there's a boatload of urls on that page, i've choosen a couple
of them and the info isn't very useful (old archived emails), perhaps
you can give a more direct link to this information?

> | The current way it is implemented is that for unbound processes
> | there is a double linkage, basically an unbound process will be on
> | both the cpu it last ran on and the global queue.  A certain weight
> | is assigned to tip the scales in favor of running a process that's
> | last ran on a particular cpu, basically 4 * RQ_PPQ (see the mod to
> 
> Is there a special reason for choosing 4 * RQ_PPQ?

Yes, I thought it was a good value. :)

> | runq_choose()), this could be adjusted in order to give either
> | higher priority processes a boost, or a process that last ran on
> | the cpu pulling it off the runqueue a boost.
> | 
> | Bound processes only exist on the per-cpu queue that they are bound
> | to.
> | 
> | What I'd actually prefer is no global queue, when schedcpu() is
> | called it would balance out the processes amongst the per-cpu
> | queues, or if a particular cpu realized it was stuck with a lot of
> | high or low priority processes while another cpu is occupied with
> | the opposite it would attempt to migrate or steal depending on the
> | type of imbalance going on.  Suggestions on how to do this would
> | also be appreciated. :)
> 
> An actual empirical measurement is required in this case.
> When can we justify the cache performance loss to switch to another
> CPU?  In addition, once this process is switched to another CPU,
> we want to keep it there.

That's the intention of the redistribution code, if it only happens
every N times schedcpu() is called, for the time between calls we
will have hard affinity and possibly loose that once, twice, HZ
times a second?

Anyhow, I really would like to hear about stuff that maps to code
for this one.

> | The attached bindcpu.c program will need sys/pioctl.h installed to
> | compile, once compiled and the kernel is rebuilt (don't forget
> | modules as the size of proc has changed) you can use it to bind
> | processes like so:
> | 
> | ./bindcpu <curproc|pid> 1  # bind curproc/pid to cpu 1
> | ./bindcpu <curproc|pid> -1 # unbind
> 
> This interface may not be the best to do. We can figure this out later.

I think it's sorta cute, Mike Smith suggested something along this line.

> | Index: fs/procfs/procfs_vnops.c
> | ===================================================================
> | RCS file: /home/ncvs/src/sys/fs/procfs/procfs_vnops.c,v
> | retrieving revision 1.98
> | diff -u -r1.98 procfs_vnops.c
> | --- fs/procfs/procfs_vnops.c	2001/05/25 16:59:04	1.98
> | +++ fs/procfs/procfs_vnops.c	2001/07/01 16:48:51
> 
> | +
> | +	if ((p->p_sflag & PS_BOUND) == 0) {
> | +		cpu = p->p_lastcpu;
> | +		if (cpu < 0 || cpu >= mp_ncpus)
> | +			cpu = PCPU_GET(cpuid);
> | +		p->p_rqcpu = cpu;
> | +		runq_setbit(rq, pri);
> | +		rqh = &rq->rq_queues[pri];
> | +		CTR4(KTR_RUNQ, "runq_add: p=%p pri=%d %d rqh=%p",
> | +		    p, p->p_pri.pri_level, pri, rqh);
> | +		TAILQ_INSERT_TAIL(rqh, p, p_procq);
> | +	} else {
> | +		CTR2(KTR_RUNQ, "runq_add: proc %p bound to cpu %d",
> | +		    p, (int)p->p_rqcpu);
> | +		cpu = p->p_rqcpu;
> | +	}
> 
> I recall a better algorithm in the almighty TAOCP. Will look
> it up when I get back.
> 
> | +	cpu = PCPU_GET(cpuid);
> | +	pricpu = runq_findbit(&runqcpu[cpu]);
> | +	pri = runq_findbit(rq);
> | +	CTR2(KTR_RUNQ, "runq_choose: pri=%d cpupri=%d", pri, pricpu);
> | +	if (pricpu != -1 && (pricpu <= pri + 4 * RQ_PPQ || pri == -1)) {
> | +		pri = pricpu;
> | +		rqh = &runqcpu[cpu].rq_queues[pri];
> | +	} else if (pri != -1) {
> | +		rqh = &rq->rq_queues[pri];
> | +	} else {
> | +		CTR1(KTR_RUNQ, "runq_choose: idleproc pri=%d", pri);
> | +		return (PCPU_GET(idleproc));
> | +	}
> 
> Do you intend the algorithm to be this simple? Or are you going to
> change it in the future?

Ah! This is important, I would like it to be more complex in the
future if it buys performance, but you should know that the KTR
stuff when compiled in makes a signifigant performance degradation
occur, the KTR stuff is (or should) be pretty fast, basically, we
don't have much room here for a very complex algorithm, this is why
I wanted to move it to schedcpu().

-- 
-Alfred Perlstein [alfred@freebsd.org]
Ok, who wrote this damn function called '??'?
And why do my programs keep crashing in it?

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message