Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 13 May 2011 21:52:16 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        Max Laier <max@love2party.net>
Cc:        freebsd-current@FreeBSD.org
Subject:   Re: proposed smp_rendezvous change
Message-ID:  <4DCD7DE0.7070400@FreeBSD.org>
In-Reply-To: <201105131150.57548.max@love2party.net>
References:  <4DCD357D.6000109@FreeBSD.org> <201105131041.59981.max@love2party.net> <4DCD4E21.7020800@FreeBSD.org> <201105131150.57548.max@love2party.net>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------080904030706060507060701
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

on 13/05/2011 18:50 Max Laier said the following:
> On Friday 13 May 2011 11:28:33 Andriy Gapon wrote:
>> on 13/05/2011 17:41 Max Laier said the following:
>>> this ncpus isn't the one you are looking for.
>>
>> Thank you!
>>
>> Here's an updated patch:
> 
> Can you attach the patch, so I can apply it locally.  This code is really hard 
> to read without context.  Some more comments inline ...

Attached.

>>
>> Index: sys/kern/subr_smp.c
>> ===================================================================
>> --- sys/kern/subr_smp.c	(revision 221835)
>> +++ sys/kern/subr_smp.c	(working copy)
>> @@ -316,19 +316,14 @@
>>  	void (*local_action_func)(void*)   = smp_rv_action_func;
>>  	void (*local_teardown_func)(void*) = smp_rv_teardown_func;
>>
>> -	/* Ensure we have up-to-date values. */
>> -	atomic_add_acq_int(&smp_rv_waiters[0], 1);
>> -	while (smp_rv_waiters[0] < smp_rv_ncpus)
>> -		cpu_spinwait();
>> -
> 
> You really need this for architectures that need the memory barrier to ensure 
> consistency.  We also need to move the reads of smp_rv_* below this point to 
> provide a consistent view.

I thought that this would be automatically handled by the fact that a master CPU
sets smp_rv_waiters[0] using atomic operation with release semantics.
But I am not very proficient in this matters...
But I fail to see why we need to require that all CPUs should gather at this
point/condition.

That is, my point is that we don't start a new rendezvous until a previous one
is completely finished.  Then we set up the new rendezvous, finish the setup
with an operation with release semantics and only then notify the target CPUs.
I can't see how the slave CPUs would see stale values in the rendezvous
pseudo-object, but, OTOH, I am not very familiar with architectures that have
weaker memory consistency rules as compared to x86.

-- 
Andriy Gapon

--------------080904030706060507060701
Content-Type: text/plain;
 name="smp_rv.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="smp_rv.diff"

Index: sys/kern/subr_smp.c
===================================================================
--- sys/kern/subr_smp.c	(revision 221835)
+++ sys/kern/subr_smp.c	(working copy)
@@ -316,19 +316,14 @@
 	void (*local_action_func)(void*)   = smp_rv_action_func;
 	void (*local_teardown_func)(void*) = smp_rv_teardown_func;
 
-	/* Ensure we have up-to-date values. */
-	atomic_add_acq_int(&smp_rv_waiters[0], 1);
-	while (smp_rv_waiters[0] < smp_rv_ncpus)
-		cpu_spinwait();
-
 	/* setup function */
 	if (local_setup_func != smp_no_rendevous_barrier) {
 		if (smp_rv_setup_func != NULL)
 			smp_rv_setup_func(smp_rv_func_arg);
 
 		/* spin on entry rendezvous */
-		atomic_add_int(&smp_rv_waiters[1], 1);
-		while (smp_rv_waiters[1] < smp_rv_ncpus)
+		atomic_add_int(&smp_rv_waiters[0], 1);
+		while (smp_rv_waiters[0] < smp_rv_ncpus)
                 	cpu_spinwait();
 	}
 
@@ -337,12 +332,16 @@
 		local_action_func(local_func_arg);
 
 	/* spin on exit rendezvous */
-	atomic_add_int(&smp_rv_waiters[2], 1);
-	if (local_teardown_func == smp_no_rendevous_barrier)
+	atomic_add_int(&smp_rv_waiters[1], 1);
+	if (local_teardown_func == smp_no_rendevous_barrier) {
+		atomic_add_int(&smp_rv_waiters[2], 1);
                 return;
-	while (smp_rv_waiters[2] < smp_rv_ncpus)
+	}
+	while (smp_rv_waiters[1] < smp_rv_ncpus)
 		cpu_spinwait();
 
+	atomic_add_int(&smp_rv_waiters[2], 1);
+
 	/* teardown function */
 	if (local_teardown_func != NULL)
 		local_teardown_func(local_func_arg);
@@ -377,6 +376,10 @@
 	/* obtain rendezvous lock */
 	mtx_lock_spin(&smp_ipi_mtx);
 
+	/* Wait for any previous unwaited rendezvous to finish. */
+	while (atomic_load_acq_int(&smp_rv_waiters[2]) < smp_rv_ncpus)
+		cpu_spinwait();
+
 	/* set static function pointers */
 	smp_rv_ncpus = ncpus;
 	smp_rv_setup_func = setup_func;
@@ -395,7 +398,7 @@
 		smp_rendezvous_action();
 
 	if (teardown_func == smp_no_rendevous_barrier)
-		while (atomic_load_acq_int(&smp_rv_waiters[2]) < ncpus)
+		while (atomic_load_acq_int(&smp_rv_waiters[1]) < ncpus)
 			cpu_spinwait();
 
 	/* release lock */

--------------080904030706060507060701--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4DCD7DE0.7070400>