Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Sep 2009 16:56:18 +0000 (UTC)
From:      Attilio Rao <attilio@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   svn commit: r197223 - head/sys/kern
Message-ID:  <200909151656.n8FGuIW3095139@svn.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: attilio
Date: Tue Sep 15 16:56:17 2009
New Revision: 197223
URL: http://svn.freebsd.org/changeset/base/197223

Log:
  Fix sched_switch_migrate():
  - In 8.x and above the run-queue locks are nomore shared even in the
    HTT case, so remove the special case.
  - The deadlock explained in the removed comment here is still possible
    even with different locks, with the contribution of tdq_lock_pair().
    An explanation is here:
    (hypotesis: a thread needs to migrate on another CPU, thread1 is doing
    sched_switch_migrate() and thread2 is the one handling the sched_switch()
    request or in other words, thread1 is the thread that needs to migrate
    and thread2 is a thread that is going to be preempted, most likely an
    idle thread. Also, 'old' is referred to the context (in terms of
    run-queue and CPU) thread1 is leaving and 'new' is referred to the
    context thread1 is going into.  Finally, thread3 is doing tdq_idletd()
    or sched_balance() and definitively doing tdq_lock_pair())
  
    * thread1 blocks its td_lock. Now td_lock is 'blocked'
    * thread1 drops its old runqueue lock
    * thread1 acquires the new runqueue lock
    * thread1 adds itself to the new runqueue and sends an IPI_PREEMPT
      through tdq_notify() to the new CPU
    * thread1 drops the new lock
    * thread3, scanning the runqueues, locks the old lock
    * thread2 received the IPI_PREEMPT and does thread_lock() with td_lock
      pointing to the new runqueue
    * thread3 wants to acquire the new runqueue lock, but it can't because
      it is held by thread2 so it spins
    * thread1 wants to acquire old lock, but as long as it is held by
      thread3 it can't
    * thread2 going further, at some point wants to switchin in thread1,
      but it will wait forever because thread1->td_lock is in blocked state
  
  This deadlock has been manifested mostly on 7.x and reported several time
  on mailing lists under the voice 'spinlock held too long'.
  Many thanks to des@ for having worked hard on producing suitable textdumps
  and Jeff for help on the comment wording.
  
  Reviewed by:	jeff
  Reported by:	des, others
  Tested by:	des, Giovanni Trematerra
  		<giovanni dot trematerra at gmail dot com>
  		(STABLE_7 based version)

Modified:
  head/sys/kern/sched_ule.c

Modified: head/sys/kern/sched_ule.c
==============================================================================
--- head/sys/kern/sched_ule.c	Tue Sep 15 12:51:22 2009	(r197222)
+++ head/sys/kern/sched_ule.c	Tue Sep 15 16:56:17 2009	(r197223)
@@ -1749,19 +1749,19 @@ sched_switch_migrate(struct tdq *tdq, st
 	 */
 	spinlock_enter();
 	thread_block_switch(td);	/* This releases the lock on tdq. */
-	TDQ_LOCK(tdn);
-	tdq_add(tdn, td, flags);
-	tdq_notify(tdn, td);
+
 	/*
-	 * After we unlock tdn the new cpu still can't switch into this
-	 * thread until we've unblocked it in cpu_switch().  The lock
-	 * pointers may match in the case of HTT cores.  Don't unlock here
-	 * or we can deadlock when the other CPU runs the IPI handler.
+	 * Acquire both run-queue locks before placing the thread on the new
+	 * run-queue to avoid deadlocks created by placing a thread with a
+	 * blocked lock on the run-queue of a remote processor.  The deadlock
+	 * occurs when a third processor attempts to lock the two queues in
+	 * question while the target processor is spinning with its own
+	 * run-queue lock held while waiting for the blocked lock to clear.
 	 */
-	if (TDQ_LOCKPTR(tdn) != TDQ_LOCKPTR(tdq)) {
-		TDQ_UNLOCK(tdn);
-		TDQ_LOCK(tdq);
-	}
+	tdq_lock_pair(tdn, tdq);
+	tdq_add(tdn, td, flags);
+	tdq_notify(tdn, td);
+	TDQ_UNLOCK(tdn);
 	spinlock_exit();
 #endif
 	return (TDQ_LOCKPTR(tdn));



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200909151656.n8FGuIW3095139>