Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 3 Aug 2018 14:27:29 +0000 (UTC)
From:      Andriy Gapon <avg@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   svn commit: r337255 - in head/sys: kern sys
Message-ID:  <201808031427.w73ERT7I076591@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: avg
Date: Fri Aug  3 14:27:28 2018
New Revision: 337255
URL: https://svnweb.freebsd.org/changeset/base/337255

Log:
  safer wait-free iteration of shared interrupt handlers
  
  The code that iterates a list of interrupt handlers for a (shared)
  interrupt, whether in the ISR context or in the context of an interrupt
  thread, does so in a lock-free fashion.   Thus, the routines that modify
  the list need to take special steps to ensure that the iterating code
  has a consistent view of the list.  Previously, those routines tried to
  play nice only with the code running in the ithread context.  The
  iteration in the ISR context was left to a chance.
  
  After commit r336635 atomic operations and memory fences are used to
  ensure that ie_handlers list is always safe to navigate with respect to
  inserting and removal of list elements.
  
  There is still a question of when it is safe to actually free a removed
  element.
  
  The idea of this change is somewhat similar to the idea of the epoch
  based reclamation.  There are some simplifications comparing to the
  general epoch based reclamation.  All writers are serialized using a
  mutex, so we do not need to worry about concurrent modifications.  Also,
  all read accesses from the open context are serialized too.
  
  So, we can get away just two epochs / phases.  When a thread removes an
  element it switches the global phase from the current phase to the other
  and then drains the previous phase.  Only after the draining the removed
  element gets actually freed. The code that iterates the list in the ISR
  context takes a snapshot of the global phase and then increments the use
  count of that phase before iterating the list.  The use count (in the
  same phase) is decremented after the iteration.  This should ensure that
  there should be no iteration over the removed element when its gets
  freed.
  
  This commit also simplifies the coordination with the interrupt thread
  context.  Now we always schedule the interrupt thread when removing one
  of handlers for its interrupt.  This makes the code both simpler and
  safer as the interrupt thread masks the interrupt thus ensuring that
  there is no interaction with the ISR context.
  
  P.S.  This change matters only for shared interrupts and I realize that
  those are becoming a thing of the past (and quickly).  I also understand
  that the problem that I am trying to solve is extremely rare.
  
  PR:		229106
  Reviewed by:	cem
  Discussed with:	Samy Al Bahra
  MFC after:	5 weeks
  Differential Revision: https://reviews.freebsd.org/D15905

Modified:
  head/sys/kern/kern_intr.c
  head/sys/sys/interrupt.h

Modified: head/sys/kern/kern_intr.c
==============================================================================
--- head/sys/kern/kern_intr.c	Fri Aug  3 14:25:15 2018	(r337254)
+++ head/sys/kern/kern_intr.c	Fri Aug  3 14:27:28 2018	(r337255)
@@ -683,6 +683,45 @@ intr_handler_source(void *cookie)
 }
 
 /*
+ * If intr_event_handle() is running in the ISR context at the time of the call,
+ * then wait for it to complete.
+ */
+static void
+intr_event_barrier(struct intr_event *ie)
+{
+	int phase;
+
+	mtx_assert(&ie->ie_lock, MA_OWNED);
+	phase = ie->ie_phase;
+
+	/*
+	 * Switch phase to direct future interrupts to the other active counter.
+	 * Make sure that any preceding stores are visible before the switch.
+	 */
+	KASSERT(ie->ie_active[!phase] == 0, ("idle phase has activity"));
+	atomic_store_rel_int(&ie->ie_phase, !phase);
+
+	/*
+	 * This code cooperates with wait-free iteration of ie_handlers
+	 * in intr_event_handle.
+	 * Make sure that the removal and the phase update are not reordered
+	 * with the active count check.
+	 * Note that no combination of acquire and release fences can provide
+	 * that guarantee as Store->Load sequences can always be reordered.
+	 */
+	atomic_thread_fence_seq_cst();
+
+	/*
+	 * Now wait on the inactive phase.
+	 * The acquire fence is needed so that that all post-barrier accesses
+	 * are after the check.
+	 */
+	while (ie->ie_active[phase] > 0)
+		cpu_spinwait();
+	atomic_thread_fence_acq();
+}
+
+/*
  * Sleep until an ithread finishes executing an interrupt handler.
  *
  * XXX Doesn't currently handle interrupt filters or fast interrupt
@@ -752,44 +791,30 @@ intr_event_remove_handler(void *cookie)
 	}
 
 	/*
-	 * If there is no ithread, then just remove the handler and return.
-	 * XXX: Note that an INTR_FAST handler might be running on another
-	 * CPU!
+	 * If there is no ithread, then directly remove the handler.  Note that
+	 * intr_event_handle() iterates ie_handlers in a lock-less fashion, so
+	 * care needs to be taken to keep ie_handlers consistent and to free
+	 * the removed handler only when ie_handlers is quiescent.
 	 */
 	if (ie->ie_thread == NULL) {
 		CK_SLIST_REMOVE_PREVPTR(prevptr, ih, ih_next);
+		intr_event_barrier(ie);
+		intr_event_update(ie);
 		mtx_unlock(&ie->ie_lock);
 		free(handler, M_ITHREAD);
 		return (0);
 	}
 
 	/*
-	 * If the interrupt thread is already running, then just mark this
-	 * handler as being dead and let the ithread do the actual removal.
-	 *
-	 * During a cold boot while cold is set, msleep() does not sleep,
-	 * so we have to remove the handler here rather than letting the
-	 * thread do it.
+	 * Let the interrupt thread do the job.
+	 * The interrupt source is disabled when the interrupt thread is
+	 * running, so it does not have to worry about interaction with
+	 * intr_event_handle().
 	 */
-	thread_lock(ie->ie_thread->it_thread);
-	if (!TD_AWAITING_INTR(ie->ie_thread->it_thread) && !cold) {
-		handler->ih_flags |= IH_DEAD;
-
-		/*
-		 * Ensure that the thread will process the handler list
-		 * again and remove this handler if it has already passed
-		 * it on the list.
-		 *
-		 * The release part of the following store ensures
-		 * that the update of ih_flags is ordered before the
-		 * it_need setting.  See the comment before
-		 * atomic_cmpset_acq(&ithd->it_need, ...) operation in
-		 * the ithread_execute_handlers().
-		 */
-		atomic_store_rel_int(&ie->ie_thread->it_need, 1);
-	} else
-		CK_SLIST_REMOVE_PREVPTR(prevptr, ih, ih_next);
-	thread_unlock(ie->ie_thread->it_thread);
+	KASSERT((handler->ih_flags & IH_DEAD) == 0,
+	    ("duplicate handle remove"));
+	handler->ih_flags |= IH_DEAD;
+	intr_event_schedule_thread(ie);
 	while (handler->ih_flags & IH_DEAD)
 		msleep(handler, &ie->ie_lock, 0, "iev_rmh", 0);
 	intr_event_update(ie);
@@ -1154,6 +1179,7 @@ intr_event_handle(struct intr_event *ie, struct trapfr
 	struct trapframe *oldframe;
 	struct thread *td;
 	int ret, thread;
+	int phase;
 
 	td = curthread;
 
@@ -1178,6 +1204,15 @@ intr_event_handle(struct intr_event *ie, struct trapfr
 	oldframe = td->td_intr_frame;
 	td->td_intr_frame = frame;
 
+	phase = ie->ie_phase;
+	atomic_add_int(&ie->ie_active[phase], 1);
+
+	/*
+	 * This fence is required to ensure that no later loads are
+	 * re-ordered before the ie_active store.
+	 */
+	atomic_thread_fence_seq_cst();
+
 	CK_SLIST_FOREACH(ih, &ie->ie_handlers, ih_next) {
 		if (ih->ih_filter == NULL) {
 			thread = 1;
@@ -1215,6 +1250,8 @@ intr_event_handle(struct intr_event *ie, struct trapfr
 				thread = 1;
 		}
 	}
+	atomic_add_rel_int(&ie->ie_active[phase], -1);
+
 	td->td_intr_frame = oldframe;
 
 	if (thread) {

Modified: head/sys/sys/interrupt.h
==============================================================================
--- head/sys/sys/interrupt.h	Fri Aug  3 14:25:15 2018	(r337254)
+++ head/sys/sys/interrupt.h	Fri Aug  3 14:27:28 2018	(r337255)
@@ -122,6 +122,8 @@ struct intr_event {
 	struct timeval	ie_warntm;
 	int		ie_irq;		/* Physical irq number if !SOFT. */
 	int		ie_cpu;		/* CPU this event is bound to. */
+	volatile int	ie_phase;	/* Switched to establish a barrier. */
+	volatile int	ie_active[2];	/* Filters in ISR context. */
 };
 
 /* Interrupt event flags kept in ie_flags. */



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201808031427.w73ERT7I076591>