Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Jan 2017 17:40:31 +0000 (UTC)
From:      Hans Petter Selasky <hselasky@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   svn commit: r312551 - head/sys/kern
Message-ID:  <201701201740.v0KHeVNT047968@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: hselasky
Date: Fri Jan 20 17:40:31 2017
New Revision: 312551
URL: https://svnweb.freebsd.org/changeset/base/312551

Log:
  Fix for race leading to endless timer interrupts related to
  configtimer().
  
  During normal operation "state->nextcallopt" will always be less than
  or equal to "state->nextcall" and checking only "state->nextcallopt"
  before calling "callout_process()" is sufficient. However when
  "configtimer()" is called a race might happen requiring both of these
  binary times to be checked.
  
  Short description of race:
  
  1) A configtimer() call will reset both "state->nextcall" and
  "state->nextcallopt" to the same binary time.
  
  2) If a "callout_reset()" call happens between "configtimer()" and the
  next "callout_process()" call, "state->nextcallopt" will get updated
  and "state->nextcall" will remain at the current time. Refer to logic
  inside cpu_new_callout().
  
  3) getnextcpuevent() only respects "state->nextcall" and returns this
  value over and over again, even if it is in the past, until "now >=
  state->nextcallopt" becomes true. Then these two time variables are
  corrected by a "callout_process()" call and the situation goes back to
  normal.
  
  The problem manifests itself in different ways. The common factor is
  the timer process(es) consume all CPU on one or more CPU cores for a
  long time, blocking other kernel processes from getting execution
  time. This can be seen by very high interrupt counts as displayed by
  "vmstat -i | grep timer" right after boot.
  
  When EARLY_AP_STARTUP was enabled in r310177 the likelyhood of hitting
  this bug apparently increased.
  
  Example output from "vmstat -i" before patch:
  cpu0:timer                          7591         69
  cpu9:timer                      39031773     358089
  cpu4:timer                          9359         85
  cpu3:timer                          9100         83
  cpu2:timer                          9620         88
  
  Example output from "vmstat -i" after patch:
  cpu0:timer                          4242         34
  cpu6:timer                          5531         44
  cpu3:timer                          6450         52
  cpu1:timer                          4545         36
  cpu9:timer                          7153         58
  
  Before the patch cpu9 in the example above, was spinning in a loop in
  order to reach 39 million interrupts just a few seconds after
  bootup. After the patch the timer interrupt counts are more or less
  consistent.
  
  Discussed with:		mav @
  Reported by:		several people
  MFC after:		1 week
  Sponsored by:		Mellanox Technologies

Modified:
  head/sys/kern/kern_clocksource.c

Modified: head/sys/kern/kern_clocksource.c
==============================================================================
--- head/sys/kern/kern_clocksource.c	Fri Jan 20 17:39:38 2017	(r312550)
+++ head/sys/kern/kern_clocksource.c	Fri Jan 20 17:40:31 2017	(r312551)
@@ -207,7 +207,7 @@ handleevents(sbintime_t now, int fake)
 		}
 	} else
 		state->nextprof = state->nextstat;
-	if (now >= state->nextcallopt) {
+	if (now >= state->nextcallopt || now >= state->nextcall) {
 		state->nextcall = state->nextcallopt = SBT_MAX;
 		callout_process(now);
 	}



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201701201740.v0KHeVNT047968>