Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 11 Jul 2017 03:45:48 +0000 (UTC)
From:      Mateusz Guzik <mjg@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-10@freebsd.org
Subject:   svn commit: r320885 - stable/10/sys/sys
Message-ID:  <201707110345.v6B3jmkt054654@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: mjg
Date: Tue Jul 11 03:45:47 2017
New Revision: 320885
URL: https://svnweb.freebsd.org/changeset/base/320885

Log:
  Remove waiters check from the inline rw wunlock routine.
  
  This is a direct commit to stable/10.
  
  r310979 is a merge of depessimisation of locking primitives.
  The important part was getting rid of an attempt to grab the lock in the
  slow path immediately after the fast path failed. In addition to that
  temporary checks were added before all atomic ops. They have no impact on
  semantic nor correctness, they only avoid an atomic operation which is
  likely to fail.
  
  After the addition of atomic_fcmpset and further changes said checks
  became pessimal and got removed. This may get merged to stable/10.
  
  Reports started showing up about a crash in all branches having extra
  checks. The codepath is:
  .. -> vm_map_delete -> __rw_wunlock_hard -> turnstile_broadcast
  
  The kernel crashed trying to wake up nonexistent waiters. The lock value
  as found in the vmcore matches the panicking thread, so in particular
  there was no waiters bit set. The bit can only be cleared by the current
  owner.
  
  A debug patch was provided, which reportedly had a side effect of getting
  rid of the issue.
  
  Also one of the reporters said that reverting the patch which adds the
  extra checks makes the crash go away.
  
  It was also reported that head with the fcmpset changes (explicit checks
  removed) also stops crashing.
  
  Finally, one user tested crashing stable/10 variant with just the rw
  wunlock check removed.
  
  The common case in all but one reports was an Intel Atom cpu. Claiming
  a cpu bug at this point is bold and I'm going to refrain from it, but
  right now apart from cpu-specific optimisation made by the compiler on
  custom kernel compiles I don't see how this can be a software bug.
  
  This will have to be investigated more.
  
  Meanwhile, restore rw wunlock to pre-r310979 state.
  
  PR:		213903

Modified:
  stable/10/sys/sys/rwlock.h

Modified: stable/10/sys/sys/rwlock.h
==============================================================================
--- stable/10/sys/sys/rwlock.h	Tue Jul 11 00:32:48 2017	(r320884)
+++ stable/10/sys/sys/rwlock.h	Tue Jul 11 03:45:47 2017	(r320885)
@@ -109,7 +109,7 @@
 									\
 	if ((rw)->rw_recurse)						\
 		(rw)->rw_recurse--;					\
-	else if ((rw)->rw_lock != _tid || !_rw_write_unlock((rw), _tid))\
+	else if (!_rw_write_unlock((rw), _tid))				\
 		_rw_wunlock_hard((rw), _tid, (file), (line));		\
 } while (0)
 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201707110345.v6B3jmkt054654>