From owner-freebsd-hackers@freebsd.org  Mon Apr 17 17:15:52 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 20E02D4248F
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 17 Apr 2017 17:15:52 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: from mail-pg0-x243.google.com (mail-pg0-x243.google.com
 [IPv6:2607:f8b0:400e:c05::243])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id E4C63C8
 for <freebsd-hackers@freebsd.org>; Mon, 17 Apr 2017 17:15:51 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: by mail-pg0-x243.google.com with SMTP id 63so15322789pgh.0
 for <freebsd-hackers@freebsd.org>; Mon, 17 Apr 2017 10:15:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-transfer-encoding;
 bh=ke6p95ySSMz4Rap7lwncmAcoi7HSs+49oACgiwohkMU=;
 b=dd7p68BN3OSQfPRIlq163GnZ7AArX/A/o0atiidCULqv6cPLvlIKppeRrW0iGWs70C
 TjA1EkA8XbfyDoRmVhokhX/UkgGyOCN35Y7SDhcbZ29W2o7yXdZka2A/IW8Ne1oLG79H
 W5d8A2ut8ZBs/0MTPLMXIFGUhNOONEhp2C2Q9vGCLWUIWebdIlKsKC74mp2OzWTYw7nw
 xjq9rtwS2OFoTkraWxiTnwa1tNF6z0Iv9V9GmKXVWS3u/GZlEmvpuMJMDsP9N9aAZdiF
 RSlpW7FWTb21eZtumuIEQljUqaYaQgQFa7YEAg7wsm1zburwK0jnE8E9pHTV6Q4jmLKs
 8ZLg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=ke6p95ySSMz4Rap7lwncmAcoi7HSs+49oACgiwohkMU=;
 b=kivib+JvC+AH8M/8hMCbq4iI9vd833Y3bb3Cst/tTyowKJxCMuxL4OPyNdxSefn176
 26e1fhRnNiL1N9VkS/trRMwYyoKAouZLKm1glFh3P4Ssxh2slSItiK8NSAvkvfr48JIa
 bDB4pDEAH5BW8iiTZBjraVhcTzlwVciKie5CpO3l+HbtsKxppKCAks12qsIaf0CldnAE
 knkKp8PiNFGVQn3j87hj1U18bE+LPAED3v3sSDne8juLtNEZVL0jp8X7j2U+hNUpU5Fy
 1HReTPecbcx9W3bwGGZ0UARx7K4W8XXN3wWzuxvTfjRd+E/mLhisMRYq6B2iPnlvYq7O
 0HGg==
X-Gm-Message-State: AN3rC/62UEaH/Ll/LIq3mT2YmEvQ+UyysPdK3V/eAul+X2zGNkctnEhU
 SCD5NfcWC6yXVg==
X-Received: by 10.99.225.5 with SMTP id z5mr13233024pgh.145.1492449351336;
 Mon, 17 Apr 2017 10:15:51 -0700 (PDT)
Received: from [192.168.0.100] ([110.64.91.54])
 by smtp.gmail.com with ESMTPSA id b8sm19336398pgn.51.2017.04.17.10.15.45
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Mon, 17 Apr 2017 10:15:50 -0700 (PDT)
Subject: Re: Understanding the FreeBSD locking mechanism
To: Chris Torek <torek@elf.torek.net>, imp@bsdimp.com
References: <201704131218.v3DCIBJg093207@elf.torek.net>
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, kostikbel@gmail.com,
 rysto32@gmail.com
From: Yubin Ruan <ablacktshirt@gmail.com>
Message-ID: <b9335496-5781-347b-829d-246e4a5936e8@gmail.com>
Date: Tue, 18 Apr 2017 01:15:41 +0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <201704131218.v3DCIBJg093207@elf.torek.net>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Apr 2017 17:15:52 -0000

On 2017/4/13 20:18, Chris Torek wrote:
>> I discover that in the current implementation in FreeBSD, spinlock
>> does not disable interrupt entirely:
> [extra-snipped here]
>>   610                 /* Give interrupts a chance while we spin. */
>>   611                 spinlock_exit();
>>   612                 while (m->mtx_lock != MTX_UNOWNED) {
> [more snip]
>
>> This is `_mtx_lock_spin_cookie(...)` in kern/kern_mutex.c, which
>> implements the core logic of spinning. However, as you can see, while
>> spinning, it would enable interrupt "occasionally" and disable it
>> again... What is the rationale for that?
>
> This code snippet is slightly misleading.  The full code path runs
> from mtx_lock_spin() through __mtx_lock_spin(), which first
> invokes spinlock_enter() and then, in the *contested* case (only),
> calls _mtx_lock_spin_cookie().
>
> spinlock_enter() is:
>
> 	td = curthread;
> 	if (td->td_md.md_spinlock_count == 0) {
> 		flags = intr_disable();
> 		td->td_md.md_spinlock_count = 1;
> 		td->td_md.md_saved_flags = flags;
> 	} else
> 		td->td_md.md_spinlock_count++;
> 	critical_enter();
>
> so it actualy disables interrupts *only* on the transition from
> td->td_md.md_spinlock_count = 0 to td->td_md.md_spinlock_count = 1,
> i.e., the first time we take a spin lock in this thread, whether
> this is a borrowed thread or not.  It's possible that interrupts
> are actually disabled at this point.  If so, td->td_md.md_saved_flags
> has interrupts disabled as well.  This is all just an optimization
> to use a thread-local variable so as to avoid touching hardware.
> The details vary widely, but typically, touching the actual hardware
> controls requires flushing the CPU's instruction pipeline.
>
> If the compare-and-swap fails, we enter _mtx_lock_spin_cookie()
> and loop waiting to see if we can obtain the spin lock in time.
> In that case, we don't actually *hold* this particular spin lock
> itself yet, so we can call spinlock_exit() to undo the effect
> of the outermost spinlock_enter() (in __mtx_lock_spin).  That
> decrements the counter.  *If* it goes to zero, that also calls
> intr_restore(td->td_md.md_saved_flags).
>
> Hence, if we have failed to obtain our first spin lock, we restore
> the interrupt setting to whatever we saved.  If interrupts were
> already locked out (as in a filter type interrupt handler) this is
> a potentially-somewhat-expensive no-op.  If interrupts were
> enabled previously, this is a somewhat expensive re-enable of
> interrupts -- but that's OK, and maybe good, because we have no
> spin locks of our own yet.  That means we can take hardware
> interrupts now, and let them borrow our current thread if they are
> that kind of interrupt, or schedule another thread to run if
> appropriate.  That might even preempt us, since we do not yet hold
> any spin locks.  (But it won't preempt us if we have done a
> critical_enter() before this point.)
>
> (In fact, the spinlock exit/enter calls that you see inside
> _mtx_lock_spin_cookie() wrap a loop that does not use compare-and-
> swap operations at all, but rather ordinary memory reads.  These
> are cheaper than CAS operations on a lot of CPUs, but they may
> produce wrong answers when two CPUs are racing to write the same
> location; only a CAS produces a guaranteed answer, which might

Hmm...I agree. I mis-read your words.
The loop inside _mtx_lock_spin_cookie() might produce a wrong result,
but the overall _mtx_lock_spin_cookie() won't, because it finally use
a compare-and-swap instruction to test finally.
(i.e the loop might produce a "false positive" result, so we have to
check again....)

-- yubinr

> still be "you lost the race".  The inner loop you are looking at
> occurs after losing a CAS race.  Once we think we might *win* a
> future CAS race, _mtx_lock_spin_cookie() calls spinlock_enter()
> again and tries the actual CAS operation, _mtx_obtain_lock_fetch(),
> with interrupts disabled.  Note also the calls to cpu_spinwait()
> -- the Linux equivalent macro is cpu_relax() -- which translates
> to a "pause" instruction on amd64.)
>
> Chris
>