From owner-freebsd-hackers@freebsd.org Mon Apr 17 17:15:52 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 20E02D4248F for ; Mon, 17 Apr 2017 17:15:52 +0000 (UTC) (envelope-from ablacktshirt@gmail.com) Received: from mail-pg0-x243.google.com (mail-pg0-x243.google.com [IPv6:2607:f8b0:400e:c05::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E4C63C8 for ; Mon, 17 Apr 2017 17:15:51 +0000 (UTC) (envelope-from ablacktshirt@gmail.com) Received: by mail-pg0-x243.google.com with SMTP id 63so15322789pgh.0 for ; Mon, 17 Apr 2017 10:15:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=ke6p95ySSMz4Rap7lwncmAcoi7HSs+49oACgiwohkMU=; b=dd7p68BN3OSQfPRIlq163GnZ7AArX/A/o0atiidCULqv6cPLvlIKppeRrW0iGWs70C TjA1EkA8XbfyDoRmVhokhX/UkgGyOCN35Y7SDhcbZ29W2o7yXdZka2A/IW8Ne1oLG79H W5d8A2ut8ZBs/0MTPLMXIFGUhNOONEhp2C2Q9vGCLWUIWebdIlKsKC74mp2OzWTYw7nw xjq9rtwS2OFoTkraWxiTnwa1tNF6z0Iv9V9GmKXVWS3u/GZlEmvpuMJMDsP9N9aAZdiF RSlpW7FWTb21eZtumuIEQljUqaYaQgQFa7YEAg7wsm1zburwK0jnE8E9pHTV6Q4jmLKs 8ZLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=ke6p95ySSMz4Rap7lwncmAcoi7HSs+49oACgiwohkMU=; b=kivib+JvC+AH8M/8hMCbq4iI9vd833Y3bb3Cst/tTyowKJxCMuxL4OPyNdxSefn176 26e1fhRnNiL1N9VkS/trRMwYyoKAouZLKm1glFh3P4Ssxh2slSItiK8NSAvkvfr48JIa bDB4pDEAH5BW8iiTZBjraVhcTzlwVciKie5CpO3l+HbtsKxppKCAks12qsIaf0CldnAE knkKp8PiNFGVQn3j87hj1U18bE+LPAED3v3sSDne8juLtNEZVL0jp8X7j2U+hNUpU5Fy 1HReTPecbcx9W3bwGGZ0UARx7K4W8XXN3wWzuxvTfjRd+E/mLhisMRYq6B2iPnlvYq7O 0HGg== X-Gm-Message-State: AN3rC/62UEaH/Ll/LIq3mT2YmEvQ+UyysPdK3V/eAul+X2zGNkctnEhU SCD5NfcWC6yXVg== X-Received: by 10.99.225.5 with SMTP id z5mr13233024pgh.145.1492449351336; Mon, 17 Apr 2017 10:15:51 -0700 (PDT) Received: from [192.168.0.100] ([110.64.91.54]) by smtp.gmail.com with ESMTPSA id b8sm19336398pgn.51.2017.04.17.10.15.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 17 Apr 2017 10:15:50 -0700 (PDT) Subject: Re: Understanding the FreeBSD locking mechanism To: Chris Torek , imp@bsdimp.com References: <201704131218.v3DCIBJg093207@elf.torek.net> Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, kostikbel@gmail.com, rysto32@gmail.com From: Yubin Ruan Message-ID: Date: Tue, 18 Apr 2017 01:15:41 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <201704131218.v3DCIBJg093207@elf.torek.net> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Apr 2017 17:15:52 -0000 On 2017/4/13 20:18, Chris Torek wrote: >> I discover that in the current implementation in FreeBSD, spinlock >> does not disable interrupt entirely: > [extra-snipped here] >> 610 /* Give interrupts a chance while we spin. */ >> 611 spinlock_exit(); >> 612 while (m->mtx_lock != MTX_UNOWNED) { > [more snip] > >> This is `_mtx_lock_spin_cookie(...)` in kern/kern_mutex.c, which >> implements the core logic of spinning. However, as you can see, while >> spinning, it would enable interrupt "occasionally" and disable it >> again... What is the rationale for that? > > This code snippet is slightly misleading. The full code path runs > from mtx_lock_spin() through __mtx_lock_spin(), which first > invokes spinlock_enter() and then, in the *contested* case (only), > calls _mtx_lock_spin_cookie(). > > spinlock_enter() is: > > td = curthread; > if (td->td_md.md_spinlock_count == 0) { > flags = intr_disable(); > td->td_md.md_spinlock_count = 1; > td->td_md.md_saved_flags = flags; > } else > td->td_md.md_spinlock_count++; > critical_enter(); > > so it actualy disables interrupts *only* on the transition from > td->td_md.md_spinlock_count = 0 to td->td_md.md_spinlock_count = 1, > i.e., the first time we take a spin lock in this thread, whether > this is a borrowed thread or not. It's possible that interrupts > are actually disabled at this point. If so, td->td_md.md_saved_flags > has interrupts disabled as well. This is all just an optimization > to use a thread-local variable so as to avoid touching hardware. > The details vary widely, but typically, touching the actual hardware > controls requires flushing the CPU's instruction pipeline. > > If the compare-and-swap fails, we enter _mtx_lock_spin_cookie() > and loop waiting to see if we can obtain the spin lock in time. > In that case, we don't actually *hold* this particular spin lock > itself yet, so we can call spinlock_exit() to undo the effect > of the outermost spinlock_enter() (in __mtx_lock_spin). That > decrements the counter. *If* it goes to zero, that also calls > intr_restore(td->td_md.md_saved_flags). > > Hence, if we have failed to obtain our first spin lock, we restore > the interrupt setting to whatever we saved. If interrupts were > already locked out (as in a filter type interrupt handler) this is > a potentially-somewhat-expensive no-op. If interrupts were > enabled previously, this is a somewhat expensive re-enable of > interrupts -- but that's OK, and maybe good, because we have no > spin locks of our own yet. That means we can take hardware > interrupts now, and let them borrow our current thread if they are > that kind of interrupt, or schedule another thread to run if > appropriate. That might even preempt us, since we do not yet hold > any spin locks. (But it won't preempt us if we have done a > critical_enter() before this point.) > > (In fact, the spinlock exit/enter calls that you see inside > _mtx_lock_spin_cookie() wrap a loop that does not use compare-and- > swap operations at all, but rather ordinary memory reads. These > are cheaper than CAS operations on a lot of CPUs, but they may > produce wrong answers when two CPUs are racing to write the same > location; only a CAS produces a guaranteed answer, which might Hmm...I agree. I mis-read your words. The loop inside _mtx_lock_spin_cookie() might produce a wrong result, but the overall _mtx_lock_spin_cookie() won't, because it finally use a compare-and-swap instruction to test finally. (i.e the loop might produce a "false positive" result, so we have to check again....) -- yubinr > still be "you lost the race". The inner loop you are looking at > occurs after losing a CAS race. Once we think we might *win* a > future CAS race, _mtx_lock_spin_cookie() calls spinlock_enter() > again and tries the actual CAS operation, _mtx_obtain_lock_fetch(), > with interrupts disabled. Note also the calls to cpu_spinwait() > -- the Linux equivalent macro is cpu_relax() -- which translates > to a "pause" instruction on amd64.) > > Chris >