From owner-freebsd-bugs@FreeBSD.ORG Tue Feb 24 21:20:00 2015 Return-Path: Delivered-To: freebsd-bugs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id ADA9CD75 for ; Tue, 24 Feb 2015 21:20:00 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 94332996 for ; Tue, 24 Feb 2015 21:20:00 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t1OLK0Vk003927 for ; Tue, 24 Feb 2015 21:20:00 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 198014] Signals can lead to an inconsistency in PI mutex ownership Date: Tue, 24 Feb 2015 21:20:00 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: eric@vangyzen.net X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Feb 2015 21:20:00 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=198014 Bug ID: 198014 Summary: Signals can lead to an inconsistency in PI mutex ownership Product: Base System Version: 11.0-CURRENT Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: eric@vangyzen.net Signals can lead to an inconsistency in PI mutex ownership. I have two test cases to reproduce this. I hope to provide them soon. For now, here is the description. Consider three threads--Trun, Tsleep, and Tsig--all contending for one pthread mutex. Trun owns the mutex and is running in userspace. Tsleep wants the mutex and calls pthread_mutex_lock...do_lock_pi. Near the top of do_lock_pi, Tsleep allocates a umtx_pi object. This object will exist as long as at least one thread is in do_lock_pi for this mutex. Since Trun owns the mutex, Tsleep sets UMUTEX_CONTESTED and calls umtxq_sleep_pi. Therein, Tsleep adds itself to the queue of waiters (umtxq_insert) and assigns ownership of the umtx_pi to Trun (umtx_pi_setowner). It then sleeps in utmxq_sleep. Tsig wants the mutex and does the same as Tsleep, with a few differences. Tsig does not allocate a new umtx_pi; instead, it finds the existing umtx_pi and increments its reference count. Tsig becomes the second thread in the queue of waiters. Tsig does not set ownership of the umtx_pi, since that's already done. Tsig then sleeps in umtxq_sleep. Trun calls pthread_mutex_unlock...do_unlock_pi. Therein, umtxq_count_pi indicates that Tsleep is the first thread on the queue of waiters. Trun disowns the umtx_pi, removes Tsleep from the queue of waiters, and makes it runnable. However, Tsleep does not run immediately, for whatever reason. Perhaps all CPUs are busy. Perhaps CPU sets, priorities, and schedling policy allow Trun to keep running while Tsleep sits on the run queue. Trun calls pthread_mutex_lock...do_lock_pi again. It acquires the mutex, claims ownership of the umtx_pi (umtx_pi_claim), and returns to userland. A thread sends a signal to Tsig. It returns from umtxq_sleep, removes itself from the queue of waiters, and ultimately returns from do_lock_pi. The queue of waiters is now empty. Trun calls pthread_mutex_unlock...do_unlock_pi. Unlike last time, umtxq_count_pi says the queue is empty, so Trun does not disown the umtx_pi. (Recall that the umtx_pi remains in existence due to the reference by Tsleep.) Trun sets the mutex to UMUTEX_UNOWNED and returns. Now, the mutex and umtx_pi disagree on the ownership of the mutex. From here, there are several possible paths to failure. For completeness, let's follow through with one. Any thread--Tany--locks the mutex. Any other thread--Tother--tries to lock it, sets the contested bit, adds itself to the queue, and sleeps. Tany unlocks the mutex; since it's contested, Tany calls do_unlock_pi. Since Tother is in the queue, uq_first is non-NULL. Recall that Trun still owns the umtx_pi, so pi->pi_owner != curthread, so do_unlock_pi returns EPERM and leaves the umutex owned by Tany. Before calling do_unlock_pi, Tany had already disowned the pthread_mutex. The error from _thr_umutex_unlock2 has no effect. So, nobody owns the pthread_mutex, Tany owns the umutex, and Trun owns the umtx_pi. Prior to r277970, this broken ownership could have caused a panic. Now, it just causes operations on this mutex to fail, or possibly causes a deadlock among the contending user threads. To solve this problem, do_unlock_pi should disown the umtx_pi even if the queue of waiters is empty. -- You are receiving this mail because: You are the assignee for the bug.