From owner-svn-src-user@FreeBSD.ORG Mon Nov 15 00:04:48 2010 Return-Path: Delivered-To: svn-src-user@freebsd.org Received: from alona.my.domain (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id 09F56106566B; Mon, 15 Nov 2010 00:04:46 +0000 (UTC) (envelope-from davidxu@freebsd.org) Message-ID: <4CE0790B.3040706@freebsd.org> Date: Mon, 15 Nov 2010 08:04:27 +0800 From: David Xu User-Agent: Thunderbird 2.0.0.21 (X11/20090522) MIME-Version: 1.0 To: Jilles Tjoelker References: <201011071349.oA7Dn8Po048543@svn.freebsd.org> <20101113151035.GB79975@stack.nl> <4CDF7F38.5010000@freebsd.org> <20101114181631.GA1831@stack.nl> In-Reply-To: <20101114181631.GA1831@stack.nl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: src-committers@freebsd.org, svn-src-user@freebsd.org Subject: Re: svn commit: r214915 - user/davidxu/libthr/lib/libthr/thread X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Nov 2010 00:04:48 -0000 Jilles Tjoelker wrote: > On Sun, Nov 14, 2010 at 02:18:32PM +0800, David Xu wrote: > >> Jilles Tjoelker wrote: >> >>> On Sun, Nov 07, 2010 at 01:49:08PM +0000, David Xu wrote: >>> > > >>>> Author: davidxu >>>> Date: Sun Nov 7 13:49:08 2010 >>>> New Revision: 214915 >>>> URL: http://svn.freebsd.org/changeset/base/214915 >>>> > > >>>> Log: >>>> Implement robust mutex, the pthread_mutex locking and >>>> unlocking code are reworked to support robust mutex and >>>> other mutex must be locked and unlocked by kernel. >>>> > > >>> The glibc+linux implementation avoids the system call for each robust >>> mutex lock/unlock by maintaining the list in userland and providing a >>> pointer to the kernel. Although this is somewhat less reliable in case a >>> process scribbles over this list, it has better performance. >>> > > >>> There are various ways this list could be maintained, but the glibc way >>> uses an "in progress" field per thread and a linked list using a field >>> in the pthread_mutex_t, so if we want that we should make sure we have >>> the space in the pthread_mutex_t. Alternatively, a simple array could be >>> used if the number of owned robust mutexes can be limited to a fairly >>> low value. >>> > > >>> Solaris robust mutexes used to work by entering the kernel for every >>> lock/unlock, but no longer, see >>> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6296770 >>> Someone complained about that implementation being too slow. >>> > > >> I don't like the glibc's idea that reading or writing the mutex after >> unlocked it, >> why do you think the memory is still valid and being used for the mutex >> after you unlocked it? >> > > >> There is a use-case that glibc will mysteriously fail: >> Thread A at userland unlocked the mutex, then another thread B at >> userland locked it, and thread B reuses the memory area for other purpose, >> before thread A enters kernel, thread B used it for memory-mapped file >> buffer, >> than write some data into the buffer which will be saved into disk file >> by kernel, but before A runs, the memory happens to contains thread A's >> thread id, then the thread A enters kernel, and thinks the userland >> still hasn't unlocked the mutex, and it tries to write some data into mutex, >> it thinks it unlocked it, but the memory is no longer for mutex now, it just >> simply corrupted the data thread B saved. This implementation is racy and >> dangerous. >> > > That seems rare but very dangerous if it triggers. > > >> I also know that they have link-entry embedded in mutex, >> if the mutex is being shared by multiple processes,then I can write a >> specific value into the link entry and corrupt the owner's link list, >> even worse, I can write a specific address into the link entry, when >> the owner unlocks it and unlinks it from its list, he will be happy to >> write to any address I specified, this may be a security problem >> or causes very difficult debugging problem if the mutex memory >> is corrupted. >> > > Indeed. Admittedly, shared memory is inherently not very safe, but one > would expect the unsafety to be restricted to the shared memory segment. > > >> I think if you want robust mutex, //for whatever you do, there is a price, >> the robust mutex is expensive but reliable. Until you can prove that >> the glibc's write-mutex-memory-after-unlock is not a problem, >> I would not follow their idea. Based on the research, I think >> Solaris must have a reason to not do it in this way, I would not laugh >> at them that their robust mutex is slow. >> > > As I said, they changed it so it does not require a syscall for > uncontended lock/unlock. Their implementation seems safer than what > glibc+linux does: > > * The kernel knows all robust mutexes that are mapped and potentially > locked in a process (even if there is no thread blocked on them). > mutexes are added to this list upon pthread_mutex_init() and > pthread_mutex_lock() and removed when the memory region containing > them is unmapped. When the process execs or exits, the list is walked > and all mutexes owned by a thread in this process do the EOWNERDEAD > thing. A copy of the list is maintained in userland to avoid excessive > system calls. When a mutex is unmapped, the kernel removes the entry > from the userland list as well. > > * The list of owned robust mutexes is a variable length array instead of > a linked list, so there are no pointers to thread-private memory in > the mutex. Furthermore, this list is only used when a thread exits, in > userland. > > This approach appears to solve your objections, except if a program > destroys a robust mutex and starts using the memory for something else > without unmapping it. Perhaps doing some sort of syscall in > pthread_mutex_destroy() to remove the mutex entry for all processes that > have it mapped can solve this. > I know the Solaris implemented in this way, but I also know because they needs to look up a hash table in userland every time a pthread_mutex_lock() is called, it is still not O(1), and has extra lock contention, even worse, does the userland hash table lock protect priority inversion ? is it safe for real-time thread ? I think they had implemented priority-inherited and priority-protect mutex, even their condition variable supports the real-time thread scheduling. Their implementation also causes impossibility to use robust mutex without thread library, I saw there is a complain of such a problem. > David Xu wrote: > >> I also remembered Solaris 's condition varaible is automatically >> robust if your mutex is robust mutex type, glibc can not provide >> the feature, when a process crashed, the condition variable's internal >> lock may be held by the dead thread, and the condition variable is >> in not usable state. but Solar's condition variable will still be in >> health state, if a thread found the the pthread_mutex_lock returned >> EOWNERDEAD, it still can use condition variable without any >> problem. Without robust condition variable, a robust mutex is less >> useful. >> > > Indeed, but it seems more-or-less orthogonal to mostly-userland robust > mutex. I think the current kernel condition variable is suitable here, > but I have not thought this through completely. > Yes, the pthread is really a very complex beast, all complexity are hidden under a simple group of API, but when you want to implement it, you will find that it is so difficult, sometime you want to give up.