Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 15 Nov 2010 08:04:27 +0800
From:      David Xu <davidxu@freebsd.org>
To:        Jilles Tjoelker <jilles@stack.nl>
Cc:        src-committers@freebsd.org, svn-src-user@freebsd.org
Subject:   Re: svn commit: r214915 - user/davidxu/libthr/lib/libthr/thread
Message-ID:  <4CE0790B.3040706@freebsd.org>
In-Reply-To: <20101114181631.GA1831@stack.nl>
References:  <201011071349.oA7Dn8Po048543@svn.freebsd.org> <20101113151035.GB79975@stack.nl> <4CDF7F38.5010000@freebsd.org> <20101114181631.GA1831@stack.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
Jilles Tjoelker wrote:
> On Sun, Nov 14, 2010 at 02:18:32PM +0800, David Xu wrote:
>   
>> Jilles Tjoelker wrote:
>>     
>>> On Sun, Nov 07, 2010 at 01:49:08PM +0000, David Xu wrote:
>>>       
>
>   
>>>> Author: davidxu
>>>> Date: Sun Nov  7 13:49:08 2010
>>>> New Revision: 214915
>>>> URL: http://svn.freebsd.org/changeset/base/214915
>>>>         
>
>   
>>>> Log:
>>>>   Implement robust mutex, the pthread_mutex locking and
>>>>   unlocking code are reworked to support robust mutex and
>>>>   other mutex must be locked and unlocked by kernel.
>>>>         
>
>   
>>> The glibc+linux implementation avoids the system call for each robust
>>> mutex lock/unlock by maintaining the list in userland and providing a
>>> pointer to the kernel. Although this is somewhat less reliable in case a
>>> process scribbles over this list, it has better performance.
>>>       
>
>   
>>> There are various ways this list could be maintained, but the glibc way
>>> uses an "in progress" field per thread and a linked list using a field
>>> in the pthread_mutex_t, so if we want that we should make sure we have
>>> the space in the pthread_mutex_t. Alternatively, a simple array could be
>>> used if the number of owned robust mutexes can be limited to a fairly
>>> low value.
>>>       
>
>   
>>> Solaris robust mutexes used to work by entering the kernel for every
>>> lock/unlock, but no longer, see
>>> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6296770
>>> Someone complained about that implementation being too slow.
>>>       
>
>   
>> I don't like the glibc's idea that reading or writing the mutex after 
>> unlocked it,
>> why do you think the memory is still valid and being used for the mutex
>> after you unlocked it?
>>     
>
>   
>> There is a use-case that glibc will mysteriously fail:
>> Thread  A at userland unlocked the mutex, then another thread B at
>> userland locked it, and thread B reuses the memory area for other purpose,
>> before thread A enters kernel, thread B used it for memory-mapped file 
>> buffer,
>> than write some data into the buffer which will be saved into disk file
>> by kernel, but before A runs, the memory happens to contains thread A's
>> thread id, then the thread A enters kernel, and thinks the userland
>> still hasn't unlocked the mutex, and it tries to write some data into mutex,
>> it thinks it unlocked it, but the memory is no longer for mutex now, it just
>> simply corrupted the data thread B saved. This implementation is racy and
>> dangerous.
>>     
>
> That seems rare but very dangerous if it triggers.
>
>   
>> I also know that they have link-entry embedded in mutex,
>> if the mutex is being shared by multiple processes,then I can write a
>> specific value into the link entry and corrupt the owner's link list,
>> even worse, I can write a specific address into the link entry, when
>> the owner unlocks it and unlinks it from its list, he will be happy to
>> write to any address I specified,  this may be a security problem
>> or causes very difficult debugging problem if the mutex memory
>> is corrupted.
>>     
>
> Indeed. Admittedly, shared memory is inherently not very safe, but one
> would expect the unsafety to be restricted to the shared memory segment.
>
>   
>> I think if you want robust mutex, //for whatever you do, there is a price,
>> the robust mutex is expensive but reliable. Until you can prove that
>> the glibc's write-mutex-memory-after-unlock is  not a problem,
>> I would not follow their idea. Based on the research, I think
>> Solaris must have a reason to not do it in this way, I would not laugh
>> at them that their robust mutex is slow.
>>     
>
> As I said, they changed it so it does not require a syscall for
> uncontended lock/unlock. Their implementation seems safer than what
> glibc+linux does:
>
> * The kernel knows all robust mutexes that are mapped and potentially
>   locked in a process (even if there is no thread blocked on them).
>   mutexes are added to this list upon pthread_mutex_init() and
>   pthread_mutex_lock() and removed when the memory region containing
>   them is unmapped. When the process execs or exits, the list is walked
>   and all mutexes owned by a thread in this process do the EOWNERDEAD
>   thing. A copy of the list is maintained in userland to avoid excessive
>   system calls. When a mutex is unmapped, the kernel removes the entry
>   from the userland list as well.
>
> * The list of owned robust mutexes is a variable length array instead of
>   a linked list, so there are no pointers to thread-private memory in
>   the mutex. Furthermore, this list is only used when a thread exits, in
>   userland.
>
> This approach appears to solve your objections, except if a program
> destroys a robust mutex and starts using the memory for something else
> without unmapping it. Perhaps doing some sort of syscall in
> pthread_mutex_destroy() to remove the mutex entry for all processes that
> have it mapped can solve this.
>   

I know the Solaris implemented in this way, but I also know because they
needs to look up a hash table in userland every time a pthread_mutex_lock()
is called, it is still not O(1), and has extra lock contention, even 
worse, does
the userland hash table lock protect priority inversion ? is it safe for 
real-time
thread ? I think they had implemented priority-inherited and 
priority-protect
mutex, even their condition variable supports the real-time thread 
scheduling.
Their implementation also causes impossibility to use robust mutex without
thread library, I saw there is a complain of such a problem.

> David Xu wrote:
>   
>> I also remembered Solaris 's  condition varaible is automatically
>> robust if your mutex is robust mutex type, glibc can not provide
>> the feature, when a process crashed, the condition variable's internal
>> lock may be held by the dead thread, and the condition variable is
>> in not usable state. but Solar's condition variable will still be in
>> health state, if a thread found the the pthread_mutex_lock returned
>> EOWNERDEAD, it still can use condition variable without any
>> problem. Without robust condition variable, a robust mutex is less
>> useful.
>>     
>
> Indeed, but it seems more-or-less orthogonal to mostly-userland robust
> mutex. I think the current kernel condition variable is suitable here,
> but I have not thought this through completely.
>   
Yes, the pthread is really a very complex beast, all complexity are hidden
under a simple group of API,  but when you want to implement it,
you will find that it is so difficult,  sometime you want to give up.







Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4CE0790B.3040706>