Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Sep 2010 08:46:00 -0700
From:      Matthew Fleming <mdf356@gmail.com>
To:        Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: Questions about mutex implementation in kern/kern_mutex.c
Message-ID:  <AANLkTimJV-oB_uTSbUTtbSrR5fXgWGk00dEV7L-Gobrf@mail.gmail.com>
In-Reply-To: <20100915134415.GA23727@pm513-1.comsys.ntu-kpi.kiev.ua>
References:  <20100915134415.GA23727@pm513-1.comsys.ntu-kpi.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
I'll take a stab at answering these...

On Wed, Sep 15, 2010 at 6:44 AM, Andrey Simonenko
<simon@comsys.ntu-kpi.kiev.ua> wrote:
> Hello,
>
> I have questions about mutex implementation in kern/kern_mutex.c
> and sys/mutex.h files (current versions of these files):
>
> 1. Is the following statement correct for a volatile pointer or integer
> =A0 variable: if a volatile variable is updated by the compare-and-set
> =A0 instruction (e.g. atomic_cmpset_ptr(&val, ...)), then the current
> =A0 value of such variable can be read without any special instruction
> =A0 (e.g. v =3D val)?
>
> =A0 I checked Assembler code for a function with "v =3D val" and "val =3D=
 v"
> =A0 like statements generated for volatile variable and simple variable
> =A0 and found differences: on ia64 "v =3D val" was implemented by ld.acq =
and
> =A0 "val =3D v" was implemented by st.rel; on mips and sparc64 Assembler =
code
> =A0 can have different order of lines for volatile and simple variable
> =A0 (depends on the code of a function).

I think this depends somewhat on the hardware and what you mean by
"current" value.

If you want a value that is not in-flux, then something like
atomic_cmpset_ptr() setting to the current value is needed, so that
you force any other atomic_cmpset to fail.  However, since there is no
explicit lock involved, there is no strong meaning for "current" value
and a read that does not rely on a value cached in a register is
likely sufficient.  While the "volatile" keyword in C has no explicit
hardware meaning, it often means that a load from memory (or,
presumably, L1-L3 cache) is required.


> 2. Let there is a default (sleep) mutex and adaptive mutexes is enabled.
> =A0 A thread tries to obtain lock quickly and fails, _mtx_lock_sleep()
> =A0 is called, it gets the address of the current mutex's owner thread
> =A0 and checks whether that owner thread is running (on another CPU).
> =A0 How does _mtx_lock_sleep() know that that thread still exists
> =A0 (lines 311-337 in kern_mutex.c)?
>
> =A0 When adaptive mutexes was implemented there was explicit locking
> =A0 around adaptive mutexes code. =A0When turnstile in mutex code was
> =A0 implemented that's locking logic was changed.

It appears that it's possible for the thread pointer to be recycled
between fetching the value of owner and looking at TD_IS_RUNNING.  On
actual hardware, this race is unlikely to occur due to the time it
takes for a thread to release a lock and perform all of thread exit
code before the struct thread is returned to the uma zone.  However,
even once returned to the uma zone on many FreeBSD implementations the
access is safe as the address of the thread is still dereferenceable,
due to the implementation of uma zones.

On e.g. AIX this issue was different because the address range for
threads was determined at compile time (one giant table) and the array
only grew, never shrank, so the thread pointer was always valid and
would be recycled at first opportunity.

It appears to me, from a strict correctness standpoint, that the use
of uma_zalloc/uma_zfree for thread objects is not safe.  But from a
practical implementation POV, the unsafe access in kern_mutex.c will
not cause trouble in the absence of a hypervisor controlling when
virtual CPUs get runtime.

> 3. Why there is no any memory barrier in mtx_init()? =A0If another thread
> =A0 (on another CPU) finds that mutex is initialized using mtx_initialize=
d()
> =A0 then it can mtx_lock() it and mtx_lock() it second time, as a result
> =A0 mtx_recurse field will be increased, but its value still can be
> =A0 uninitialized on architecture with relaxed memory ordering model.

It seems to me that it's generally a programming error to rely on the
return of mtx_initialized(), as there is no serialization with e.g. a
thread calling mtx_destroy().  A fully correct serialization model
would require that a single thread initialize the mtx and then create
any worker threads that will use the mtx.

Cheers,
matthew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTimJV-oB_uTSbUTtbSrR5fXgWGk00dEV7L-Gobrf>