Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 24 Apr 2012 17:03:48 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Martin Simmons <martin@lispworks.com>
Cc:        freebsd-threads@freebsd.org, jack.ren@intel.com
Subject:   Re: About the memory barrier in BSD libc
Message-ID:  <20120424140348.GY2358@deviant.kiev.zoral.com.ua>
In-Reply-To: <201204241343.q3ODhe2C032683@higson.cam.lispworks.com>
References:  <20120423084120.GD76983@zxy.spb.ru> <CAPHpMu=kCwhf1RV_sYBDWDPL8368YTMLXge4L_g_F4AkTX1H5g@mail.gmail.com> <20120423094043.GS32749@zxy.spb.ru> <CAPHpMukLUeetSKpH2oiKJQ3ML_PFHEi6a0hK3_Ery=LX1YEd3g@mail.gmail.com> <20120423113838.GT32749@zxy.spb.ru> <CAPHpMumWu_aaZ4Sj5Athro6441Y%2B3_phbD2jxkKE-CdBf-Fd8g@mail.gmail.com> <20120423120720.GS2358@deviant.kiev.zoral.com.ua> <CAPHpMumh3YpB3RDD-7g5tU6thiuNA6HTuVxmt-9_OzUiEdEXzA@mail.gmail.com> <20120423130343.GT2358@deviant.kiev.zoral.com.ua> <201204241343.q3ODhe2C032683@higson.cam.lispworks.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--yDxN68y6wlbaMG9t
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Apr 24, 2012 at 02:43:40PM +0100, Martin Simmons wrote:
> >>>>> On Mon, 23 Apr 2012 16:03:43 +0300, Konstantin Belousov said:
> >=20
> > On Mon, Apr 23, 2012 at 08:33:05PM +0800, Fengwei yin wrote:
> > > On Mon, Apr 23, 2012 at 8:07 PM, Konstantin Belousov
> > > <kostikbel@gmail.com> wrote:
> > > > On Mon, Apr 23, 2012 at 07:44:34PM +0800, Fengwei yin wrote:
> > > >> On Mon, Apr 23, 2012 at 7:38 PM, Slawa Olhovchenkov <slw@zxy.spb.r=
u> wrote:
> > > >> > On Mon, Apr 23, 2012 at 07:26:54PM +0800, Fengwei yin wrote:
> > > >> >
> > > >> >> On Mon, Apr 23, 2012 at 5:40 PM, Slawa Olhovchenkov <slw@zxy.sp=
b.ru> wrote:
> > > >> >> > On Mon, Apr 23, 2012 at 05:32:24PM +0800, Fengwei yin wrote:
> > > >> >> >
> > > >> >> >> On Mon, Apr 23, 2012 at 4:41 PM, Slawa Olhovchenkov <slw@zxy=
.spb.ru> wrote:
> > > >> >> >> > On Mon, Apr 23, 2012 at 02:56:03PM +0800, Fengwei yin wrot=
e:
> > > >> >> >> >
> > > >> >> >> >> Hi list,
> > > >> >> >> >> If this is not correct question on the list, please let m=
e know and
> > > >> >> >> >> sorry for noise.
> > > >> >> >> >>
> > > >> >> >> >> I have a question regarding the BSD libc for SMP arch. I =
didn't see
> > > >> >> >> >> memory barrier used in libc.
> > > >> >> >> >> How can we make sure it's safe on SMP arch?
> > > >> >> >> >
> > > >> >> >> > /usr/include/machine/atomic.h:
> > > >> >> >> >
> > > >> >> >> > #define mb() ?? ??__asm __volatile("lock; addl $0,(%%esp)"=
 : : : "memory")
> > > >> >> >> > #define wmb() ?? __asm __volatile("lock; addl $0,(%%esp)" =
: : : "memory")
> > > >> >> >> > #define rmb() ?? __asm __volatile("lock; addl $0,(%%esp)" =
: : : "memory")
> > > >> >> >> >
> > > >> >> >>
> > > >> >> >> Thanks for the information. But it looks no body use it in l=
ibc.
> > > >> >> >
> > > >> >> > I think no body in libc need memory barrier: libc don't work =
with
> > > >> >> > peripheral, for atomic opertions used different macros.
> > > >> >>
> > > >> >> If we check the usage of __sinit(), it is a typical singleton p=
attern which
> > > >> >> needs memory barrier to make sure no potential SMP issue.
> > > >> >>
> > > >> >> Or did I miss something here?
> > > >> >
> > > >> > What architecture with cache incoherency and FreeBSD support?
> > > >>
> > > >> I suppose it's not related with cache inchoherency (I could be wro=
ng).
> > > >> It's related
> > > >> with reorder of instruction by CPU.
> > > >>
> > > >> Here is the link talking about why need memory barrier for singlet=
on:
> > > >> http://www.oaklib.org/docs/oak/singleton.html
> > > >>
> > > >> x86 has strict memory model and may not suffer this kind of issue.=
 But
> > > >> ARM need to
> > > >> take care of it IMHO.
> > > >
> > > > Please note that __sinit is idempotent, so double-initialization is=
 not
> > > > an issue there. The only possible problematic case would be other t=
hread
> > > > executing exit and not noticing non-NULL value for __cleanup while =
current
> > > > thread just set it.
> > > >
> > > > I am not sure how much real this race is. Each call to _sinit() is =
immediately
> > > > followed by a lock acquire, typically FLOCKFILE(), which enforces f=
ull barrier
> > > > semantic due to pthread_mutex_lock call. The exit() performs __cxa_=
finalize()
> > > > call before checking __cleanup value, and __cxa_finalize() itself l=
ocks
> > > > atexit_mutex. So the race is tiny and probably possible only for so=
mewhat
> > > > buggy applications which perform exit() while there are stdio opera=
tions
> > > > in progress.
> > > >
> > > > Also note that some functions assign to __cleanup unconditionally.
> > > >
> > > > Do you see any real issue due to non-synchronized access to __clean=
up ?
> > >=20
> > > No. I didn't see real issue. I am just reviewing the code.
> > >=20
> > > If you don't think __sinit has issue, let's check another code:
> > >      line 68 in libc/stdio/fclose.c
> > >      line 133 in libc/stdio/findfp.c (function __sfp())
> > >=20
> > > Which is trying to free a fp slot by assign 0 to fp->_flags. But if
> > > the instrucation
> > > could be re-ordered, another CPU could see fp->_flags is assigned to 0
> > > before the
> > > cleanup from line 57 to 67.
> > >=20
> > > Let's say, if another CPU is in line 133 of __sfp(), it could see
> > > fp->_flags become
> > > 0 before it's aware of the cleanup (Line 57 to line 67 in
> > > libc/stdio/fclose.c) happen.
> > >=20
> > > Note: the mutex of FUNLOCKFILE(fp) in line 69 of libc/stdio/fclose.c
> > > just could make sure
> > > line 70 happen after line 68. It can't impact the re-order of line 57
> > > ~ line 68 by CPU.
> >=20
> > Yes, FUNLOCKFILE() there would have no effect on the potential CPU reor=
dering
> > of the writes.  But does the order of these writes matter at all ?
> >=20
> > Please note that __sfp() reinitializes all fields written by fclose().
> > Only if CPU executing fclose() is allowed to reorder operations so that
> > the external effect of _flags =3D 0 assignment can be observed before t=
hat
> > CPU executes other operations from fclose(), there could be a problem.
> >=20
> > This is definitely impossible on Intel, and I indeed do not know about
> > other architectures enough to reject such possibility. The _flags member
> > is short, so atomics cannot be used there. The easier solution, if this
> > is indeed an issue, is to lock thread_lock around _flags =3D 0 assignme=
nt
> > in fclose().
>=20
> This can be a problem, even on Intel, because the compiler can reorder the
> stores.  E.g. if I compile the following with gcc -O4 on amd64:
>=20
> struct foo { int x, y; };
>=20
> int foo(struct foo *p)
> {
>   int x =3D bar();
>   p->y =3D baz();
>   p->x =3D x;
> }
>=20
> then I get the following assembly language, which sets p->x before p->y:
>=20
> 	movq	%rdi, %rbx
> 	call	bar
> 	movl	%eax, %ebp
> 	xorl	%eax, %eax
> 	call	baz
> 	movl	%ebp, (%rbx)
> 	movl	%eax, 4(%rbx)
>=20
> __Martin
Ok, as I already said, I think that the reordering is safe there.

Anyway, the change below should remove all concerns.

diff --git a/lib/libc/stdio/fclose.c b/lib/libc/stdio/fclose.c
index f0629e8..383040e 100644
--- a/lib/libc/stdio/fclose.c
+++ b/lib/libc/stdio/fclose.c
@@ -41,9 +41,12 @@ __FBSDID("$FreeBSD$");
 #include <stdio.h>
 #include <stdlib.h>
 #include "un-namespace.h"
+#include <spinlock.h>
 #include "libc_private.h"
 #include "local.h"
=20
+extern spinlock_t __stdio_thread_lock;
+
 int
 fclose(FILE *fp)
 {
@@ -65,7 +68,11 @@ fclose(FILE *fp)
 		FREELB(fp);
 	fp->_file =3D -1;
 	fp->_r =3D fp->_w =3D 0;	/* Mess up if reaccessed. */
+	if (__isthreaded)
+		_SPINLOCK(&__stdio_thread_lock);
 	fp->_flags =3D 0;		/* Release this FILE for reuse. */
+	if (__isthreaded)
+		_SPINUNLOCK(&__stdio_thread_lock);
 	FUNLOCKFILE(fp);
 	return (r);
 }
diff --git a/lib/libc/stdio/findfp.c b/lib/libc/stdio/findfp.c
index 89c0536..bcd6f62 100644
--- a/lib/libc/stdio/findfp.c
+++ b/lib/libc/stdio/findfp.c
@@ -82,9 +82,9 @@ static struct glue *lastglue =3D &uglue;
=20
 static struct glue *	moreglue(int);
=20
-static spinlock_t thread_lock =3D _SPINLOCK_INITIALIZER;
-#define THREAD_LOCK()	if (__isthreaded) _SPINLOCK(&thread_lock)
-#define THREAD_UNLOCK()	if (__isthreaded) _SPINUNLOCK(&thread_lock)
+spinlock_t __stdio_thread_lock =3D _SPINLOCK_INITIALIZER;
+#define THREAD_LOCK()	if (__isthreaded) _SPINLOCK(&__stdio_thread_lock)
+#define THREAD_UNLOCK()	if (__isthreaded) _SPINUNLOCK(&__stdio_thread_lock)
=20
 #if NOT_YET
 #define	SET_GLUE_PTR(ptr, val)	atomic_set_rel_ptr(&(ptr), (uintptr_t)(val))

--yDxN68y6wlbaMG9t
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (FreeBSD)

iEYEARECAAYFAk+WssQACgkQC3+MBN1Mb4g+hACguHQ9O3LLzcvc8DuzymjOmaeg
JFAAoLF1xp2cXY6dvSf7dLsk0X1X9VeY
=XzVd
-----END PGP SIGNATURE-----

--yDxN68y6wlbaMG9t--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120424140348.GY2358>