From owner-freebsd-hackers@FreeBSD.ORG Fri Mar 14 18:31:28 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5C6EB96E; Fri, 14 Mar 2014 18:31:28 +0000 (UTC) Received: from mailuogwdur.emc.com (mailuogwdur.emc.com [128.221.224.79]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 07F2BE28; Fri, 14 Mar 2014 18:31:27 +0000 (UTC) Received: from maildlpprd55.lss.emc.com (maildlpprd55.lss.emc.com [10.106.48.159]) by mailuogwprd53.lss.emc.com (Sentrion-MTA-4.3.0/Sentrion-MTA-4.3.0) with ESMTP id s2EIVGSh002358 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 14 Mar 2014 14:31:19 -0400 X-DKIM: OpenDKIM Filter v2.4.3 mailuogwprd53.lss.emc.com s2EIVGSh002358 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=isilon.com; s=jan2013; t=1394821879; bh=ZTNHAldhyy4iZr4n9k/7b7R15m0=; h=From:To:CC:Subject:Date:Message-ID:Content-Type: Content-Transfer-Encoding:MIME-Version; b=dFlYr7eXGeIg6mbZ6RttVdU/biRIHJ2ImxJvO/1rCH6AGB8eXEYJ4+cidQiqnA6bk my5lRtSR14v68d2QRYWN7xF22XNkbxV52n5acae47O7lY2nNlAaTUNhbbYqvsnMpmz so1CHAIYTOmuy0AO4P/54fvM84oK/WLTBH47N0j8= X-DKIM: OpenDKIM Filter v2.4.3 mailuogwprd53.lss.emc.com s2EIVGSh002358 Received: from mailusrhubprd01.lss.emc.com (mailusrhubprd01.lss.emc.com [10.253.24.19]) by maildlpprd55.lss.emc.com (RSA Interceptor); Fri, 14 Mar 2014 14:31:09 -0400 Received: from mxhub25.corp.emc.com (mxhub25.corp.emc.com [10.254.110.181]) by mailusrhubprd01.lss.emc.com (Sentrion-MTA-4.3.0/Sentrion-MTA-4.3.0) with ESMTP id s2EIV9NT024383 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Fri, 14 Mar 2014 14:31:09 -0400 Received: from MXHUB104.corp.emc.com (10.253.58.16) by mxhub25.corp.emc.com (10.254.110.181) with Microsoft SMTP Server (TLS) id 8.3.327.1; Fri, 14 Mar 2014 14:31:08 -0400 Received: from MX103CL02.corp.emc.com ([169.254.6.182]) by MXHUB104.corp.emc.com ([::1]) with mapi id 14.03.0158.001; Fri, 14 Mar 2014 14:31:08 -0400 From: "Meyer, Conrad" To: Bryan Drewery Subject: [PATCH] amd64/pcpu.h: Use Clang builtins for clarity when referencing thread's pcpu Thread-Topic: [PATCH] amd64/pcpu.h: Use Clang builtins for clarity when referencing thread's pcpu Thread-Index: AQHPP7ORFQWDnOxBbUqa6FTdfgehbA== Date: Fri, 14 Mar 2014 18:31:08 +0000 Message-ID: <1394821826-19412-1-git-send-email-conrad.meyer@isilon.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [137.69.152.119] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Sentrion-Hostname: mailusrhubprd01.lss.emc.com X-RSA-Classifications: public Cc: "freebsd-hackers@freebsd.org" , "Meyer, Conrad" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Mar 2014 18:31:28 -0000 We can efficiently reference thread-local pcpu members via the %gs register with Clang-annotated C code, in place of inline GNU assembly. Motivations: - Use C in leiu of inline assembly for clarity - Clang's static analyser may be better able to understand PCPU_* macros using the C constructs rather than inline assembly (unverified) Sponsored by: EMC/Isilon storage division Signed-off-by: Conrad Meyer Reviewed-by: Max Laier --- This is more of a "what do you think?" than a pull request. It seems like u= sing annotated C instead of asm is nice (in particular, Clang detects casts from pointers typed with one segment to another, or unsegmented type). On the ot= her hand, this is code that doesn't change frequently, and we may still need to support GCC for some time. So adding a second, parallel implementation just doubles room for bugs. Open questions: - How long is GCC intended to be supported as a compiler? - How atomic does PCPU_INC() need to be? It looks like it updates cpu-loc= al counters; as long as it's a single asm instruction, should it be fine w.r.t. interrupts? The existing implementation does NOT use the 'lock; = ' prefix. See the following simple example: $ cat gstest.c #include #include #define GS_RELATIVE __attribute__((address_space(256))) struct pcpu { void *curthread; struct pcpu *self; }; struct pcpu * __curpcpu(void) { volatile struct pcpu * GS_RELATIVE *res =3D (volatile struct pcpu * GS_RELATIVE *) __offsetof(struct pcpu, self); return (*res); } $ clang -Wall -Wextra -O1 -c gstest.c $ objdump -d gstest.o gstest.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <__curpcpu>: 0: 65 48 8b 04 25 08 00 mov %gs:0x8,%rax 7: 00 00 9: c3 retq Support has been present since at least April 9, 2009 (when documentation of the feature was first added to LanguageExtensions.html). So all Clang versions in BSD core (BSD9, BSD10) should support the feature. --- sys/amd64/include/pcpu.h | 98 +++++++++++++++++++++++++++++++++++++++++---= ---- 1 file changed, 84 insertions(+), 14 deletions(-) diff --git a/sys/amd64/include/pcpu.h b/sys/amd64/include/pcpu.h index fe898e9..68892fc 100644 --- a/sys/amd64/include/pcpu.h +++ b/sys/amd64/include/pcpu.h @@ -81,7 +81,7 @@ extern struct pcpu *pcpup; #define PCPU_PTR(member) (&pcpup->pc_ ## member) #define PCPU_SET(member, val) (pcpup->pc_ ## member =3D (val)) =20 -#elif defined(__GNUCLIKE_ASM) && defined(__GNUCLIKE___TYPEOF) +#elif defined(__clang__) || (defined(__GNUCLIKE_ASM) && defined(__GNUCLIKE= ___TYPEOF)) =20 /* * Evaluates to the byte offset of the per-cpu variable name. @@ -95,6 +95,80 @@ extern struct pcpu *pcpup; #define __pcpu_type(name) \ __typeof(((struct pcpu *)0)->name) =20 +#if defined(__clang__) + +#define __GS_RELATIVE __attribute__((address_space(256))) + +/* + * Evaluates to the address of the per-cpu variable name. + */ +#define __PCPU_PTR(name) __extension__ ({ \ + volatile __pcpu_type(name) __GS_RELATIVE *__p; \ + \ + __p =3D (volatile __pcpu_type(name) __GS_RELATIVE *)__pcpu_offset(name); = \ + __p; \ +}) + +#define __PCPU_PTRX(name) __extension__ ({ \ + volatile __pcpu_type(pc_prvspace) __GS_RELATIVE *__p; \ + __pcpu_type(name) *__mp; \ + \ + __p =3D __PCPU_PTR(pc_prvspace); \ + __mp =3D &(*__p)->name; \ + __mp; \ +}) +#define PCPU_PTR(member) __PCPU_PTRX(pc_ ## member) + +/* + * Evaluates to the value of the per-cpu variable name. + */ +#define __PCPU_GET(name) __extension__ ({ \ + *__PCPU_PTR(name); \ +}) + +/* + * Adds the value to the per-cpu counter name. The implementation + * must be atomic with respect to interrupts. + */ +#define __PCPU_ADD(name, val) do { \ + __pcpu_type(name) __val; \ + volatile __pcpu_type(name) __GS_RELATIVE *__ptr; \ + \ + __val =3D (val); \ + __ptr =3D __PCPU_PTR(name); \ + *__ptr +=3D __val; \ +} while (0) + +/* + * Increments the value of the per-cpu counter name. The implementation + * must be atomic with respect to interrupts. + */ +#define __PCPU_INC(name) __PCPU_ADD(name, 1) + +/* + * Sets the value of the per-cpu variable name to value val. + */ +#define __PCPU_SET(name, val) do { \ + __pcpu_type(name) __val; \ + volatile __pcpu_type(name) __GS_RELATIVE *__ptr; \ + \ + __val =3D (val); \ + __ptr =3D __PCPU_PTR(name); \ + *__ptr =3D __val; \ +} while (0) + +#define curthread __extension__ ({ \ + *((volatile __pcpu_type(pc_curthread) __GS_RELATIVE *) \ + __pcpu_offset(pc_curthread)); \ +}) + +#define curpcb __extension__ ({ \ + *((volatile __pcpu_type(pc_curpcb) __GS_RELATIVE *) \ + __pcpu_offset(pc_curpcb)); \ +}) + +#else /* !__clang__ */ + /* * Evaluates to the address of the per-cpu variable name. */ @@ -200,17 +274,7 @@ extern struct pcpu *pcpup; } \ } =20 -#define PCPU_GET(member) __PCPU_GET(pc_ ## member) -#define PCPU_ADD(member, val) __PCPU_ADD(pc_ ## member, val) -#define PCPU_INC(member) __PCPU_INC(pc_ ## member) -#define PCPU_PTR(member) __PCPU_PTR(pc_ ## member) -#define PCPU_SET(member, val) __PCPU_SET(pc_ ## member, val) - #define OFFSETOF_CURTHREAD 0 -#ifdef __clang__ -#pragma clang diagnostic push -#pragma clang diagnostic ignored "-Wnull-dereference" -#endif static __inline __pure2 struct thread * __curthread(void) { @@ -220,9 +284,6 @@ __curthread(void) : "m" (*(char *)OFFSETOF_CURTHREAD)); return (td); } -#ifdef __clang__ -#pragma clang diagnostic pop -#endif #define curthread (__curthread()) =20 #define OFFSETOF_CURPCB 32 @@ -236,6 +297,15 @@ __curpcb(void) } #define curpcb (__curpcb()) =20 +#define PCPU_PTR(member) __PCPU_PTR(pc_ ## member) + +#endif /* __clang__ */ + +#define PCPU_GET(member) __PCPU_GET(pc_ ## member) +#define PCPU_ADD(member, val) __PCPU_ADD(pc_ ## member, val) +#define PCPU_INC(member) __PCPU_INC(pc_ ## member) +#define PCPU_SET(member, val) __PCPU_SET(pc_ ## member, val) + #define IS_BSP() (PCPU_GET(cpuid) =3D=3D 0) =20 #else /* !lint || defined(__GNUCLIKE_ASM) && defined(__GNUCLIKE___TYPEOF) = */ --=20 1.8.5.3