Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 11 Jul 2010 14:54:14 +0300
From:      Andriy Gapon <avg@freebsd.org>
To:        "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>, freebsd-hackers@freebsd.org
Cc:        Jeff Roberson <jeff@freebsd.org>, Konstantin Belousov <kib@freebsd.org>, Peter Wemm <peter@freebsd.org>
Subject:   Re: elf obj load: skip zero-sized sections early
Message-ID:  <4C39B0E6.3090400@freebsd.org>
In-Reply-To: <4C36FB32.30901@freebsd.org>
References:  <4C246CD0.3020606@freebsd.org> <20100702082754.S14969@maildrop.int.zabbadoz.net> <4C320E6E.4040007@freebsd.org> <20100705171155.K14969@maildrop.int.zabbadoz.net> <4C321409.2070500@freebsd.org> <4C343C68.8010302@freebsd.org> <4C36FB32.30901@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
on 09/07/2010 13:34 Andriy Gapon said the following:
> Having thought and experimented more, I don't see why we need inline assembly at
> all and why DPCPU_DEFINE can not simply be defined as follows:
> 
> #define DPCPU_DEFINE(t, n)	\
> 	t DPCPU_NAME(n) __section("set_pcpu") \
> 	__aligned(CACHE_LINE_SIZE) __used
> 

More explanation for this proposal with additional technical details,
demonstrations and conclusions.

First, this is the patch that I had in mind (note the file name):
http://people.freebsd.org/~avg/dpcpu/pcpu.bad.patch

Some test code that exercises DPCPU_DEFINE macro in various (redundant) ways:
http://people.freebsd.org/~avg/dpcpu/dpcpu.c

GCC-generated assembly with current version of pcpu.h:
http://people.freebsd.org/~avg/dpcpu/dpcpu.orig.s

Note #APP block, this is what gets produced from the inline assembly.
Also note that GCC-generated section directive has exactly the same parameters
as the parameters we use in the inline assembly - "aw",@progbits.

GCC-generated assembly with patched pcpu.h and its diff from the previous version:
http://people.freebsd.org/~avg/dpcpu/dpcpu.bad.s
http://people.freebsd.org/~avg/dpcpu/dpcpu.bad.diff

Note the section definition is exactly the same as before - it has the same
flags, its alignment is the same too (.align 128 vs .p2align 7).

It's also obvious where I got confused with this patch (bz, thanks!) and why the
patch is named "bad".  Instead of defining alignment for the section the patch
sets CACHE_LINE_SIZE alignment for each variable defined in that section.
Which is a waste, not good, etc.

So, while this patch is bad, it still demonstrated that the real reason for the
inline assembly is defining sections alignment.  But the assembly has nothing to
do with "aw" vs "a", variable initialization, etc.  Which was my main point.

For completeness, here is a patch that simply drops the inline assembly and the
comment about it, and GCC-generated assembly and its diff:
http://people.freebsd.org/~avg/dpcpu/pcpu.new.patch
http://people.freebsd.org/~avg/dpcpu/dpcpu.new.s
http://people.freebsd.org/~avg/dpcpu/dpcpu.new.diff

As was speculated above, the only thing really changed is section alignment
(from 128 to 4).

And to be continued...

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C39B0E6.3090400>