Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 May 2017 19:13:04 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Konstantin Belousov <kib@freebsd.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r317809 - head/share/man/man7
Message-ID:  <20170505174957.B875@besplex.bde.org>
In-Reply-To: <201705042131.v44LVokb076951@repo.freebsd.org>
References:  <201705042131.v44LVokb076951@repo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 4 May 2017, Konstantin Belousov wrote:

> Log:
>  Provide introduction for the arch(7) manpage.
>
>  Start with some words about linear address space and its layout, then
>  explain pointers models and ABIs, providing explanation to the
>  structure of the tables.
>
>  Reviewed by:	emaste, imp
>  'Future-proof' cheri wording by:	brooks
> 
> Modified: head/share/man/man7/arch.7
> ==============================================================================
> --- head/share/man/man7/arch.7	Thu May  4 21:30:26 2017	(r317808)
> +++ head/share/man/man7/arch.7	Thu May  4 21:31:50 2017	(r317809)
> ...
> @@ -35,9 +35,92 @@
> .Sh DESCRIPTION
> Differences between CPU architectures and platforms supported by
> .Fx .
> -.Pp
> +.Ss Introduction
> If not explicitly mentioned, sizes are in bytes.
> +.Pp
> +FreeBSD uses flat address space for program execution, where
> +pointers have the same binary representation as

Minor grammar problems.

"binary" is redundant.

> +.Vt unsigned long
> +variables, and
> +.Vt uintptr_t
> +and
> +.Vt size_t
> +types are synonyms for
> +.Vt unsigned long .

uintptr_t and size_t are are not synonyms for unsigned long on all arches.
They only have the same respresentation on 32-bit arches.  On 32-bit arches,
they are synonyms for unsigned int, and thus have a lower rank than
unsigned long.  This mainly causes problems printing them, but might cause
sign extension/overblow problems.  For example, (size_t)0 + (long)-1 is
unsigned and large positive on 64-bit arches, but signed and small negative
on 32-bit arches.

> +.Pp
> +In order to maximize compatibility with future pointer integrity mechanisms,

"pointer integrity mechanisms" sounds like management/marketingspeak.
"integrity" isn't a relevant property of integer types.  "mechanism" might
mean the details of the representation (more than the size), but I think
you just mean the size.   Most manipulations of pointers as integers
assume the same representation.  You stated that the representation is
the same [in future] above, and didn't use the usual caveat "on all
supported arches".  I don't like this, but lots of code depends on it.

Translation of the above: "... compatibility with changes in the size of
pointers in future implementations".

> +manipulations of pointers as integers should be performed via
> +.Vt uintptr_t
> +or
> +.Vt intptr_t
> +and no other types.

Except in the kernel, vm_offset_t should normally be used.  In fact, it
is wrong to use [u]intptr_t for anything except what is guaranteed by
the C standard.  The only guarantee is that you get back the same
value (not necessarily the same bits) if you start with a pointer of
type void * (possibly also qualified void *) and convert it to [u]intptr_t
and back.  You can also look at the bits in the integer representation,
but don't expect these to be useful.  Errors generally start in the cast.
To convert a struct pointer to an integer back (with the same value), it
is necessary to first convert to void *, then to [u]intptr_t, then back
to void *, and finally back to the struct pointer.

Use vm_offset_t for unportable uses.  For flat address spaces, it is
assumed that addition of offsets in the integer corresponds to addition
of byte offsets in the pointer, as if the pointer is a pointer to
unsigned char.  Most other properties follow from that.

There is a problem converting to vm_offset_t.  We should guarantee that
vm_offset_t has all the properties of uintptr_t and much more -- that
it is not restricted to conversions between void * and back.  The
second guarantee requires compiler support in general, by we assume
a flat address space so it just requires the compiler to not be perverse.
Obviously, if [u]intptr_t exists, then the compiler can add the intermediate
casts to and from void * to handle other pointer types.

> +In particular,
> +.Vt long
> +and
> +.Vt ptrdiff_t
> +should be avoided.

prtdiff_t should never be used in portable code.  Neither should pointer
subtraction.  Only pointer differences of up to PTRDIFF_MIN/MAX.  Otherwise,
pointer subtraction is undefined.  PTRDIFF_MIN/MAX can be as low as
+-65535.  Perverse and portability-testing implementations implement
the handy type int17_t to use it perversely for ptrdiff_t, with size_t
perhaps also perversely small (it can be uint15_t), but usually much
larger than this ptrdiff_t.  Pointer subtraction is thus undefined in
general even within the same array if the array has 65536 eleemnts.

There is a minor practical problem with non-perverse ptrdiff_t and a
corresponding problem for vm_offset_t.  32-bit vm_offset_t has a range
of 4G, but can't handle negative offsets, so you have to be careful
not to subtract a larger pointer from a smaller one, or handle the
wrap from this.  32-bit ptrdiff_t has a range of +-2G, so it can't
hande pointers differing by half of the address space.

> +Compilers define
> +.Dv _LP64
> +symbol when compiling for an
> +.Dv LP64
> +ABI.

Further minor grammar problems here and elsewhere:
- missing "the" before _LP64
- "an" is confusing.  First, "a" might be correct depending on how you
   pronounce LP64.  I pronounce it as "el ...", so "an" is better than
   "a".  But there is only 1 LP64, so "the" is more correct.  "the LP64
   ABI" is confusing too.  LP64 isn't an ABI or a collection of ABIs.
   The collection is of arches, many using a single LP64 sub-ABI with
   variations in other parts of their ABI.

> ...
> +Examples are:
> +.Bl -column -offset indent "powerpc64" "Sy ILP32 counterpart"
> +.It Sy LP64        Ta Sy ILP32 counterpart

This has the "Sy" sizing bug in only 1 field in the header.

> @@ -48,6 +131,9 @@ On all supported architectures:
> .It float Ta 4
> .It double Ta 8
> .El
> +Integers are represented as two-complement.
> +Alignment of integer and pointer types is natural, that is,
> +the address of the variable must be congruent to zero modulo type size.

Missing "the" after "modulo".

Is it natural for arm?  arm has unnatural struct padding, at least at
the end of structs.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170505174957.B875>