Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 25 Feb 2017 13:05:14 +1100 (EST)
From:      Bruce Evans <>
To:        Konstantin Belousov <>
Cc:        Eric van Gyzen <>,,,
Subject:   Re: svn commit: r314179 - in head: contrib/netbsd-tests/lib/librt include lib/libc/gen lib/libc/include share/man/man3 sys/kern
Message-ID:  <>
In-Reply-To: <>
References:  <> <>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
On Fri, 24 Feb 2017, Konstantin Belousov wrote:

> On Thu, Feb 23, 2017 at 07:36:39PM +0000, Eric van Gyzen wrote:
>> Modified: head/include/semaphore.h
>> ==============================================================================
>> --- head/include/semaphore.h	Thu Feb 23 19:32:25 2017	(r314178)
>> +++ head/include/semaphore.h	Thu Feb 23 19:36:38 2017	(r314179)
>> @@ -59,6 +59,8 @@ int	 sem_init(sem_t *, int, unsigned int
>>  sem_t	*sem_open(const char *, int, ...);
>>  int	 sem_post(sem_t *);
>>  int	 sem_timedwait(sem_t * __restrict, const struct timespec * __restrict);
>> +int	 sem_clockwait_np(sem_t * __restrict, __clockid_t, int,
>> +	    const struct timespec *, struct timespec *);
> I argue that semaphore.h is POSIX include file and the declaration of
> sem_clockwait_np(), despite being in implementation (non-portable)
> namespace, still should be braced with #if __BSD_VISIBLE.

This declaration also unsorts the list.  Similarly in the man page, except
at least 4 shorter lists are unsorted and sorting them would hide the
primary API.

This declaration has rather too much pollution avoidance.  It is almost
unusable without including another header that declares clockid_t, and
of course the man page gives not even a hint about this.

The implementation and POSIX has or had the same bug for sem_timewait().
That uses struct timespec, but in the 2001 version <semaphore.h> wasn't
required to declare struct timespec.  The 2007 draft version is even
worse.  It still doesn't require <semaphore.h> to declare struct
timespec, but allows <semaphore.h> to be polluted with all the symbols
in <sys/time.h>.  I haven't checked later versions.  POSIX generally
"fixes" such bugs by requiring declarations of all relevant types but
allowing low quality implemenations to do this by including massive
pollution.  At least it documents the pollution.

I don't like multiple functions per man page.  <semaphore.h> mostly
has separate ones.  It has 11 functions and at least 8 separate man
pages.  Details like extra headers being needed are easier to organize
in separate man pages (but both clockid_t and struct timespec need
the same man page <sys/time.h>).

>>  int	 sem_trywait(sem_t *);
>>  int	 sem_unlink(const char *);
>>  int	 sem_wait(sem_t *);

FreeBSD is still missing clock_nanosleep(), and has a broken nanosleep()
that sleeps on a wrong clock id.  I would have thought that this standard
function was needed more than a nonstandard semaphore function.  Eric
even mentioned clock_nanosleep() in the commit log.

Interestingly, sem_timedwait() is not broken like nanosleep().  POSIX
specifies that both sleep on CLOCK_REALTIME.  It isn't clear what this
means if someone steps the clock to adjust it, but it clearly requires
jumping by a second or 2 for leap seconds, and the implementation doesn't
do that very well (it should wake up a second or 2 early in case there is
a leap second, or for long sleeps a few percent early in case there is
clock drift, then micro-sleep to the final time).

kern_sem_wait() uses an old-style wait loop with the right clock id
but a low precision (getnanotime() uses CLOCK_REALTIME_FAST, not
CLOCK_REALTIME, so its id is not quite right either).

nanosleep() originally used the wrong clock id CLOCK_MONOTONIC_FAST
via getnanotime() in an old-style wait loop.  Now it uses sbintimes
via tsleep_sbt().  Sbintime APIs only support monotonic clocks, so
changing to the correct clock is not so easy.

POSIX timers seem to be little used, and the POSIX timer code that
supports clock ids and absolute times hasn't been converted to use
sbintimes.  It uses kernel timeouts, which are fundamentally based
on monotonic clocks, and isn't careful to wake up early to handle
leap seconds or drift.

The benefit from using sbintimes for nanosleep() seems to be just that
you can avoid reqesting a high precision, and with a little more
precision control you could reqest an early wakeup.  IIRC, the current
precision control is biased towards waking up late, and in a keyboard
function I had to fudge it to wake up early or late with no bias, but
in nanosleep() early wakeups would be preferred and for long timeouts
and timeouts across clock steps early wakeups should be mandatory.

This shows a good way to handle all clock steps: wake up immediately
to let the wait loop check the time and usually restart.  Only higher
levels can know the correct handling.  They should already be prepared
for early wakeups because these are caused by interrupts.

kern_sem_wait() does check after an interrupt, but not after a timeout
(it assumes that timeouts in hz ticks with the monotonic hz clock are
accurate for the CLOCK_REALTIME[_FAST] timecounter clock).
kern_nanosleep() used to check after a timeout, but it now depends on
timeouts in sbintimes with the monotonic sbintime clock being
accurate...  The sbintime clock is essentially CLOCK_MONOTONIC_[FAST],
so it is accurate enough, but with a wrong clock id.


Want to link to this message? Use this URL: <>