Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 18 Aug 2013 14:00:07 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        arch@FreeBSD.org
Subject:   Re: suspend/resume time-gap and expiration timers in network stack
Message-ID:  <20130818123906.A942@besplex.bde.org>
In-Reply-To: <20130818004948.L4326@besplex.bde.org>
References:  <20130817.173019.1478850854128616078.hrs@allbsd.org> <20130818004948.L4326@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 18 Aug 2013, I wrote:

> ...
> just 3 types of jump should change the boot time.
>
> Types of jumps:
> 1. when the system is initialized.  Times are set by eventually calling
>    ...
> 2. Many systems have the RTC on wall clock time.  Both of the times set by
>    ...
> 3. Many systems set the real time more precisely using ntp.  This is best
>    ...
> ...
> The acpica code that fixes up the real time on resume is hard to find.
> I can only find acpi_resync_clock().  This is under an __amd64__ ifdef
> and a boolean sysctl that defaults to on.  It is very broken:
>
> % static void
> % acpi_resync_clock(struct acpi_softc *sc)
> % {
> % #ifdef __amd64__
> %     if (!acpi_reset_clock)
> % 	return;
> % %     /*
> %      * Warm up timecounter again and reset system clock.
> %      */
> %     (void)timecounter->tc_get_timecount(timecounter);
> %     (void)timecounter->tc_get_timecount(timecounter);
> %     inittodr(time_second + sc->acpi_sleep_delay);
> % #endif
> % }
>
> The time_second + ... arg for inittodr() is probably out of date, but
> it is only used if reading the RTC hardware fails.  This function is
> very broken since it doesn't use the RTC delta.  It is basically missing
> all of the initialization steps (1)-(3), plus any later tracking of the
> real time done by ntp.  The largest error is when the RTC is on wall
> clock time.  Then the above gives an error of 10 hours or so.

Oops, inittodr() adds the current gmtoffset, so it has steps (1) and (2)
built in and doesn't have a very large error.

Starting in about FreeBSD-9, there is also periodic update of the RTC
with a default period of 1800 seconds.  This gives some of the
preciseness of (3), but reading and writing of the RTC have errors of
up to about 1 second each in at least the x86 implementation, so the
preciseness is not nearly as much as ntp's, and these periodic updates
are just bugs.  They only help keep the error in the RTC small after
a panic or power failure.  Other cases are better handled by only reading
the RTC at critical points like suspend and resume and keeping track of
the difference between it and the current (timecounter) time then, except
for writing it on reboot.  Suspend/resume is unlike reboot on at least
x86 since it retains enough state in memory variables that can keep track
of the difference better than the RTC hardware.

See Linux for the complications needed to read and write AT RTCs without
losing a second for each.  My version only has these complications for
reading.  For writing, it just skips the write if the change would be
-1, 0, or 1 seconds.  Writing differences of 0 (which turn into differences
of up to 1 because of no synchronization) is obviously wrong.  Forgetting
about differences of -1 and 1 is no worse than getting errors of -1 or 1
from the missing synchronization. For larger differences, the fix is larger
than the possible error from no synchronization, so it may as well be made
provided nothing requires synchronization with the old readings of the RTC,
Old i386 versions of inittodr() achieved essentially the same result as my
resettodr() gives, by refusing to update the current time when the time
difference is -1, 0 or 1 second.  This has been lost.

Writing the AT RTC is a heavyweight operation that also stops its interrupts,
so may mess up synchronization of anything that uses it as an event timer.
Its locking has always been completely broken.  The non-hardware part of
it used to be protected by spclock().  Now it is protected by Giant at best,
so context switches may occur during the hareware part of the update during
which its interrupts are disabled.  Event timers using it are then broken
for as long as the context switch...

Reading of the AT RTC is lighter weight, and the locking in it is less
broken.  It used to be be almost all protected by splhigh().  Now it is
not almost all protected by critical_enter().  The main critical_enter()
be in the same place as the splhigh() was, but it is now a little later
so there is a race between seeing RTC_TUP and acting on this.

Reading and writing of the AT RTC should always have used a hard interrupt
disable, to prevent any interference.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130818123906.A942>