Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 7 Mar 2007 15:57:19 +1100 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Nate Lawson <nate@root.org>
Cc:        freebsd-acpi@freebsd.org, Stefan Ehmann <shoesoft@gmx.net>
Subject:   Re: notebook freezes
Message-ID:  <20070307155444.G28283@delplex.bde.org>
In-Reply-To: <45EC8969.8060405@root.org>
References:  <200703011612.07110.shoesoft@gmx.net> <20070305004000.B17935@delplex.bde.org> <45EB28A1.5010803@root.org> <200703042242.58748.shoesoft@gmx.net> <20070305142926.O2780@besplex.bde.org> <45EC8969.8060405@root.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Mon, 5 Mar 2007, Nate Lawson wrote:

> Bruce Evans wrote:
>> Could you add some RTC accesses to determine exactly what state is
>> inconsistent?  Something like the following:
>>
>>     cur_rtc_reg = inb(IO_RTC);    /* Sloppy locking. */
>>     printf("cur_rtc_reg = %02x, rtc_reg = %02x\n", cur_rtc_reg, rtc_reg);
>>     rtc_reg = -1;
>>     cur_rtc_statusa = rtcin(RTC_STATUSA);
>>     printf(...);
>>     cur_rtc_statusb = rtcin(RTC_STATUSB);
>>     printf(...);
>>
>> Where should such cghecks be put in acpi code if a hook like pmtimer's
>> is not available or not understood?
>
> I don't understand.  Every driver implements a DEVICE_RESUME() method
> and that is responsible for figuring out the device-specific issues for
> properly restoring the hw from any state, likely all state lost.

I didn't know that there was a generic device resume hook.

clock.c already has one, but it is bogus.  It is just bus_generic_resume,
but the "clock" driver (which names itself inconsistently as attimer[01])
has no children, so bus_generic_resume is a no-op.  It partially knows
that it is wrong and annotates its suspend/resume methods with a comment
saying what they should do.  Also, the attimer part of clock.c is only
compiled if isa is configured, and seems to only by attached if the
system supports PnP.  pmtimer seems to have been reduced to just a
hack to work around these bugs.  It doesn't depend on isa or PnP, and
does very little except use what should be clock.c's private suspend/resume
methods.

Grepping for bus_generic_resume seems to show a couple of other leaf
devices using these.

>> Where do timer updates on suspend/resume happen for acpi?
>
> pmtimer handles both (see NOTES) since DEVICE_RESUME() is called from
> both apm and acpi.

So part of a complete fix is to merge the "i386" pmtimer.c into the
i386 and amd64 clock.c's, make it unconditional or conditional on acpi
| apm (messy with modules), and remove it?  I'm not sure how to remove
the conditionals on isa and PnP.

>> Someday I
>> need to figure out why my laptop (HP nx6325) clobbers the time when
>> its lid is closed.  Suspend stuff mostly doesn't doesn't work.  In
>> particular, closing the lid doesn't even turn off the screen, but it
>> does clobber the time given by the acpi timecounter by almost exactly
>> 1 second.  The TSC timecounter doesn't lose like this but it loses in
>> other ways.  Opening the lid doesn't change the time.  I don't have
>> pmtimer configured, but pmtimer would mess up the time even more because
>> the RTC drifts relative to the correct time and inittodr() doesn't
>> sync with the RTC so it is always off by an average of -0.5 seconds.
>
> No idea -- is something running in SMM for a long time?  I seem to
> remember you had access to an oscilloscope.  Check out the cpu pin
> SMACT# when you close the lid.

No, I don't have an oscilloscope.  The time difference seems to be too
precise to be caused by anything outside of FreeBSD.  According to
ntpdate -q every second:

% server 192.168.2.7, stratum 6, offset -19.000175, delay 0.02568
%  7 Mar 15:17:47 ntpdate[7122]: step time server 192.168.2.7 offset -19.000175 sec

The clock was stepped 19 seconds by 19 previous lid closings.

% server 192.168.2.7, stratum 6, offset -19.000174, delay 0.02568
%  7 Mar 15:17:48 ntpdate[7124]: step time server 192.168.2.7 offset -19.000174 sec
% server 192.168.2.7, stratum 6, offset -19.000174, delay 0.02568
%  7 Mar 15:17:49 ntpdate[7126]: step time server 192.168.2.7 offset -19.000174 sec
% server 192.168.2.7, stratum 6, offset -19.000174, delay 0.02568
%  7 Mar 15:17:50 ntpdate[7128]: step time server 192.168.2.7 offset -19.000174 sec
% server 192.168.2.7, stratum 6, offset -19.000173, delay 0.02568
%  7 Mar 15:17:51 ntpdate[7130]: step time server 192.168.2.7 offset -19.000173 sec
% server 192.168.2.7, stratum 6, offset -19.000174, delay 0.02568
%  7 Mar 15:17:52 ntpdate[7132]: step time server 192.168.2.7 offset -19.000174 sec
% server 192.168.2.7, stratum 6, offset -19.000175, delay 0.02568
%  7 Mar 15:17:53 ntpdate[7134]: step time server 192.168.2.7 offset -19.000175 sec

The relative clocks are drifting at < 1 ppm.

The server timercounter is ACPI-fast and the client timecounter is
TSC.  Both are synced to a local ntp server.

Contrary to what I said before, the TSC timecounter on the server also
jumps by 1 second (but with jitter +- 200 msec due to SMP etc.) when the
lid is closed.

% server 192.168.2.7, stratum 6, offset -20.000179, delay 0.02568
%  7 Mar 15:17:56 ntpdate[7136]: step time server 192.168.2.7 offset -20.000179 sec

1 more lid closing stepped the relative clocks by -1 second - 4uS.

% server 192.168.2.7, stratum 6, offset -20.000177, delay 0.02568
%  7 Mar 15:17:57 ntpdate[7138]: step time server 192.168.2.7 offset -20.000177 sec

2 of the 4 uS was apparently jitter.  ntp is very unlikely to have started
fixing up the 1-second jump or even the previous 19 1-second jumps yet.

% server 192.168.2.7, stratum 6, offset -20.000177, delay 0.02568
%  7 Mar 15:17:58 ntpdate[7140]: step time server 192.168.2.7 offset -20.000177 sec
% server 192.168.2.7, stratum 6, offset -20.000178, delay 0.02568
%  7 Mar 15:17:59 ntpdate[7142]: step time server 192.168.2.7 offset -20.000178 sec
% server 192.168.2.7, stratum 6, offset -20.000179, delay 0.02568
%  7 Mar 15:18:00 ntpdate[7144]: step time server 192.168.2.7 offset -20.000179 sec
% server 192.168.2.7, stratum 6, offset -20.000180, delay 0.02568
%  7 Mar 15:18:01 ntpdate[7146]: step time server 192.168.2.7 offset -20.000180 sec
% server 192.168.2.7, stratum 6, offset -20.000178, delay 0.02568
%  7 Mar 15:18:02 ntpdate[7148]: step time server 192.168.2.7 offset -20.000178 sec
% server 192.168.2.7, stratum 6, offset -20.000179, delay 0.02568
%  7 Mar 15:18:03 ntpdate[7150]: step time server 192.168.2.7 offset -20.000179 sec
% server 192.168.2.7, stratum 6, offset -21.000182, delay 0.02568
%  7 Mar 15:18:05 ntpdate[7152]: step time server 192.168.2.7 offset -21.000182 sec

Another lid closing: -1 second - 3uS.

% server 192.168.2.7, stratum 6, offset -21.000180, delay 0.02568
%  7 Mar 15:18:06 ntpdate[7154]: step time server 192.168.2.7 offset -21.000180 sec

2 uS jitter for the second lid closing too.

% server 192.168.2.7, stratum 6, offset -21.000179, delay 0.02568
%  7 Mar 15:18:07 ntpdate[7156]: step time server 192.168.2.7 offset -21.000179 sec
% server 192.168.2.7, stratum 6, offset -21.000180, delay 0.02568
%  7 Mar 15:18:08 ntpdate[7158]: step time server 192.168.2.7 offset -21.000180 sec
% server 192.168.2.7, stratum 6, offset -21.000180, delay 0.02568
%  7 Mar 15:18:09 ntpdate[7160]: step time server 192.168.2.7 offset -21.000180 sec

Closing the lid is not completely broken.  It (or pressing the screen switch)
turns the screen off, and opening the lid turns the screen back on.  The
above output covers 2 lid openings and shows <= 2 uS of glitches for
openings.  pmtimer is not configured, and rtc_restore() us not called.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070307155444.G28283>