Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 3 Jun 2016 13:29:43 +0800
From:      RayCherng Yu <raycherng@gmail.com>
To:        Kevin Oberman <rkoberman@gmail.com>
Cc:        "O. Hartmann" <ohartman@zedat.fu-berlin.de>, Hans Petter Selasky <hps@selasky.org>,  FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: Suddenly poweroff in 11-Current r300097
Message-ID:  <CAG2Wu4PQxaVrpc4DR-MjRK2V0JM8=9QW=AAe2-=_2zemTN=LVQ@mail.gmail.com>
In-Reply-To: <CAN6yY1supvEJN=e3_dwHRuMq_OdFGWDZNE-o2oYjWFDx9S9Z9w@mail.gmail.com>
References:  <CAG2Wu4OtrDxRWMtYhOt2RNCEjryYuZzhYE=-WKH3t-153M9jJA@mail.gmail.com> <0448c751-8608-51ce-f47e-76280ebf14f2@selasky.org> <CAN6yY1uxTdv0Gbh=A3vPRgSkboOb5dgjNRc0V9VrJvCqsSprZg@mail.gmail.com> <20160602224654.18927083.ohartman@zedat.fu-berlin.de> <CAN6yY1supvEJN=e3_dwHRuMq_OdFGWDZNE-o2oYjWFDx9S9Z9w@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
OK, thanks.
I will load coretemp to monitor the cpu temperature.
I have enabled powerd to have cpu frequency adjustment automatically. And
it won't happen when AC power supply connected.


2016-06-03 8:08 GMT+08:00 Kevin Oberman <rkoberman@gmail.com>:

> On Thu, Jun 2, 2016 at 1:46 PM, O. Hartmann <ohartman@zedat.fu-berlin.de>
> wrote:
>
>> Am Thu, 2 Jun 2016 10:26:22 -0700
>> Kevin Oberman <rkoberman@gmail.com> schrieb:
>>
>> > On Thu, Jun 2, 2016 at 7:41 AM, Hans Petter Selasky <hps@selasky.org>
>> wrote:
>> >
>> > > On 06/02/16 03:07, RayCherng Yu wrote:
>> > >
>> > >> I got a suddenly poweroff in r300097 (and previous revision in Apri=
l
>> and
>> > >> May) when I built textproc/docproj.
>> > >> My machine is Macbook Pro 13 2011 early. I have checked the Apple
>> website.
>> > >> My bios is the latest version.
>> > >> Actually it also happened in 10.3-STABLE.
>> > >> It happened when the machine load was heavy. Before it shutdown, th=
e
>> fan
>> > >> started to run very loudly. After several seconds (20 or 30
>> seconds), my
>> > >> laptop shutdown (poweroff directly) suddenly. It seems not happen
>> with the
>> > >> AC power supply connected.
>> > >>
>> > >> I installed both Mac OSX and FreeBSD (dual boot). It never happened
>> in Mac
>> > >> OSX.
>> > >>
>> > >> My dmesg:
>> > >> http://pastebin.com/QjZmbGCB
>> > >>
>> > >> My sysctl hw.acpi:
>> > >>
>> > >> hw.acpi.acline: 0
>> > >> hw.acpi.battery.info_expire: 5
>> > >> hw.acpi.battery.units: 1
>> > >> hw.acpi.battery.state: 1
>> > >> hw.acpi.battery.time: 87
>> > >> hw.acpi.battery.life: 59
>> > >> hw.acpi.cpu.cx_lowest: C8
>> > >> hw.acpi.reset_video: 0
>> > >> hw.acpi.handle_reboot: 1
>> > >> hw.acpi.disable_on_reboot: 0
>> > >> hw.acpi.verbose: 0
>> > >> hw.acpi.s4bios: 0
>> > >> hw.acpi.sleep_delay: 1
>> > >> hw.acpi.suspend_state: S3
>> > >> hw.acpi.standby_state: NONE
>> > >> hw.acpi.lid_switch_state: NONE
>> > >> hw.acpi.sleep_button_state: S3
>> > >> hw.acpi.power_button_state: S5
>> > >> hw.acpi.supported_sleep_state: S3 S4 S5
>> > >>
>> > >>
>> > > Hi,
>> > >
>> > > Do you have a temperature sysctl? Usually FreeBSD will shutdown the
>> system
>> > > if the ACPI temperature exceeds some value. Maybe it would be better
>> to
>> > > reduce the CPU load when the temperature goes up instead of facing a
>> > > shutdown?
>> > >
>> > > --HPS
>> >
>> >
>> > The relevant information is probably found in dev.cpu. That is where a=
ll
>> > temperature information is located as it is per-CPU, not per-system. O=
f
>> > particular interest is dev.cpu.0.cx_lowest, dev.cpu.0.cx_supported, an=
d
>> > dev.cpu.0.freq_levels. A snapshot of dev.cpu.0 when the fan has cranke=
d
>> up,
>> > but before shutdown would be nice, too.
>> >
>> > I see no hw.acpi.thermal information. This is very odd. These values
>> > indicate what the system will do and is doing if it starts getting too
>> hot.
>> >
>> > Is coretemp loaded? It is required to see the core temperatures and
>> those
>> > are almost certainly significant. It may account for the lack of therm=
al
>> > information. Finally, a dmesg might be useful as it will tell us more
>> about
>> > just what thermal control techniques are enabled.
>> >
>> > Just to explain a bit on how this should work: when the temperature
>> exceeds
>> > some BIOS defined point, the system should "throttle" by pausing one o=
f
>> > every 8 clock cycles. If that does not fix the problem, the it rests f=
or
>> > two of every 8 and so on until the temperature is reduced. If it
>> continues
>> > to rise and reaches another BIOS set point, it will initiate an
>> emergency
>> > shutdown. If it reaches a CPU defined temperature, the power will shut
>> off
>> > immediately. Note that this is entirely a hardware function with no
>> BIOS or
>> > OS involvement. It should NEVER happen in normal operation as it is
>> > triggered by a significant overtemp that threatens to destroy the CPU.
>> I've
>> > only seen it once when the CPU heat sink came loose on an old P4 syste=
m
>> > several years ago.
>> >
>> > I should mention that I have zero experience with Apple hardware and i=
t
>> is
>> > possible that they do some things differently than I have seen on othe=
r
>> > hardware.
>> > --
>> > Kevin Oberman, Part time kid herder and retired Network Engineer
>> > E-mail: rkoberman@gmail.com
>> > PGP Fingerprint: D03FB98AFA78E3B78C1694B318AB39EF1B055683
>>
>> I have had such  problems many times with older hardware. In most cases
>> "dried out"
>>  thermal conductive pad or grease was the reason overheating the CPU du
>> to a ineffective
>> thermal conductivity from the CPU's surface to the heat spreader/cooler.
>> I had recently
>> two laptops with such a phenomenon - using high-quality thermal grease
>> solved the problem
>> for my. In both cases, the former high-viscous thermal grease has become
>> like dry mud.
>> Same with pads.
>>
>
> Valid suggestion. If you have not worked with it, keep the layer of greas=
e
> as thin as possible. Use quality grease, not pads or tape. They just don'=
t
> work as well. Good silicone thermal grease should remain effective for at=
 a
> minimum of 10 years.
>
> Also, clean your heat sinks! I clean the ones on my laptop about once a
> year (I have to remove the keyboard to blow them out) and I see the
> quiescent temperature drop by 10-15C and the temp under load can drop by
> 20C. As active cooling works on my laptop, it does not overheat, but it
> does slow down on "buildworld -j6" and building ports like chromium and
> libreoffice. Very significant.
> --
> Kevin Oberman, Part time kid herder and retired Network Engineer
> E-mail: rkoberman@gmail.com
> PGP Fingerprint: D03FB98AFA78E3B78C1694B318AB39EF1B055683
>
>


--=20
"Life is like a snowball. The important thing is finding wet snow and a
really long hill."

"Price is what you pay. Value is what you get."

"The first rule of Investing is don't lose money; the second rule is don't
forget rule #1..."

"Wall Street is the only place that people ride to work in a Rolls-Royce to
get advice from those who take the subway..."


=E2=80=94 Warren Buffett.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAG2Wu4PQxaVrpc4DR-MjRK2V0JM8=9QW=AAe2-=_2zemTN=LVQ>