Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 04 Aug 2013 20:01:36 -0600
From:      Gary Aitken <vagabond@blackfoot.net>
To:        Frank Leonhardt <frank2@fjl.co.uk>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: AMD Phenom II X4 temperature issues  (was Re: hardware monitor)
Message-ID:  <51FF0780.1010908@blackfoot.net>
In-Reply-To: <51FEF20B.2090503@fjl.co.uk>
References:  <51FEBE38.2000202@blackfoot.net> <20130804231548.dbb1fd2e.freebsd@edvax.de> <51FEE23D.3020402@blackfoot.net> <51FEE3E0.5080709@blackfoot.net> <51FEF20B.2090503@fjl.co.uk>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
On 08/04/13 18:30, Frank Leonhardt wrote:
> On 05/08/2013 00:29, Gary Aitken wrote:
>> On 08/04/13 17:22, Gary Aitken wrote:
>>> Ok, so now I see that my cpu temperature shoots up pretty dang
>>> fast when a build is going on.
>>> 
>>> I'm running an AMD Phenom II X4 with the AMD-supplied fan in an 
>>> ASUS M4A89TD PRO / USB3 motherboard.
>>> 
>>> The system "works fine" unless I start a cpu-intensive build. If
>>> I leave it unattended, after some time the system shuts down
>>> abruptly. I'm guessing it's because of excessive cpu
>>> temperatures.
>>> 
>>> When doing port builds, or any cpu-intensive job, the temperature
>>> of the CPU goes from 45 to 50 in about 30 seconds. I pretty much
>>> have to manually suspend and resume the build process to keep it
>>> down.  If I do that, I avoid the abrupt shutdown.
>>> 
>>> Needless to say, this makes unattended operation a
>>> non-starter...
>>> 
>>> Does anyone else have a similar setup they can provide me some
>>> related experience on?
>> BTW, the mobo temp stays down around 32.
>> 
> 
> Did you get that from the ACPI?

I think so; via amdtemp and xmbmon

> Obvious answers are a bigger fan, but a lot of home-build machines
> don't match the airflow through the case properly - if the CPU fan is
> blowing pre-warmed air on to the CPU it's not as good as blowing
> outside air.
> 
> 50C isn't crazy. Some would say that was barely warm, in fact. Cooler
> is always better, but you possibly don't need to worry about this.
> Some CPUs use what they call passive temperature management, and
> power management, which means they increase or reduce the clock rate
> depending on the workload and whether it's getting too hot. Faster
> switching means more heat. So getting hotter when doing a lot of work
> makes sense and could be expected. (Winchesters really heat up like
> you wouldn't believe when you move the heads a lot).

Actually, the 50C figure is just where it shoots to for starters.
Mfg specs say 62C max, so I stall the process when it gets around 59
and still climbing steeply.

> Did you get anywhere with the ACPI suggestion (you emailed me
> privately, whether you meant to or not, but didn't mention the
> outcome). There's a lot there in the ACPI you might want to look in
> to, including fan control. If I understand it correctly, "passive
> cooling" will be engaged by acpi_thermal if the cpufreq drivers are
> in use, which may not be what you want. Try
> hw.acpi.thermal.tz0.active=1 to make the fan come on and stay on (tz0
> or as appropriate).

The fan is on and stays on all the time at the moment...

> Here's the fun part. Is your system doing a thermal overload
> shutdown? it will say so on the console, or in the message log. You
> didn't say, you just said it "shut down". If it's deciding to shut
> down through over-temperature it does not necesarily mean it's
> overheating; it could be that it has incorrectly set the shutdown
> temperatue for your CPU to be far too low - possibly because it
> doesn't recognise it and is being over-cautious.

There is no indication in messages; the last thing before it shut down
the last time was some su's and root logins.

> it might help if you posted the results of "sysctl hw.acpi.thermal",
> but in the mean time look at:
> 
> hw.acpi.thermal.tz0._HOT hw.acpi.thermal.tz0._CRT
> 
> (replace tz0 with whatever tz you're worried about).

I don't see any of those; here's what shows up in sysctl -a :

hw.acpi.supported_sleep_state: S1 S3 S4 S5
hw.acpi.power_button_state: S5
hw.acpi.sleep_button_state: S1
hw.acpi.lid_switch_state: NONE
hw.acpi.standby_state: S1
hw.acpi.suspend_state: S3
hw.acpi.sleep_delay: 1
hw.acpi.s4bios: 0
hw.acpi.verbose: 0
hw.acpi.disable_on_reboot: 0
hw.acpi.handle_reboot: 0
hw.acpi.reset_video: 0
hw.acpi.cpu.cx_lowest: C1

> The first is the temperature when the system is supposed to stop what
> it's doing and suspend to disk (if it can). When it reaches the value
> on _CRT it'll write a message to the log file and shut down
> immediately to prevent damage. You can set these to whatever you
> want, but you have to set hw.acpi.thermal.user_override to 1 first
> before it will let you. Final trick - make sure you specify the
> temperatures like
> 
> sysctl hw.acpi.thermal.tz0._CRT=80C

# sysctl hw.acpi.thermal.user_override
sysctl: unknown oid 'hw.acpi.thermal.user_override'

obviously, something missing...

I tried loading coretemp, but no additional hw.acpi variables;
and the man page says it is for intel, not amd.

> Don't specify it as 80.0C (as it will display) and don't forget the C
> or it will assume degrees Kelvin!
> 
> Regards, Frank.



Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?51FF0780.1010908>