Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 17 Jan 2018 15:10:30 -0800 (PST)
From:      Don Lewis <truckman@FreeBSD.org>
To:        Mike Tancsa <mike@sentex.net>
Cc:        Pete French <petefrench@ingresso.co.uk>, freebsd-stable@freebsd.org
Subject:   Re: Ryzen issues on FreeBSD ?
Message-ID:  <tkrat.9891976d94158d1c@FreeBSD.org>
In-Reply-To: <795dbb79-3c18-d967-98b9-5d09a740dbfe@sentex.net>
References:  <8e842dec-ade7-37d1-6bd8-856ea1a827ca@sentex.net> <3b625072-dfb3-6b4f-494d-7fe1b2fa554c@ingresso.co.uk> <2c6ce4dd-f43c-7c40-abc2-732d6f8996ec@sentex.net> <tkrat.6d8f44d87e74fa14@FreeBSD.org> <795dbb79-3c18-d967-98b9-5d09a740dbfe@sentex.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 17 Jan, Mike Tancsa wrote:
> On 1/17/2018 3:39 PM, Don Lewis wrote:
>> On 17 Jan, Mike Tancsa wrote:
>>> On 1/17/2018 8:43 AM, Pete French wrote:
>>>>
>>>> Are you running the latest STABLE ? There were some patches for Ryzen
>>>> which went in I belive, and might affect te stability. Specificly the
>>>> chnages to stop it locking up when executing code in the top page ?
>>>
>>> Hi,
>>> 	I was testing with RELENG_11 as of 2 days ago.  The fix seems to be there
>>>
>>> # sysctl -A hw.lower_amd64_sharedpage
>>> hw.lower_amd64_sharedpage: 1
>>>
>>> Would love to find a class of motherboard that pushes its "You dont need
>>> to dork around with any BIOS settings. It just works.  Oh, and we have a
>>> hardware watchdog too".... ipmi would be stellar.
>> 
>> The shared page change fixed the random lockup and silent reboot problem
>> for me.  I've got a 1700X eight core CPU and a Gigabyte X370 Gaming 5. I
>> did have to RMA my CPU (it was an early one) because it had the problem
>> with random segfaults that seemed to be triggered by process migration
>> between CPU cores.  I still haven't switched over to using it for
>> package builds because I see more random fallout than on my older
>> package builder.  I'm not blaming the hardware for that at this point
>> because I see a lot of the same issues on my older machine, but less
>> frequently.
>> 
>> One thing to watch (though it should be less critical with a six core
>> CPU) is VRM cooling.  I removed the stupid plastic shroud over the VRM
>> sink on my motherboard so that it gets some more airflow.
> 
> Thanks! I will confirm the cooling.  I tried just now looking at the CPU
> FAN control in the BIOS and up'd it to "turbo" from the default.  Does
> amdtmp.ko work with your chipset ? Nothing on mine unfortunately, so I
> cant tell from the OS if its running hot.
> 
> Is there a way to see if your CPU is old and has that bug ? I havent
> seen any segfaults on the few dozen buildworlds I have done. So far its
> always been a total lockup and not crash with RELENG11.
> 
> x86info v1.31pre
> Found 12 identical CPUs
> Extended Family: 8 Extended Model: 0 Family: 15 Model: 1 Stepping: 1
> CPU Model (x86info's best guess): AMD Zen Series Processor (ZP-B1)
> Processor name string (BIOS programmed): AMD Ryzen 5 1600 Six-Core
> Processor

My original CPU had a date code of 1708SUT (8th week of 2017 I think),
and the replacement has a date code of 1733SUS.  There's a humungous
discussion thread here <https://community.amd.com/thread/215773>; where
date codes are discussed.  As I recall, the first replacement parts
shipped had dates codes somewhere in the mid 20's, but I think AMD was
still hand screening parts at that point.  My replacement came in a
sealed box, so it wasn't hand screened and AMD probably was able to
screen for this problem in their production test.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?tkrat.9891976d94158d1c>