Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 25 Jul 2017 13:30:58 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-bugs@FreeBSD.org
Subject:   [Bug 219399] System panics after several hours of 14-threads-compilation orgies using poudriere on AMD Ryzen...
Message-ID:  <bug-219399-8-MKxqAlrzQS@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-219399-8@https.bugs.freebsd.org/bugzilla/>
References:  <bug-219399-8@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D219399

--- Comment #121 from Nils Beyer <nbe@renzel.net> ---
(In reply to Don Lewis from comment #119)

> This gives mfence() some memory loads to wait for, which allows the data =
to be migrated from the core A cache.  With this change, I no longer get an=
y segfaults.

confirmed - with that change, I haven't gotten any segfaults in 500 passes.
Though, there is a discrepancy in how many passes each core has absolved:
---------------------------------------------------------------------------
[...]
412: Tue Jul 25 15:19:00 CEST 2017: OK
405: Tue Jul 25 15:19:01 CEST 2017: OK
402: Tue Jul 25 15:19:01 CEST 2017: OK
420: Tue Jul 25 15:19:01 CEST 2017: OK
410: Tue Jul 25 15:19:01 CEST 2017: OK
406: Tue Jul 25 15:19:01 CEST 2017: OK
410: Tue Jul 25 15:19:01 CEST 2017: OK
414: Tue Jul 25 15:19:01 CEST 2017: OK
410: Tue Jul 25 15:19:01 CEST 2017: OK
409: Tue Jul 25 15:19:02 CEST 2017: OK
413: Tue Jul 25 15:19:02 CEST 2017: OK
423: Tue Jul 25 15:19:02 CEST 2017: OK
397: Tue Jul 25 15:19:02 CEST 2017: OK
411: Tue Jul 25 15:19:02 CEST 2017: OK
401: Tue Jul 25 15:19:02 CEST 2017: OK
421: Tue Jul 25 15:19:02 CEST 2017: OK
438: Tue Jul 25 15:19:02 CEST 2017: OK
427: Tue Jul 25 15:19:02 CEST 2017: OK
406: Tue Jul 25 15:19:02 CEST 2017: OK
---------------------------------------------------------------------------

In my eyes, each core is performing the same workload and should therefore =
be
at the same pass number. Maybe I'm completely wrong. But isn't that somethi=
ng
you've observed, too, is it?


> Ryzen bug?  Just more aggressive prefetching?  I don't know ...

It's a rather difficult question: if CPU A executes something without
segfaults; and CPU B throws segfaults using the same executable, does that
automatically mean that CPU B is doing it all wrongly? Or does it rather me=
an
CPU B is not 100% compatible to CPU A and therefore needs an appropiate
executable?

I ask because I wonder if that's something that should be told to AMD tech
support - particularly because I have an open ticket there...

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-219399-8-MKxqAlrzQS>