Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 25 Jul 2017 17:05:21 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-bugs@FreeBSD.org
Subject:   [Bug 219399] System panics after several hours of 14-threads-compilation orgies using poudriere on AMD Ryzen...
Message-ID:  <bug-219399-8-JQhjw5NMsx@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-219399-8@https.bugs.freebsd.org/bugzilla/>
References:  <bug-219399-8@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D219399

--- Comment #124 from Don Lewis <truckman@FreeBSD.org> ---
(In reply to Nils Beyer from comment #121)
I let it run overnight here and got 16000+ passes w/o error.

I also see the same variation:

16680: Tue Jul 25 08:04:36 PDT 2017: OK
16640: Tue Jul 25 08:04:36 PDT 2017: OK
16678: Tue Jul 25 08:04:36 PDT 2017: OK
16699: Tue Jul 25 08:04:37 PDT 2017: OK
16719: Tue Jul 25 08:04:37 PDT 2017: OK
16813: Tue Jul 25 08:04:37 PDT 2017: OK
16684: Tue Jul 25 08:04:37 PDT 2017: OK
16687: Tue Jul 25 08:04:37 PDT 2017: OK
16737: Tue Jul 25 08:04:37 PDT 2017: OK
16758: Tue Jul 25 08:04:37 PDT 2017: OK

This isn't too surprising since there are more threads than cores and the
scheduler won't be totally fair about keeping the load on each core balance=
d,
so the wall clock time for each process will vary a bit.  Over time there w=
ill
be some dispersion in the number of processes executed by each run1 instanc=
e.

I don't know whether the segfaults in this example count as a bug or not.  =
The
architecture spec should say that for this sort of thing you should do A, B,
and C.  It may be the case that if you don't strictly follow the spec that =
your
code will run on CPU A, but not CPU B.

I forgot to mention the uop cache.  I'm wondering if it automatically gets
invalidated when writes are detected to the instruction locations that it h=
as
cached decoded instructions for.  Note this statement about self-modifying
code:
  The micro-op cache is filled by the conventional instruction-fetch-and-de=
code
  pipeline, but it=E2=80=99s neither inclusive nor exclusive of the L1 inst=
ruction
  cache. As a result, self-modifying code is more difficult, as it must che=
ck
  and potentially invalidate both caches. Since the TLBs are earlier in the
  pipeline, the micro-op cache may be physically addressed, unlike Intel=E2=
=80=99s
  virtually addressed micro-op cache.
that I found here:
  http://www.neogaf.com/forum/showthread.php?t=3D1342455&page=3D1
I have read that AMD has been suggesting that people having stability probl=
ems
try disabling the uop cache.  The BIOS on my board does not have an option =
for
that.

I think this code is trying to test for the ASLR problem that at lot of Lin=
ux
users have run into.  It's a poor match for that, though.  ASLR doesn't use
self modifying code, it always starts with a fresh process each time and ju=
st
maps stuff into randomly chosen locations each time.  If you run the same
program several times, the memory contents might look like
   A   B              C
      B     A                C
etc.  To the CPU this is shouldn't be any different than running cat, make,=
 and
sh.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-219399-8-JQhjw5NMsx>