Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 8 Oct 2017 22:47:27 -0700 (PDT)
From:      Don Lewis <truckman@FreeBSD.org>
To:        kostikbel@gmail.com
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r324313 - head/sys/amd64/amd64
Message-ID:  <201710090547.v995lRso069094@gw.catspoiler.org>
In-Reply-To: <20171008083307.GG95911@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On  8 Oct, Konstantin Belousov wrote:
> On Sat, Oct 07, 2017 at 01:54:05PM -0700, Don Lewis wrote:
>> On  7 Oct, Konstantin Belousov wrote:
>> > On Sat, Oct 07, 2017 at 01:04:09PM -0700, Don Lewis wrote:
>> >> On  5 Oct, Konstantin Belousov wrote:
>> >> > Author: kib
>> >> > Date: Thu Oct  5 12:50:03 2017
>> >> > New Revision: 324313
>> >> > URL: https://svnweb.freebsd.org/changeset/base/324313
>> >> > 
>> >> > Log:
>> >> >   Avoid a race betweem freeing LDT and context switches.
>> >> >   
>> >> >   cpu_switch.S uses curproc->p_md.md_ldt value as the flag indicating
>> >> >   presence of the process LDT.  The flag is checked and then ldt segment
>> >> >   descriptor is copied into the CPU' GDT slot.
>> >> >   
>> >> >   Disallow context switches around clearing of the curproc LDT state by
>> >> >   performing the cleanup in critical section.  Ensure that the md_ldt
>> >> >   flag is cleared before md_ldt_sd descriptor content is destroyed by
>> >> >   inserting fence between the operations.
>> >> >   
>> >> >   We depend on the x86 memory model strong ordering guarantees, in
>> >> >   particular, that cpu_switch.S observes the writes to md_ldt and
>> >> >   md_ldt_sd in the expected order.
>> >> 
>> >> I don't know which of this series of commits is responsible, but I think
>> >> that it fixed the build of lang/ghc on Ryzen.
>> >> 
>> >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221029#c102
>> > 
>> > Does ghc use LDT on amd64 ?  This sounds unbelievable.
>> 
>> I have no idea, but ghc would reliably fail to build on my Ryzen machine
>> up through r323398, and at r324367 it seems to reliably build.
> Could you try to bisect ?
> 
> I reviewed all kernel changes and do not see anything which could be
> marked as possible fix for whatever kernel issue causing usermode
> fault.  

I need to bisect at least userland for another reason.  I've recently
been seeing sporadic build runaways when using poudriere to build ports.
This has happened both on my replacement Ryzen CPU and also on my AMD
FX-8320E, more frequently on the Ryzen.  My reason for suspecting a
userland problem is that I've only seen this when building ports in a
12.0-CURRENT jail, and not 10.4 or 11.1.  I first started seeing this a
couple weeks ago.

This will be slow going because even on the Ryzen, my full package set
builds take about nine hours and I don't see the failure every time.

I'll also see if ghc is affected by the userland version.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201710090547.v995lRso069094>