Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Mar 2019 18:35:42 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        Justin Hibbits <chmeeedalf@gmail.com>
Cc:        freebsd-ppc <freebsd-ppc@freebsd.org>
Subject:   Re: head -r344018 powerpc64 variant on Powermac G5 (2 sockets, 2 cores each): [*buffer arena] shows up more . . .?
Message-ID:  <99AD89F8-0F90-48BE-A060-DA12FD7129E6@yahoo.com>
In-Reply-To: <8668AAF7-9E6A-4278-9D1B-2ECDBD3804AA@yahoo.com>
References:  <D9B56EE2-35C7-44A2-9229-D9E4AECAD3E1@yahoo.com> <20190306151914.44ea831c@titan.knownspace> <8668AAF7-9E6A-4278-9D1B-2ECDBD3804AA@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[The patch is definitely wrong via a 3rd use of platform_smp_timebase_sync
that I'd not noted before. Details at the end.]

On 2019-Mar-6, at 16:39, Mark Millard <marklmi at yahoo.com> wrote:



> On 2019-Mar-6, at 13:19, Justin Hibbits <chmeeedalf at gmail.com> wrote:
> 
>> On Mon, 4 Mar 2019 19:43:09 -0800
>> Mark Millard via freebsd-ppc <freebsd-ppc@freebsd.org> wrote:
>> 
>>> [It is possible that the following is tied to my hack to
>>> avoid threads ending up stuck-sleeping. But I ask about
>>> an alternative that I see in the code.]
>>> 
>>> Context: using the modern powerpc64 VM_MAX_KERNEL_ADDRESS
>>> and using usefdt=1 on an old Powermac G5 (2 sockets, 2 cores
>>> each). Hacks are in use to provide fairly reliable booting
>>> and to avoid threads getting stuck sleeping.
>>> 
>>> Before the modern VM_MAX_KERNEL_ADDRESS figure there were only
>>> 2 or 3 bufspacedaemon-* threads as I remember. Now there are 8
>>> (plus bufdaemon and its worker), for example:
>>> 
>>> root         23   0.0  0.0     0   288  -  DL   15:48     0:00.39
>>> [bufdaemon/bufdaemon] root         23   0.0  0.0     0   288  -  DL
>>> 15:48     0:00.05 [bufdaemon/bufspaced] root         23   0.0
>>> 0.0     0   288  -  DL   15:48     0:00.05 [bufdaemon/bufspaced]
>>> root         23   0.0  0.0     0   288  -  DL   15:48     0:00.05
>>> [bufdaemon/bufspaced] root         23   0.0  0.0     0   288  -  DL
>>> 15:48     0:00.05 [bufdaemon/bufspaced] root         23   0.0
>>> 0.0     0   288  -  DL   15:48     0:00.05 [bufdaemon/bufspaced]
>>> root         23   0.0  0.0     0   288  -  DL   15:48     0:00.07
>>> [bufdaemon/bufspaced] root         23   0.0  0.0     0   288  -  DL
>>> 15:48     0:00.05 [bufdaemon/bufspaced] root         23   0.0
>>> 0.0     0   288  -  DL   15:48     0:00.56 [bufdaemon// worker]
>>> 
>>> I'm sometimes seeing processes showing [*buffer arena] that
>>> seemed to wait for a fairly long time with that status, not
>>> something I'd seen historically for those same types of
>>> processes for a similar overall load (not much). During such
>>> times trying to create processes to look around at what is
>>> going on seems to also wait. (Probably with the same status?)
>>> 
>> 
>> Hi Mark,
>> 
>> Can you try the attached patch?  It might be overkill in the
>> synchronization, and I might be using the wrong barriers to be
>> considered correct, but I think this should narrow the race down, and
>> synchronize the timebases to within a very small margin.  The real
>> correct fix would be to suspend the timebase on all cores, which is
>> feasible (there's a GPIO for the G4s, and i2c for G5s), but that's a
>> non-trivial extra work.
>> 
>> Be warned, I haven't tested it, I've only compiled it (I don't have a
>> G5 to test with anymore).
>> 
> 
> Sure, I'll try it when the G5 is again available: it is doing
> a time consuming build.
> 
> I do see one possible oddity: tracing another
> platform_smp_timebase_sync use in the code . . .
> 
> DEVMETHOD(cpufreq_drv_set,      pmufreq_set)
> 
> static int
> pmufreq_set(device_t dev, const struct cf_setting *set)
> {
> . . .        
>        error = pmu_set_speed(speed_sel);
> . . .
> }
> 
> int
> pmu_set_speed(int low_speed)
> {
> . . .
>        platform_sleep();
> . . .
> }
> 
> PLATFORMMETHOD(platform_sleep,          powermac_sleep),
> 
> void
> powermac_sleep(platform_t platform)
> {
> 
>        *(unsigned long *)0x80 = 0x100;
>        cpu_sleep();
> }
> 
> void
> cpu_sleep()
> {
> . . .
>        platform_smp_timebase_sync(timebase, 0);
> . . .
> }
> 
> PLATFORMMETHOD(platform_smp_timebase_sync, powermac_smp_timebase_sync),
> 
> The issue:
> 
> I do not see any matching platform_smp_timebase_sync(timebase, 1)
> or other CPUs doing a powermac_smp_timebase_sync in this sequence.
> 
> (If this makes testing the patch inappropriate, let me know.)
> 

More important: There is also a use of:

        /* The following is needed for restoring from sleep. */
        platform_smp_timebase_sync(0, 1);

in cpudep_ap_setup . That in turn happens during cpu_reset_handler
before machdep_ap_bootstrap is called (which does
platform_smp_timebase_sync as well) :

cpu_reset_handler:
        GET_TOCBASE(%r2)
 
        ld      %r1,TOC_REF(tmpstk)(%r2)        /* get new SP */
        addi    %r1,%r1,(TMPSTKSZ-48)
 
        bl      CNAME(cpudep_ap_early_bootstrap) /* Set PCPU */
        nop
        lis     %r3,1@l
        bl      CNAME(pmap_cpu_bootstrap)       /* Turn on virtual memory */
        nop
        bl      CNAME(cpudep_ap_bootstrap)      /* Set up PCPU and stack */
        nop
        mr      %r1,%r3                         /* Use new stack */
        bl      CNAME(cpudep_ap_setup)
        nop
        GET_CPUINFO(%r5)
        ld      %r3,(PC_RESTORE)(%r5)
        cmpldi  %cr0,%r3,0
        beq     %cr0,2f
        nop
        li      %r4,1
        bl      CNAME(longjmp)
        nop
2:
#ifdef SMP
        bl      CNAME(machdep_ap_bootstrap)     /* And away! */
        nop
#endif

Thus overall for ap's there is the sequence:

        platform_smp_timebase_sync(0, 1);
. . .
        while (ap_letgo == 0)
                __asm __volatile("or 31,31,31");
        __asm __volatile("or 6,6,6");

        /*
         * Set timebase as soon as possible to meet an implicit rendezvous
         * from cpu_mp_unleash(), which sets ap_letgo and then immediately
         * sets timebase.
         *
         * Note that this is instrinsically racy and is only relevant on
         * platforms that do not support better mechanisms.
         */
        platform_smp_timebase_sync(ap_timebase, 1);

for each ap . So the (ap) case in powermac_smp_timebase_sync
will wait with tb==0 (from cpudep_ap_setup) and the later calls
from machdep_ap_bootstrap will not wait and will be after the
unleash but not just local to powermac_smp_timebase_sync:

static void
powermac_smp_timebase_sync(platform_t plat, u_long tb, int ap)
{
        static int cpus;
        static int unleash;

        if (ap) {
                atomic_add_int(&cpus, 1);
                while (!atomic_load_acq_int(&unleash))
                        ;
        } else {
                atomic_add_int(&cpus, 1);
                while (atomic_load_int(&cpus) != mp_ncpus)
                        ;
                atomic_store_rel_int(&unleash, 1);
        }
        
        mttb(tb);
}

In the end cpus will have double counts of the ap cpus instead
of matching mp_ncpus.

cpufreq_drv_set activity is a seperate, additional issue from this.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?99AD89F8-0F90-48BE-A060-DA12FD7129E6>