Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 26 Apr 2019 15:05:34 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: I have submitted bugzilla 237590 for old powerpc64 FreeBSD on G5 PowerMac's crashing for "ofwdump -ap" and the like: timeout trying to sleep the CPUs
Message-ID:  <EEFD7C61-25FC-4F42-903B-1744FFFC5E9E@yahoo.com>
In-Reply-To: <0224E7FC-52CC-4148-B795-453894BBAC65@yahoo.com>
References:  <0224E7FC-52CC-4148-B795-453894BBAC65@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[I added a comment noting another possibility about the failure
sequence and why a timeout for cpu sleep might be what is reported.]

On 2019-Apr-26, at 14:12, Mark Millard <marklmi at yahoo.com> wrote:

> The biggest issue here may be the inability to sufficiently sleep
> CPUs on powerpc64 in general, with ofwdump on old PowerMac G5's
> just being a good way to test that.
> 
> But my test context is limited to old PowerMacs.
> 
> This was originally observed on head -r345758.
> 
> "bisecting" based on:
> 
> https://artifact.ci.freebsd.org/snapshot/head/r*
> 
> I found that for the likes of "ofwdump -ap > /dev/null" :
> 
> -r330572: does not crash the system (or program).
> -r330614: crashes the system: timeout trying to sleep cpus.
> 
> There are no other  https://artifact.ci.freebsd.org/snapshot/head/r*
> between -r330572 and -r330614 with powerpc64 present. So I stopped
> at this range.
> 
> Turns out that between those two versions is:
> 
> Revision 330610 . . .
> Modified Wed Mar 7 17:08:07 2018 UTC . . . by nwhitehorn 
> . . .
> Move the powerpc64 direct map base address from zero to high memory. This
> accomplishes a few things:
> - Makes NULL an invalid address in the kernel, which is useful for catching
> bugs.
> . . .
> 
> (It may be that -r330610 exposed another problem that was
> accidentally avoided before that.)
> . . .

I suppose that a possibility is that:

A) It may be ddb related code that can not sleep some
   CPU(s) (because some already are sleeping?).

B) It may be openfirmware tried to use an address that
   is invalid in the kernel on or after -r330610 .

(A) may prevent seeing a notice that would point to (B)
as a possibility, thus hiding the true cause.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EEFD7C61-25FC-4F42-903B-1744FFFC5E9E>