Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 8 Sep 2014 22:10:18 -0700
From:      Mark Millard <markmi@dsl-only.net>
To:        Nathan Whitehorn <nwhitehorn@freebsd.org>, Justin Hibbits <chmeeedalf@gmail.com>
Cc:        freebsd-ppc@freebsd.org
Subject:   Re: powerpc64/GENERIC64 for NVIDIA now crashes Xorg (backtrace included) for Option-Fn; Radeon gets a display oddity
Message-ID:  <87D45E18-4312-4B74-8309-D35E66F86F99@dsl-only.net>
In-Reply-To: <B950CC79-19D7-4C1B-B15E-8228DFCE48E8@dsl-only.net>
References:  <4D86DDCB-FF04-4EA2-9703-8B74BBF31C7E@dsl-only.net> <EDE36402-30CE-4747-8BDD-EDD82D8C308F@dsl-only.net> <D42F3E26-8D35-4C8B-A695-AA380ED888E1@dsl-only.net> <EF019CAD-6BAB-431D-A239-0644C0634C24@dsl-only.net> <540386C6.4060004@freebsd.org> <7AFF7E0F-6BB0-4972-A629-61910CE001C2@dsl-only.net> <540393F3.5060508@freebsd.org> <D53F6E61-13E3-4473-ABAD-F72BD86E1083@dsl-only.net> <2B74B670-7463-47D1-B0AF-BDBFEE8823A4@dsl-only.net> <1B729E38-6495-4240-B9E2-A48238E4E830@dsl-only.net> <D960F3AF-7498-4222-96E9-654E9B672EF7@dsl-only.net> <38A1300F-E5A4-4A71-A9CF-A7BED66E0BDF@dsl-only.net> <5408976A.5080106@freebsd.org> <6DE6C98D-F553-4F59-A72A-AEA881DC1C65@dsl-only.net> <3D7C705D-5792-43FA-835C-9FD88AEAE07E@dsl-only.net> <DC6BA46B-C123-41A3-AD07-1212FC084B88@dsl-only.net> <35DA591A-127B-4F46-B779-D76A0F71DA39@dsl-only.net> <20140906172136.59a531d0@zhabar.attlocal.net> <3B34604D-2EDB-4315-97E9-4C97652E9AE7@dsl-only.net> <20140906180532.35faf018@zhabar.attlocal.net> <E0A8E3FA-C68 9-4EC1-9275-BF517B50CC95@dsl-only.net> <B950CC79-19D7-4C1B-B15E-8228DFCE48E8@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
More notes for and the NVIDIA and Radeon contexts for...

FreeBSD FBSDG5S0 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #0 r271278: Mon =
Sep  8 12:40:56 PDT 2014     =
root@FBSDG5S0:/usr/obj/usr/src/sys/GENERIC64  powerpc

The NVIDIA GeForce 7800 GT also is getting stuck at a black screen for =
logging out of xfce4.

And that does mean that Xorg/xfce4 are quitting, but not with the same =
backtrace as for the NVIDIA Option-Fn crashes:

Core was generated by `Xorg'.
Program terminated with signal SIGABRT, Aborted.
#0  0x0000000050bc49d8 in .__sys_thr_kill () at thr_kill.S:3
3	thr_kill.S: No such file or directory.
(gdb) info threads
  Id   Target Id         Frame=20
* 2    Thread 51406400 (LWP 100071) 0x0000000050bc49d8 in =
.__sys_thr_kill () at thr_kill.S:3
* 1    Thread 51406400 (LWP 100071) 0x0000000050bc49d8 in =
.__sys_thr_kill () at thr_kill.S:3
(gdb) info stack
#0  0x0000000050bc49d8 in .__sys_thr_kill () at thr_kill.S:3
#1  0x00000000506ae274 in _thr_send_sig (thread=3D<optimized out>, =
sig=3D6) at /usr/src/lib/libthr/thread/thr_sig.c:113
#2  0x0000000050ca6068 in abort () at =
/usr/src/lib/libc/stdlib/abort.c:65
#3  0x00000000102a0370 in OsAbort () at utils.c:1198
#4  0x00000000100c04a8 in ddxGiveUp (error=3DEXIT_ERR_ABORT) at =
xf86Init.c:1009
#5  0x00000000100c064c in AbortDDX (error=3DEXIT_ERR_ABORT) at =
xf86Init.c:1053
#6  0x00000000102aa3cc in AbortServer () at log.c:476
#7  0x00000000102aa974 in FatalError (f=3D0x102d68c0 "Caught signal %d =
(%s). Server aborting\n") at log.c:611
#8  0x000000001029c744 in OsSigHandler (signo=3D11, =
sip=3D0xffffffffffffd740, unused=3D0xffffffffffffd280) at osinit.c:146
#9  0x00000000506ae478 in handle_signal (actp=3D0xffffffffffffd1a0, =
sig=3D11, info=3D0xffffffffffffd740, ucp=3D0xffffffffffffd280)
    at /usr/src/lib/libthr/thread/thr_sig.c:238
#10 0x00000000506ae790 in thr_sighandler (sig=3D11, =
info=3D0xffffffffffffd740, _ucp=3D0xffffffffffffd280) at =
/usr/src/lib/libthr/thread/thr_sig.c:183
#11 0xffffffffffffe188 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

The logout is producing a Xorg core file. It would also appear that Xorg =
is not doing the proper cleanup/context-restore in some manor.



The ATI Radeon 9800PRO NH (AGP) gets a worse problem when I log out of =
xfce4 (compared to what I reported for Radeon's Option-Fn handling): =
When it returns to scons (say) VT1 the input from the keyboard is messed =
up.

Option-Fn to a different VTn and input starts working. Option-F1 (back =
to the original) and input is again working there.

(The top and bottom pixel bands from the Xorg/xfce4 session can still be =
there after Xorg/xfce4 has quit [via SIGABRT for the logout]. But =
otherwise the display is working: it is not just stuck at black.)

The Radeon xfce4 logout is also producing a Xorg core file but it's =
backtrace appears to look normal for SIGABRT from what I can tell. =
(Although it is not clear to me why SIGABRT is classified as appropriate =
for a core dump.)




=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On Sep 8, 2014, at 9:17 PM, Mark Millard <markmi@dsl-only.net> wrote:

Well I built powerpc64/GENERIC64 with a fresh set of ports (all via svn =
for a previously unused SSD):

FreeBSD FBSDG5S0 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #0 r271278: Mon =
Sep  8 12:40:56 PDT 2014     =
root@FBSDG5S0:/usr/obj/usr/src/sys/GENERIC64  powerpc

I built this with /etc/make.conf having:

WITH_DEBUG_FILES=3D
WITHOUT_CLANG=3D # To avoid the problem with the above.
WITH_DEBUG=3D

I tried startxfce4 first for a quad-core PowerMac G5 with a NVIDIA =
GeForce 7800 GT

While startxfce4 works a later Option-Fn (to get to VTn) does not: I end =
up stuck at a black screen.

The tail of the Xorg.0.log shows that Xorg suddenly quit when I did the =
Option-Fn:

[    48.476] (**) Option "XkbLayout" "us"
[    48.476] (**) Option "config_info" =
"hal:/org/freedesktop/Hal/devices/usb_device_5ac_24f_noserial_if0"
[    48.476] (II) XINPUT: Adding extended input device "Apple Keyboard" =
(type: KEYBOARD, id 8)
[   416.255] [mi] Increasing EQ size to 512 to prevent dropped events.
[   417.012] (II) UnloadModule: "kbd"
[   417.037] (II) UnloadModule: "mouse"
[   417.037] (II) UnloadModule: "kbd"
[   418.103] Server terminated successfully (0). Closing log file.

[It is also interesting that the "EQ size" notice seems to be the first =
thing from the time that I hit Option-Fn: Until then it did not have =
that problem.]

/var/log/messages also reports that Xorg coredumped:

Sep  8 20:26:34 FBSDG5S0 dbus[909]: [system] Activating service =
name=3D'org.freedesktop.UPower' (using servicehelper)
Sep  8 20:26:34 FBSDG5S0 dbus[909]: [system] Successfully activated =
service 'org.freedesktop.UPower'
Sep  8 20:32:36 FBSDG5S0 kernel: pid 1109 (Xorg), uid 0: exited on =
signal 11 (core dumped)

/usr/local/bin/gdb reports based on the core produced:

Core was generated by `Xorg'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __jemalloc_bitmap_unset (bit=3D0, binfo=3D<optimized out>, =
bitmap=3D0xc3ba0f75) at =
/usr/src/lib/libc/../../contrib/jemalloc/include/jemalloc/internal/bitmap.=
h:156
156		g =3D *gp;
(gdb) info threads
  Id   Target Id         Frame=20
* 2    Thread 51406400 (LWP 100097) __jemalloc_bitmap_unset (bit=3D0, =
binfo=3D<optimized out>, bitmap=3D0xc3ba0f75)
    at =
/usr/src/lib/libc/../../contrib/jemalloc/include/jemalloc/internal/bitmap.=
h:156
* 1    Thread 51406400 (LWP 100097) __jemalloc_bitmap_unset (bit=3D0, =
binfo=3D<optimized out>, bitmap=3D0xc3ba0f75)
    at =
/usr/src/lib/libc/../../contrib/jemalloc/include/jemalloc/internal/bitmap.=
h:156
(gdb) info stack
#0  __jemalloc_bitmap_unset (bit=3D0, binfo=3D<optimized out>, =
bitmap=3D0xc3ba0f75) at =
/usr/src/lib/libc/../../contrib/jemalloc/include/jemalloc/internal/bitmap.=
h:156
#1  arena_run_reg_dalloc (ptr=3D<optimized out>, run=3D<optimized out>) =
at jemalloc_arena.c:357
#2  __jemalloc_arena_dalloc_bin_locked (arena=3D0x510000c0, =
chunk=3D0x51400000, ptr=3D0x5144d140, mapelm=3D<optimized out>) at =
jemalloc_arena.c:1709
#3  0x0000000050c07a04 in __jemalloc_arena_dalloc_bin (arena=3D0x510000c0,=
 chunk=3D0x51400000, ptr=3D0x5144d140, pageind=3D<optimized out>, =
mapelm=3D0x514006d8) at jemalloc_arena.c:1733
#4  0x0000000050c07a94 in __jemalloc_arena_dalloc_small =
(arena=3D<optimized out>, chunk=3D<optimized out>, ptr=3D<optimized =
out>, pageind=3D<optimized out>) at jemalloc_arena.c:1749
#5  0x0000000050c1226c in __jemalloc_arena_dalloc (try_tcache=3D<optimized=
 out>, ptr=3D<optimized out>, chunk=3D<optimized out>, arena=3D0x510000c0)=

    at =
/usr/src/lib/libc/../../contrib/jemalloc/include/jemalloc/internal/arena.h=
:1005
#6  __jemalloc_idallocx (try_tcache=3D<optimized out>, ptr=3D<optimized =
out>) at =
/usr/src/lib/libc/../../contrib/jemalloc/include/jemalloc/internal/jemallo=
c_internal.h:913
#7  __jemalloc_iqallocx (try_tcache=3D<optimized out>, ptr=3D<optimized =
out>) at =
/usr/src/lib/libc/../../contrib/jemalloc/include/jemalloc/internal/jemallo=
c_internal.h:932
#8  __jemalloc_iqalloc (ptr=3D<optimized out>) at =
/usr/src/lib/libc/../../contrib/jemalloc/include/jemalloc/internal/jemallo=
c_internal.h:939
#9  __free (ptr=3D0x5144d140) at jemalloc_jemalloc.c:1277
#10 0x00000000102a57c4 in _XSERVTransFreeConnInfo (ciptr=3D0x5144d140) =
at /usr/local/include/X11/Xtrans/Xtrans.c:146
#11 0x00000000102a72d8 in _XSERVTransClose (ciptr=3D0x5144d140) at =
/usr/local/include/X11/Xtrans/Xtrans.c:962
#12 0x00000000102960d0 in CloseWellKnownConnections () at =
connection.c:486
#13 0x00000000102aa3ac in AbortServer () at log.c:473
#14 0x00000000102aa974 in FatalError (f=3D0x102d68c0 "Caught signal %d =
(%s). Server aborting\n") at log.c:611
#15 0x000000001029c744 in OsSigHandler (signo=3D11, =
sip=3D0xffffffffffffd740, unused=3D0xffffffffffffd280) at osinit.c:146
#16 0x00000000506ae478 in handle_signal (actp=3D0xffffffffffffd1a0, =
sig=3D11, info=3D0xffffffffffffd740, ucp=3D0xffffffffffffd280) at =
/usr/src/lib/libthr/thread/thr_sig.c:238
#17 0x00000000506ae790 in thr_sighandler (sig=3D11, =
info=3D0xffffffffffffd740, _ucp=3D0xffffffffffffd280) at =
/usr/src/lib/libthr/thread/thr_sig.c:183
#18 0xffffffffffffe188 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)



Attempting use of a ATI Radeon 9800PRO NH (AGP) based Dual Processor =
PowerMac G5 goes better: Option-Fn works for getting to VTn, including =
getting back to the Xorg/xfce4 session.

But there is a display oddity when going away from the Xorg/xfce4 =
session to other scons VTn's: a band a few pixels wide at the top and =
another at the bottom of the screen still show the pixels from the =
Xorg/xfce4 session.




=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On Sep 8, 2014, at 12:29 AM, Mark Millard <markmi@dsl-only.net> wrote:

I built a 10.1-PRERELEASE r271215 that contains Justin Hibbits recent =
changes for powerpc and powerpc64. (I also used portsnap and then =
rebuilt my ports from scratch.) r271215 includes the change that Justin =
expected would fix Xorg UI hangs (really Xorg silently quitting) when =
xfce4 starts up. Based on

FreeBSD FBSDG4S0 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #0 r271215: Sat =
Sep  6 23:56:15 PDT 2014     root@FBSDG4S0:/usr/obj/usr/src/sys/GENERIC  =
powerpc

Xorg with xfce4 no longer hangs/quits on my Dual Processor PowerMac =
G4's. I can still use the G5's. Thanks Justin! (I still have more =
PowerMac/Card combinations to check.)


Side note for NVIDIA GeForce4 Ti 4600's:

The experiment has also shown that there is a separate problem for at =
least the NVIDIA GeForce4 Ti 4600 when used at 1680x1050 (at least on an =
ADC display from a G4 PowerMac): What should be the last 8 pixels or so =
on the right are instead near the left hand side of the display, about 8 =
pixels from the left side in fact. (This had been observed before the =
fix after a ports update but with the hang/quit status I did not want to =
assume much about the incomplete display updates at the time.)

In total all the pixels are probably there. There is just a band of =
pixels that is way out of place. (So all the pixels after them are then =
shifted from where they should be by the width of that band.)

Also the about 8 pixel wide bind appears to be shifted down one pixel =
from what would be expected. (Easily visible at the edge of the menu bar =
across the top.)

At 1400x1050 there are no such problem bands. Nor probably at 1280x1024. =
(But 1280x1024 is not a nice display in many other ways. I did not look =
carefully at 1280x1024.)

The Xorg.0.log's do show an alternate Modeline for 1680x1050:

[    73.585] (**) NV(0): *Driver mode "1680x1050": 117.1 MHz, 63.7 kHz, =
59.9 Hz
[    73.585] (II) NV(0): Modeline "1680x1050"x59.9  117.13  1680 1744 =
1776 1840  1050 1053 1056 1062 +hsync +vsync (63.7 kHz ezP)
[    73.585] (**) NV(0): *Driver mode "1680x1050": 119.0 MHz, 64.7 kHz, =
59.9 Hz
[    73.585] (II) NV(0): Modeline "1680x1050"x59.9  119.00  1680 1728 =
1760 1840  1050 1053 1059 1080 +hsync -vsync (64.7 kHz ez)
[    73.585] (**) NV(0): *Default mode "1400x1050": 122.0 MHz, 64.9 kHz, =
60.0 Hz
[    73.585] (II) NV(0): Modeline "1400x1050"x60.0  122.00  1400 1488 =
1640 1880  1050 1052 1064 1082 +hsync +vsync (64.9 kHz zd)
[    73.585] (**) NV(0): *Default mode "1280x1024": 108.0 MHz, 64.0 kHz, =
60.0 Hz
[    73.585] (II) NV(0): Modeline "1280x1024"x60.0  108.00  1280 1328 =
1440 1688  1024 1025 1028 1066 +hsync +vsync (64.0 kHz zd)


The Radeon's do not show this display problem at all. (The alternate =
pixel counts are not allowed for the Radeon's: just 1680x1050.)



=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On Sep 6, 2014, at 6:05 PM, Justin Hibbits <chmeeedalf@gmail.com> wrote:

r261095 is only part of it.  The part that fixes powerpc32 is the MFC
of r263464, "Mask out SRR1 bits that aren't exported in the MSR".  What
happens is that sometimes an extra exception bit is stashed away in the
mcontext, I believe it's the "external interrupt" bit getting set if
memory serves me right.  I'm not sure why it gets set and stashed in
there, but r263464 masks that out, so X will work again (at least it
does for me).

- Justin

On Sat, 6 Sep 2014 18:00:56 -0700
Mark Millard <markmi@dsl-only.net> wrote:

> If I grab sources from svn it will be the first time that I've done
> so. I'd probably make a separate SSD and leave my "as distributed"
> powerpc/GENERIC SSD alone (it has other uses). Overall it will be
> some time before I've a rebuilt context if I try it. (It is probably
> a good thing for me to do at this stage.)
>=20
>=20
> The svn-src-stable-10/2014-September signal handling commit comments
> that I noticed from you this month are for
>=20
> "Fix 32-bit signal handling on ppc64." (r261095)
>=20
> and for
>=20
> "Set the si_code appropriately for exception-caused
> signals" (powerpc/aim/trap.c) (MFC r269701)
>=20
> ppc64 (G5) is working and ppc32 (G4) is not working as far as the
> processor context goes for what I'me been reporting on. Although it
> is the powerpc/GENERIC build used for both contexts, not a
> powerpc64/GENERIC64 build for the G5's. (All 10.1-PRERELEASE r270981
> now.)
>=20
> So I'm guessing that you are expecting the si_code update to be what
> fixes things. If so then more than just LLDB would care about the
> ucode=3D??? assignments that were added.
>=20
>=20
>=20
> =3D=3D=3D
> Mark Millard
> markmi at dsl-only.net
>=20
> On Sep 6, 2014, at 5:21 PM, Justin Hibbits <chmeeedalf@gmail.com>
> wrote:
>=20
> On Sat, 6 Sep 2014 17:05:45 -0700
> Mark Millard <markmi@dsl-only.net> wrote:
>=20
>> I finally asked myself "how many gdb's does FreeBSD have?". This
>> lead me to building and using devel/gdb (/usr/local/bin/gdb).
>> Experiments indicates /usr/local/bin/gdb works on Xorg on the G4's.
>> (Xorg's build installs gcc47 and apparently needs a newer gdb to go
>> with it.)
>>=20
>> Thus I managed to get a little more information: /usr/local/bin/gdb
>> reports for Xorg the likes of:
>>=20
>> [Inferior 1 (process 1934) exited with code 026]
>=20
> This is the smoking gun.  Exit code 026 =3D=3D 22 (decimal), which =
just so
> happens to be EINVAL, returned by sigreturn().  I finished committing
> all my merges to stable/10, so you could try building an updated
> kernel, and see that it fixes it.
>=20
> - Justin
>=20







Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?87D45E18-4312-4B74-8309-D35E66F86F99>