From owner-freebsd-current Sat Oct 19 2: 5:18 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 953A537B401; Sat, 19 Oct 2002 02:05:16 -0700 (PDT) Received: from haystack.lclark.edu (haystack.lclark.edu [149.175.1.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8EBD843E88; Sat, 19 Oct 2002 02:05:06 -0700 (PDT) (envelope-from eta@lclark.edu) Received: from copeland-30-191.lclark.edu (anholt@copeland-30-191.lclark.edu [149.175.30.191]) by haystack.lclark.edu (8.9.3/8.9.3) with ESMTP id BAA13336; Sat, 19 Oct 2002 01:57:03 -0700 (PDT) Subject: Re: X problems & 5.0... -RELEASE? From: Eric Anholt To: Eric Anholt Cc: Kris Kennaway , Wesley Morgan , current@FreeBSD.ORG, Maxim Sobolev In-Reply-To: <1034575226.3020.75.camel@anholt.dyndns.org> References: <20021013231430.F92271-100000@volatile.chemikals.org> <20021014041422.GA31437@xor.obsecurity.org> <1034575226.3020.75.camel@anholt.dyndns.org> Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 Date: 19 Oct 2002 01:57:02 -0700 Message-Id: <1035017831.882.25.camel@anholt.dyndns.org> Mime-Version: 1.0 Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sun, 2002-10-13 at 23:00, Eric Anholt wrote: > On Sun, 2002-10-13 at 21:14, Kris Kennaway wrote: > > On Sun, Oct 13, 2002 at 11:28:51PM -0400, Wesley Morgan wrote: > > > > > I know there is some work being done on the recent signal changes to fix > > > some things, but are we sure this is the problem? I would hate to see > > > release schedules pushed back because these problems are lost in the > > > noise, and I can't see a release being made that has a known unstable X. > > > > I thought this was believed to be a bug in X that was exposed by > > kernel changes. > > > > Kris > Could anyone who is having stability issues with X please email me > privately if they are using either -current before September or > -stable? If not, without some sort of hints of where an issue really > is, I'm going to chalk this up to kernel bugs. Just to let people know what's going on with this on my end: I've got my laptop up to a fresh kernel, world, and X as of 10/17 or so. I've got a reproducible X server crash with XFree86 + glxgears alone (DRI disabled). I'm working on getting backtraces to see if anything useful can be produced. However, gdb521 is crashing if I start XFree86 from it (gdbing that gdb produced only silliness -- gbs exiting semicleanly or senseless backtraces). gdb521 can attach to a running XFree86 fine apparently, but then it doesn't get the module info. On my -stable box, gdb521 appears to start XFree86 fine, but on stable (and current iirc) ^Cing in gdb results in nothing happening and needing to kill the gdb or the XFree86 because they go unresponsive. If I can get gdb52 to be useful, I'll add a patch to XFree86-4-Server (and dri-devel maybe?) to compile debuggable X Server/modules and install them properly. From the reports: Both stable and current users get the "self-healing" hang, where the X server responds to nothing but the mouse moves, and at some minutes later time it continues and responds to those actions. One person said they'd had this since at least current in July. Folks with kernels later than a couple weeks ago get X crashes all the time. Updated world wasn't necessary to get it (in my case), updated world+kernel didn't help (others), and updated world+kernel+X didn't help (my case, too). Note that just about any X crash will result in a signal six reported by the kernel, because one X crash causes another while it tries to recover (reset to the console, etc), and on the second crash that gets caught it aborts. To see what started the mess, look at the console output from startx if you used startx. You're looking for the first "Fatal error" -- later stuff is trying to recover from that crash that got caught. If you use xdm, it's in your /var/log/xdm-errors iirc. If you blame this on type1/bezier, make sure you actually have an error message about bezier or something else in your log before the abort. All of the type1 module's aborts have a reason printed before the abort. -- Eric Anholt http://people.freebsd.org/~anholt/dri/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message