From owner-freebsd-bugs@FreeBSD.ORG  Thu Apr 17 14:20:03 2008
Return-Path: <owner-freebsd-bugs@FreeBSD.ORG>
Delivered-To: freebsd-bugs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A8A101065671
	for <freebsd-bugs@hub.freebsd.org>;
	Thu, 17 Apr 2008 14:20:03 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 9FA678FC1B
	for <freebsd-bugs@hub.freebsd.org>;
	Thu, 17 Apr 2008 14:20:03 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m3HEK3co020413
	for <freebsd-bugs@freefall.freebsd.org>; Thu, 17 Apr 2008 14:20:03 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m3HEK3GA020412;
	Thu, 17 Apr 2008 14:20:03 GMT (envelope-from gnats)
Date: Thu, 17 Apr 2008 14:20:03 GMT
Message-Id: <200804171420.m3HEK3GA020412@freefall.freebsd.org>
To: freebsd-bugs@FreeBSD.org
From: Bob Frazier <bobf@mrp3.com>
Cc: 
Subject: Re: kern/122615: occasional crash/boot while running Xorg
X-BeenThere: freebsd-bugs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Bob Frazier <bobf@mrp3.com>
List-Id: Bug reports <freebsd-bugs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-bugs>
List-Post: <mailto:freebsd-bugs@freebsd.org>
List-Help: <mailto:freebsd-bugs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-bugs>,
	<mailto:freebsd-bugs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Apr 2008 14:20:03 -0000

The following reply was made to PR kern/122615; it has been noted by GNATS.

From: Bob Frazier <bobf@mrp3.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/122615: occasional crash/boot while running Xorg
Date: Thu, 17 Apr 2008 08:18:20 -0700

 It appears that this problem may be related specifically to the SATA 
 controller.
 
 I had several crashes happen to me this morning, most of them without 
 Xorg running.  Prior to this, Xorg had been running for several days 
 without incident.
 
 I should point out that I have 2 jails running from directories on the 
 SATA drive, which is the 2nd drive in my system.  So I can expect file 
 activity on this drive from time to time due to cron, etc. running in 
 the jails.  The SATA drive has a single NFS partition and is 160Gb.
 
 Crash 1:  copying a ~180Mb file from an NFS share on a linux machine to 
 a location on the SATA drive.  System froze up then rebooted.  no core dump.
 
 Crash 2:  From the console (no X running), after copying the same file 
 again (while background checks were being done), copied this same file 
 to a USB ramdisk and started another process (in a different vconsole) 
 to compare a number of existing files against (should be) identical 
 files on the same NFS share as before.  When I issued the 'umount' 
 command, the system rebooted.  No core dump.
 
 Crash 3:  Started the file comparison (again), after manually fsck'ing 
 the partitions on the IDE drive (/, /tmp, /var, /usr) in single-user and 
 pressing CTRL+D to resume startup.  System rebooted with a crash dump 
 (#4 in /var/crash).
 
 Crash 4:  Started the system, booted to single user, fsck'd the 4 
 mountpoints on the IDE drive again, ctrl+D to multi-user, and then 
 started typing in a command.  System froze up and rebooted with a crash 
 dump (#5 in /var/crash).
 
 In each case the crash symptoms are similar to the one I reported here. 
   I'm lacking time at the moment and will follow up with more backtraces 
 for the 2 crashdump files on request.
 
 At the moment I'm running an fsck on the SATA drive with the drive 
 unmounted in multi-user mode (jails not running).  Hopefully this won't 
 crash and I can validate and offload files from this drive.  I am 
 starting to suspect that the SATA controller or the drive itself is at 
 the root of the problem.  The typical symptoms include a message in 
 which the 'ad4' (SATA) drive has some kind of error, followed by a 
 message that suggests it is being removed or not responding or something 
 similar, followed by several reported errors reading/writing LBA 
 locations that seem unusually large for a drive that size, followed by 
 the crash/boot.  Unfortunately this information gets lost every time, if 
 I'm even lucky enough to see the writing on the terminal before the 
 system boots.  The only relevant piece of information that seems to end 
 up in the info.# file is "vinvalbuf: dirtybufs" as the cause for the 
 'panic'.