From owner-freebsd-stable@FreeBSD.ORG Sat Mar 18 23:26:17 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 16E5616A401; Sat, 18 Mar 2006 23:26:17 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id B85FA43D45; Sat, 18 Mar 2006 23:26:16 +0000 (GMT) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id 866D91A3C25; Sat, 18 Mar 2006 15:26:16 -0800 (PST) Received: by obsecurity.dyndns.org (Postfix, from userid 1000) id 7336351373; Sat, 18 Mar 2006 18:26:15 -0500 (EST) Date: Sat, 18 Mar 2006 18:26:15 -0500 From: Kris Kennaway To: Jason Harmening Message-ID: <20060318232615.GA63516@xor.obsecurity.org> References: <2d1264630603161045t31774a33h9cec88c4b7d6d13d@mail.gmail.com> <20060316195406.GA30669@xor.obsecurity.org> <200603181712.13329.jason.harmening@gmail.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="sm4nu43k4a2Rpi4c" Content-Disposition: inline In-Reply-To: <200603181712.13329.jason.harmening@gmail.com> User-Agent: Mutt/1.4.2.1i Cc: freebsd-bugs@freebsd.org, freebsd-stable@freebsd.org, Kris Kennaway Subject: Re: [6.1-PRERELEASE/amd64] Kernel panic during heavy UFS traffic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Mar 2006 23:26:17 -0000 --sm4nu43k4a2Rpi4c Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Mar 18, 2006 at 05:12:12PM -0600, Jason Harmening wrote: > I managed to get a backtrace from a panic that occurred during bgfsck: >=20 > dev =3D ar0s1f, block =3D 22958088, fs =3D /usr > panic: ffs_blkfree: freeing free frag > cpuid =3D 0 > KDB: stack backtrace: > kdb_backtrace() at kdb_backtrace+0x37 > panic() at panic+0x1d1 > ffs_blkfree() at ffs_blkfree+0x43f > sysctl_ffs_fsck() at sysctl_ffs_fsck+0x316 > sysctl_root() at sysctl_root+0x13f > userland_sysctl() at userland_sysctl+0x131 > __sysctl() at __sysctl+0xd8 > syscall() at syscall+0x404 > Xfast_syscall() at Xfast_syscall+0xa8 > --- syscall (202, FreeBSD ELF64, __sysctl), rip =3D 0x8006e91fc, rsp =3D= =20 > 0x7ffffffee858, rbp =3D > 0x3 --- > Uptime: 2m59s > Dumping 1023 MB (2 chunks) Thanks, I saw this panic a few times too but not after subsequent updates. Since you seem to be able to reproduce this easily, can you try to update to BETA4 or later and see if it's still a problem? > However, I've been unable to get a backtrace for the panics that occur wh= en I=20 > try to mount my DVD-RAM. For some reason, I've so far only been able to= =20 > reproduce this panic when trying to mount the disk while the X server is= =20 > running (by clicking the KDE disk icon rather than mounting from the comm= and=20 > ine). Because of this, I'd like to be able to have the kernel generate a= =20 > crash dump and automatically reboot. Without debug symbols or dumping=20 > enabled, the system will automatically reboot. However, every time I've= =20 > managed to reproduce the panic WITH a debug kernel and WITH a dumpdev ent= ry=20 > in /etc/rc.conf, the system simply freezes. After I reset, there is no d= ump=20 > in /var/crash. Currently I have the following in my kernel configuration: Yeah, panics while in X often don't fare very well. Unfortunately there's no real solution other than trying to reproduce from the console (note that you can still run X applications from a vty by setting DISPLAY, although I don't know if you can trigger the KDE mounting from the command line). Perhaps you'll need to figure out what KDE is doing (e.g. is it mounting with special flags?) and repeat it by hand. > I've tried rebooting in single-user mode and manually retrieving the dump= from=20 > swap using savecore, but it reports "no dump found". I have 1GB of physi= cal=20 > RAM, 2GB of swap, and ~7GB free on /var, so space shouldn't be an issue. = =20 > Explicitly specifying my swap partition "/dev/ar0s1b" instead of "AUTO" i= n=20 > rc.conf does not fix the problem, nor does removing everything except the= =20 > "makeoptions" line from the kernel config. I've also tried various=20 > combinations of KDB, KDB_UNATTENDED, KDB_TRACE, and DDB in the kernel con= fig.=20 > Also note that while the above backtrace indicates dumping was done, no d= ump=20 > was actually generated. Is there something I'm missing? savecore runs after bg fsck, so are you sure it's not there? Kris P.S. Don't top-post, it confuses the logical flow of the email and loses context. > On Thursday 16 March 2006 13:54, Kris Kennaway wrote: > > On Thu, Mar 16, 2006 at 12:45:07PM -0600, Jason Harmening wrote: > > > Last night I ran into a series of kernel panics that seemed to be rel= ated > > > to heavy UFS traffic. I ran into two consecutive panics when trying = to > > > mount a UFS-formatted DVD-RAM as a regular user (though not when I > > > mounted it as root). The system seemed to actually succeed in mounti= ng > > > the disk, as it was marked dirty after the ensuing panic. Upon reboo= ting > > > after the second panic, I saw another two consecutive panics which > > > happened whenever I tried to do something fairly disk-intensive (e.g. > > > starting the X server + KDE) while the bgfsck was still running from = the > > > last panic. Ultimately I rebooted in single-user mode, ran fsck > > > manually, and have experienced no further panics. I suspect these pa= nics > > > may be related to UFS deadlocks, as in all cases the application that= was > > > attempting disk access hung for several seconds before the panic, > > > followed by a few seconds of total system hang, followed by the autom= atic > > > reboot. > > > > > > I'm running 6.1-PRELEASE/amd64 from 12 March on an Athlon 64 x2 (SMP) > > > with SCHED_ULE+PREEMPTION--dangerous combination I know, but it's been > > > rock solid for months until now. If anyone is interested, I'll try to > > > reproduce this panic with a dump/backtrace. It may be one of the UFS > > > deadlock issues that's already under investigation for 6.1-RELEASE. > > > > Yeah, we need a trace. > > > > Kris >=20 --sm4nu43k4a2Rpi4c Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (FreeBSD) iD8DBQFEHJcWWry0BWjoQKURAi+QAJ4vFNdhrV92i5rpLfzvKNlSNxc0fgCfYz06 qfVFjEBuULSc6qybwF7cUVY= =JzD6 -----END PGP SIGNATURE----- --sm4nu43k4a2Rpi4c--