From owner-freebsd-stable@FreeBSD.ORG Sun Oct 21 16:33:24 2012 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 15765E99 for ; Sun, 21 Oct 2012 16:33:24 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from albert.catwhisker.org (m209-73.dsl.rawbw.com [198.144.209.73]) by mx1.freebsd.org (Postfix) with ESMTP id CC34B8FC08 for ; Sun, 21 Oct 2012 16:33:23 +0000 (UTC) Received: from albert.catwhisker.org (localhost [127.0.0.1]) by albert.catwhisker.org (8.14.5/8.14.5) with ESMTP id q9LGXNNl002802; Sun, 21 Oct 2012 09:33:23 -0700 (PDT) (envelope-from david@albert.catwhisker.org) Received: (from david@localhost) by albert.catwhisker.org (8.14.5/8.14.5/Submit) id q9LGXMlA002801; Sun, 21 Oct 2012 09:33:22 -0700 (PDT) (envelope-from david) Date: Sun, 21 Oct 2012 09:33:22 -0700 From: David Wolfskill To: Konstantin Belousov Subject: Re: stable/9 @r241776 panic: REDZONE: Buffer underflow detected... Message-ID: <20121021163322.GB1730@albert.catwhisker.org> References: <20121020141019.GW1817@albert.catwhisker.org> <20121021121356.GJ35915@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="St7VIuEGZ6dlpu13" Content-Disposition: inline In-Reply-To: <20121021121356.GJ35915@deviant.kiev.zoral.com.ua> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Oct 2012 16:33:24 -0000 --St7VIuEGZ6dlpu13 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Oct 21, 2012 at 03:13:56PM +0300, Konstantin Belousov wrote: > On Sat, Oct 20, 2012 at 07:10:19AM -0700, David Wolfskill wrote: > > This seems ... fairly weird to me. > >=20 > > Yesterday, I built & booted: > >=20 > > FreeBSD g1-227.catwhisker.org 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #27= 4 241726M: Fri Oct 19 05:40:05 PDT 2012 root@g1-227.catwhisker.org:/usr= /obj/usr/src/sys/CANARY i386 > >=20 > > and used the machine all day; nothing unusual (including various > > reboots (e.g. when I disembarked the train for the final leg of my > > commute home, so I powered the laptop off). > >=20 > > This morning, I built: > >=20 > > FreeBSD g1-227.catwhisker.org 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #27= 5 241776M: Sat Oct 20 04:34:45 PDT 2012 root@g1-227.catwhisker.org:/usr= /obj/usr/src/sys/CANARY i386 > >=20 > > and on first reboot, I got a panic. > >=20 > > After a bit of experimentation, it appears that I get a panic @r241776 > > if I attempt a normal boot into multi-user mode, but if I first boot to > > single-user mode, then exit single-user mode, it comes up without a > > problem. > >=20 > > I don't have a serial console, so I started to write down some of the > > panic information, but my patience ran a bit short. Here's whet I > > recorded (warning: hand-transcripted -- twice!): > >=20 > > ... > > Starting devd. > > REDZONE: Buffer underflow detected. 1 byte corrupted before 0xced40080= (4294966796 bytes allocated). > > Allocation backtrace: > > #0 0xc0ceac8f at redzone_setup+0xcf > > #1 0xc0a5d5c9 at malloc+0x1d9 > > ...[about 20 more such lines I didn't record]... > >=20 > > > bt > > Tracing pid 901 tid 100106 td 0xd2b99000 > > kdb_enter(...) > > panic(...) > > free(...) > > devread(ce8c2d00,f7274c0c,0,c0b1e4f0,d279e380,...) at devread+0x1a6 > > giant_read(...) at giant_read+0x87 > > devfs_read(...) at devfs_read+0xc6 > > dofileread(...) at dofileread+0x99 > > sys_read(...) at sys_read+0x98 > > syscall(f7274d08) at syscall+0x387 > >=20 > > Within the bounds described above, this appears to be quite reproducible > > -- on my laptop. My build machine (updated in parallel, at the same > > GRNs) does not exhibit the panic. > >=20 > > I was unable to get a crash dump; I have > >=20 > > dumpdev=3D"AUTO" > >=20 > > in /etc/rc.conf, and the panic was occurring well after swap was > > enabled. (Yes, I know I have swap over-allocated. I plan to do > > something about it at some point.) > >=20 > > I've attached a copy of dmesg.boot. > >=20 > > Anyone else seeing this? Any ideas how to diagnose it? >=20 > devread is the method of devctl(4) which passes devd notifications from > the kernel to userland (to devd, specifically). There were no changes to > devctl(4) for quite a time. >=20 > The corruption is, most likely, in some unrelated piece of code. Could > you try to bisect the stable to catch the offender ? The bisect is not > guaranteed to work, obviously, since the random corruption effects are > unpredictable. [Lack of trimming is deliberate, in this case, as I found a reversion that appears to address the issue, and I wanted folks looking at this to have the bulk of the symptoms readily at hand. -- dhw] The range of GRNs in question is 241726 - 241776, only 5 of which appliy to stable/9. Here's a list, with the affected files listed: 241742 sys/dev/sound/pci/hda/hdaa_patches.c 241749 sys/cam/cam_queue.c 241762 sys/dev/tws/tws.c sys/dev/tws/tws.h sys/dev/tws/tws_cam.c sys/dev/tws/tws_hdm.h sys/dev/tws/tws_user.c 241767 usr.bin/make/var.c 241769 sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c I had actually tried reverting 241742 yesterday, to no effect. I don't use ZFS, and I have a pretty hard time understanding how 241767 would break one machine and leave 4 others unscathed. (Yes, I completed my weekly updates, as well, by now.) I don't have tws(4) devices -- certainly not on the laptop. So I tried reverting 241749 ... and I failed to reproduce the problem. Well, one boot out of one, at least. I'll try a few more reality checks, and report back if a correction is in order. But (for now, at least), it looks to me as if 241749 is presenting a problem on this laptop. For folks investigating, I attached a dmesg.boot to the initial post in the thread; I'll be happy to provide more information, should it be requested (& specified). Peace, david --=20 David H. Wolfskill david@catwhisker.org Taliban: Evil men with guns afraid of truth from a 14-year old girl. See http://www.catwhisker.org/~david/publickey.gpg for my public key. --St7VIuEGZ6dlpu13 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlCEI9EACgkQmprOCmdXAD0w+QCfTT7c0aL8L76liKKa/bP8/VO8 gXcAnjz+0l68d21fkp7ewnmXco86jd+2 =gn7W -----END PGP SIGNATURE----- --St7VIuEGZ6dlpu13--