From owner-freebsd-questions@FreeBSD.ORG Fri Mar 28 16:57:04 2003 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8D87B37B401 for ; Fri, 28 Mar 2003 16:57:04 -0800 (PST) Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80]) by mx1.FreeBSD.org (Postfix) with ESMTP id BCE2D43FB1 for ; Fri, 28 Mar 2003 16:57:01 -0800 (PST) (envelope-from grog@lemis.com) Received: by wantadilla.lemis.com (Postfix, from userid 1004) id 2599051A73; Sat, 29 Mar 2003 11:26:59 +1030 (CST) Date: Sat, 29 Mar 2003 11:26:59 +1030 From: Greg 'groggy' Lehey To: james Message-ID: <20030329005659.GC72294@wantadilla.lemis.com> References: <20030328010810.GE72254@wantadilla.lemis.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="8X7/QrJGcKSMr1RN" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Organization: The FreeBSD Project Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-418-838-708 WWW-Home-Page: http://www.FreeBSD.org/ X-PGP-Fingerprint: 9A1B 8202 BCCE B846 F92F 09AC 22E6 F290 507A 4223 cc: freebsd-questions@freebsd.org Subject: Re: PANIC: vinum / atacontrol (5.0-STABLE / 4.8-RC2) X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Mar 2003 00:57:06 -0000 --8X7/QrJGcKSMr1RN Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Friday, 28 March 2003 at 11:32:48 +0000, james wrote: > On Fri, 28 Mar 2003, Greg 'groggy' Lehey wrote: > >> [Format recovered--see http://www.lemis.com/email/email-format.html] >> >> Computer output wrapped. >> >> On Thursday, 27 March 2003 at 14:18:43 +0000, james wrote: >>> Hi >>> >>> I am trying to configure hotswap-raid and vinum on my machine, and have found I >>> can cause the kernel to panic at will. >>> >>> Ideally I would like to be able to stop a plex, use atacontrol attach/detach to >>> replace the disk, and rebuild the plex. Would this work in theory? >> >> Apparently. There was a time when people claimed that ATA drives >> couldn't be hot swapped, but that seems to be incorrect nowadays. >> >>> Now I stop and unload vinum, and try to run atacontrol: >>> >>> eddie# vinum stop >>> vinum unloaded >>> eddie# kldstat | grep vinum >>> eddie# >>> eddie# atacontrol detach 3 >>> >>> >>> I have built a debug kernel, and have a core. The backtrace is below. >>> >>> If you need any more info please let me know! >>> >>> James >>> >>> Now follows the gdb-output: >>> >>> (kgdb) bt >>> #9 0xc01a9223 in panic () at /usr/src/sys/kern/kern_shutdown.c:517 >>> #10 0xc02e311e in trap_fatal (frame=0xc0b94e00, eva=0x0) at /usr/src/sys/i386/i386/trap.c:844 >>> #11 0xc02e2e32 in trap_pfault (frame=0xc873fa74, usermode=0x0, eva=0x24) at /usr/src/sys/i386/i386/trap.c:758 >>> #12 0xc02e2a1d in trap (frame= >>> {tf_fs = 0xc0380018, tf_es = 0xc0b90010, tf_ds = 0x10, tf_edi = 0x0, >>> tf_esi = 0xc1857530, tf_ebp = 0xc873fab4, tf_isp = 0xc873faa0, tf_ >>> ebx = 0xc19a6c00, tf_edx = 0xe7, tf_ecx = 0xc032a340, tf_eax = 0x0, tf_trapno = >>> 0xc, tf_err = 0x0, tf_eip = 0xc01c6de6, tf_cs = 0x8, tf_eflag >>> s = 0x10292, tf_esp = 0xc873faf0, tf_ss = 0xc01296ae}) >>> at /usr/src/sys/i386/i386/trap.c:445 >>> #13 0xc02d44f8 in calltrap () at {standard input}:98 >>> #14 0xc01296ae in ata_command (atadev=0xc1857530, command=0xe7, lba=0x0, count=0x0, feature=0x0, flags=0x4) >>> at bus_at386.h:526 >>> #15 0xc01396df in adclose (dev=0x0, flags=0x3, fmt=0x0, td=0x0) at /usr/src/sys/dev/ata/ata-disk.c:292 >> >> (etc) >> >> The trap occurred between frames 12 and 13 at address 0xc873faa0, in >> the ATA code. Depending on your prowess with kernel code, you may be >> able to find out what has gone wrong. I'd be inclined to look at >> frame 13: >> >> (gdb) f 13 select frame >> (gdb) l list the code >> (gdb) i loc show local variables >> >> My guess is that something has not been initialized. It's probably >> worth submitting a bug report. > > I will be able to analyse the 5.0-stable panic a little more when I get home. > In the meantime I've been doing simliar tests with 4.8-RC2. I get slightly more > progress, but still a panic at the end. > > Sequence of events: > > 1, create volume, 2 plexes on 2 disks > 2, vinum stop volume.p1 > 3, atacontrol detach 1 (drive b) > 4, atacontrol attach 1 - this WORKS, doesn't panic like 5.0-STABLE > 5, vinum start volume.p1 > OK, that's good background information, but first we need to look at the dump. > As before, I have a debug kernel and core dump. I can't seem to > configure my mailer to not wrap lines, so I've posted all relevant > information to http://web.hisser.org/vinum/4.8-crash/ . They are > plain text files. You should consider getting a different MUA. #6 0xc0220332 in dsioctl (dev=0xc1383200, cmd=0x8004646d, data=0xcb307d58 "\001", flags=0x2, sspp=0xc1328568) at ../../kern/subr_diskslice.c:356 #7 0xc021fd5b in diskioctl (dev=0xc1383200, cmd=0x8004646d, data=0xcb307d58 "\001", fflag=0x2, p=0xca10fee0) at ../../kern/subr_disk.c:267 #8 0xc140b5af in ?? () #9 0xc140969b in ?? () #10 0xc140988e in ?? () #11 0xc140c461 in ?? () #12 0xc024efa2 in spec_ioctl (ap=0xcb307de4) at ../../miscfs/specfs/spec_vnops.c:306 This is different from the other crash. It looks like it happens in Vinum. Take a look at vinum(4) or http://www.vinumvm.org/vinum/how-to-debug.html for details of how to bring life into them. > The kenel is panicking in dsioctl(), kern/subr_diskslice.c:356. I've > had a look in there, but I really have no idea what it's trying to > do - I can't even work out what the ioctl is. I'm no kernel guy :( The ioctl is the cmd parameter passed to diskioctl, 0x8004646d. That's DIOCWLABEL. Finding them isn't easy, but basically: 0x8 -> _IOW macro. We're writing. 004 length to write 64 ioctl type ('d'). You'd go looking for a regular expression _IOW.*'d'. 6d ioctl number (109). This one is in /sys/sys/disklabel.h: #define DIOCWLABEL _IOW('d', 109, int) /* write en/disable label */ It's not clear what's trying to write the label, but looking at the locals of the dsioctl frame would help. > I appreciate this may not be a bug in Vinum, but it certainly seems > like it's being triggered by vinum. Yes, that's reasonable. Greg -- When replying to this message, please copy the original recipients. If you don't, I may ignore the reply or reply to the original recipients. For more information, see http://www.lemis.com/questions.html See complete headers for address and phone numbers --8X7/QrJGcKSMr1RN Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (FreeBSD) iD8DBQE+hO9bIubykFB6QiMRAgAJAJ4+7DKRSz4RDLOTIxmEZ3NlnBtJDwCdEDpc M7tr3hHMK3qcQPbHHxbXgBM= =GXg2 -----END PGP SIGNATURE----- --8X7/QrJGcKSMr1RN--