From owner-freebsd-stable@FreeBSD.ORG Sun Nov 16 08:22:49 2008 Return-Path: Delivered-To: FreeBSD-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 24F9A1065680 for ; Sun, 16 Nov 2008 08:22:49 +0000 (UTC) (envelope-from rorya+freebsd.org@TrueStep.com) Received: from Tserver.TrueStep.com (Tserver.TrueStep.com [64.253.96.188]) by mx1.freebsd.org (Postfix) with ESMTP id A470C8FC0A for ; Sun, 16 Nov 2008 08:22:48 +0000 (UTC) (envelope-from rorya+freebsd.org@TrueStep.com) Received: from Cypher.TrueStep (Cypher.TrueStep [10.101.1.8]) (authenticated bits=0) by Tserver.TrueStep.com (8.14.3/8.14.3) with ESMTP id mAG8MdVw034854 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Sun, 16 Nov 2008 03:22:45 -0500 (EST) (envelope-from rorya+freebsd.org@TrueStep.com) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=TrueStep.com; s=default; t=1226823765; bh=0YjGTkXW3zdiI7BXxIPgTgDew0KOoE0R7hTb3jN JaOs=; h=Cc:Message-Id:From:To:In-Reply-To:Content-Type: Content-Transfer-Encoding:Mime-Version:Subject:Date:References; b=lF7Yd6p7xCG8CrHT02HMgP1Q2/qIdn4SydvXv5zU6C8/DvgVPVYRL2rvhfTcigeT0 4UPXW7gqKhDxu08FWlPDAtOV0WAzE6pzAZrk9qJcNiUVlf5ykbp7CGohXsBMdDZB5Y4 0qpqsGJmsgczkP3Y2ezxxhfQw5w/QJJX+Tnu9Vw= Message-Id: <9C64A87F-1359-4694-8238-6C4D4B025BE3@TrueStep.com> From: Rory Arms To: Ken Smith In-Reply-To: <1226078239.37011.37.camel@bauer.cse.buffalo.edu> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Date: Sun, 16 Nov 2008 03:22:39 -0500 References: <9592E887-75F3-473F-9581-F9C22A9936A6@TrueStep.com> <1226078239.37011.37.camel@bauer.cse.buffalo.edu> X-Mailer: Apple Mail (2.929.2) Cc: FreeBSD-stable@FreeBSD.org Subject: Re: 6.4-RC2 crashes after a few minutes of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2008 08:22:49 -0000 On 2008-11-07, at 12:17 , Ken Smith wrote: > On Fri, 2008-11-07 at 00:00 -0500, Rory Arms wrote: >> Well, if I can assist with further debugging, let me know. > > The person who followed up with a list of things that *may* have made > the problem go away mentioned one of the things was disabling powerd. > Do you have that enable, and if yes would you mind disabling it to see > if that's the culprit? Ken, Ok, guess something is amiss with the CD-ROM drive on this notebook, as in GNOME, it flashes an icon of a CD on the desktop from time to time, as if it has detected a disc in the drive. But of course there is no disc in the drive. I believe it did the same with 6.3 though, but as said before didn't ever panic due to this issue. So, some anecdotal info, after running RC2 for a few days now. It seems the pattern is that it seems to always panic a few minutes after a first cold boot, but then seems to remain stable after the second boot. Odd, as with 6.3 this didn't happen. So, I happened to catch a panic while working in the syscons console after one of these cold boots. As far as I can tell, the panic does have something to do with the the CD-ROM drive, as right after I saw this message on the console, it immediately paniced: acd0: WARNING - PREVENT_ALLOW read data overrun 18>0 and then the panic is as follows: kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode fault virtual address = 0x78 fault code = supervisor read, page not present instruction pointer = 0x20:0xc06d39b9 stack pointer = 0x28:0xca865c10 frame pointer = 0x28:0xca865c14 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 0 current process = 19 (swi6: task queue) trap number = 12 panic: page fault Uptime: 1h9m7s Physical memory: 179MB Dumping 43MB: 28 12 Dump complete This is also the computer, as you may recall, that I can't ever get kgdb(1) to open the core dump file. Note the uptime on that particular boot was 1h because I pretty much let it sit idle after booting. So, gdm loaded and then I switched to syscons, logged in, and then pretty much let it idle, till it paniced. Hope that helps, - rory From owner-freebsd-stable@FreeBSD.ORG Sun Nov 16 10:15:10 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D50551065672; Sun, 16 Nov 2008 10:15:10 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from mail.cksoft.de (mail.cksoft.de [62.111.66.27]) by mx1.freebsd.org (Postfix) with ESMTP id 88E6D8FC19; Sun, 16 Nov 2008 10:15:10 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from localhost (amavis.str.cksoft.de [192.168.74.71]) by mail.cksoft.de (Postfix) with ESMTP id 2861A41C650; Sun, 16 Nov 2008 11:15:08 +0100 (CET) X-Virus-Scanned: amavisd-new at cksoft.de Received: from mail.cksoft.de ([62.111.66.27]) by localhost (amavis.str.cksoft.de [192.168.74.71]) (amavisd-new, port 10024) with ESMTP id hw6Sj9eZNrJC; Sun, 16 Nov 2008 11:15:05 +0100 (CET) Received: by mail.cksoft.de (Postfix, from userid 66) id 8491341C65F; Sun, 16 Nov 2008 11:15:05 +0100 (CET) Received: from maildrop.int.zabbadoz.net (maildrop.int.zabbadoz.net [10.111.66.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.int.zabbadoz.net (Postfix) with ESMTP id 21622444888; Sun, 16 Nov 2008 10:10:37 +0000 (UTC) Date: Sun, 16 Nov 2008 10:10:36 +0000 (UTC) From: "Bjoern A. Zeeb" X-X-Sender: bz@maildrop.int.zabbadoz.net To: Lorenzo Perone In-Reply-To: <2192B50F-16AE-4BC8-ACEC-6C5B99804DA0@yellowspace.net> Message-ID: <20081116100529.Y61259@maildrop.int.zabbadoz.net> References: <2192B50F-16AE-4BC8-ACEC-6C5B99804DA0@yellowspace.net> X-OpenPGP-Key: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-jail@freebsd.org, freebsd-stable@freebsd.org Subject: Re: hangs for 7.1-PRE [was: problem possibly related to multi-ip jail patch?] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: freebsd-stable@freebsd.org List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2008 10:15:10 -0000 On Sun, 16 Nov 2008, Lorenzo Perone wrote: Hi, > I've been experiencing problems with one of the machines running FreeBSD > 7.1-PRERELEASE #2: Thu Oct 16 20:23:09 CEST 2008 with the multi-ip patch > bz_jail7-20080920-01-at150161.diff, and I'm wondering if it possibly related > to the patch - in any case, any advice would be very welcome. bottom line is that most of this looks less likely to be a jail problem. > It happens that mysql (tried both 4.0 and 5.1, in 2 separate jails), at some > time stop responding to connections, and mysql gets stuck in sbwait state. It > is only killable with kill -9 Yeah, I had been seeing mysql hang or go to 99% CPU for years once in a while; it's been more rare the last months. I have seen it in- and outside of jails, with or without patches. You could try to see if you can get backtraces of those processes. > each of the two mysqlds is running in a jail on one private IP, serving > connections to a webserver nearby - the latter having one public and one > private IP, communicating with the other jail via the private network. > > I also experienced two complete system hangs (which must not be necessarily > related to the mysql problem) both during a shutdown -r now. one was a panic, > in another case the machine was still pingable but did not shut down > completely. I could only reset it over the DRAC. here's a screenshot I made > over the Dell RAC: http://lorenzo.yellowspace.net/stuck.png Looking at your image I see more problems before the shutdown so this as well is most likely not a jail problem. > Since I'm also using zfs there and the kernel has been built with the DTRACE > options. > > any advice (also about which more details that I should/could provide) would > be very welcome... I am Cc:ing the answer to stable@ and setting reply-to: to move the discussion there. /bz -- Bjoern A. Zeeb Stop bit received. Insert coin for new game. From owner-freebsd-stable@FreeBSD.ORG Sun Nov 16 12:25:04 2008 Return-Path: Delivered-To: FreeBSD-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CDA4B10656B7 for ; Sun, 16 Nov 2008 12:25:04 +0000 (UTC) (envelope-from barbara.xxx1975@libero.it) Received: from cp-out1.libero.it (cp-out1.libero.it [212.52.84.101]) by mx1.freebsd.org (Postfix) with ESMTP id 6344C8FC08 for ; Sun, 16 Nov 2008 12:25:04 +0000 (UTC) (envelope-from barbara.xxx1975@libero.it) Received: from wmail2.libero.it (172.31.0.97) by cp-out1.libero.it (8.5.016.1) id 49196160008D5517; Sun, 16 Nov 2008 13:24:42 +0100 Message-ID: <13170840.199821226838282753.JavaMail.defaultUser@defaultHost> Date: Sun, 16 Nov 2008 13:24:42 +0100 (CET) From: Barbara To: , Ken Smith MIME-Version: 1.0 Content-Type: text/plain;charset="UTF-8" Content-Transfer-Encoding: 7bit X-SenderIP: 87.1.198.170 Cc: FreeBSD-stable@FreeBSD.org Subject: 6.4-RC2 crashes after a few minutes of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Barbara List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2008 12:25:04 -0000 >Ok, guess something is amiss with the CD-ROM drive on this notebook, >as in GNOME, it flashes an icon of a CD on the desktop from time to >time, as if it has detected a disc in the drive. But of course there >is no disc in the drive. I believe it did the same with 6.3 though, >but as said before didn't ever panic due to this issue. > >So, some anecdotal info, after running RC2 for a few days now. It >seems the pattern is that it seems to always panic a few minutes after >a first cold boot, but then seems to remain stable after the second >boot. Odd, as with 6.3 this didn't happen. So, I happened to catch a >panic while working in the syscons console after one of these cold >boots. As far as I can tell, the panic does have something to do with >the the CD-ROM drive, as right after I saw this message on the >console, it immediately paniced: > >acd0: WARNING - PREVENT_ALLOW read data overrun 18>0 > >and then the panic is as follows: > >kernel trap 12 with interrupts disabled > >Fatal trap 12: page fault while in kernel mode >fault virtual address = 0x78 >fault code = supervisor read, page not present >instruction pointer = 0x20:0xc06d39b9 >stack pointer = 0x28:0xca865c10 >frame pointer = 0x28: 0xca865c14 >code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 >processor eflags = resume, IOPL = 0 >current process = 19 (swi6: task queue) >trap number = 12 >panic: page fault >Uptime: 1h9m7s >Physical memory: 179MB >Dumping 43MB: 28 12 >Dump complete Hi Rory, did you see my replies or are you missing them for any reason? Your panics and some aspects about how they happens look like mine to me, look here: http: //lists.freebsd.org/pipermail/freebsd-stable/2008-October/045865.html Unfortunately I got no answer about that and I've had no comment in the pr I've filed http://www.freebsd.org/cgi/query-pr.cgi?pr=128076 I wonder if someone had the time to look at it. From owner-freebsd-stable@FreeBSD.ORG Sun Nov 16 21:21:49 2008 Return-Path: Delivered-To: FreeBSD-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4791A1065670 for ; Sun, 16 Nov 2008 21:21:49 +0000 (UTC) (envelope-from rorya+freebsd.org@TrueStep.com) Received: from Tserver.TrueStep.com (Tserver.TrueStep.com [64.253.96.188]) by mx1.freebsd.org (Postfix) with ESMTP id E68258FC1F for ; Sun, 16 Nov 2008 21:21:48 +0000 (UTC) (envelope-from rorya+freebsd.org@TrueStep.com) Received: from Cypher.TrueStep (Cypher.TrueStep [10.101.1.8]) (authenticated bits=0) by Tserver.TrueStep.com (8.14.3/8.14.3) with ESMTP id mAGLLdrD039831 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Sun, 16 Nov 2008 16:21:45 -0500 (EST) (envelope-from rorya+freebsd.org@TrueStep.com) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=TrueStep.com; s=default; t=1226870505; bh=QCTuOaD9xNWrSDwNQOnCch5bshHSrmq5XsYXNg4 iRY4=; h=Cc:Message-Id:From:To:In-Reply-To:Content-Type: Content-Transfer-Encoding:Mime-Version:Subject:Date:References; b=PFHuBfpgEacZDn4ZEtFY65cZyFrLznCiJY77ukD4J2+izgqz0gShGlEeskfwHFQCn beSvbISNB3bIk+UVjJnDFtHBk8m5nwoHoJfK9TukJyZL69Ibbc7ICWEx1ADyXZQwXVV ZYIKWy7fORiMqT83/N0iwH6JQEhZ899O0ihbdWI= Message-Id: <369CC50A-9CF4-4F9B-8D22-153294B93532@TrueStep.com> From: Rory Arms To: Barbara In-Reply-To: <13170840.199821226838282753.JavaMail.defaultUser@defaultHost> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Date: Sun, 16 Nov 2008 16:21:38 -0500 References: <13170840.199821226838282753.JavaMail.defaultUser@defaultHost> X-Mailer: Apple Mail (2.929.2) Cc: Ken Smith , FreeBSD-stable@FreeBSD.org Subject: Re: 6.4-RC2 crashes after a few minutes of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2008 21:21:49 -0000 On 2008-11-16, at 7:24 , Barbara wrote: > > > >> Ok, guess something is amiss with the CD-ROM drive on this notebook, >> as > in GNOME, it flashes an icon of a CD on the desktop from time to >> time, as if > it has detected a disc in the drive. But of course there >> is no disc in the > drive. I believe it did the same with 6.3 though, >> but as said before didn't > ever panic due to this issue. >> >> So, some anecdotal info, after running RC2 for > a few days now. It >> seems the pattern is that it seems to always panic a few > minutes after >> a first cold boot, but then seems to remain stable after the > second >> boot. Odd, as with 6.3 this didn't happen. So, I happened to catch > a >> panic while working in the syscons console after one of these cold > >> boots. As far as I can tell, the panic does have something to do >> with >> the > the CD-ROM drive, as right after I saw this message on the >> console, it > immediately paniced: >> >> acd0: WARNING - PREVENT_ALLOW read data overrun 18>0 >> > >> and then the panic is as follows: >> >> kernel trap 12 with interrupts disabled >> > >> Fatal trap 12: page fault while in kernel mode >> fault virtual address = 0x78 > >> fault code = supervisor read, page not present >> instruction pointer = > 0x20:0xc06d39b9 >> stack pointer = 0x28:0xca865c10 >> frame pointer = 0x28: > 0xca865c14 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, > pres 1, def32 1, gran 1 >> processor eflags = resume, IOPL = 0 >> current process > = 19 (swi6: task queue) >> trap number = 12 >> panic: page fault >> Uptime: > 1h9m7s >> Physical memory: 179MB >> Dumping 43MB: 28 12 >> Dump complete > > Hi Rory, > > did you see my replies or are you missing them for any reason? Yes, I have seen your replies. I must have missed the PR you mentioned last time, sorry. > > > Your panics and > some aspects about how they happens look like mine to me, look here: > http: > //lists.freebsd.org/pipermail/freebsd-stable/2008-October/045865.html Yes, indeed. That looks very similar to the issue I'm running into with 6.4-RC2 as well. Sounds like it might be a regression in ata(4). At least you were able to open the core dump. Are you still able to open core dumps with RC2? > > > > Unfortunately I got no answer about that and I've had no comment in > the pr I've > filed http://www.freebsd.org/cgi/query-pr.cgi?pr=128076 > I wonder if someone had > the time to look at it. > From owner-freebsd-stable@FreeBSD.ORG Sun Nov 16 22:29:03 2008 Return-Path: Delivered-To: FreeBSD-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B44701065689 for ; Sun, 16 Nov 2008 22:29:03 +0000 (UTC) (envelope-from barbara.xxx1975@libero.it) Received: from cp-out2.libero.it (cp-out2.libero.it [212.52.84.102]) by mx1.freebsd.org (Postfix) with ESMTP id 774298FC08 for ; Sun, 16 Nov 2008 22:29:03 +0000 (UTC) (envelope-from barbara.xxx1975@libero.it) Received: from libero.it (192.168.17.15) by cp-out2.libero.it (8.5.016.1) id 49197770008E9D9B; Sun, 16 Nov 2008 23:28:27 +0100 Date: Sun, 16 Nov 2008 23:28:27 +0100 Message-Id: MIME-Version: 1.0 X-Sensitivity: 3 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable From: "barbara" To: "rorya+freebsd\.org" X-XaM3-API-Version: 4.3 (R1) (B3pl25) X-SenderIP: 87.1.198.170 Cc: kensmith , FreeBSD-stable Subject: Re: 6.4-RC2 crashes after a few minutes of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2008 22:29:03 -0000 > > Hi Rory, > > > > did you see my replies or are you missing them for any reason? > > Yes, I have seen your replies. I must have missed the PR you mentioned = > last time, sorry. No problem! > > Your panics and > > some aspects about how they happens look like mine to me, look here: > > http: > > //lists.freebsd.org/pipermail/freebsd-stable/2008-October/045865.html= > > Yes, indeed. That looks very similar to the issue I'm running into > with 6.4-RC2 as well. Sounds like it might be a regression in ata(4). = > At least you were able to open the core dump. Are you still able to > open core dumps with RC2? > I'm not sure. I'm running STABLE and I had no panics after the branch has= changed to RC2. It seems that my panics are not frequent as yours. Anyway my box freezed a couple of times after last newvers.sh and the sym= ptoms looked like the same, with messages about acd0. I was able to ping = it but it won't let me ssh in, like it was using all the cpus. About kgdb... I never used freebsd-update, so sorry if I'm saying something stupid, but= could it be the case that the kernel has been built without debugging sy= mbols or something like that? Does freebsd-update provide a kernel.debug?= I've seen that you are not using shiny quad-core, but could you try build= ing a kernel by yourself? I think that you could do it using a different,= more powerful, freebsd box if you have it, or even on qemu. I could help= if you wish. From owner-freebsd-stable@FreeBSD.ORG Mon Nov 17 00:01:51 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0618E1065673 for ; Mon, 17 Nov 2008 00:01:51 +0000 (UTC) (envelope-from p.christias@noc.ntua.gr) Received: from achilles.noc.ntua.gr (achilles.noc.ntua.gr [IPv6:2001:648:2000:de::210]) by mx1.freebsd.org (Postfix) with ESMTP id 6A61B8FC08 for ; Mon, 17 Nov 2008 00:01:50 +0000 (UTC) (envelope-from p.christias@noc.ntua.gr) Received: from ajax.noc.ntua.gr (ajax6.noc.ntua.gr [IPv6:2001:648:2000:dc::1]) by achilles.noc.ntua.gr (8.14.3/8.14.3) with ESMTP id mAH01m0D080145 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 17 Nov 2008 02:01:48 +0200 (EET) (envelope-from p.christias@noc.ntua.gr) Received: from ajax.noc.ntua.gr (localhost.noc.ntua.gr [127.0.0.1]) by ajax.noc.ntua.gr (8.13.8/8.13.8) with ESMTP id mAH01mMV052715 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 17 Nov 2008 02:01:48 +0200 (EET) (envelope-from p.christias@noc.ntua.gr) Received: (from christia@localhost) by ajax.noc.ntua.gr (8.13.8/8.13.8/Submit) id mAH01mol052714; Mon, 17 Nov 2008 02:01:48 +0200 (EET) (envelope-from p.christias@noc.ntua.gr) X-Authentication-Warning: ajax.noc.ntua.gr: christia set sender to p.christias@noc.ntua.gr using -f Date: Mon, 17 Nov 2008 02:01:47 +0200 From: Panagiotis Christias To: Claus Guttesen Message-ID: <20081117000147.GA52109@noc.ntua.gr> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.16 (2007-06-09) X-Virus-Scanned: ClamAV version 0.94, clamav-milter version 0.94 on achilles.noc.ntua.gr X-Virus-Status: Clean Cc: FreeBSD Stable Subject: Re: qlogic qle2462 hba and freebsd stable on a dl360 g5 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2008 00:01:51 -0000 On Thu, Nov 13, 2008 at 12:22:11PM +0100, Claus Guttesen wrote: > Hi. > > I'm looking at a qlogic qle2462 hba for my dl360 g5. The thread > http://www.mail-archive.com/freebsd-stable@freebsd.org/msg99497.html > mentions a deadlock when system is loaded. Has this issue been > resolved? Are there other PCI Express hba's which are known to work > with freebsd stable and dl360 g5? Hello, no, the issue has not been resolved. The system still deadlocks regardless the value of tag openings (even when set to the minimum value of 2) and the filesystem gets corrupted or totally destroyed. I am still looking for a solution and willing to do any tests. Regards, Panagiotis -- Panagiotis J. Christias Network Management Center P.Christias@noc.ntua.gr National Technical Univ. of Athens, GREECE From owner-freebsd-stable@FreeBSD.ORG Mon Nov 17 00:02:29 2008 Return-Path: Delivered-To: FreeBSD-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6E6121065695 for ; Mon, 17 Nov 2008 00:02:29 +0000 (UTC) (envelope-from rorya+freebsd.org@TrueStep.com) Received: from Tserver.TrueStep.com (Tserver.TrueStep.com [64.253.96.188]) by mx1.freebsd.org (Postfix) with ESMTP id 037198FC23 for ; Mon, 17 Nov 2008 00:02:28 +0000 (UTC) (envelope-from rorya+freebsd.org@TrueStep.com) Received: from Cypher.TrueStep (Cypher.TrueStep [10.101.1.8]) (authenticated bits=0) by Tserver.TrueStep.com (8.14.3/8.14.3) with ESMTP id mAH02Jjg040907 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Sun, 16 Nov 2008 19:02:25 -0500 (EST) (envelope-from rorya+freebsd.org@TrueStep.com) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=TrueStep.com; s=default; t=1226880145; bh=Su8IzqGamREsuHnjeB5XQJwzLSG/H+96WILUD3h wDzE=; h=Cc:Message-Id:From:To:In-Reply-To:Content-Type: Content-Transfer-Encoding:Mime-Version:Subject:Date:References; b=SNorPDO7l4Pr/ovVMgYabB2RMXdCzav/fLKyNAWnd2ZQTFWYsiHX5FbWHHqnstRMq 5Tg1lCsKZVdYv0pcrC1X0qJReLvj+yEGu9XxYW5sceWvCN+bmuoI4a+K2FRRGXiwYPp G7r3j22ynQsfrkfZHnWNse0BSBh2Uz/EmNsvK2I= Message-Id: From: Rory Arms To: barbara In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Date: Sun, 16 Nov 2008 19:02:19 -0500 References: X-Mailer: Apple Mail (2.929.2) Cc: kensmith , FreeBSD-stable Subject: Re: 6.4-RC2 crashes after a few minutes of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2008 00:02:29 -0000 On 2008-11-16, at 17:28 , barbara wrote: >>> Hi Rory, >>> >>> did you see my replies or are you missing them for any reason? >> >> Yes, I have seen your replies. I must have missed the PR you >> mentioned >> last time, sorry. > > No problem! > >>> Your panics and >>> some aspects about how they happens look like mine to me, look here: >>> http: >>> //lists.freebsd.org/pipermail/freebsd-stable/2008-October/ >>> 045865.html >> >> Yes, indeed. That looks very similar to the issue I'm running into >> with 6.4-RC2 as well. Sounds like it might be a regression in ata(4). >> At least you were able to open the core dump. Are you still able to >> open core dumps with RC2? >> > > I'm not sure. I'm running STABLE and I had no panics after the > branch has changed to RC2. It seems that my panics are not frequent > as yours. > Anyway my box freezed a couple of times after last newvers.sh and > the symptoms looked like the same, with messages about acd0. I was > able to ping it but it won't let me ssh in, like it was using all > the cpus. > > About kgdb... > I never used freebsd-update, so sorry if I'm saying something > stupid, but could it be the case that the kernel has been built > without debugging symbols or something like that? Does freebsd- > update provide a kernel.debug? I haven't had to use a the kernel.debug file in the obj dir in a long time. As far as I know, these days, the GENERIC kernel includes debug symbols. And in cases when there aren't any debug symbols, that shouldn't prevent kgdb from loading, I wouldn't think. - rory From owner-freebsd-stable@FreeBSD.ORG Mon Nov 17 00:45:21 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 391121065672 for ; Mon, 17 Nov 2008 00:45:21 +0000 (UTC) (envelope-from jrhett@netconsonance.com) Received: from mail.netconsonance.com (mail.netconsonance.com [198.207.204.4]) by mx1.freebsd.org (Postfix) with ESMTP id 2210B8FC2B for ; Mon, 17 Nov 2008 00:45:21 +0000 (UTC) (envelope-from jrhett@netconsonance.com) Received: from [192.168.148.73] ([216.239.45.19]) (authenticated bits=0) by mail.netconsonance.com (8.14.1/8.14.1) with ESMTP id mAH0jGYD095952; Sun, 16 Nov 2008 16:45:17 -0800 (PST) (envelope-from jrhett@netconsonance.com) X-Virus-Scanned: amavisd-new at netconsonance.com X-Spam-Flag: NO X-Spam-Score: -1.44 X-Spam-Level: X-Spam-Status: No, score=-1.44 tagged_above=-999 required=3.5 tests=[ALL_TRUSTED=-1.44] Message-Id: From: Jo Rhett To: Oliver Lehmann In-Reply-To: <20081112204351.ccc51c2f.lehmann@ans-netz.de> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Date: Sun, 16 Nov 2008 16:45:15 -0800 References: <20081029170728.be7cc7ab.lehmann@ans-netz.de> <13394481-8FDC-4934-BB12-FA5BCB2D35CD@nevada.net.nz> <20081112204351.ccc51c2f.lehmann@ans-netz.de> X-Mailer: Apple Mail (2.929.2) Cc: freebsd-stable@freebsd.org, Philip Murray Subject: Re: 3Ware 9000 series hangs under load X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2008 00:45:21 -0000 Philip Murray wrote: > Anyway, I stopped running 3dmd (or 3dm2 I think it's called now) to > monitor it, and the crashes went away. It's had hundreds of days > uptime since. We have never used 3dm2, and the 9500 units have been rock solid for us. > I've never been game enough to try newer versions of 3dm, but a > cronjob of tw_cli allows me to monitor it now without the lockups. > Might not be your problem, but it's worth a shot if all else fails. The driver logs all useful stuff, and the SEC logfile surfer does a good job of notifying you quickly. I can send you an SEC configuration for that if you want. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness From owner-freebsd-stable@FreeBSD.ORG Mon Nov 17 00:46:23 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 55AE4106564A for ; Mon, 17 Nov 2008 00:46:23 +0000 (UTC) (envelope-from jrhett@netconsonance.com) Received: from mail.netconsonance.com (mail.netconsonance.com [198.207.204.4]) by mx1.freebsd.org (Postfix) with ESMTP id 3CC0E8FC1B for ; Mon, 17 Nov 2008 00:46:23 +0000 (UTC) (envelope-from jrhett@netconsonance.com) Received: from [192.168.148.73] ([216.239.45.19]) (authenticated bits=0) by mail.netconsonance.com (8.14.1/8.14.1) with ESMTP id mAH0jGYE095952; Sun, 16 Nov 2008 16:46:18 -0800 (PST) (envelope-from jrhett@netconsonance.com) X-Virus-Scanned: amavisd-new at netconsonance.com X-Spam-Flag: NO X-Spam-Score: -0.04 X-Spam-Level: X-Spam-Status: No, score=-0.04 tagged_above=-999 required=3.5 tests=[ALL_TRUSTED=-1.44, AWL=-1.400, DATE_IN_FUTURE_24_48=2.8] Message-Id: From: Jo Rhett To: Philip Murray In-Reply-To: <95E9EA2C-C288-4F11-AD35-FE6AF6633A09@nevada.net.nz> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Date: Sun, 16 Nov 2008 16:46:17 -0800 References: <20081029170728.be7cc7ab.lehmann@ans-netz.de> <13394481-8FDC-4934-BB12-FA5BCB2D35CD@nevada.net.nz> <20081112204351.ccc51c2f.lehmann@ans-netz.de> <95E9EA2C-C288-4F11-AD35-FE6AF6633A09@nevada.net.nz> X-Mailer: Apple Mail (2.929.2) Cc: freebsd-stable@freebsd.org, Oliver Lehmann Subject: Re: 3Ware 9000 series hangs under load X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2008 00:46:23 -0000 On Nov 12, 2008, at 12:37 PM, Philip Murray wrote: > I just installed sysutils/tw_cli from ports, and it sets up some > 'periodic' scripts for you. To be precise it puts 407.status-3ware- > raid in /usr/local/etc/periodic/daily Don't use that. It's a very old version of the code. Use the binary version of tw_cli that matches the firmware on your controller. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness From owner-freebsd-stable@FreeBSD.ORG Mon Nov 17 01:13:23 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8EDC11065670; Mon, 17 Nov 2008 01:13:23 +0000 (UTC) (envelope-from p.christias@noc.ntua.gr) Received: from achilles.noc.ntua.gr (achilles.noc.ntua.gr [IPv6:2001:648:2000:de::210]) by mx1.freebsd.org (Postfix) with ESMTP id F109D8FC17; Mon, 17 Nov 2008 01:13:22 +0000 (UTC) (envelope-from p.christias@noc.ntua.gr) Received: from ajax.noc.ntua.gr (ajax6.noc.ntua.gr [IPv6:2001:648:2000:dc::1]) by achilles.noc.ntua.gr (8.14.3/8.14.3) with ESMTP id mAH1DIRJ013941 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 17 Nov 2008 03:13:18 +0200 (EET) (envelope-from p.christias@noc.ntua.gr) Received: from ajax.noc.ntua.gr (localhost.noc.ntua.gr [127.0.0.1]) by ajax.noc.ntua.gr (8.13.8/8.13.8) with ESMTP id mAH1DHqN056431 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 17 Nov 2008 03:13:17 +0200 (EET) (envelope-from p.christias@noc.ntua.gr) Received: (from christia@localhost) by ajax.noc.ntua.gr (8.13.8/8.13.8/Submit) id mAH1DHVK056430; Mon, 17 Nov 2008 03:13:17 +0200 (EET) (envelope-from p.christias@noc.ntua.gr) X-Authentication-Warning: ajax.noc.ntua.gr: christia set sender to p.christias@noc.ntua.gr using -f Date: Mon, 17 Nov 2008 03:13:17 +0200 From: Panagiotis Christias To: Oleg Sharoiko Message-ID: <20081117011317.GB52109@noc.ntua.gr> References: <20081014222343.GA8706@noc.ntua.gr> <1224049455.1277.44.camel@brain.cc.rsu.ru> <20081015175453.GA3260@noc.ntua.gr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20081015175453.GA3260@noc.ntua.gr> User-Agent: Mutt/1.5.16 (2007-06-09) X-Virus-Scanned: ClamAV version 0.94, clamav-milter version 0.94 on achilles.noc.ntua.gr X-Virus-Status: Clean Cc: freebsd-scsi@freebsd.org, freebsd-stable@freebsd.org Subject: Re: FreeBSD 7-STABLE, isp(4), QLE2462: panic & deadlocks X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2008 01:13:23 -0000 On Wed, Oct 15, 2008 at 08:54:53PM +0300, Panagiotis Christias wrote: > On Wed, Oct 15, 2008 at 09:44:15AM +0400, Oleg Sharoiko wrote: > > Hi! > > > > On Wed, 2008-10-15 at 01:23 +0300, Panagiotis Christias wrote: > > > > > However, when we connect them to the CX3-40, create and mount a new > > > partition and then do something as simple as "tar -C /san -xf ports.tgz" > > > the system panics and deadlocks. We have tried several FreeBSD versions > > > (6.3 i386/adm64, 7.0 i386/adm64, 7.1 i386/adm64 and lastly 7-STABLE i386 > > > - we also tried the latest 8-CURRENT snapshot but it panicked too soon). > > > The result is always the same; panic and deadlock. > > > > Try reducing the number of "tagged openings" with 'camcontrol tags' down > > to 46. If it doesn't work try reducing it further to 2. Also be advised > > that I've seen panics with geom_multipath in FreeBSD-7, unfortunately I > > had no time to test it in -current. > > > Hm.. that would probably explain the fact that I was unable to panic the > system when I had set the hint.isp.0.debug="0x1F" in /boot/device.hints. > > Currently I am stress testing the server with the tagged openings set to > 44 (first value tested). Until now there is no panic or deadlock. I am > trying concurrent tar extractions and rsync copies. The filesystem looks > ok till now according to fsck. I will let it write/copy/delete overnight > and tomorrow I will try different tagged opening values. > > Thank you for the hint! I am wondering what is the performance penalty > with decreased tagged openings. Also, is there anything else I could try > in order to get more useful debug output? I have at least three servers > that I could use for any kind of tests and I am willing to spend as much > time I can get to help solving the problem. > > Finally, the only output in the logs is: > > Expensive timeout(9) function: 0xc06f4210(0xc67e1200) 0.059422635 s > Expensive timeout(9) function: 0xc08d4fd0(0) 0.060676147 s > > I suppose that is related to the CAMDEBUG kernel config options. For the record, I have done many tests using several stressing tools in parallel, different FreeBSD versions (up to 7.1beta2), various filesystem configurations (plain ufs2 with softupdates, ufs2 and gjournal, zfs) and various tag openings values (down to 2). Regardless of the configuration, the system deadlocks, panics or the filesystem gets awfully corrupted within seconds, minutes or a few hours. The only configuration that seems to work without problems(?) but with a unacceptable *severe* performance penalty is when tag openings are set to minimum value of 2 (that is more or less same as disabling tagged command queueing at all). All tests ran using a 500 GB RAID5 LUN on an EMC Clariion CX340: da0 at isp0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-4 device da0: Serial Number CK200083100148 da0: 400.000MB/s transfers da0: Command Queueing Enabled da0: 512000MB (1048576000 512 byte sectors: 255H 63S/T 65270C) Previously, a Sun StorEdge T3 was tested which worked flawlessly but it had a 1 Gbps fibre channel interface, instead of a 4 Gbps that Clariion has, was recognized as a SCSI-3 device and had 2 tags openings (no surprise) by default: da1 at isp1 bus 0 target 0 lun 0 da1: Fixed Direct Access SCSI-3 device da1: 100.000MB/s transfers da1: 241724MB (495050752 512 byte sectors: 255H 63S/T 30815C) As I mentioned before, I am willing to spend time or/and provide access to the system for testing and debugging. Regards, Panagiotis -- Panagiotis J. Christias Network Management Center P.Christias@noc.ntua.gr National Technical Univ. of Athens, GREECE From owner-freebsd-stable@FreeBSD.ORG Mon Nov 17 02:29:53 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2F3A11065673 for ; Mon, 17 Nov 2008 02:29:53 +0000 (UTC) (envelope-from hartzell@alerce.com) Received: from merlin.alerce.com (merlin.alerce.com [64.62.142.94]) by mx1.freebsd.org (Postfix) with ESMTP id 1BDD78FC18 for ; Mon, 17 Nov 2008 02:29:53 +0000 (UTC) (envelope-from hartzell@alerce.com) Received: from merlin.alerce.com (localhost [127.0.0.1]) by merlin.alerce.com (Postfix) with ESMTP id BDADD33C62; Sun, 16 Nov 2008 18:29:52 -0800 (PST) Received: from merlin.alerce.com (localhost [127.0.0.1]) by merlin.alerce.com (Postfix) with ESMTP id 3EB5633C5B; Sun, 16 Nov 2008 18:29:51 -0800 (PST) From: George Hartzell MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <18720.55070.363778.698000@almost.alerce.com> Date: Sun, 16 Nov 2008 18:29:50 -0800 To: hartzell@alerce.com In-Reply-To: <18716.48723.452606.66518@almost.alerce.com> References: <18716.48723.452606.66518@almost.alerce.com> X-Mailer: VM 7.19 under Emacs 22.1.50.1 X-Virus-Scanned: ClamAV using ClamSMTP Cc: freebsd-stable@freebsd.org Subject: Re: problem moving gmirror between two machines. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: hartzell@alerce.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2008 02:29:53 -0000 George Hartzell writes: > > I have an HP DL360 with a pair of 1TB seagate disks that's been > running -STABLE with a ZFS root partition set up using the tools > available here: > > http://yds.coolrat.org/zfsboot.shtml > > It's been working great. As part of trying to understand what's going > on, I csup'ed to -RELENG earlier today and rebuilt/installed the > kernel and world whilst running on the DL360, so everything should be > current. > > I tried to move the disks into an HP DL320 G4 and it fails to boot > because it can't find /dev/mirror/boot (which it wants to mount onto > /strap and then parts get nullfs'ed onto /boot and /rescue). It gives > me the opportunity to start a shell, and from that shell I can do a > zfs mount -a and get all of the zfs filesystems mounted, but there's > nothing in /dev/mirror. No gmirror status and list are silent. > > I can move the disks back into the older machine and they work fine. > > I've run fdisk -s ad4 and bsdlabel -A /dev/ad4s1a and diffed the > output from the two machines and they're identical. > > I've booted with kern.geom.mirror.debug=2 and the DL320G4 tastes > /dev/ad4s1a (along with everything else) but doesn't do anything with > it. > > Any ideas? > [for the archives] Solved. gmirror had been set up with -h specifying the device, and although the newer server used the same device names for its disks (ad[46]) it assigned them to different hot swap bays. Once I switched the disks everything came up fine. g. From owner-freebsd-stable@FreeBSD.ORG Mon Nov 17 03:46:48 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 20CF21065689 for ; Mon, 17 Nov 2008 03:46:48 +0000 (UTC) (envelope-from cmdlnkid@gmail.com) Received: from rn-out-0910.google.com (rn-out-0910.google.com [64.233.170.191]) by mx1.freebsd.org (Postfix) with ESMTP id C00168FC1E for ; Mon, 17 Nov 2008 03:46:47 +0000 (UTC) (envelope-from cmdlnkid@gmail.com) Received: by rn-out-0910.google.com with SMTP id j71so2331158rne.12 for ; Sun, 16 Nov 2008 19:46:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:reply-to:to:cc :subject:in-reply-to:message-id:references:user-agent:x-openpgp-key :mime-version:content-type; bh=/eT25LAT92xXVrLOvA9Io5a7LNiTfBFqCgwfIM6snkk=; b=gpC2fp+qcJZL3D1grPfxTqe62hzk6mb3jyxnVIBaow2SMAiHTbZW4bzbXJIP6P4T0H DdBhAvwpA10pojE5sJx2SoroE6hXSurbZNVMoAelvx6XGRxQUhHMKdo+Jo+h5/VEQj5j dz7kbRFfdOucAVefay3F0Ak1kHrxIfpYm88nE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:reply-to:to:cc:subject:in-reply-to:message-id:references :user-agent:x-openpgp-key:mime-version:content-type; b=S/N3QNMf30liUhMXPTcZD0q/CLnwl0/WKeckKYScI2ZtKBHT8VKsd93GcD0laUc/oZ KWRCCNg6jo4N4ZOTjtEHsbuCiZap7V8L/wFYApz3tnof9lKmI8fpztSr/w0ZG9LZcImS pa0at88RF7V++NN+xd6tRDDkV4rd2sMaan7PY= Received: by 10.90.98.13 with SMTP id v13mr2434406agb.28.1226891678587; Sun, 16 Nov 2008 19:14:38 -0800 (PST) Received: from ?192.168.1.50? (c-71-205-56-117.hsd1.mi.comcast.net [71.205.56.117]) by mx.google.com with ESMTPS id 20sm2576183agb.38.2008.11.16.19.14.37 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 16 Nov 2008 19:14:37 -0800 (PST) Date: Sun, 16 Nov 2008 22:14:35 -0500 From: CmdLnKid To: George Hartzell In-Reply-To: <18720.55070.363778.698000@almost.alerce.com> Message-ID: References: <18716.48723.452606.66518@almost.alerce.com> <18720.55070.363778.698000@almost.alerce.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-OpenPGP-Key: 0xDFFDD218 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-stable@freebsd.org Subject: Re: problem moving gmirror between two machines. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: CmdLnKid List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2008 03:46:48 -0000 On Sun, 16 Nov 2008 21:29 -0000, hartzell wrote: > George Hartzell writes: > > > > I have an HP DL360 with a pair of 1TB seagate disks that's been > > running -STABLE with a ZFS root partition set up using the tools > > available here: > > > > http://yds.coolrat.org/zfsboot.shtml > > > > It's been working great. As part of trying to understand what's going > > on, I csup'ed to -RELENG earlier today and rebuilt/installed the > > kernel and world whilst running on the DL360, so everything should be > > current. > > > > I tried to move the disks into an HP DL320 G4 and it fails to boot > > because it can't find /dev/mirror/boot (which it wants to mount onto > > /strap and then parts get nullfs'ed onto /boot and /rescue). It gives > > me the opportunity to start a shell, and from that shell I can do a > > zfs mount -a and get all of the zfs filesystems mounted, but there's > > nothing in /dev/mirror. No gmirror status and list are silent. > > > > I can move the disks back into the older machine and they work fine. > > > > I've run fdisk -s ad4 and bsdlabel -A /dev/ad4s1a and diffed the > > output from the two machines and they're identical. > > > > I've booted with kern.geom.mirror.debug=2 and the DL320G4 tastes > > /dev/ad4s1a (along with everything else) but doesn't do anything with > > it. > > > > Any ideas? > > > > [for the archives] > > Solved. gmirror had been set up with -h specifying the device, and > although the newer server used the same device names for its disks > (ad[46]) it assigned them to different hot swap bays. Once I switched > the disks everything came up fine. > > g. Wouldn't it be more feasible in this situation to just glabel the disks and mount them from /dev//