From owner-freebsd-stable@FreeBSD.ORG Fri Nov 18 18:22:49 2005 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9E85716A41F; Fri, 18 Nov 2005 18:22:49 +0000 (GMT) (envelope-from johan@stromnet.org) Received: from pne-smtpout2-sn2.hy.skanova.net (pne-smtpout2-sn2.hy.skanova.net [81.228.8.164]) by mx1.FreeBSD.org (Postfix) with ESMTP id BE7D043D77; Fri, 18 Nov 2005 18:22:41 +0000 (GMT) (envelope-from johan@stromnet.org) Received: from elfi2.stromnet.org (81.231.107.13) by pne-smtpout2-sn2.hy.skanova.net (7.2.060.1) id 437DDD3B00010570; Fri, 18 Nov 2005 19:22:29 +0100 Received: from [10.10.0.6] (vpn1-c1.stromnet.org [10.10.0.6]) by elfi2.stromnet.org (Postfix) with ESMTP id 316E0CF03F; Fri, 18 Nov 2005 19:22:24 +0100 (CET) In-Reply-To: References: <991F35AA-151B-4AEA-82BD-5F4AEDF28424@stromnet.org> <74994962-5050-47BD-897B-DE3880B9EBD5@stromnet.org> Mime-Version: 1.0 (Apple Message framework v746.2) Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: quoted-printable From: =?ISO-8859-1?Q?Johan_Str=F6m?= Date: Fri, 18 Nov 2005 19:23:25 +0100 To: delphij@delphij.net X-Mailer: Apple Mail (2.746.2) Cc: pjd@freebsd.org, freebsd-stable@freebsd.org Subject: Re: Page fault, GEOM problem?? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Nov 2005 18:22:49 -0000 Hi! On 18 nov 2005, at 18.43, Xin LI wrote: > Hi, Johan, > > On 11/18/05, Johan Str=F6m wrote: >> On 18 nov 2005, at 10.17, Xin LI wrote: > [snip] >> Doesnt look like I got any "usable" dump devices.. >> When booting i get > [...] >> Loading configuration files. >> No suitable dump device was found. >> Entropy harvesting: >> interrupts >> ethernet >> point_to_point >> kickstart >> . >> swapon: adding /dev/mirror/gm0s1b as swap device > > I see, so your both SATA disks are in the same mirror group... > >> Then naturally: >> /etc/rc: WARNING: Dump device does not exist. Savecore not run. >> >> Looked around in the rc-scripts and tried to figure out what it did, >> the dumpon script >> tries to autolookup a good dump device but finds none.. > > Unfortunately, kernel dumps currently does not support every device, > for some technical reasons (probably to simplify the crash code so > they do not make more mistakes^Wdamages) > >> According to the page you linked to, the dumpon command has to be >> executed AFTER swapon.. Why is the rc scripts trying to run it before >> swapon then? > > I guess this is because that dumpon now can detect dump device > automatically, but I'm not quite sure about this. Will look for the > reason. I think either Handbook should be updated, or the code should > be corrected. > > What I am very curious is that why dumpon is "BEFORE" savecore. Maybe > I have some misunderstanding... Sorry, partly my misstake.. I think i missunderstod how save savecore =20= works below (when i tried it manually in last mail).. But the messages from above are directly from boot, seems it tries =20 dumpon before savecore? Relevant bootlog from last boot: ad0: 2441MB at ata0-master UDMA33 acd0: CDROM at ata1-master PIO4 ad6: 286188MB at ata3-master SATA150 ad10: 286188MB at ata5-master SATA150 GEOM_MIRROR: Device gm0s1 created (id=3D4118114647). GEOM_MIRROR: Device gm0s1: provider ad6s1 detected. GEOM_MIRROR: Device gm0s1: provider ad10s1 detected. GEOM_MIRROR: Device gm0s1: provider ad10s1 activated. GEOM_MIRROR: Device gm0s1: provider ad6s1 activated. GEOM_MIRROR: Device gm0s1: provider mirror/gm0s1 launched. Trying to mount root from ufs:/dev/mirror/gm0s1a Loading configuration files. dumpon: (this DIOCSKERNELDUMP message is probably since i specified =20 dumpdev in rc.conf so it forced useage of gm0s1b instead of letting =20 the scripts autodetect.. ) ioctl(DIOCSKERNELDUMP) : Operation not supported Entropy harvesting: interrupts ethernet point_to_point kickstart . swapon: adding /dev/mirror/gm0s1b as swap device Starting file system checks: /dev/mirror/gm0s1a: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/mirror/gm0s1a: clean, 213811 free (771 frags, 26630 blocks, 0.3% =20= fragmentation) /dev/mirror/gm0s1e: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/mirror/gm0s1e: clean, 1012917 free (85 frags, 126604 blocks, =20 0.0% fragmentation) /dev/mirror/gm0s1f: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/mirror/gm0s1f: clean, 115955787 free (40747 frags, 14489380 =20 blocks, 0.0% fragmentation) /dev/mirror/gm0s1d: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/mirror/gm0s1d: clean, 1983354 free (4834 frags, 247315 blocks, =20 0.2% fragmentation) Starting devd. Mounting NFS file systems: . Creating and/or trimming log files: . Starting syslogd. Checking for core dump on /dev/mirror/gm0s1b... savecore: no dumps found Starting named. So, it seems it does run savecore after running dumpon and mounting =20 disks etc... Is that wrong? > >> Anyway, tried to do dumpon manually on my swap drive: >> >> $ dumpon -v /dev/mirror/gm0s1b >> dumpon: ioctl(DIOCSKERNELDUMP): Operation not supported >> >> Didn't work too good.. >> Also tried savecore manually: >> >> $ savecore /var/crash/ /dev/mirror/gm0s1b >> savecore: no dumps found (This was my misstake, of course there are no dumps when I didnt have =20= a dump when it crashed..) >> >> Didnt work very good either (but probably expected since there was no >> working dumps..) >> Google showed me some other thread in this list about gmirror swap >> dump, just a question (if it was supported) w/o any answers tho. Same >> error as I got. > > It seems that this could not be workaround'ed easily. If possible, my > suggestion is that you attach a third disk and create a swap partition > on it for the crash dump. If this is not feasible, then adding DDB > and KDB may give us a chance to catch the panic and you can use > "trace" command at the ddb> prompt to obtain a simplified backtrace, > and there is good chance that it would reveal what is happening. > > I have cc'ed to Pawel who is very knowledgeable in this area, and > let's see whether he has some better suggestions :-) Okay, just added an old but working 2 gig disk to the system, made it =20= a swap and swapon'ed and: root@elfi:~$ dumpon -v /dev/ad0s1b kernel dumps on /dev/ad0s1b Great! :) So, let's see when/if it dies next time... Before I took it =20= down for the dump-disk, it had been running fine for 1d 1h (since boot after crasch), however probably not as loaded =20 as the day it crashed.. I'll try to load it some now and see if it =20 crashes. Thanks Johan > > Cheers, > -- > Xin LI http://www.delphij.net