Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Nov 2005 19:23:25 +0100
From:      =?ISO-8859-1?Q?Johan_Str=F6m?= <johan@stromnet.org>
To:        delphij@delphij.net
Cc:        pjd@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: Page fault, GEOM problem??
Message-ID:  <A6F22EE2-B1E6-44B5-B4C2-E77E1A24FEBB@stromnet.org>
In-Reply-To: <a78074950511180943r57fd9d03r64efcc705001bc35@mail.gmail.com>
References:  <991F35AA-151B-4AEA-82BD-5F4AEDF28424@stromnet.org> <a78074950511180117r6d64db25o4ae37c0c5998e002@mail.gmail.com> <74994962-5050-47BD-897B-DE3880B9EBD5@stromnet.org> <a78074950511180943r57fd9d03r64efcc705001bc35@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi!

On 18 nov 2005, at 18.43, Xin LI wrote:

> Hi, Johan,
>
> On 11/18/05, Johan Str=F6m <johan@stromnet.org> wrote:
>> On 18 nov 2005, at 10.17, Xin LI wrote:
> [snip]
>> Doesnt look like I got any "usable" dump devices..
>> When booting i get
> [...]
>> Loading configuration files.
>> No suitable dump device was found.
>> Entropy harvesting:
>> interrupts
>> ethernet
>> point_to_point
>> kickstart
>> .
>> swapon: adding /dev/mirror/gm0s1b as swap device
>
> I see, so your both SATA disks are in the same mirror group...
>
>> Then naturally:
>> /etc/rc: WARNING: Dump device does not exist.  Savecore not run.
>>
>> Looked around in the rc-scripts and tried to figure out what it did,
>> the dumpon script
>> tries to autolookup a good dump device but finds none..
>
> Unfortunately, kernel dumps currently does not support every device,
> for some technical reasons (probably to simplify the crash code so
> they do not make more mistakes^Wdamages)
>
>> According to the page you linked to, the dumpon command has to be
>> executed AFTER swapon.. Why is the rc scripts trying to run it before
>> swapon then?
>
> I guess this is because that dumpon now can detect dump device
> automatically, but I'm not quite sure about this.  Will look for the
> reason.  I think either Handbook should be updated, or the code should
> be corrected.
>
> What I am very curious is that why dumpon is "BEFORE" savecore.  Maybe
> I have some misunderstanding...

Sorry, partly my misstake.. I think i missunderstod how save savecore =20=

works below (when i tried it manually in last mail)..
But the messages from above are directly from boot, seems it tries =20
dumpon before savecore? Relevant bootlog from last boot:


ad0: 2441MB <WDC AC22500L 32.41N35> at ata0-master UDMA33
acd0: CDROM <CD-ROM CDU701-F/1.0q> at ata1-master PIO4
ad6: 286188MB <Maxtor 7L300S0 BANC1G10> at ata3-master SATA150
ad10: 286188MB <Maxtor 7L300S0 BANC1G10> at ata5-master SATA150
GEOM_MIRROR: Device gm0s1 created (id=3D4118114647).
GEOM_MIRROR: Device gm0s1: provider ad6s1 detected.
GEOM_MIRROR: Device gm0s1: provider ad10s1 detected.
GEOM_MIRROR: Device gm0s1: provider ad10s1 activated.
GEOM_MIRROR: Device gm0s1: provider ad6s1 activated.
GEOM_MIRROR: Device gm0s1: provider mirror/gm0s1 launched.
Trying to mount root from ufs:/dev/mirror/gm0s1a
Loading configuration files.
dumpon: (this DIOCSKERNELDUMP message is probably since i specified =20
dumpdev in rc.conf so it forced useage of gm0s1b instead of letting =20
the scripts autodetect.. )
ioctl(DIOCSKERNELDUMP)
:
Operation not supported
Entropy harvesting:
interrupts
ethernet
point_to_point
kickstart
.
swapon: adding /dev/mirror/gm0s1b as swap device
Starting file system checks:
/dev/mirror/gm0s1a: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/mirror/gm0s1a: clean, 213811 free (771 frags, 26630 blocks, 0.3% =20=

fragmentation)
/dev/mirror/gm0s1e: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/mirror/gm0s1e: clean, 1012917 free (85 frags, 126604 blocks, =20
0.0% fragmentation)
/dev/mirror/gm0s1f: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/mirror/gm0s1f: clean, 115955787 free (40747 frags, 14489380 =20
blocks, 0.0% fragmentation)
/dev/mirror/gm0s1d: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/mirror/gm0s1d: clean, 1983354 free (4834 frags, 247315 blocks, =20
0.2% fragmentation)
<ifconfig stuff>
Starting devd.
Mounting NFS file systems:
.
Creating and/or trimming log files:
.
Starting syslogd.
Checking for core dump on /dev/mirror/gm0s1b...
savecore: no dumps found
Starting named.
<rest of boot>

So, it seems it does run savecore after running dumpon and mounting =20
disks etc... Is that wrong?

>
>> Anyway, tried to do dumpon manually on my swap drive:
>>
>> $ dumpon -v /dev/mirror/gm0s1b
>> dumpon: ioctl(DIOCSKERNELDUMP): Operation not supported
>>
>> Didn't work too good..
>> Also tried savecore manually:
>>
>> $ savecore /var/crash/ /dev/mirror/gm0s1b
>> savecore: no dumps found

(This was my misstake, of course there are no dumps when I didnt have =20=

a dump when it crashed..)

>>
>> Didnt work very good either (but probably expected since there was no
>> working dumps..)
>> Google showed me some other thread in this list about gmirror swap
>> dump, just a question (if it was supported) w/o any answers tho. Same
>> error as I got.
>
> It seems that this could not be workaround'ed easily.  If possible, my
> suggestion is that you attach a third disk and create a swap partition
> on it for the crash dump.  If this is not feasible, then adding DDB
> and KDB may give us a chance to catch the panic and you can use
> "trace" command at the ddb> prompt to obtain a simplified backtrace,
> and there is good chance that it would reveal what is happening.
>
> I have cc'ed to Pawel who is very knowledgeable in this area, and
> let's see whether he has some better suggestions :-)

Okay, just added an old but working 2 gig disk to the system, made it =20=

a swap and swapon'ed and:

root@elfi:~$ dumpon -v /dev/ad0s1b
kernel dumps on /dev/ad0s1b

Great! :) So, let's see when/if it dies next time... Before I took it =20=

down for the dump-disk, it had been running fine
for 1d 1h (since boot after crasch), however probably not as loaded =20
as the day it crashed.. I'll try to load it some now and see if it =20
crashes.

Thanks

Johan

>
> Cheers,
> --
> Xin LI <delphij@delphij.net> http://www.delphij.net




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A6F22EE2-B1E6-44B5-B4C2-E77E1A24FEBB>