Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Dec 2004 09:23:18 +1030
From:      Greg 'groggy' Lehey <grog@FreeBSD.org>
To:        Daniel Johansson <donnex@gmail.com>
Cc:        questions@freebsd.org
Subject:   Re: My server gets kernel panic every 7th day
Message-ID:  <20041219225318.GH84787@wantadilla.lemis.com>
In-Reply-To: <2a37e1ef04121914421fe84902@mail.gmail.com>
References:  <2a37e1ef04121802575db1ba26@mail.gmail.com> <20041218195002.GC78603@xor.obsecurity.org> <20041219222919.GE84787@wantadilla.lemis.com> <2a37e1ef04121914352677c442@mail.gmail.com> <20041219223801.GG84787@wantadilla.lemis.com> <2a37e1ef04121914421fe84902@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--K1n7F7fSdjvFAEnM
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sunday, 19 December 2004 at 23:42:20 +0100, Daniel Johansson wrote:
> On Mon, 20 Dec 2004 09:08:01 +1030, Greg 'groggy' Lehey
> <grog@freebsd.org> wrote:
>> On Sunday, 19 December 2004 at 23:35:18 +0100, Daniel Johansson wrote:
>>> On Mon, 20 Dec 2004 08:59:19 +1030, Greg 'groggy' Lehey
>>> <grog@freebsd.org> wrote:
>>>> On Saturday, 18 December 2004 at 11:50:02 -0800, Kris Kennaway wrote:
>>>>> On Sat, Dec 18, 2004 at 11:57:35AM +0100, Daniel Johansson wrote:
>>>>>> Hi, i've had my server up for over a year now and it's been rock sol=
id
>>>>>> but for the latest weeks the server has rebooted evert Saturday at
>>>>>> exact 04:19:57 because of a find command. I have no idea why and I've
>>>>>> checked the cron log and I don't think any crontab is runned at that
>>>>>> time. Not as far as I can see from the cron log. Anyway find makes t=
he
>>>>>> server get a kernel panic and it reboots. This is the fourth week in=
 a
>>>>>> row it happens and I've checked the hardware, no problems at all.
>>>>>
>>>>> How did you "check the hardware"?  Hardware failure is by far the
>>>>> most common cause of "strange panics under abnormal load [such as
>>>>> when the weekly cron job runs]".
>>>>
>>>> If this panic occurs repeatedly under certain circumstances, it's
>>>> probably not hardware.  Anyway, there's not much point standing
>>>> outside and scratching our heads.  We have a facility for analysing
>>>> this kind of problem: the processor dump and kernel debugger.
>>>
>>> Yeah, I want to say thank you for your help. I think I've been able to
>>> reproduce the kernel panic now, finalay!
>>>
>>> On my server I run 3 jails and every night at 04:15 when it runs
>>> periodic weekly it runs it in 3 jails + the host enviroment. This
>>> seems to cause the kernel panic, I don't really know why yet. I can
>>> run periodic weekly separatly in every jail + the host without kernel
>>> panic but when I run it at the same time on all places it kernel
>>> panics.
>>
>> What does the dump backtrace show?
>>
>>> It can still be the PSU, don't have any other atm to try with. I'll
>>> do some more testing and see if I can get any more info.
>>
>> There's no point looking at the hardware until you've looked at the
>> dump.

I'd appreciate it if you didn't require me to move the text of your
messages to where it fits.

> Okay, is this hard to do? I've no idea how to look at the dump or
> how to understand the dump. You don't have to be kernel hacker to
> understand that?

It's described in the handbook.  Basically:

- Build a kernel with debug symbols (you should be doing this anyway).
  You need the following line in your configuration file:

    makeoptions	DEBUG=3D-g		# Build kernel with gdb(1) debug symbols

- Make sure that dumps are enabled.  You should have something like
  this in your /etc/rc.conf:

    dumpdev=3D/dev/ad0s2b

  The device name should be the name of your swap partition, and it
  must be at least slightly larger than your main memory.

- Ensure you have a directory /var/crash, and that the file system in
  which it resides has enough space for the dump (a little larger than
  main memory).

- When you get a dump, it will be copied to /var/crash automatically
  on reboot.  Go there and get a backtrace.  You don't say which
  version of FreeBSD you're using, but in general this will do it:

  # cd /var/crash
  # gdb -k /usr/obj/src/sys/GENERIC/kernel.debug vmcore.0
  (gdb) bt
 =20
The name of the kernel (kernel.debug) depends on how you built your
kernel.  If it's not called GENERIC, the name of the directory will
change accordingly.

That's it in a nutshell.  There's much more detail in chapter 6 of my
debug tutorial, which you can find at
http://www.lemis.com/grog/Papers/Debug-tutorial/tutorial.pdf .

Greg
--
When replying to this message, please copy the original recipients.
If you don't, I may ignore the reply or reply to the original recipients.
For more information, see http://www.lemis.com/questions.html
See complete headers for address and phone numbers.

--K1n7F7fSdjvFAEnM
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (FreeBSD)

iD8DBQFBxgZeIubykFB6QiMRAv89AKCrjuusTCI/XtbNRIbkbCztLSVY4QCfX7zv
tUBu0zXuB/1Ezo9YzmktJKk=
=WqFV
-----END PGP SIGNATURE-----

--K1n7F7fSdjvFAEnM--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041219225318.GH84787>