Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 5 Jan 2019 11:01:28 +0100
From:      =?UTF-8?B?SnVyaWogS292YcSNacSN?= <jurij.kovacic@ocpea.com>
To:        freebsd-stable@freebsd.org
Subject:   Re: Kernel panic on 11.2-RELEASE-p7
Message-ID:  <CADEDvuCB8%2BRkHvrhN8czyGha8A=DKwODYR-XA_fjip91qL0SBQ@mail.gmail.com>
In-Reply-To: <CADEDvuBa=oEO_4wF-Su3q=tuBNZnn6k3429xwG3eiOUZQyWmsw@mail.gmail.com>
References:  <CADEDvuBa=oEO_4wF-Su3q=tuBNZnn6k3429xwG3eiOUZQyWmsw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Dear list,

About a week ago, we had a kernel panic on Freebsd 11.2-RELEASE-p7 with
GENERIC kernel, ZFS root. As the kernel was not compiled with debug support
enabled, the resulting "vmcore" files were of little use. Consequently, I
recompiled kernel with debug support:

--- GENERIC     2018-12-29 08:03:04.786846000 +0100
+++ DEBUG       2018-12-29 08:23:36.522966000 +0100
@@ -19,11 +19,16 @@
 # $FreeBSD: releng/11.2/sys/amd64/conf/GENERIC 333417 2018-05-09 16:14:12Z
sbruno $

 cpu            HAMMER
-ident          GENERIC
+ident          DEBUG

 makeoptions    DEBUG=3D-g                # Build kernel with gdb(1) debug
symbols
 makeoptions    WITH_CTF=3D1              # Run ctfconvert(1) for DTrace
support

+# kernel debugging
+options                KDB
+options                KDB_UNATTENDED
+options                KDB_TRACE
+
 options        SCHED_ULE               # ULE scheduler
 options        PREEMPTION              # Enable kernel thread preemption
 options        INET                    # InterNETworking

and installed it.

After running for about a week, the server crashed again this night.
Unfortunately, there are no "vmcore" files on "/var/crash" this time.

The server has 12GB of RAM installed:
 # sysctl hw.physmem
hw.physmem: 12843053056

and uses 2 swap partitions (2G each):
# swapinfo -h
Device          1K-blocks     Used    Avail Capacity
/dev/ada0p2       2097152     642M     1.4G    31%
/dev/ada1p2       2097152     638M     1.4G    31%
Total             4194304     1.3G     2.7G    31%

Dump device is set in /etc/rc.conf:
# grep dump /etc/rc.conf
# Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable
dumpdev=3D"AUTO"

There seems to be enough space left in "/var/crash":
 # zfs list | grep crash
zroot/var/crash      857M  17.2G   857M  /var/crash

and like I said earlier, the system DID create "vmcore" files when crashing
with GENERIC kernel. Is it possible that swap partition(s) are too small
for the memory dump, now that the kernel is compiled with debug support? Or
is some additional configuration needed to make the system save vmcore
files?

Please advise.

Kind regards,
Jurij

On Tue, Dec 25, 2018 at 7:57 AM Jurij Kova=C4=8Di=C4=8D <jurij.kovacic@ocpe=
a.com>
wrote:

> Dear list,
>
> I hope I am posting this to the correct list - if not, I apologize (and
> please advise where to post this instead).
>
> Today I experienced a kernel panic on a (physical) server, running Freebs=
d
> 11.2-RELEASE-p7 with GENERIC kernel, ZFS root:
>
> Fatal trap 9: general protection fault while in kernel mode
> cpuid =3D 0; apic id =3D 00
> instruction pointer    =3D 0x20:0xffffffff82299013
> stack pointer            =3D 0x28:0xfffffe0352893ad0
> frame pointer            =3D 0x28:0xfffffe0352893b10
> code segment        =3D base 0x0, limit 0xfffff, type 0x1b
>             =3D DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags    =3D interrupt enabled, resume, IOPL =3D 0
> current process        =3D 9 (dbuf_evict_thread)
> trap number        =3D 9
> panic: general protection fault
> cpuid =3D 0
> KDB: stack backtrace:
> #0 0xffffffff80b3d577 at kdb_backtrace+0x67
> #1 0xffffffff80af6b17 at vpanic+0x177
> #2 0xffffffff80af6993 at panic+0x43
> #3 0xffffffff80f77fdf at trap_fatal+0x35f
> #4 0xffffffff80f7759e at trap+0x5e
> #5 0xffffffff80f5808c at calltrap+0x8
> #6 0xffffffff8229c049 at dbuf_evict_one+0xe9
> #7 0xffffffff82297a15 at dbuf_evict_thread+0x1a5
> #8 0xffffffff80aba093 at fork_exit+0x83
> #9 0xffffffff80f58fae at fork_trampoline+0xe
>
> I have used "crashinfo" utility to generate the text file which is
> available at this URL: http://www.ocpea.com/dump/core.txt
>
> At the time of the crash, the server was probably under more intensive I/=
O
> load (scheduled backup with rsync).
>
> This is a production server, so naturally, all advice is deeply
> appreciated. :)
>
> Kind regards,
> Jurij
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CADEDvuCB8%2BRkHvrhN8czyGha8A=DKwODYR-XA_fjip91qL0SBQ>