Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 May 2017 11:20:43 +0200
From:      Raimo Niskanen <raimo+freebsd@erix.ericsson.se>
To:        <freebsd-questions@freebsd.org>
Subject:   Advice on kernel panics
Message-ID:  <20170529092043.GA89682@erix.ericsson.se>

next in thread | raw e-mail | index | archive | help
Hello list.

I have a server that panics about every 3 days and need some advice on how
to handle that.

It currently has 7 dumps in /var/crash/, head of the latest core.txt.4
looks like this:


=======
sasquatch.otp.ericsson.se dumped core - see /var/crash/vmcore.4

Mon May 29 03:15:32 CEST 2017

FreeBSD sasquatch.otp.ericsson.se 10.3-RELEASE-p18 FreeBSD 10.3-RELEASE-p18
#0: Tue Apr 11 10:31:00 UTC 2017
root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

panic: page fault

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff809fb017
stack pointer           = 0x28:0xfffffe04673a18c0
frame pointer           = 0x28:0xfffffe04673a1900
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 18 (syncer)
trap number             = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff8098e7e0 at kdb_backtrace+0x60
#1 0xffffffff809514b6 at vpanic+0x126
#2 0xffffffff80951383 at panic+0x43
#3 0xffffffff80d5646b at trap_fatal+0x36b
#4 0xffffffff80d5676d at trap_pfault+0x2ed
#5 0xffffffff80d55dea at trap+0x47a
#6 0xffffffff80d3bdb2 at calltrap+0x8
#7 0xffffffff809f9b23 at vfs_msync+0x203
#8 0xffffffff809fb858 at sync_fsync+0x108
#9 0xffffffff80e81ed7 at VOP_FSYNC_APV+0xa7
#10 0xffffffff809fc27b at sched_sync+0x3ab
#11 0xffffffff8091a93a at fork_exit+0x9a
#12 0xffffffff80d3c2ee at fork_trampoline+0xe
Uptime: 2d19h53m15s
=======


What sticks out later in core.txt.4 is the fstat section that contains a
lot of errors, but I can not tell if that is just a secondary symptom...

Looks like this:
=======
fstat

fstat: can't read file 1 at 0x200007fffffffff
fstat: can't read file 2 at 0x4000000001fffff
fstat: can't read znode_phys at 0x1
fstat: can't read znode_phys at 0x1
fstat: can't read znode_phys at 0x1
:
USER     CMD          PID   FD MOUNT      INUM MODE         SZ|DV R/W
root     sed        78401 root -         -       error    -
root     sed        78401   wd -         -       error    -
root     sed        78401 text -         -       error    -
root     sed        78401    0* pipe fffff8001800f000 <-> fffff8001800f160
   0 rw
root     grep       78400 root -         -       error    -
root     grep       78400   wd -         -       error    -
root     grep       78400 text -         -       error    -
:
=======

To me the other core.txt.? files does not look exactly the same.  All have
an fstat section with many errors, though.

Does anyone have some advice on how to proceed?
-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170529092043.GA89682>