Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 1 Jun 2017 15:53:36 +0200
From:      Raimo Niskanen <raimo+freebsd@erix.ericsson.se>
To:        <freebsd-questions@freebsd.org>
Subject:   Re: Advice on kernel panics
Message-ID:  <20170601135336.GD2256@erix.ericsson.se>
In-Reply-To: <20170529092043.GA89682@erix.ericsson.se>
References:  <20170529092043.GA89682@erix.ericsson.se>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
Hello again.

I gave to little details in my original post; this concerns a Dell Power Edge
R320 server with motherboard disk controller and ZFS only install.

The dmsg is at the end of this mail.


On Mon, May 29, 2017 at 11:20:43AM +0200, Raimo Niskanen wrote:
> Hello list.
> 
> I have a server that panics about every 3 days and need some advice on how
> to handle that.
> 
> It currently has 7 dumps in /var/crash/, head of the latest core.txt.4
> looks like this:
> 
> 
> =======
> sasquatch.otp.ericsson.se dumped core - see /var/crash/vmcore.4
> 
> Mon May 29 03:15:32 CEST 2017
> 
> FreeBSD sasquatch.otp.ericsson.se 10.3-RELEASE-p18 FreeBSD 10.3-RELEASE-p18
> #0: Tue Apr 11 10:31:00 UTC 2017
> root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
> 
> panic: page fault
> 
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you
> are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd"...
> 
> Unread portion of the kernel message buffer:
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0x0
> fault code              = supervisor write data, page not present
> instruction pointer     = 0x20:0xffffffff809fb017
> stack pointer           = 0x28:0xfffffe04673a18c0
> frame pointer           = 0x28:0xfffffe04673a1900
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 18 (syncer)
> trap number             = 12
> panic: page fault
> cpuid = 0
> KDB: stack backtrace:
> #0 0xffffffff8098e7e0 at kdb_backtrace+0x60
> #1 0xffffffff809514b6 at vpanic+0x126
> #2 0xffffffff80951383 at panic+0x43
> #3 0xffffffff80d5646b at trap_fatal+0x36b
> #4 0xffffffff80d5676d at trap_pfault+0x2ed
> #5 0xffffffff80d55dea at trap+0x47a
> #6 0xffffffff80d3bdb2 at calltrap+0x8
> #7 0xffffffff809f9b23 at vfs_msync+0x203
> #8 0xffffffff809fb858 at sync_fsync+0x108
> #9 0xffffffff80e81ed7 at VOP_FSYNC_APV+0xa7
> #10 0xffffffff809fc27b at sched_sync+0x3ab
> #11 0xffffffff8091a93a at fork_exit+0x9a
> #12 0xffffffff80d3c2ee at fork_trampoline+0xe
> Uptime: 2d19h53m15s
> =======
> 
> 
> What sticks out later in core.txt.4 is the fstat section that contains a
> lot of errors, but I can not tell if that is just a secondary symptom...
> 
> Looks like this:
> =======
> fstat
> 
> fstat: can't read file 1 at 0x200007fffffffff
> fstat: can't read file 2 at 0x4000000001fffff
> fstat: can't read znode_phys at 0x1
> fstat: can't read znode_phys at 0x1
> fstat: can't read znode_phys at 0x1
> :
> USER     CMD          PID   FD MOUNT      INUM MODE         SZ|DV R/W
> root     sed        78401 root -         -       error    -
> root     sed        78401   wd -         -       error    -
> root     sed        78401 text -         -       error    -
> root     sed        78401    0* pipe fffff8001800f000 <-> fffff8001800f160
>    0 rw
> root     grep       78400 root -         -       error    -
> root     grep       78400   wd -         -       error    -
> root     grep       78400 text -         -       error    -
> :
> =======
> 
> To me the other core.txt.? files does not look exactly the same.  All have
> an fstat section with many errors, though.
> 
> Does anyone have some advice on how to proceed?
> -- 

Copyright (c) 1992-2016 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.3-RELEASE-p18 #0: Tue Apr 11 10:31:00 UTC 2017
    root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64
FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
CPU: Intel(R) Xeon(R) CPU E5-2407 v2 @ 2.40GHz (2400.06-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x306e4  Family=0x6  Model=0x3e  Stepping=4

Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>

Features2=0x7fbee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  Structured Extended Features=0x281<FSGSBASE,SMEP,ERMS>
  XSAVE Features=0x1<XSAVEOPT>
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
  TSC: P-state invariant, performance statistics
real memory  = 12884901888 (12288 MB)
avail memory = 12380942336 (11807 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <DELL   PE_SC3  >
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  2
 cpu2 (AP): APIC ID:  4
 cpu3 (AP): APIC ID:  6
random: <Software, Yarrow> initialized
ioapic1: Changing APIC ID to 1
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 32-55 on motherboard
kbd1 at kbdmux0
acpi0: <DELL PE_SC3> on motherboard
acpi0: Power Button (fixed)
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
cpu2: <ACPI CPU> on acpi0
cpu3: <ACPI CPU> on acpi0
atrtc0: <AT realtime clock> port 0x70-0x7f irq 8 on acpi0
Event timer "RTC" frequency 32768 Hz quality 0
attimer0: <AT timer> port 0x40-0x5f irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 550
Event timer "HPET1" frequency 14318180 Hz quality 440
Event timer "HPET2" frequency 14318180 Hz quality 440
Event timer "HPET3" frequency 14318180 Hz quality 440
Event timer "HPET4" frequency 14318180 Hz quality 440
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> irq 53 at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> irq 53 at device 3.0 on pci0
pci8: <ACPI PCI bus> on pcib2
bge0: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> mem
0xd90a0000-0xd90affff,0xd90b0000-0xd90bffff,0xd90c0000-0xd90cffff irq 48 at
device 0.0 on pci8
bge0: APE FW version: NCSI v1.2.33.0
bge0: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E
miibus0: <MII bus> on bge0
brgphy0: <BCM5720C 1000BASE-T media interface> PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge0: Using defaults for TSO: 65518/35/2048
bge0: Ethernet address: 00:0a:f7:52:b1:1a
bge1: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> mem
0xd90d0000-0xd90dffff,0xd90e0000-0xd90effff,0xd90f0000-0xd90fffff irq 52 at
device 0.1 on pci8
bge1: APE FW version: NCSI v1.2.33.0
bge1: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E
miibus1: <MII bus> on bge1
brgphy1: <BCM5720C 1000BASE-T media interface> PHY 2 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge1: Using defaults for TSO: 65518/35/2048
bge1: Ethernet address: 00:0a:f7:52:b1:1b
pcib3: <PCI-PCI bridge> irq 16 at device 17.0 on pci0
pci9: <PCI bus> on pcib3
pci0: <simple comms> at device 22.0 (no driver attached)
pci0: <simple comms> at device 22.1 (no driver attached)
ehci0: <Intel Patsburg USB 2.0 controller> mem 0xde8fd000-0xde8fd3ff irq 23
at device 26.0 on pci0
usbus0: EHCI version 1.0
usbus0 on ehci0
pcib4: <ACPI PCI-PCI bridge> at device 28.0 on pci0
pci10: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> irq 16 at device 28.4 on pci0
pci2: <ACPI PCI bus> on pcib5
bge2: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> mem
0xd91a0000-0xd91affff,0xd91b0000-0xd91bffff,0xd91c0000-0xd91cffff irq 16 at
device 0.0 on pci2
bge2: APE FW version: NCSI v1.2.33.0
bge2: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E
miibus2: <MII bus> on bge2
brgphy2: <BCM5720C 1000BASE-T media interface> PHY 1 on miibus2
brgphy2:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge2: Using defaults for TSO: 65518/35/2048
bge2: Ethernet address: c8:1f:66:bc:10:cd
bge3: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> mem
0xd91d0000-0xd91dffff,0xd91e0000-0xd91effff,0xd91f0000-0xd91fffff irq 17 at
device 0.1 on pci2
bge3: APE FW version: NCSI v1.2.33.0
bge3: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E
miibus3: <MII bus> on bge3
brgphy3: <BCM5720C 1000BASE-T media interface> PHY 2 on miibus3
brgphy3:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge3: Using defaults for TSO: 65518/35/2048
bge3: Ethernet address: c8:1f:66:bc:10:ce
pcib6: <ACPI PCI-PCI bridge> irq 19 at device 28.7 on pci0
pci3: <ACPI PCI bus> on pcib6
pcib7: <PCI-PCI bridge> at device 0.0 on pci3
pci4: <PCI bus> on pcib7
pcib8: <PCI-PCI bridge> at device 0.0 on pci4
pci5: <PCI bus> on pcib8
pcib9: <PCI-PCI bridge> at device 0.0 on pci5
pci6: <PCI bus> on pcib9
vgapci0: <VGA-compatible display> mem
0xd8000000-0xd8ffffff,0xddffc000-0xddffffff,0xdd000000-0xdd7fffff irq 19 at
device 0.0 on pci6
vgapci0: Boot video device
pcib10: <PCI-PCI bridge> at device 1.0 on pci4
pci7: <PCI bus> on pcib10
ehci1: <Intel Patsburg USB 2.0 controller> mem 0xde8fe000-0xde8fe3ff irq 22
at device 29.0 on pci0
usbus1: EHCI version 1.0
usbus1 on ehci1
pcib11: <PCI-PCI bridge> at device 30.0 on pci0
pci11: <PCI bus> on pcib11
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
ahci0: <Intel Patsburg AHCI SATA controller> port
0xfce8-0xfcef,0xfcf8-0xfcfb,0xfcf0-0xfcf7,0xfcfc-0xfcff,0xfcc0-0xfcdf mem
0xde8ff000-0xde8ff7ff irq 20 at device 31.2 on pci0
ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich1: <AHCI channel> at channel 1 on ahci0
ahcich2: <AHCI channel> at channel 2 on ahci0
ahcich3: <AHCI channel> at channel 3 on ahci0
ahcich4: <AHCI channel> at channel 4 on ahci0
ahciem0: <AHCI enclosure management bridge> on ahci0
pcib12: <ACPI Host-PCI bridge> on acpi0
pci63: <ACPI PCI bus> on pcib12
pcib13: <ACPI Host-PCI bridge> on acpi0
pci127: <ACPI PCI bus> on pcib13
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xec000-0xeffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
ppc0: cannot reserve I/O port range
est0: <Enhanced SpeedStep Frequency Control> on cpu0
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 1d4d00001800
device_attach: est0 attach returned 6
est1: <Enhanced SpeedStep Frequency Control> on cpu1
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 1d4d00001800
device_attach: est1 attach returned 6
est2: <Enhanced SpeedStep Frequency Control> on cpu2
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 1d4d00001800
device_attach: est2 attach returned 6
est3: <Enhanced SpeedStep Frequency Control> on cpu3
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 1d4d00001800
device_attach: est3 attach returned 6
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
Timecounters tick every 1.000 msec
random: unblocking device.
usbus0: 480Mbps High Speed USB v2.0
usbus1: 480Mbps High Speed USB v2.0
ugen0.1: <Intel> at usbus0
uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus0
ugen1.1: <Intel> at usbus1
uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
ses0 at ahciem0 bus 0 scbus5 target 0 lun 0
ses0: <AHCI SGPIO Enclosure 1.00 0001> SEMB S-E-S 2.00 device
ses0: SEMB SES Device
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <WDC WD5003ABYX-18WERA0 01.01S04> ATA8-ACS SATA 2.x device
ada0: Serial Number WD-WMAYP8034312
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 476940MB (976773168 512 byte sectors)
ada0: Previously was known as ad4
ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
ada1: <WDC WD10EFRX-68PJCN0 82.00A82> ACS-2 ATA SATA 3.x device
ada1: Serial Number WD-WCC4JDU1EVHN
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 953869MB (1953525168 512 byte sectors)
ada1: quirks=0x1<4K>
ada1: Previously was known as ad6
cd0 at ahcich4 bus 0 scbus4 target 0 lun 0
cd0: <TSSTcorp DVD-ROM SN-108FB D150> Removable CD-ROM SCSI device
cd0: Serial Number S1596YBF3001M9
cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes)
cd0: Attempt to query device size failed: NOT READY, Medium not present -
tray closed
SMP: AP CPU #2 Launched!
SMP: AP CPU #1 Launched!
SMP: AP CPU #3 Launched!
Timecounter "TSC-low" frequency 1200028244 Hz quality 1000
GEOM_MIRROR: Device mirror/swap launched (2/2).
Root mount waiting for: usbus1 usbus0
uhub1: 2 ports with 2 removable, self powered
uhub0: 2 ports with 2 removable, self powered
Root mount waiting for: usbus1 usbus0
ugen1.2: <vendor 0x8087> at usbus1
uhub2: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2> on
usbus1
ugen0.2: <vendor 0x8087> at usbus0
uhub3: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2> on
usbus0
Root mount waiting for: usbus1 usbus0
uhub3: 6 ports with 6 removable, self powered
uhub2: 8 ports with 8 removable, self powered
Root mount waiting for: usbus1 usbus0
ugen0.3: <no manufacturer> at usbus0
uhub4: <no manufacturer Gadget USB HUB, class 9/0, rev 2.00/0.00, addr 3>
on usbus0
ugen1.3: <vendor 0x0557> at usbus1
uhub5: <vendor 0x0557 product 0x8021, class 9/0, rev 1.10/1.00, addr 3> on
usbus1
uhub5: 4 ports with 4 removable, self powered
uhub4: 6 ports with 6 removable, self powered
Root mount waiting for: usbus1 usbus0
ugen0.4: <Avocent> at usbus0
ukbd0: <Keyboard> on usbus0
kbd0 at ukbd0
ugen1.4: <ATEN> at usbus1
ukbd1: <Kb> on usbus1
kbd2 at ukbd1
Trying to mount root from zfs:zroot/ROOT/default []...

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?20170601135336.GD2256>