Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 17 Aug 2017 13:03:40 +0100
From:      Kaya Saman <kayasaman@gmail.com>
To:        freebsd-questions <freebsd-questions@freebsd.org>
Subject:   Upgrade to 11.1 from 10.3 causing complete system freeze / hang
Message-ID:  <c6bd6cc3-8549-521e-f891-47a396a61475@gmail.com>

next in thread | raw e-mail | index | archive | help
Hi,


I've just been upgrading my systems to 11.1-Release from 10.3 and so far 
all seems to have gone well bar one in particular system.


I will try to explain as clearly as possible to mitigate confusion.


This issue seems to be restricted to two systems only. Both these 
systems run identical SuperMicro Celeron Mini-ITX system boards. From a 
hardware perspective both systems are identical but configurations are 
quite different.


System 1:


ZFS on root running 2x Jails and 2x (non root) hard drives configured as 
iscsi targets, this machine uses a lagg interface and 2x vlans in 802.1q 
trunk


<--- after upgrade the above system hung initially but after hard 
power-off/on it seems to be up now for a few days without any issues


System 2:


UFS on root and 3x zpools configured over 4 hard drives exported through 
NFS, this system also uses a lagg interface but not vlans


<--- this system seems to hang quite a bit - (every few hours or less 
with tuning in /boot/loader.conf , /etc/sysctl.conf!)


An SSH session running 'top' shows this before the hang:


last pid: 40393;  load averages:  0.31,  0.22,  0.22    up 0+03:25:51  
03:50:51
53 processes:  1 running, 52 sleeping
CPU:  0.0% user,  0.0% nice,  1.1% system,  0.1% interrupt, 98.8% idle
Mem: 2028K Active, 291M Inact, 11M Laundry, 7321M Wired, 249M Buf, 159M Free
ARC: 5654M Total, 664M MFU, 4568M MRU, 48K Anon, 99M Header, 322M Other
      4980M Compressed, 5237M Uncompressed, 1.05:1 Ratio
Swap: 2327M Total, 33M Used, 2293M Free, 1% Inuse


 From above the system looks absolutely fine in terms of no abnormal 
load, processor or memory usage.....


The dmesg output of the system is as follows:


Copyright (c) 1992-2017 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
     The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.1-RELEASE-p1 #0: Wed Aug  9 11:55:48 UTC 2017
root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64
FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on 
LLVM 4.0.0)
VT(vga): resolution 640x480
CPU: Intel(R) Celeron(R) CPU  J1900  @ 1.99GHz (2000.05-MHz K8-class CPU)
   Origin="GenuineIntel"  Id=0x30678  Family=0x6  Model=0x37 Stepping=8
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Features2=0x41d8e3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,RDRAND>
   AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
   AMD Features2=0x101<LAHF,Prefetch>
   Structured Extended Features=0x2282<TSCADJ,SMEP,ERMS,NFPUSG>
   VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
   TSC: P-state invariant, performance statistics
real memory  = 8589934592 (8192 MB)
avail memory = 8106274816 (7730 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <SUPERM SMCI--MB>
WARNING: L1 data cache covers less APIC IDs than a core
0 < 1
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
random: unblocking device.
ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Gpe0Block: 
128/32 (20170303/tbfadt-748)
ioapic0 <Version 2.0> irqs 0-86 on motherboard
SMP: AP CPU #2 Launched!
SMP: AP CPU #3 Launched!
SMP: AP CPU #1 Launched!
Timecounter "TSC" frequency 2000049240 Hz quality 1000
random: entropy device external interface
kbd1 at kbdmux0
netmap: loaded module
module_register_init: MOD_LOAD (vesa, 0xffffffff80f5b220, 0) error 19
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
nexus0
vtvga0: <VT VGA driver> on motherboard
cryptosoft0: <software crypto> on motherboard
acpi0: <SUPERM SMCI--MB> on motherboard
acpi0: Power Button (fixed)
unknown: I/O range not supported
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
cpu2: <ACPI CPU> on acpi0
cpu3: <ACPI CPU> on acpi0
atrtc0: <AT realtime clock> port 0x70-0x77 on acpi0
atrtc0: Warning: Couldn't map I/O.
Event timer "RTC" frequency 32768 Hz quality 0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 8 on 
acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 450
Event timer "HPET1" frequency 14318180 Hz quality 440
Event timer "HPET2" frequency 14318180 Hz quality 440
attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pcib0: _OSC returned error 0x10
pci0: <ACPI PCI bus> on pcib0
vgapci0: <VGA-compatible display> port 0xe080-0xe087 mem 
0x90000000-0x903fffff,0x80000000-0x8fffffff irq 16 at device 2.0 on pci0
vgapci0: Boot video device
ahci0: <AHCI SATA controller> port 
0xe070-0xe077,0xe060-0xe063,0xe050-0xe057,0xe040-0xe043,0xe020-0xe03f 
mem 0x90a06000-0x90a067ff irq 19 at device 19.0 on pci0
ahci0: AHCI v1.30 with 2 3Gbps ports, Port Multiplier not supported
ahcich1: <AHCI channel> at channel 1 on ahci0
pci0: <encrypt/decrypt> at device 26.0 (no driver attached)
hdac0: <Intel BayTrail HDA Controller> mem 0x90a00000-0x90a03fff irq 22 
at device 27.0 on pci0
pcib1: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
pcib1: [GIANT-LOCKED]
pcib2: <ACPI PCI-PCI bridge> irq 18 at device 28.2 on pci0
pcib2: [GIANT-LOCKED]
pci1: <ACPI PCI bus> on pcib2
igb0: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 
0xd000-0xd01f mem 0x90900000-0x9097ffff,0x90980000-0x90983fff irq 18 at 
device 0.0 on pci1
igb0: Using MSIX interrupts with 5 vectors
igb0: Ethernet address: 0c:c4:7a:b0:60:92
igb0: Bound queue 0 to cpu 0
igb0: Bound queue 1 to cpu 1
igb0: Bound queue 2 to cpu 2
igb0: Bound queue 3 to cpu 3
igb0: netmap queues/slots: TX 4/1024, RX 4/1024
pcib3: <ACPI PCI-PCI bridge> irq 19 at device 28.3 on pci0
pcib3: [GIANT-LOCKED]
pci2: <ACPI PCI bus> on pcib3
pcib4: <ACPI PCI-PCI bridge> mem 0x90800000-0x90803fff irq 19 at device 
0.0 on pci2
pci3: <ACPI PCI bus> on pcib4
pcib5: <PCI-PCI bridge> irq 16 at device 1.0 on pci3
pci4: <PCI bus> on pcib5
igb1: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 
0xc000-0xc01f mem 0x90700000-0x9077ffff,0x90780000-0x90783fff irq 16 at 
device 0.0 on pci4
igb1: Using MSIX interrupts with 5 vectors
igb1: Ethernet address: 0c:c4:7a:b0:60:93
igb1: Bound queue 0 to cpu 0
igb1: Bound queue 1 to cpu 1
igb1: Bound queue 2 to cpu 2
igb1: Bound queue 3 to cpu 3
igb1: netmap queues/slots: TX 4/1024, RX 4/1024
pcib6: <PCI-PCI bridge> irq 17 at device 2.0 on pci3
pci5: <PCI bus> on pcib6
pcib7: <PCI-PCI bridge> irq 18 at device 3.0 on pci3
pci6: <PCI bus> on pcib7
ahci1: <Marvell 88SE9230 AHCI SATA controller> port 
0xb050-0xb057,0xb040-0xb043,0xb030-0xb037,0xb020-0xb023,0xb000-0xb01f 
mem 0x90610000-0x906107ff irq 18 at device 0.0 on pci6
ahci1: AHCI v1.20 with 8 6Gbps ports, Port Multiplier not supported
ahci1: quirks=0x900<NOBSYRES,ALTSIG>
ahcich2: <AHCI channel> at channel 0 on ahci1
ahcich3: <AHCI channel> at channel 1 on ahci1
ahcich4: <AHCI channel> at channel 2 on ahci1
ahcich5: <AHCI channel> at channel 3 on ahci1
ahcich6: <AHCI channel> at channel 4 on ahci1
ahcich7: <AHCI channel> at channel 5 on ahci1
ahcich8: <AHCI channel> at channel 6 on ahci1
ahcich9: <AHCI channel> at channel 7 on ahci1
ehci0: <Intel BayTrail USB 2.0 controller> mem 0x90a05000-0x90a053ff irq 
23 at device 29.0 on pci0
usbus0: EHCI version 1.0
usbus0 on ehci0
usbus0: 480Mbps High Speed USB v2.0
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
acpi_button0: <Power Button> on acpi0
acpi_button1: <Sleep Button> on acpi0
acpi_tz0: <Thermal Zone> on acpi0
uart0: <16950 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart2: <16950 or compatible> port 0x3e0-0x3e7 irq 3 on acpi0
uart3: <16950 or compatible> port 0x3e8-0x3ef irq 4 on acpi0
uart4: <16950 or compatible> port 0x2e0-0x2e7 irq 3 on acpi0
orm0: <ISA Option ROM> at iomem 0xd2000-0xd2fff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
fdc0: <Enhanced floppy controller> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 
on isa0
ppc0: cannot reserve I/O port range
est0: <Enhanced SpeedStep Frequency Control> on cpu0
est1: <Enhanced SpeedStep Frequency Control> on cpu1
est2: <Enhanced SpeedStep Frequency Control> on cpu2
est3: <Enhanced SpeedStep Frequency Control> on cpu3
Timecounters tick every 1.000 msec
nvme cam probe device init
hdacc0: <Realtek ALC888 HDA CODEC> at cad 0 on hdac0
hdaa0: <Realtek ALC888 Audio Function Group> at nid 1 on hdacc0
pcm0: <Realtek ALC888 (Front Analog)> at nid 27 and 25 on hdaa0
pcm1: <Realtek ALC888 (Internal Digital)> at nid 17 on hdaa0
hdacc1: <Intel (0x2882) HDA CODEC> at cad 2 on hdac0
hdaa1: <Intel (0x2882) Audio Function Group> at nid 1 on hdacc1
hdaa1: hdaa_audio_as_parse: Duplicate pin 0 (5) in association 1! 
Disabling association.
pcm2: <Intel (0x2882) (HDMI/DP 8ch)> at nid 6 on hdaa1
ugen0.1: <Intel EHCI root HUB> at usbus0
uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus0
ada0 at ahcich1 bus 0 scbus0 target 0 lun 0
ada0: <INTEL SSDSA2M040G2GC 2CV102HB> ATA-7 SATA 2.x device
ada0: Serial Number CVGB007000G3040GGN
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 38166MB (78165360 512 byte sectors)
ada0: quirks=0x1<4K>
ada1 at ahcich2 bus 0 scbus1 target 0 lun 0
ada1: <WDC WD60EZRX-00MVLB1 80.00A80> ACS-2 ATA SATA 3.x device
ada1: Serial Number WD-WX21D947NY6S
ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 5723166MB (11721045168 512 byte sectors)
ada1: quirks=0x1<4K>
ada2 at ahcich3 bus 0 scbus2 target 0 lun 0
ada2: <WDC WD2001FASS-00U0B0 01.00101> ATA8-ACS SATA 2.x device
ada2: Serial Number WD-WMAUR0169440
ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 1907729MB (3907029168 512 byte sectors)
ada3 at ahcich4 bus 0 scbus3 target 0 lun 0
ada3: <HGST HDS724040ALE640 MJAOA580> ATA8-ACS SATA 3.x device
ada3: Serial Number PK2334PBG4GTDT
ada3: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada3: Command Queueing enabled
ada3: 3815447MB (7814037168 512 byte sectors)
ada4 at ahcich5 bus 0 scbus4 target 0 lun 0
ada4: <WDC WD2001FASS-00U0B0 01.00101> ATA8-ACS SATA 2.x device
ada4: Serial Number WD-WMAUR0169605
ada4: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada4: Command Queueing enabled
ada4: 1907729MB (3907029168 512 byte sectors)
pass5 at ahcich9 bus 0 scbus8 target 0 lun 0
pass5: <Marvell Console 1.01> Removable Processor SCSI device
pass5: Serial Number HKDP221516WL
pass5: 150.000MB/s transfers (SATA 1.x, UDMA4, ATAPI 12bytes, PIO 8192bytes)
Trying to mount root from ufs:/dev/ada0s1a [rw]...
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
uhub0: 8 ports with 8 removable, self powered
ugen0.2: <vendor 0x8087 product 0x07e6> at usbus0
uhub1 on uhub0
uhub1: <vendor 0x8087 product 0x07e6, class 9/0, rev 2.00/0.14, addr 2> 
on usbus0
uhub1: 4 ports with 4 removable, self powered
ugen0.3: <vendor 0x0409 product 0x005a> at usbus0
uhub2 on uhub1
uhub2: <vendor 0x0409 product 0x005a, class 9/0, rev 2.00/1.00, addr 3> 
on usbus0
uhub2: 4 ports with 4 removable, self powered
ugen0.4: <vendor 0x046a product 0x002f> at usbus0
ukbd0 on uhub2
ukbd0: <vendor 0x046a product 0x002f, class 0/0, rev 2.00/1.00, addr 4> 
on usbus0
kbd2 at ukbd0
lagg0: link state changed to DOWN
ums0 on uhub2
ums0: <vendor 0x046a product 0x002f, class 0/0, rev 2.00/1.00, addr 4> 
on usbus0
ums0: 3 buttons and [XY] coordinates ID=0
igb0: link state changed to UP
lagg0: link state changed to UP
igb1: link state changed to UP


uname -a output:

11.1-RELEASE-p1 FreeBSD 11.1-RELEASE-p1 #0: Wed Aug  9 11:55:48 UTC 2017 
root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64


I'm not sure what more information I could provide but this issue only 
started after the upgrade so I'm wondering if it's a bug, perhaps 
something that has already been reported??


Unfortunately after the system becomes totally unresponsive I don't see 
any error messages on the local terminal and after reboot there is no 
core.dump so I have no idea what's going on.


Would anyone be able to offer any advise?


Thanks.


Kaya




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?c6bd6cc3-8549-521e-f891-47a396a61475>