Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Mar 2015 16:12:23 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-bugs@FreeBSD.org
Subject:   [Bug 195458] Hang on shutdown/root unmount after FreeBSD 10.1R upgrade
Message-ID:  <bug-195458-8-EsTwlPntDW@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-195458-8@https.bugs.freebsd.org/bugzilla/>
References:  <bug-195458-8@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=195458

--- Comment #54 from elofu17@hotmail.com ---
Install a VirtualBox VM via 10.1-RELEASE CD.
Reboot after installer: OK

Boot the installed system.
Reboot after doing nothing at all but logging in: OK

Boot again.
Run 'freebsd-update fetch install' (to 10.1-RELEASE-p6, 705 patches, 462 files)
Now I take a snapshot in VirtualBox - "Updates installed, not rebooted yet".
I now run 'reboot': Fail
  Syncing disks, vnodes remaining...3 1 0 done
  All buffers synced.

Poweroff in VirtualBox.
Boot the machine again.
Reboot: OK.



I now Restore Snapshot "Updates installed, not rebooted yet" and start the VM
again.
'reboot' again fails:
  Syncing disks, vnodes remaining...3 1 0 0 done
  All buffers synced.

Restore again and run 'sync && reboot': Fail
  Syncing disks, vnodes remaining...3 1 0 0 done
  All buffers synced.

Restore again.
sync
sync
sleep 30
reboot     Fail
  Syncing disks, vnodes remaining...3 1 0 0 done
  All buffers synced.

Restore again.
sync ; sleep 5 ; sync ; sleep 5 ; shutdown -r now
  Syncing disks, vnodes remaining...2 0 0 done
  All buffers synced.


So... It fails with both 'reboot' and 'shutdown -r now'.


Restore again.
shutdown now
  stopping cron
  stopping sshd
  stopping devd
  Writing entropy file:.
  .
  syslogd exiting on signal 15
  Enter full pathname of shell or RETURN for /bin/sh:
  <return>
  sync; sleep 5; reboot   
  Syncing disks, vnodes remaining...0 done
  All buffers synced.

Now it hangs for 20 seconds, so it looks like it once again failed, BUT...
Suddenly the machine reboots!!!

(Normally the machine waits less than 1 second after the "All buffers synced"
message when I've run a 'reboot' command, so this must be a 20 second timeout
somewhere)

Also, I see no root (/) fs warnings upon booting. Yay!




I went back and re-ran the 'sync ; sleep 5 ; sync ; sleep 5 ; shutdown -r now'
command and waited several minutes. No reboot. Fail.


Restore again, ran 'shutdown now', enter single-user-shell and 'reboot'
  Syncing disks, vnodes remaining...1 0 done
  All buffers synced.
After 20 seconds, the machine reboots.



Restore again, stopped devd and killed cron, syslogd, adjkerntz, dhclient,
sendmail and ran 'reboot'
  Syncing disks, vnodes remaining...1 0 done
  All buffers synced.
Nope it fails. Waited several minutes.


Restore again, ran 'shutdown -ro now' (execute 'reboot' instead of signalling
init().
  Syncing disks, vnodes remaining...2 2 0 0 done
  All buffers synced.
Fail.


Restore again, ran 'shutdown -ron now' (prevent filesystem cache from being
flushed)
  Syncing disks, vnodes remaining...2 2 0 0 done
  All buffers synced.
Now the machine instantly reboots! Yay!
  / was not properly dismounted
  /: mount pending error: blocks 8512 files 5
  ...Rebuilding fs from journal...





Findings:
'reboot' or 'shutdown -r' get the same results.
Manual pre 'sync' does nothing.
Running 'shutdown now' and hence entering single-user mode apparently does
something good.
Some buffers seem to be connected to a 20 second timeout.
Not flushing the buffers at all on shutdown removes the 20 second timeout (but
generates a corrupt fs).

You can easily reproduce this yourselves in VirtualBox to debug further. See
above.

The main problem is still a CRITICAL one, since even if you use the 'shutdown
now+single-user+20sec timeout'-approach to get the machine to finally _reboot_
OK, you still need KVM-access for the single-user-mode.
And if you use the 'shutdown -ron now'-approach, you do get the much needed
reboot, but you also get a corrupt fs... :-(
So remote FreeBSD machines without any iLO/IPMI still suffer badly from this. I
hope someone will find a fix soon.

/Elof

-- 
You are receiving this mail because:
You are the assignee for the bug.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-195458-8-EsTwlPntDW>