Date: Wed, 29 Oct 2014 10:36:38 +0100 From: Bengt Ahlgren <bengta@sics.se> To: Kevin Oberman <rkoberman@gmail.com> Cc: FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>, Walter Hop <freebsd@spam.lifeforms.nl> Subject: Re: System hang on shutdown when running freebsd-update Message-ID: <uh7k33jgqu1.fsf@P142s.sics.se> In-Reply-To: <CAN6yY1vyOO53dpZSVX103b510ArzoAdahPeM81tR=QsNDPv=uA@mail.gmail.com> (Kevin Oberman's message of "Tue, 28 Oct 2014 20:21:08 -0700") References: <2B4EEDA7-C3D9-465A-B0C9-B5728D438077@spam.lifeforms.nl> <CAN6yY1vyOO53dpZSVX103b510ArzoAdahPeM81tR=QsNDPv=uA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Kevin Oberman <rkoberman@gmail.com> writes: > On Tue, Oct 28, 2014 at 3:09 PM, Walter Hop <freebsd@spam.lifeforms.nl> > wrote: > >> [Apologies for not replying directly to the thread; I found it at >> https://lists.freebsd.org/pipermail/freebsd-stable/2014-October/080595.h= tml >> ] >> >> I noticed this same hang after upgrading from 10.0-RELEASE to 10.1-RC3 in >> a VM running under VMware Fusion, so the problem appears still present. >> >> I could only make it happen in the single uptime just after the system w= as >> freebsd-updated from FreeBSD 10.0 to 10.1-RC3. >> >> Here is a screenshot: http://lf.ms/wait-for-reboot.png >> >> It did not make any progress after 2 hours of waiting. When restarting t= he >> VM, the disk was dirty. >> >> Some interesting facts: >> - Note "swapoff: /dev/da0p2: Cannot allocate memory" in the screenshot >> which might pose a clue. I haven=E2=80=99t seen this normally. >> - FreeBSD does respond to ping while it is busy, so it is not a complete >> "freeze". >> - The VM is at 100% CPU while this is going on. >> >> I have created a snapshot of the VM in the failed state, so maybe some >> useful information could be retrieved from it, although I don=E2=80=99t = have any >> experience with kernel debugging over VMware. >> >> Cheers, >> WH >> >> -- >> Walter Hop | PGP key: https://lifeforms.nl/pgp >> >> I am starting to suspect that some code that is needed to flush a resour= ce > that is blocking the complete shutdown is no longer available so waiting = is > not going to work. I tried a simple "shutdown now" and waited in single > user mode for a minute before "reboot". It worked fine. > > This is based on guesswork, but seems to fit the symptoms. Some more guesswork that better fit Walter's symtom than Kevin's... I have noticed that our server with large amounts of disk (three ZFS pools with 22x4TB disks) and 128GB RAM, often takes quite some time to shut down after syncing the disks. The last time it was in the order of 10 mins, but it has always completed. It seems to be related to swap. Swap is on dedicated GPT partitions on two system disks, and during the 10mins, it first accesses the first of these disks, then the other. I know for sure that the second must be accesses to swap, because this is the only partition currently used on this disk. I believe that it had in the order of 6GB pushed out to swap the last time. It is running 9.3-REL without Denninger's ZFS patches, so it tends to push some stuff to swap. Is there some swap GC going on before shutdown that can take this time? Bengt
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?uh7k33jgqu1.fsf>