FreeBSD Mail Archives

Date:      Wed, 29 Oct 2014 10:36:38 +0100
From:      Bengt Ahlgren <bengta@sics.se>
To:        Kevin Oberman <rkoberman@gmail.com>
Cc:        FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>, Walter Hop <freebsd@spam.lifeforms.nl>
Subject:   Re: System hang on shutdown when running freebsd-update
Message-ID:  <uh7k33jgqu1.fsf@P142s.sics.se>
In-Reply-To: <CAN6yY1vyOO53dpZSVX103b510ArzoAdahPeM81tR=QsNDPv=uA@mail.gmail.com> (Kevin Oberman's message of "Tue, 28 Oct 2014 20:21:08 -0700")
References:  <2B4EEDA7-C3D9-465A-B0C9-B5728D438077@spam.lifeforms.nl> <CAN6yY1vyOO53dpZSVX103b510ArzoAdahPeM81tR=QsNDPv=uA@mail.gmail.com>

Kevin Oberman <rkoberman@gmail.com> writes:

> On Tue, Oct 28, 2014 at 3:09 PM, Walter Hop <freebsd@spam.lifeforms.nl>
> wrote:
>
>> [Apologies for not replying directly to the thread; I found it at
>> https://lists.freebsd.org/pipermail/freebsd-stable/2014-October/080595.h=
tml
>> ]
>>
>> I noticed this same hang after upgrading from 10.0-RELEASE to 10.1-RC3 in
>> a VM running under VMware Fusion, so the problem appears still present.
>>
>> I could only make it happen in the single uptime just after the system w=
as
>> freebsd-updated from FreeBSD 10.0 to 10.1-RC3.
>>
>> Here is a screenshot: http://lf.ms/wait-for-reboot.png
>>
>> It did not make any progress after 2 hours of waiting. When restarting t=
he
>> VM, the disk was dirty.
>>
>> Some interesting facts:
>> - Note "swapoff: /dev/da0p2: Cannot allocate memory" in the screenshot
>> which might pose a clue. I haven=E2=80=99t seen this normally.
>> - FreeBSD does respond to ping while it is busy, so it is not a complete
>> "freeze".
>> - The VM is at 100% CPU while this is going on.
>>
>> I have created a snapshot of the VM in the failed state, so maybe some
>> useful information could be retrieved from it, although I don=E2=80=99t =
have any
>> experience with kernel debugging over VMware.
>>
>> Cheers,
>> WH
>>
>> --
>> Walter Hop | PGP key: https://lifeforms.nl/pgp
>>
>> I am starting to suspect that some code that is needed to flush a resour=
ce
> that is blocking the complete shutdown is no longer available so waiting =
is
> not going to work. I tried a simple "shutdown now" and waited in single
> user mode for a minute before "reboot". It worked fine.
>
> This is based on guesswork, but seems to fit the symptoms.

Some more guesswork that better fit Walter's symtom than Kevin's...

I have noticed that our server with large amounts of disk (three ZFS
pools with 22x4TB disks) and 128GB RAM, often takes quite some time to
shut down after syncing the disks.  The last time it was in the order of
10 mins, but it has always completed.

It seems to be related to swap.  Swap is on dedicated GPT partitions on
two system disks, and during the 10mins, it first accesses the first of
these disks, then the other.  I know for sure that the second must be
accesses to swap, because this is the only partition currently used on
this disk.

I believe that it had in the order of 6GB pushed out to swap the last
time.  It is running 9.3-REL without Denninger's ZFS patches, so it
tends to push some stuff to swap.

Is there some swap GC going on before shutdown that can take this time?

Bengt

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?uh7k33jgqu1.fsf>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation