From owner-freebsd-stable@FreeBSD.ORG Thu Oct 16 10:26:54 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 473223DE for ; Thu, 16 Oct 2014 10:26:54 +0000 (UTC) Received: from balrog.mythic-beasts.com (balrog.mythic-beasts.com [IPv6:2a00:1098:0:82:1000:0:2:1]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EF910D15 for ; Thu, 16 Oct 2014 10:26:53 +0000 (UTC) Received: from [130.209.247.112] (port=56036 helo=mangole.dcs.gla.ac.uk) by balrog.mythic-beasts.com with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from ) id 1XeiGg-0006M5-9E; Thu, 16 Oct 2014 11:26:52 +0100 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: System hang on shutdown when running freebsd-update From: Colin Perkins In-Reply-To: Date: Thu, 16 Oct 2014 11:26:47 +0100 Message-Id: <1951041E-D133-4301-B07C-5D4B5A17C521@csperkins.org> References: <7479DC25-4451-4940-AFE7-7C81D08206D4@csperkins.org> To: Kevin Oberman X-Mailer: Apple Mail (2.1878.6) X-BlackCat-Spam-Score: -28 X-Mythic-Debug: Threshold = On = X-Spam-Status: No, score=-2.9 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: FreeBSD-STABLE Mailing List X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Oct 2014 10:26:54 -0000 On 15 Oct 2014, at 18:23, Kevin Oberman wrote: > On Wed, Oct 15, 2014 at 2:40 AM, Colin Perkins = wrote: > On 14 Oct 2014, at 18:09, Kevin Oberman wrote: > > I thought that this was just a fluke, but it has now happened three = times, > > so I guess it's now out of the "fluke" class. > > > > I have upgraded several times recently to each 10.1 BETA and RC. = After the > > first install pass t install the kernel and modules, the system = shutdown > > freezes at the very end. I see the buffers synced to the disks and = get the > > "All buffers synced" message. Then it just hangs. The disks are not = marked > > as clean and are fscked after a reset and boot. > > > > There is not much between the "All buffers synced" message and the = call to > > vfs_unmountall(), so I suspect it is hanging in that call. I admit = that I > > am pretty much lost whenever I look at the VFS code and I have not = put a > > lot of effort going further. Just hoping that someone familiar with = it > > might have an idea. > > > > I have tried several reboots and all run normally. The problem only = seems > > to appear when upgrading the OS. It happened repeatedly when I tried = to > > reboot before doing the second "install" pass of freebsd-update, but = not > > after, so the kernel and world are not in sync. I am baffled as to = what > > could be going on, but it means I need to be at the system (a baby = server) > > when I upgrade, but not every time I upgrade. I know it happened on = the > > 10.0-RELEAASE to 10.1-BETA1 and 10.1-RC1 to 10.1-RC2 upgrades. > > > > Has anyone else seen this? >=20 > I=92m seeing the same behaviour, most recently when moving to 10.1-RC1 = (haven=92t gone to -RC2 yet). The system is: >=20 > FreeBSD 10.1-RC1 #0 r272463: Fri Oct 3 01:47:10 UTC 2014 > root@releng1.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 > FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) = 20140512 > CPU: AMD Opteron(TM) Processor 6274 (2200.05-MHz = K8-class CPU) > Origin =3D "AuthenticAMD" Id =3D 0x600f12 Family =3D 0x15 Model =3D = 0x1 Stepping =3D 2 > = Features=3D0x178bfbff > = Features2=3D0x1e98220b > AMD Features=3D0x2e500800 > AMD = Features2=3D0x1c9bfff > TSC: P-state invariant, performance statistics > real memory =3D 549755813888 (524288 MB) > avail memory =3D 534559084544 (509795 MB) > Event timer "LAPIC" quality 400 > ACPI APIC Table: <041112 APIC1739> > FreeBSD/SMP: Multiprocessor System Detected: 64 CPUs > FreeBSD/SMP: 4 package(s) x 16 core(s) > =85 >=20 > I do have IPMI loaded, unlike the other reports. >=20 > -- > Colin Perkins > https://csperkins.org/ >=20 > Paul Koch replied privately with a pointer to a seemingly unrelated = message he sent to stable last month. Take a look at the several = paragraphs at the end starting with "On a side note". I'm suspicious = that the generation of the large upgrade on /var during the "upgrade" = pass is causing the delay. It fits pretty well and, in normal operation, = my server would never see this issue at all. >=20 > = https://docs.freebsd.org/cgi/getmsg.cgi?fetch=3D326083+0+/usr/local/www/db= /text/2014/freebsd-stable/20140907.freebsd-stable >=20 > Aside from fsyncing the files, I suspect just running "upgrade" = waiting for a long time before doing reboot might prevent it from = happening. I is likely relevant that the single partition on the system = is a 500GB SU+J UFS. The update to -RC2 worked without problem on my system. I did, however, = wait until vmstat showed the disks being idle after running = freebsd-update (this only took a few minutes). That said, when updating to -RC1 previously I left the box to sit for = maybe an hour at the =93All buffers synced=94 stage before giving up and = resetting it, so if it was just waiting for the disks to sync, it was = taking a very long time to do so.=20 > I need to research a bit on how freebsd does things as well as = possible interaction with the large SU+J partition. I was already = uncomfortable about the SU+J but went with it due to the time it would = otherwise take to fsck the 500GB disk. > -- > R. Kevin Oberman, Network Engineer, Retired > rkoberman@gmail.com --=20 Colin Perkins https://csperkins.org/