From owner-freebsd-stable@FreeBSD.ORG Tue Oct 14 11:20:06 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0EEEACF5; Tue, 14 Oct 2014 11:20:06 +0000 (UTC) Received: from smtp1.multiplay.co.uk (smtp1.multiplay.co.uk [85.236.96.35]) by mx1.freebsd.org (Postfix) with ESMTP id 775F7C84; Tue, 14 Oct 2014 11:20:04 +0000 (UTC) Received: by smtp1.multiplay.co.uk (Postfix, from userid 65534) id 99E9F20E70942; Tue, 14 Oct 2014 11:20:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.multiplay.co.uk X-Spam-Level: * X-Spam-Status: No, score=2.0 required=8.0 tests=AWL,BAYES_00,DOS_OE_TO_MX, FSL_HELO_NON_FQDN_1,RDNS_DYNAMIC autolearn=no version=3.3.1 Received: from r2d2 (82-69-141-170.dsl.in-addr.zen.co.uk [82.69.141.170]) by smtp1.multiplay.co.uk (Postfix) with ESMTPS id E634D20E70940; Tue, 14 Oct 2014 11:20:00 +0000 (UTC) Message-ID: <138CF459AA0B41EB8CB4E11B3DE932CF@multiplay.co.uk> From: "Steven Hartland" To: "Steven Hartland" , "K. Macy" References: <54372173.1010100@ijs.si> <644FA8299BF848E599B82D2C2C298EA7@multiplay.co.uk> <54372EBA.1000908@ijs.si> <543731F3.8090701@ijs.si> <543AE740.7000808@ijs.si> <6E01BBEDA9984CCDA14F290D26A8E14D@multiplay.co.uk> <14ADE02801754E028D9A0EAB4A16527E@multiplay.co.uk> <543C3C47.4010208@ijs.si> Subject: Re: zpool import hangs when out of space - Was: zfs pool import hangs on [tx->tx_sync_done_cv] Date: Tue, 14 Oct 2014 12:19:58 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: "freebsd-fs@FreeBSD.org" , FreeBSD Stable , mark X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Oct 2014 11:20:06 -0000 ----- Original Message ----- From: "Steven Hartland" To: "K. Macy" Cc: "freebsd-fs@FreeBSD.org" ; "mark" ; "FreeBSD Stable" Sent: Tuesday, October 14, 2014 9:14 AM Subject: Re: zpool import hangs when out of space - Was: zfs pool import hangs on [tx->tx_sync_done_cv] > ----- Original Message ----- > From: "K. Macy" > > >>>> Thank you both for analysis and effort! >>>> >>>> I can't rule out the possibility that my main system pool >>>> on a SSD was low on space at some point in time, but the >>>> three 4 GiB cloned pools (sys1boot and its brothers) were all >>>> created as a zfs send / receive copies of the main / (root) >>>> file system and I haven't noticed anything unusual during >>>> syncing. This syncing was done manually (using zxfer) and >>>> independently from the upgrade on the system - on a steady/quiet >>>> system, when the source file system definitely had sufficient >>>> free space. >>>> >>>> The source file system now shows 1.2 GiB of usage shown >>>> by df: >>>> shiny/ROOT 61758388 1271620 60486768 2% / >>>> Seems unlikely that the 1.2 GiB has grown to 4 GiB space >>>> on a cloned filesystem. >>>> >>>> Will try to import the main two pools after re-creating >>>> a sane boot pool... >>> >>> >>> Yer zfs list only shows around 2-3GB used too but zpool list >>> shows the pool is out of space. Cant rule out an accounting >>> issue though. >>> >> >> What is using the extra space in the pool? Is there an unmounted >> dataset or snapshot? Do you know how to easily tell? Unlike txg and >> zio processing I don't have the luxury of having just read that part >> of the codebase. > > Its not clear but I believe it could just be fragmention even though > its ashift=9. > > I sent the last snapshot to another pool of the same size and it > resulted in: > NAME SIZE ALLOC FREE FRAG EXPANDSZ CAP DEDUP HEALTH ALTROOT > sys1boot 3.97G 3.97G 190K 0% - 99% 1.00x ONLINE - > sys1copy 3.97G 3.47G 512M 72% - 87% 1.00x ONLINE - > > I believe FRAG is 0% as the feature wasn't enabled for the lifetime of > the pool hence its simply not showing a valid value. > > zfs list -t all -r sys1boot > NAME USED AVAIL REFER MOUNTPOINT > sys1boot 1.76G 2.08G 11K /sys1boot > sys1boot/ROOT 1.72G 2.08G 1.20G /sys1boot/ROOT > sys1boot/ROOT@auto-2014-08-16_04.00 1K - 1.19G - > sys1boot/ROOT@auto-2014-08-17_04.00 1K - 1.19G - .. Well interesting issue I left this pool alone this morning literally doing nothing, and its now out of space. zpool list NAME SIZE ALLOC FREE FRAG EXPANDSZ CAP DEDUP HEALTH ALTROOT sys1boot 3.97G 3.97G 190K 0% - 99% 1.00x ONLINE - sys1copy 3.97G 3.97G 8K 0% - 99% 1.00x ONLINE - There's something very wrong here as nothing has been accessing the pool. pool: zfs state: ONLINE status: One or more devices are faulted in response to IO failures. action: Make sure the affected devices are connected, then run 'zpool clear'. see: http://illumos.org/msg/ZFS-8000-HC scan: none requested config: NAME STATE READ WRITE CKSUM zfs ONLINE 0 2 0 md1 ONLINE 0 0 0 I tried destroying the pool and ever that failed, presumably because the pool has suspended IO. Regards Steve