Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 Jan 2017 20:31:33 -0600
From:      Larry Rosenman <ler@FreeBSD.org>
To:        Steven Hartland <killing@multiplay.co.uk>
Cc:        Freebsd fs <freebsd-fs@freebsd.org>
Subject:   Re: 16.0E ExpandSize? -- New Server
Message-ID:  <a98b3a3da1665c8eac6160633a0bc778@FreeBSD.org>
In-Reply-To: <96534515-4fcb-774e-a599-8d48aec930cd@multiplay.co.uk>
References:  <00db0ab7243ce6368c246ae20f9c075a@FreeBSD.org> <1a69057c-dc59-9b78-9762-4f98a071105e@multiplay.co.uk> <ce5a1d39612d694077accda33266a3ab@FreeBSD.org> <ad07e84e-f297-362a-1398-c5503bb56a8d@multiplay.co.uk> <35a9034f91542bb1329ac5104bf3b773@FreeBSD.org> <76fc9505-f681-0de0-fe0c-5624b29de321@multiplay.co.uk> <22e1bfc5840d972cf93643733682cda1@FreeBSD.org> <f2600a53-0dc1-9f41-1405-ed22d96d30cf@multiplay.co.uk> <8a710dc75c129f58b0372eeaeca575b5@FreeBSD.org> <aef02eb0-0888-6fea-a4b8-4033ca56f4a3@multiplay.co.uk> <d3181bd00c827fb99fbcebe6fe097ef8@FreeBSD.org> <a3d78923-5046-11c8-daea-713eacf47bd2@multiplay.co.uk> <ffc24b7bfacd265d637b633566bbaa51@FreeBSD.org> <96534515-4fcb-774e-a599-8d48aec930cd@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
no grief that I can see: 

borg-new /home/ler $ sudo zdb
Password:
zroot:
version: 5000
name: 'zroot'
state: 0
txg: 96143
pool_guid: 11945658884309024932
hostid: 3619181042
hostname: ''
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 11945658884309024932
create_txg: 4
children[0]:
type: 'raidz'
id: 0
guid: 7596925654112466913
nparity: 1
metaslab_array: 42
metaslab_shift: 36
ashift: 12
asize: 11947471798272
is_log: 0
create_txg: 4
com.delphix:vdev_zap_top: 35
children[0]:
type: 'disk'
id: 0
guid: 1443238581175429852
path: '/dev/mfid4p4'
whole_disk: 1
DTL: 137
create_txg: 4
com.delphix:vdev_zap_leaf: 131
children[1]:
type: 'disk'
id: 1
guid: 1865792721003775978
path: '/dev/mfid0p4'
whole_disk: 1
DTL: 133
create_txg: 4
com.delphix:vdev_zap_leaf: 37
children[2]:
type: 'disk'
id: 2
guid: 12541720522827927342
path: '/dev/mfid1p4'
whole_disk: 1
DTL: 132
create_txg: 4
com.delphix:vdev_zap_leaf: 38
children[3]:
type: 'disk'
id: 3
guid: 13053934791777776444
path: '/dev/mfid3p4'
whole_disk: 1
DTL: 136
create_txg: 4
com.delphix:vdev_zap_leaf: 135
children[4]:
type: 'disk'
id: 4
guid: 4432707573898874857
path: '/dev/mfid2p4'
whole_disk: 1
DTL: 130
create_txg: 4
com.delphix:vdev_zap_leaf: 40
children[5]:
type: 'disk'
id: 5
guid: 5106293125005422556
path: '/dev/mfid5p4'
whole_disk: 1
DTL: 129
create_txg: 4
com.delphix:vdev_zap_leaf: 41
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
borg-new /home/ler $ sudo zpool list -v
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
zroot 10.8T 94.3G 10.7T - 0% 0% 1.00x ONLINE -
raidz1 10.8T 94.3G 10.7T - 0% 0%
mfid4p4 - - - - - -
mfid0p4 - - - - - -
mfid1p4 - - - - - -
mfid3p4 - - - - - -
mfid2p4 - - - - - -
mfid5p4 - - - - - -
borg-new /home/ler $ sudo zpool get all
NAME PROPERTY VALUE SOURCE
zroot size 10.8T -
zroot capacity 0% -
zroot altroot - default
zroot health ONLINE -
zroot guid 11945658884309024932 default
zroot version - default
zroot bootfs zroot/ROOT/default local
zroot delegation on default
zroot autoreplace off default
zroot cachefile - default
zroot failmode wait default
zroot listsnapshots off default
zroot autoexpand off default
zroot dedupditto 0 default
zroot dedupratio 1.00x -
zroot free 10.7T -
zroot allocated 94.3G -
zroot readonly off -
zroot comment - default
zroot expandsize - -
zroot freeing 0 default
zroot fragmentation 0% -
zroot leaked 0 default
zroot feature@async_destroy enabled local
zroot feature@empty_bpobj active local
zroot feature@lz4_compress active local
zroot feature@multi_vdev_crash_dump enabled local
zroot feature@spacemap_histogram active local
zroot feature@enabled_txg active local
zroot feature@hole_birth active local
zroot feature@extensible_dataset enabled local
zroot feature@embedded_data active local
zroot feature@bookmarks enabled local
zroot feature@filesystem_limits enabled local
zroot feature@large_blocks enabled local
zroot feature@sha512 enabled local
zroot feature@skein enabled local
borg-new /home/ler $ 

On 01/31/2017 5:22 pm, Steven Hartland wrote:

> Yep
> 
> On 31/01/2017 21:49, Larry Rosenman wrote: 
> 
> revert the other patch and apply this one? 
> 
> On 01/31/2017 3:47 pm, Steven Hartland wrote: Hmm, looks like there's also a bug in the way vdev_min_asize is calculated for raidz as it can and has resulted in child min_asize which won't provided enough space for the parent due to the use of unrounded integer division.
> 
> 1981411579221 * 6 = 11888469475326 < 11888469475328
> 
> You should have vdev_min_asize: 1981411579222 for your children.
> 
> Updated patch attached, however calculation still isn't 100% reversible so may need work, however it does now ensure that the children will provide enough capacity for min_asize even if all of them are shrunk to their individual min_asize, which I believe previously may not have been the case.
> 
> This isn't related to the incorrect EXPANDSZ output, but would be good if you could confirm it doesn't cause any issues for your pool given its state.
> 
> On 31/01/2017 21:00, Larry Rosenman wrote: 
> 
> borg-new /home/ler $ sudo ./vdev-stats.d
> Password:
> vdev_path: n/a, vdev_max_asize: 0, vdev_asize: 0, vdev_min_asize: 0
> vdev_path: n/a, vdev_max_asize: 11947471798272, vdev_asize: 11947478089728, vdev_min_asize: 11888469475328
> vdev_path: /dev/mfid4p4, vdev_max_asize: 1991245299712, vdev_asize: 1991245299712, vdev_min_asize: 1981411579221
> vdev_path: /dev/mfid0p4, vdev_max_asize: 1991246348288, vdev_asize: 1991246348288, vdev_min_asize: 1981411579221
> vdev_path: /dev/mfid1p4, vdev_max_asize: 1991246348288, vdev_asize: 1991246348288, vdev_min_asize: 1981411579221
> vdev_path: /dev/mfid3p4, vdev_max_asize: 1991247921152, vdev_asize: 1991247921152, vdev_min_asize: 1981411579221
> vdev_path: /dev/mfid2p4, vdev_max_asize: 1991246348288, vdev_asize: 1991246348288, vdev_min_asize: 1981411579221
> vdev_path: /dev/mfid5p4, vdev_max_asize: 1991246348288, vdev_asize: 1991246348288, vdev_min_asize: 1981411579221
> ^C 
> 
> borg-new /home/ler $ 
> 
> borg-new /home/ler $ sudo zpool list -v
> Password:
> NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
> zroot 10.8T 94.3G 10.7T 16.0E 0% 0% 1.00x ONLINE -
> raidz1 10.8T 94.3G 10.7T 16.0E 0% 0%
> mfid4p4 - - - - - -
> mfid0p4 - - - - - -
> mfid1p4 - - - - - -
> mfid3p4 - - - - - -
> mfid2p4 - - - - - -
> mfid5p4 - - - - - -
> borg-new /home/ler $ 
> 
> On 01/31/2017 2:37 pm, Steven Hartland wrote: In that case based on your zpool history I suspect that the original mfid4p4 was the same size as mfid0p4 (1991246348288) but its been replaced with a drive which is (1991245299712), slightly smaller.
> 
> This smaller size results in a max_asize of 1991245299712 * 6 instead of original 1991246348288* 6.
> 
> Now given the way min_asize (the value used to check if the device size is acceptable) is rounded to the the nearest metaslab I believe that replace would be allowed.
> https://github.com/freebsd/freebsd/blob/master/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c#L4947
> 
> Now the problem is that on open the calculated asize is only updated if its expanding:
> https://github.com/freebsd/freebsd/blob/master/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c#L1424
> 
> The updated dtrace file outputs vdev_min_asize which should confirm my suspicion about why the replace was allowed.
> 
> On 31/01/2017 19:05, Larry Rosenman wrote: 
> 
> I've replaced some disks due to failure, and some of the pariition sizes are different.  
> 
> autoexpand is off: 
> 
> borg-new /home/ler $ zpool get all zroot
> NAME PROPERTY VALUE SOURCE
> zroot size 10.8T -
> zroot capacity 0% -
> zroot altroot - default
> zroot health ONLINE -
> zroot guid 11945658884309024932 default
> zroot version - default
> zroot bootfs zroot/ROOT/default local
> zroot delegation on default
> zroot autoreplace off default
> zroot cachefile - default
> zroot failmode wait default
> zroot listsnapshots off default
> zroot autoexpand off default
> zroot dedupditto 0 default
> zroot dedupratio 1.00x -
> zroot free 10.7T -
> zroot allocated 94.3G -
> zroot readonly off -
> zroot comment - default
> zroot expandsize 16.0E -
> zroot freeing 0 default
> zroot fragmentation 0% -
> zroot leaked 0 default
> zroot feature@async_destroy enabled local
> zroot feature@empty_bpobj active local
> zroot feature@lz4_compress active local
> zroot feature@multi_vdev_crash_dump enabled local
> zroot feature@spacemap_histogram active local
> zroot feature@enabled_txg active local
> zroot feature@hole_birth active local
> zroot feature@extensible_dataset enabled local
> zroot feature@embedded_data active local
> zroot feature@bookmarks enabled local
> zroot feature@filesystem_limits enabled local
> zroot feature@large_blocks enabled local
> zroot feature@sha512 enabled local
> zroot feature@skein enabled local
> borg-new /home/ler $ 
> 
> borg-new /home/ler $ gpart show
> => 40 3905945520 mfid0 GPT (1.8T)
> 40 1600 1 efi (800K)
> 1640 1024 2 freebsd-boot (512K)
> 2664 1432 - free - (716K)
> 4096 16777216 3 freebsd-swap (8.0G)
> 16781312 3889162240 4 freebsd-zfs (1.8T)
> 3905943552 2008 - free - (1.0M) 
> 
> => 40 3905945520 mfid1 GPT (1.8T)
> 40 1600 1 efi (800K)
> 1640 1024 2 freebsd-boot (512K)
> 2664 1432 - free - (716K)
> 4096 16777216 3 freebsd-swap (8.0G)
> 16781312 3889162240 4 freebsd-zfs (1.8T)
> 3905943552 2008 - free - (1.0M) 
> 
> => 40 3905945520 mfid2 GPT (1.8T)
> 40 1600 1 efi (800K)
> 1640 1024 2 freebsd-boot (512K)
> 2664 1432 - free - (716K)
> 4096 16777216 3 freebsd-swap (8.0G)
> 16781312 3889162240 4 freebsd-zfs (1.8T)
> 3905943552 2008 - free - (1.0M) 
> 
> => 40 3905945520 mfid3 GPT (1.8T)
> 40 1600 1 efi (800K)
> 1640 1024 2 freebsd-boot (512K)
> 2664 16777216 3 freebsd-swap (8.0G)
> 16779880 3889165680 4 freebsd-zfs (1.8T) 
> 
> => 40 3905945520 mfid5 GPT (1.8T)
> 40 1600 1 efi (800K)
> 1640 1024 2 freebsd-boot (512K)
> 2664 1432 - free - (716K)
> 4096 16777216 3 freebsd-swap (8.0G)
> 16781312 3889162240 4 freebsd-zfs (1.8T)
> 3905943552 2008 - free - (1.0M) 
> 
> => 40 3905945520 mfid4 GPT (1.8T)
> 40 1600 1 efi (800K)
> 1640 1024 2 freebsd-boot (512K)
> 2664 1432 - free - (716K)
> 4096 16777216 3 freebsd-swap (8.0G)
> 16781312 3889160192 4 freebsd-zfs (1.8T)
> 3905941504 4056 - free - (2.0M) 
> 
> borg-new /home/ler $ 
> 
> this system was built last week, and I **CAN** rebuild it if necessary, but I didn't do anything strange (so I thought :) )  
> 
> On 01/31/2017 12:30 pm, Steven Hartland wrote: Your issue is the reported vdev_max_asize > vdev_asize:
> vdev_max_asize: 11947471798272
> vdev_asize:     11947478089728
> 
> max asize is smaller than asize by 6291456
> 
> For raidz1 Xsize should be the smallest disk Xsize * disks so:
> 1991245299712 * 6 = 11947471798272
> 
> So your max asize looks right but asize looks too big
> 
> Expand Size is calculated by:
> if (vd->vdev_aux == NULL && tvd != NULL && vd->vdev_max_asize != 0) {
> vs->vs_esize = P2ALIGN(vd->vdev_max_asize - vd->vdev_asize,
> 1ULL << tvd->vdev_ms_shift);
> }
> 
> So the question is why is asize too big?
> 
> Given you seem to have some random disk sizes do you have auto expand turned on?
> 
> On 31/01/2017 17:39, Larry Rosenman wrote: vdev_path: n/a, vdev_max_asize: 11947471798272, vdev_asize: 11947478089728

-- 
Larry Rosenman                     http://people.freebsd.org/~ler [1]
Phone: +1 214-642-9640                 E-Mail: ler@FreeBSD.org
US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281

-- 
Larry Rosenman                     http://people.freebsd.org/~ler [1]
Phone: +1 214-642-9640                 E-Mail: ler@FreeBSD.org
US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281

-- 
Larry Rosenman                     http://people.freebsd.org/~ler [1]
Phone: +1 214-642-9640                 E-Mail: ler@FreeBSD.org
US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281

-- 
Larry Rosenman                     http://people.freebsd.org/~ler
Phone: +1 214-642-9640                 E-Mail: ler@FreeBSD.org
US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281
 

Links:
------
[1] http://people.freebsd.org/%7Eler
From owner-freebsd-fs@freebsd.org  Wed Feb  1 02:43:53 2017
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id B4DC8CCB898
 for <freebsd-fs@mailman.ysv.freebsd.org>; Wed,  1 Feb 2017 02:43:53 +0000 (UTC)
 (envelope-from killing@multiplay.co.uk)
Received: from mail-wm0-x231.google.com (mail-wm0-x231.google.com
 [IPv6:2a00:1450:400c:c09::231])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 1FC0ADDD
 for <freebsd-fs@freebsd.org>; Wed,  1 Feb 2017 02:43:53 +0000 (UTC)
 (envelope-from killing@multiplay.co.uk)
Received: by mail-wm0-x231.google.com with SMTP id b65so17523636wmf.0
 for <freebsd-fs@freebsd.org>; Tue, 31 Jan 2017 18:43:53 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to;
 bh=Onjyfcowgphq4fg01yaHTyaRWptYUCtTfwO+WZ6Vte4=;
 b=vGQ+aLJqMWAZnCoGg21JyY4S19IZBe6vozGjF93hEckCepqqpW+KXOGfVxmtEALo+E
 yXE+tXRCgJ8vn/JZghAvLmbfjqg6couzYwEESZIr1Eezw9xtQpX9k5plriP+92Dy3MKc
 YHcBOYiQvmrNLxuGhQdya8C0/7V1GNfFU2aZH5ci9ZyG+IuW7NygJbjckHrB+dHVpmm8
 Avh+shyUX6Mbd/O52nFGcsHYZguZruPBNA8bo0f4jlrIpM3V6PORMwan2xgdrz9mq6P4
 OcznTg5MVrJ99T6uxS29mNT8Yv3o215Xjfcu7S1oPVG2oByzeq2N9kC0TbKfEqvKp7/D
 RAjg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to;
 bh=Onjyfcowgphq4fg01yaHTyaRWptYUCtTfwO+WZ6Vte4=;
 b=sWqfWhnUlCMo/FXq6UhkSKmcuo1jr2dZrWboi1tsy53E59DhY8G22h9c4z9eZ5+9Sk
 apz1/yASg+xTJnOz6qR6meoctOeqJFJzprpSXacPLz/FoglKK2wvgyEMbtk/bKsz4a7/
 NIs7GkTkgi1pVL5YojnMe/6t4F6u51ayzl373euYx/3bZPqFtZqMKjpa/0w1u8FXqqOR
 Jio6oxLZNHW4bQRDQtiL3QkSW7Hki0xKTps+1ORr6A04V5AbGTZs3bYJo8Rq8p9Rv0B2
 6c4Y9+BLDV0y4o2mcYC3jL/kXCS2+iIU/IzLq7cCx3Yi6I7V6ObmIWxXVisvS2cXrSk5
 sJTg==
X-Gm-Message-State: AIkVDXKvHLT292/VldlTjGTradL3dn+MgBOWQF++Oj7JSEiYk9zFKMxCQhTnKQKd6uoK0TIx
X-Received: by 10.223.135.163 with SMTP id b32mr353860wrb.184.1485917030756;
 Tue, 31 Jan 2017 18:43:50 -0800 (PST)
Received: from [10.10.1.58] ([185.97.61.26])
 by smtp.gmail.com with ESMTPSA id b8sm31355833wrb.17.2017.01.31.18.43.49
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 31 Jan 2017 18:43:49 -0800 (PST)
Subject: Re: 16.0E ExpandSize? -- New Server
To: Larry Rosenman <ler@FreeBSD.org>
References: <00db0ab7243ce6368c246ae20f9c075a@FreeBSD.org>
 <1a69057c-dc59-9b78-9762-4f98a071105e@multiplay.co.uk>
 <ce5a1d39612d694077accda33266a3ab@FreeBSD.org>
 <ad07e84e-f297-362a-1398-c5503bb56a8d@multiplay.co.uk>
 <35a9034f91542bb1329ac5104bf3b773@FreeBSD.org>
 <76fc9505-f681-0de0-fe0c-5624b29de321@multiplay.co.uk>
 <22e1bfc5840d972cf93643733682cda1@FreeBSD.org>
 <f2600a53-0dc1-9f41-1405-ed22d96d30cf@multiplay.co.uk>
 <8a710dc75c129f58b0372eeaeca575b5@FreeBSD.org>
 <aef02eb0-0888-6fea-a4b8-4033ca56f4a3@multiplay.co.uk>
 <d3181bd00c827fb99fbcebe6fe097ef8@FreeBSD.org>
 <a3d78923-5046-11c8-daea-713eacf47bd2@multiplay.co.uk>
 <ffc24b7bfacd265d637b633566bbaa51@FreeBSD.org>
 <96534515-4fcb-774e-a599-8d48aec930cd@multiplay.co.uk>
 <a98b3a3da1665c8eac6160633a0bc778@FreeBSD.org>
Cc: Freebsd fs <freebsd-fs@freebsd.org>
From: Steven Hartland <killing@multiplay.co.uk>
Message-ID: <8387d38f-3185-8c07-396b-602c708002a6@multiplay.co.uk>
Date: Wed, 1 Feb 2017 02:43:51 +0000
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.7.0
MIME-Version: 1.0
In-Reply-To: <a98b3a3da1665c8eac6160633a0bc778@FreeBSD.org>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>;
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 01 Feb 2017 02:43:53 -0000

Thanks I've put a PR in upstream to get some eyes on the fix.
https://github.com/openzfs/openzfs/pull/296

If no objections are raised to the approach I've used I'll commit the 
fix to HEAD too.

On 01/02/2017 02:31, Larry Rosenman wrote:
>
> no grief that I can see:
>
> borg-new /home/ler $ sudo zdb
> Password:
> zroot:
> version: 5000
> name: 'zroot'
> state: 0
> txg: 96143
> pool_guid: 11945658884309024932
> hostid: 3619181042
> hostname: ''
> com.delphix:has_per_vdev_zaps
> vdev_children: 1
> vdev_tree:
> type: 'root'
> id: 0
> guid: 11945658884309024932
> create_txg: 4
> children[0]:
> type: 'raidz'
> id: 0
> guid: 7596925654112466913
> nparity: 1
> metaslab_array: 42
> metaslab_shift: 36
> ashift: 12
> asize: 11947471798272
> is_log: 0
> create_txg: 4
> com.delphix:vdev_zap_top: 35
> children[0]:
> type: 'disk'
> id: 0
> guid: 1443238581175429852
> path: '/dev/mfid4p4'
> whole_disk: 1
> DTL: 137
> create_txg: 4
> com.delphix:vdev_zap_leaf: 131
> children[1]:
> type: 'disk'
> id: 1
> guid: 1865792721003775978
> path: '/dev/mfid0p4'
> whole_disk: 1
> DTL: 133
> create_txg: 4
> com.delphix:vdev_zap_leaf: 37
> children[2]:
> type: 'disk'
> id: 2
> guid: 12541720522827927342
> path: '/dev/mfid1p4'
> whole_disk: 1
> DTL: 132
> create_txg: 4
> com.delphix:vdev_zap_leaf: 38
> children[3]:
> type: 'disk'
> id: 3
> guid: 13053934791777776444
> path: '/dev/mfid3p4'
> whole_disk: 1
> DTL: 136
> create_txg: 4
> com.delphix:vdev_zap_leaf: 135
> children[4]:
> type: 'disk'
> id: 4
> guid: 4432707573898874857
> path: '/dev/mfid2p4'
> whole_disk: 1
> DTL: 130
> create_txg: 4
> com.delphix:vdev_zap_leaf: 40
> children[5]:
> type: 'disk'
> id: 5
> guid: 5106293125005422556
> path: '/dev/mfid5p4'
> whole_disk: 1
> DTL: 129
> create_txg: 4
> com.delphix:vdev_zap_leaf: 41
> features_for_read:
> com.delphix:hole_birth
> com.delphix:embedded_data
> borg-new /home/ler $ sudo zpool list -v
> NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
> zroot 10.8T 94.3G 10.7T - 0% 0% 1.00x ONLINE -
> raidz1 10.8T 94.3G 10.7T - 0% 0%
> mfid4p4 - - - - - -
> mfid0p4 - - - - - -
> mfid1p4 - - - - - -
> mfid3p4 - - - - - -
> mfid2p4 - - - - - -
> mfid5p4 - - - - - -
> borg-new /home/ler $ sudo zpool get all
> NAME PROPERTY VALUE SOURCE
> zroot size 10.8T -
> zroot capacity 0% -
> zroot altroot - default
> zroot health ONLINE -
> zroot guid 11945658884309024932 default
> zroot version - default
> zroot bootfs zroot/ROOT/default local
> zroot delegation on default
> zroot autoreplace off default
> zroot cachefile - default
> zroot failmode wait default
> zroot listsnapshots off default
> zroot autoexpand off default
> zroot dedupditto 0 default
> zroot dedupratio 1.00x -
> zroot free 10.7T -
> zroot allocated 94.3G -
> zroot readonly off -
> zroot comment - default
> zroot expandsize - -
> zroot freeing 0 default
> zroot fragmentation 0% -
> zroot leaked 0 default
> zroot feature@async_destroy enabled local
> zroot feature@empty_bpobj active local
> zroot feature@lz4_compress active local
> zroot feature@multi_vdev_crash_dump enabled local
> zroot feature@spacemap_histogram active local
> zroot feature@enabled_txg active local
> zroot feature@hole_birth active local
> zroot feature@extensible_dataset enabled local
> zroot feature@embedded_data active local
> zroot feature@bookmarks enabled local
> zroot feature@filesystem_limits enabled local
> zroot feature@large_blocks enabled local
> zroot feature@sha512 enabled local
> zroot feature@skein enabled local
> borg-new /home/ler $
>
>
>
> On 01/31/2017 5:22 pm, Steven Hartland wrote:
>
>> Yep
>>
>> On 31/01/2017 21:49, Larry Rosenman wrote:
>>>
>>> revert the other patch and apply this one?
>>>
>>>
>>>
>>> On 01/31/2017 3:47 pm, Steven Hartland wrote:
>>>
>>>     Hmm, looks like there's also a bug in the way vdev_min_asize is
>>>     calculated for raidz as it can and has resulted in child
>>>     min_asize which won't provided enough space for the parent due
>>>     to the use of unrounded integer division.
>>>
>>>     1981411579221 * 6 = 11888469475326 < 11888469475328
>>>
>>>     You should have vdev_min_asize: 1981411579222 for your children.
>>>
>>>     Updated patch attached, however calculation still isn't 100%
>>>     reversible so may need work, however it does now ensure that the
>>>     children will provide enough capacity for min_asize even if all
>>>     of them are shrunk to their individual min_asize, which I
>>>     believe previously may not have been the case.
>>>
>>>     This isn't related to the incorrect EXPANDSZ output, but would
>>>     be good if you could confirm it doesn't cause any issues for
>>>     your pool given its state.
>>>
>>>     On 31/01/2017 21:00, Larry Rosenman wrote:
>>>
>>>         borg-new /home/ler $ sudo ./vdev-stats.d
>>>         Password:
>>>         vdev_path: n/a, vdev_max_asize: 0, vdev_asize: 0,
>>>         vdev_min_asize: 0
>>>         vdev_path: n/a, vdev_max_asize: 11947471798272, vdev_asize:
>>>         11947478089728, vdev_min_asize: 11888469475328
>>>         vdev_path: /dev/mfid4p4, vdev_max_asize: 1991245299712,
>>>         vdev_asize: 1991245299712, vdev_min_asize: 1981411579221
>>>         vdev_path: /dev/mfid0p4, vdev_max_asize: 1991246348288,
>>>         vdev_asize: 1991246348288, vdev_min_asize: 1981411579221
>>>         vdev_path: /dev/mfid1p4, vdev_max_asize: 1991246348288,
>>>         vdev_asize: 1991246348288, vdev_min_asize: 1981411579221
>>>         vdev_path: /dev/mfid3p4, vdev_max_asize: 1991247921152,
>>>         vdev_asize: 1991247921152, vdev_min_asize: 1981411579221
>>>         vdev_path: /dev/mfid2p4, vdev_max_asize: 1991246348288,
>>>         vdev_asize: 1991246348288, vdev_min_asize: 1981411579221
>>>         vdev_path: /dev/mfid5p4, vdev_max_asize: 1991246348288,
>>>         vdev_asize: 1991246348288, vdev_min_asize: 1981411579221
>>>         ^C
>>>
>>>         borg-new /home/ler $
>>>
>>>
>>>         borg-new /home/ler $ sudo zpool list -v
>>>         Password:
>>>         NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
>>>         zroot 10.8T 94.3G 10.7T 16.0E 0% 0% 1.00x ONLINE -
>>>         raidz1 10.8T 94.3G 10.7T 16.0E 0% 0%
>>>         mfid4p4 - - - - - -
>>>         mfid0p4 - - - - - -
>>>         mfid1p4 - - - - - -
>>>         mfid3p4 - - - - - -
>>>         mfid2p4 - - - - - -
>>>         mfid5p4 - - - - - -
>>>         borg-new /home/ler $
>>>
>>>
>>>         On 01/31/2017 2:37 pm, Steven Hartland wrote:
>>>
>>>             In that case based on your zpool history I suspect that
>>>             the original mfid4p4 was the same size as mfid0p4
>>>             (1991246348288) but its been replaced with a drive which
>>>             is (1991245299712), slightly smaller.
>>>
>>>             This smaller size results in a max_asize of
>>>             1991245299712 * 6 instead of original 1991246348288* 6.
>>>
>>>             Now given the way min_asize (the value used to check if
>>>             the device size is acceptable) is rounded to the the
>>>             nearest metaslab I believe that replace would be allowed.
>>>             https://github.com/freebsd/freebsd/blob/master/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c#L4947
>>>
>>>             Now the problem is that on open the calculated asize is
>>>             only updated if its expanding:
>>>             https://github.com/freebsd/freebsd/blob/master/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c#L1424
>>>
>>>             The updated dtrace file outputs vdev_min_asize which
>>>             should confirm my suspicion about why the replace was
>>>             allowed.
>>>
>>>             On 31/01/2017 19:05, Larry Rosenman wrote:
>>>
>>>                 I've replaced some disks due to failure, and some of
>>>                 the pariition sizes are different.
>>>
>>>
>>>                 autoexpand is off:
>>>
>>>                 borg-new /home/ler $ zpool get all zroot
>>>                 NAME PROPERTY VALUE SOURCE
>>>                 zroot size 10.8T -
>>>                 zroot capacity 0% -
>>>                 zroot altroot - default
>>>                 zroot health ONLINE -
>>>                 zroot guid 11945658884309024932 default
>>>                 zroot version - default
>>>                 zroot bootfs zroot/ROOT/default local
>>>                 zroot delegation on default
>>>                 zroot autoreplace off default
>>>                 zroot cachefile - default
>>>                 zroot failmode wait default
>>>                 zroot listsnapshots off default
>>>                 zroot autoexpand off default
>>>                 zroot dedupditto 0 default
>>>                 zroot dedupratio 1.00x -
>>>                 zroot free 10.7T -
>>>                 zroot allocated 94.3G -
>>>                 zroot readonly off -
>>>                 zroot comment - default
>>>                 zroot expandsize 16.0E -
>>>                 zroot freeing 0 default
>>>                 zroot fragmentation 0% -
>>>                 zroot leaked 0 default
>>>                 zroot feature@async_destroy enabled local
>>>                 zroot feature@empty_bpobj active local
>>>                 zroot feature@lz4_compress active local
>>>                 zroot feature@multi_vdev_crash_dump enabled local
>>>                 zroot feature@spacemap_histogram active local
>>>                 zroot feature@enabled_txg active local
>>>                 zroot feature@hole_birth active local
>>>                 zroot feature@extensible_dataset enabled local
>>>                 zroot feature@embedded_data active local
>>>                 zroot feature@bookmarks enabled local
>>>                 zroot feature@filesystem_limits enabled local
>>>                 zroot feature@large_blocks enabled local
>>>                 zroot feature@sha512 enabled local
>>>                 zroot feature@skein enabled local
>>>                 borg-new /home/ler $
>>>
>>>
>>>                 borg-new /home/ler $ gpart show
>>>                 => 40 3905945520 mfid0 GPT (1.8T)
>>>                 40 1600 1 efi (800K)
>>>                 1640 1024 2 freebsd-boot (512K)
>>>                 2664 1432 - free - (716K)
>>>                 4096 16777216 3 freebsd-swap (8.0G)
>>>                 16781312 3889162240 4 freebsd-zfs (1.8T)
>>>                 3905943552 2008 - free - (1.0M)
>>>
>>>                 => 40 3905945520 mfid1 GPT (1.8T)
>>>                 40 1600 1 efi (800K)
>>>                 1640 1024 2 freebsd-boot (512K)
>>>                 2664 1432 - free - (716K)
>>>                 4096 16777216 3 freebsd-swap (8.0G)
>>>                 16781312 3889162240 4 freebsd-zfs (1.8T)
>>>                 3905943552 2008 - free - (1.0M)
>>>
>>>                 => 40 3905945520 mfid2 GPT (1.8T)
>>>                 40 1600 1 efi (800K)
>>>                 1640 1024 2 freebsd-boot (512K)
>>>                 2664 1432 - free - (716K)
>>>                 4096 16777216 3 freebsd-swap (8.0G)
>>>                 16781312 3889162240 4 freebsd-zfs (1.8T)
>>>                 3905943552 2008 - free - (1.0M)
>>>
>>>                 => 40 3905945520 mfid3 GPT (1.8T)
>>>                 40 1600 1 efi (800K)
>>>                 1640 1024 2 freebsd-boot (512K)
>>>                 2664 16777216 3 freebsd-swap (8.0G)
>>>                 16779880 3889165680 4 freebsd-zfs (1.8T)
>>>
>>>                 => 40 3905945520 mfid5 GPT (1.8T)
>>>                 40 1600 1 efi (800K)
>>>                 1640 1024 2 freebsd-boot (512K)
>>>                 2664 1432 - free - (716K)
>>>                 4096 16777216 3 freebsd-swap (8.0G)
>>>                 16781312 3889162240 4 freebsd-zfs (1.8T)
>>>                 3905943552 2008 - free - (1.0M)
>>>
>>>                 => 40 3905945520 mfid4 GPT (1.8T)
>>>                 40 1600 1 efi (800K)
>>>                 1640 1024 2 freebsd-boot (512K)
>>>                 2664 1432 - free - (716K)
>>>                 4096 16777216 3 freebsd-swap (8.0G)
>>>                 16781312 3889160192 4 freebsd-zfs (1.8T)
>>>                 3905941504 4056 - free - (2.0M)
>>>
>>>                 borg-new /home/ler $
>>>
>>>
>>>                 this system was built last week, and I **CAN**
>>>                 rebuild it if necessary, but I didn't do anything
>>>                 strange (so I thought :) )
>>>
>>>
>>>
>>>
>>>                 On 01/31/2017 12:30 pm, Steven Hartland wrote:
>>>
>>>                     Your issue is the reported vdev_max_asize >
>>>                     vdev_asize:
>>>                     vdev_max_asize: 11947471798272
>>>                     vdev_asize:     11947478089728
>>>
>>>                     max asize is smaller than asize by 6291456
>>>
>>>                     For raidz1 Xsize should be the smallest disk
>>>                     Xsize * disks so:
>>>                     1991245299712 * 6 = 11947471798272
>>>
>>>                     So your max asize looks right but asize looks
>>>                     too big
>>>
>>>                     Expand Size is calculated by:
>>>                     if (vd->vdev_aux == NULL && tvd != NULL &&
>>>                     vd->vdev_max_asize != 0) {
>>>                         vs->vs_esize = P2ALIGN(vd->vdev_max_asize -
>>>                     vd->vdev_asize,
>>>                             1ULL << tvd->vdev_ms_shift);
>>>                     }
>>>
>>>                     So the question is why is asize too big?
>>>
>>>                     Given you seem to have some random disk sizes do
>>>                     you have auto expand turned on?
>>>
>>>                     On 31/01/2017 17:39, Larry Rosenman wrote:
>>>
>>>                         vdev_path: n/a, vdev_max_asize:
>>>                         11947471798272, vdev_asize: 11947478089728
>>>
>>>
>>>                 -- 
>>>                 Larry Rosenman http://people.freebsd.org/~ler
>>>                 <http://people.freebsd.org/%7Eler>;
>>>                 Phone: +1 214-642-9640                 E-Mail:
>>>                 ler@FreeBSD.org <mailto:ler@FreeBSD.org>
>>>                 US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281
>>>
>>>
>>>         -- 
>>>         Larry Rosenman http://people.freebsd.org/~ler
>>>         <http://people.freebsd.org/%7Eler>;
>>>         Phone: +1 214-642-9640                 E-Mail:
>>>         ler@FreeBSD.org <mailto:ler@FreeBSD.org>
>>>         US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281
>>>
>>>
>>> -- 
>>> Larry Rosenman http://people.freebsd.org/~ler 
>>> <http://people.freebsd.org/%7Eler>;
>>> Phone: +1 214-642-9640                 E-Mail: ler@FreeBSD.org 
>>> <mailto:ler@FreeBSD.org>
>>> US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281
>
>
> -- 
> Larry Rosenman http://people.freebsd.org/~ler 
> <http://people.freebsd.org/%7Eler>;
> Phone: +1 214-642-9640                 E-Mail: ler@FreeBSD.org 
> <mailto:ler@FreeBSD.org>
> US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a98b3a3da1665c8eac6160633a0bc778>