From owner-freebsd-current@freebsd.org  Fri Nov 27 06:17:32 2015
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 42C23A3ADEE
 for <freebsd-current@mailman.ysv.freebsd.org>;
 Fri, 27 Nov 2015 06:17:32 +0000 (UTC)
 (envelope-from mgrooms@shrew.net)
Received: from mx2.shrew.net (mx2.shrew.net [38.97.5.132])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id C69211455
 for <freebsd-current@freebsd.org>; Fri, 27 Nov 2015 06:17:31 +0000 (UTC)
 (envelope-from mgrooms@shrew.net)
Received: from mail.shrew.net (mail.shrew.prv [10.24.10.20])
 by mx2.shrew.net (8.14.7/8.14.7) with ESMTP id tAR5v1qD050124
 for <freebsd-current@freebsd.org>; Thu, 26 Nov 2015 23:57:01 -0600 (CST)
 (envelope-from mgrooms@shrew.net)
Received: from [10.22.200.30] (cpe-72-179-24-154.austin.res.rr.com
 [72.179.24.154])
 by mail.shrew.net (Postfix) with ESMTPSA id 46B0818C28E
 for <freebsd-current@freebsd.org>; Thu, 26 Nov 2015 23:56:51 -0600 (CST)
Subject: Re: Resizing a zpool as a VMware ESXi guest ...
To: freebsd-current@freebsd.org
References: <543841B8.4070007@shrew.net> <20141016081016.GA4670@brick.home>
From: Matthew Grooms <mgrooms@shrew.net>
Message-ID: <5657F135.6080902@shrew.net>
Date: Thu, 26 Nov 2015 23:59:17 -0600
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101
 Thunderbird/38.3.0
MIME-Version: 1.0
In-Reply-To: <20141016081016.GA4670@brick.home>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3
 (mx2.shrew.net [10.24.10.11]); Thu, 26 Nov 2015 23:57:02 -0600 (CST)
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Nov 2015 06:17:32 -0000

On 10/16/2014 3:10 AM, Edward Tomasz Napierała wrote:
> On 1010T1529, Matthew Grooms wrote:
>> All,
>>
>> I am a long time user and advocate of FreeBSD and manage a several
>> deployments of FreeBSD in a few data centers. Now that these
>> environments are almost always virtual, it would make sense that FreeBSD
>> support for basic features such as dynamic disk resizing. It looks like
>> most of the parts are intended to work. Kudos to the FreeBSD foundation
>> for seeing the need and sponsoring dynamic increase of online UFS
>> filesystems via growfs. Unfortunately, it would appear that there are
>> still problems in this area, such as ...
>>
>> a) cam/geom recognizing when a drive's size has increased
>> b) zpool recognizing when a gpt partition size has increased
>>
>> For example, if I do an install of FreeBSD 10 on VMware using ZFS, I see
>> the following ...
>>
>> root@zpool-test:~ #  gpart show
>> =>      34  16777149  da0  GPT  (8.0G)
>>           34      1024    1  freebsd-boot  (512K)
>>         1058   4194304    2  freebsd-swap  (2.0G)
>>      4195362  12581821    3  freebsd-zfs  (6.0G)
>>
>> If I increase the VM disk size using VMware to 16G and rescan using
>> camcontrol, this is what I see ...
> "camcontrol rescan" does not force fetching the updated disk size.
> AFAIK there is no way to do that.  However, this should happen
> automatically, if the "other side" properly sends proper Unit Attention
> after resizing.  No idea why this doesn't happen with VMWare.
> Reboot obviously clears things up.
>
> [..]
>
>> Now I want the claim the additional 14 gigs of space for my zpool ...
>>
>> root@zpool-test:~ # zpool status
>>     pool: zroot
>>    state: ONLINE
>>     scan: none requested
>> config:
>>
>>           NAME                                          STATE     READ
>> WRITE CKSUM
>>           zroot                                         ONLINE 0     0     0
>>             gptid/352086bd-50b5-11e4-95b8-0050569b2a04  ONLINE 0     0     0
>>
>> root@zpool-test:~ # zpool set autoexpand=on zroot
>> root@zpool-test:~ # zpool online -e zroot
>> gptid/352086bd-50b5-11e4-95b8-0050569b2a04
>> root@zpool-test:~ # zpool list
>> NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
>> zroot  5.97G   876M  5.11G    14%  1.00x  ONLINE  -
>>
>> The zpool appears to still only have 5.11G free. Lets reboot and try
>> again ...
> Interesting.  This used to work; actually either of those (autoexpand or
> online -e) should do the trick.
>
>> root@zpool-test:~ # zpool set autoexpand=on zroot
>> root@zpool-test:~ # zpool online -e zroot
>> gptid/352086bd-50b5-11e4-95b8-0050569b2a04
>> root@zpool-test:~ # zpool list
>> NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
>> zroot  14.0G   876M  13.1G     6%  1.00x  ONLINE  -
>>
>> Now I have 13.1G free. I can add this space to any of my zfs volumes and
>> it picks the change up immediately. So the question remains, why do I
>> need to reboot the OS twice to allocate new disk space to a volume?
>> FreeBSD is first and foremost a server operating system. Servers are
>> commonly deployed in data centers. Virtual environments are now
>> commonplace in data centers, not the exception to the rule. VMware still
>> has the vast majority of the private virutal environment market. I
>> assume that most would expect things like this to work out of the box.
>> Did I miss a required step or is this fixed in CURRENT?
> Looks like genuine bugs (or rather, one missing feature and one bug).
> Filling PRs for those might be a good idea.
>

All,

I know this is a very late follow up, but spent some more time looking 
at this today and found some additional information that I found quite 
interesting. I setup two VMs, one that acts as an iSCSI initiator ( 
CURRENT ) and another that acts as a target ( 10.2-RELEASE ). Both are 
running under ESXi v5.5. There are two block devices on the initiator, 
da1 and da2, that I used for resize testing ...

[root@iscsi-i /home/mgrooms]# camcontrol devlist
<NECVMWar VMware IDE CDR10 1.00>   at scbus1 target 0 lun 0 (cd0,pass0)
<VMware Virtual disk 1.0>          at scbus2 target 0 lun 0 (pass1,da0)
<VMware Virtual disk 1.0>          at scbus2 target 1 lun 0 (pass2,da1)
<FREEBSD CTLDISK 0001>             at scbus3 target 0 lun 0 (da2,pass3)

The da1 device is a virtual disk hanging off of a VMware virtual SAS 
controller ...

[root@iscsi-i /home/mgrooms]# pciconf
...
mpt0@pci0:3:0:0:        class=0x010700 card=0x197615ad chip=0x00541000 
rev=0x01 hdr=0x00
     vendor     = 'LSI Logic / Symbios Logic'
     device     = 'SAS1068 PCI-X Fusion-MPT SAS'
     class      = mass storage
     subclass   = SAS

[root@iscsi-i /home/mgrooms]# camcontrol readcap da1 -h
Device Size: 10 G, Block Length: 512 bytes

[root@iscsi-i /home/mgrooms]# gpart show da1
=>      40  20971440  da1  GPT  (10G)
         40  20971440    1  freebsd-ufs  (10G)

The da2 device is an iSCSI LUN mounted from my FreeBSD 10.2 VM running 
ctld ...

[root@iscsi-i /home/mgrooms]# iscsictl
Target name                          Target portal    State
iqn.2015-01.lab.shrew:target0        iscsi-t.shrew.lab Connected: da2

[root@iscsi-i /home/mgrooms]# camcontrol readcap da2 -h
Device Size: 10 G, Block Length: 512 bytes

[root@iscsi-i /home/mgrooms]# gpart show da2
=>      40  20971440  da2  GPT  (10G)
         40        24       - free -  (12K)
         64  20971392    1  freebsd-ufs  (10G)
   20971456        24       - free -  (12K)

When I increased the size of da1 ( the VMDK ) and then re-ran 
'camcontrol readcap' without a reboot, it clearly showed that the disk 
size had increased. However, geom failed to recognize the additional 
capacity ...

[root@iscsi-i /home/mgrooms]# camcontrol readcap da1 -h
Device Size: 16 G, Block Length: 512 bytes

[root@iscsi-i /home/mgrooms]# gpart show da1
=>      40  20971440  da1  GPT  (10G)
         40  20971440    1  freebsd-ufs  (10G)

Here is the interesting bit. I increased the size of da2 by modifying 
the lun size in ctld.conf on the target and then issued a /etc/rd.d/ctld 
reload. When I re-ran 'camcontrol readcap' on the initiator without a 
reboot, it also showed that the disk size had increased, but this time 
geom recognized the additional capacity as well ...

[root@iscsi-i /home/mgrooms]# camcontrol readcap da2 -h
Device Size: 16 G, Block Length: 512 bytes

[root@iscsi-i /home/mgrooms]# gpart show da2
=>      40  33554352  da2  GPT  (16G)
         40        24       - free -  (12K)
         64  20971392    1  freebsd-ufs  (10G)
   20971456  12582936       - free -  (6.0G)

I was then able to resize the partition and then grow the UFS 
filesystem, all without rebooting the VM ...

[root@iscsi-i /home/mgrooms]# gpart resize -i 1 da2
da2p1 resized

[root@iscsi-i /home/mgrooms]# gpart show da2
=>      40  33554352  da2  GPT  (16G)
         40        24       - free -  (12K)
         64  33554304    1  freebsd-ufs  (16G)
   33554368        24       - free -  (12K)

[root@iscsi-i /home/mgrooms]# growfs da2p1
Device is mounted read-write; resizing will result in temporary write 
suspension for /var/data2.
It's strongly recommended to make a backup before growing the file system.
OK to grow filesystem on /dev/da2p1, mounted on /var/data2, from 10GB to 
16GB? [Yes/No] Yes
super-block backups (for fsck_ffs -b #) at:
  21798272, 23080512, 24362752, 25644992, 26927232, 28209472, 29491712, 
30773952, 32056192, 33338432

[root@iscsi-i /home/mgrooms]# df -h
Filesystem    Size    Used   Avail Capacity  Mounted on
/dev/da0p3     15G    1.2G     12G     9%    /
devfs         1.0K    1.0K      0B   100%    /dev
/dev/da1p1    9.7G     32M    8.9G     0%    /var/data1
/dev/da2p1     15G     32M     14G     0%    /var/data2

It's also worth noting that the additional space was not recognized by 
gpart/geom on the initiator until after the 'camcontrol readcap da2' 
command was run. In other words, I'm skeptical that it was a Unit 
Attention notification that made the right thing happen since it still 
took manual prodding of cam to get the new disk geometry up into the 
geom layer.

So what's the difference between the Virtual SAS block device vs the 
iSCSI block device? I'm sure I have no idea. But in my mind this 
invalidates two previous notions that were floated when I brought this 
problem up late last year ...

1) There is in fact a command that can be manually run to force cam to 
read new disk geometry. And when that new geometry is read, it is, at 
least in some cases, passed on to geom.

2) While ESXi may or may not be issuing the correct SCSI notifications 
to help the OS pick up the new disk geometry automatically, it surely 
reports the new size when asked. Additionally, all the pluming is in 
place to allow the entire disk, geom, fs resize process to work without 
a reboot. I'm just not sure why it seems to work with iSCSI but doesn't 
with the virtualized SAS controller.

Any thoughts?

Thanks,

-Matthew