Date: Fri, 17 Jun 2011 06:45:22 +0000 From: Marcus Reid <marcus@blazingdot.com> To: Per von Zweigbergk <pvz@itassistans.se> Cc: freebsd-fs@freebsd.org Subject: Re: Disk usage and ZFS deduplication Message-ID: <20110617064522.GA91945@blazingdot.com> In-Reply-To: <9544F7B9-E286-4266-86E3-B4D1A667CBBD@itassistans.se> References: <9544F7B9-E286-4266-86E3-B4D1A667CBBD@itassistans.se>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jun 14, 2011 at 09:19:32AM +0200, Per von Zweigbergk wrote: > I've been following the "Impossible compression ratio on ZFS" thread > with some interest, and it made me ask myself this: > > Let us say we have a hypothetical zfs filesystem with the equally > hypothetical files A and B. The filesystem has deduplication enabled. > Both files have an apparent file size of 100 MB, but 50 MB of that > data is common between the two files and thus can be deduplicated. > This would mean that total disk usage would be 150 MB. > > If you use "du" to determine disk size for a deduplication, what would > be the result? Which file would the common data be accounted to? Or > would it be accounted to both files somehow, in part or in > full? Pretty simple test. [root@luna /root]# zfs create -o mountpoint=/dedup -o dedup=on data/dedup [root@luna /usr/data]# dd if=/dev/urandom of=set_a_50MiB bs=1m count=50 [root@luna /usr/data]# dd if=/dev/urandom of=set_b_50MiB bs=1m count=50 [root@luna /usr/data]# dd if=/dev/urandom of=set_c_50MiB bs=1m count=50 [root@luna /usr/data]# cat set_a_50MiB set_b_50MiB > file_1 [root@luna /usr/data]# cat set_a_50MiB set_c_50MiB > file_2 [root@luna /usr/data]# cp file_1 /dedup [root@luna /usr/data]# cp file_2 /dedup [root@luna /usr/data]# zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT data 101G 32.8G 68.2G 32% 1.33x ONLINE - [root@luna /usr/data]# cd /dedup [root@luna /dedup]# du -sk * 102479 file_1 102479 file_2 Marcus
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110617064522.GA91945>