From owner-freebsd-hackers@FreeBSD.ORG Fri Oct 21 17:18:47 2005 Return-Path: X-Original-To: freebsd-hackers@FreeBSD.ORG Delivered-To: freebsd-hackers@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BF8F816A41F for ; Fri, 21 Oct 2005 17:18:47 +0000 (GMT) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (lurza.secnetix.de [83.120.8.8]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2DAAC43D46 for ; Fri, 21 Oct 2005 17:18:46 +0000 (GMT) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (kbczgz@localhost [127.0.0.1]) by lurza.secnetix.de (8.13.1/8.13.1) with ESMTP id j9LHIi5Z060489 for ; Fri, 21 Oct 2005 19:18:45 +0200 (CEST) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by lurza.secnetix.de (8.13.1/8.13.1/Submit) id j9LHIi1o060488; Fri, 21 Oct 2005 19:18:44 +0200 (CEST) (envelope-from olli) Date: Fri, 21 Oct 2005 19:18:44 +0200 (CEST) Message-Id: <200510211718.j9LHIi1o060488@lurza.secnetix.de> From: Oliver Fromme To: freebsd-hackers@FreeBSD.ORG In-Reply-To: X-Newsgroups: list.freebsd-hackers User-Agent: tin/1.5.4-20000523 ("1959") (UNIX) (FreeBSD/4.11-RELEASE (i386)) X-Mailman-Approved-At: Sat, 22 Oct 2005 12:08:17 +0000 Cc: Subject: Re: FreeBSD UFS2 snapshots, and math ... X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: freebsd-hackers@FreeBSD.ORG List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Oct 2005 17:18:48 -0000 user wrote: > Let's say I have a filesystem, and on that filesystem I create a snapshot > every single night, and every night I delete the snapshot from 5 nights > ago. This means that at all times, I have four snapshots running on that > filesystem, one from 1 day ago, one from 2 days ago, one from 3 days ago, > and one from 4 days ago. > > Let's also assume that the percent change of the filesystem is 5% (every > day 5% of the blocks in the filesystem are either changed or deleted). > > Does this mean that if that 5% change is a different 5% every day, that > the one day ago snapshot will be size 5%_of_filesystem, and that the 2 day > ago snapshot will be size 10%_of_filesystem, day 3 15% and day 4 20%, for > a total of 50% of the total filesystem taken up with snapshot data ? No, the size requirement of every new snapshot should be 5%. Only the data that is modified requires new space in every case, even if there are multiple snapshots. In other words: If you have five snapshots, and you modify a file, should the original content of the file be copied to every snapshot, i.e. five times? That would be terribly inefficient. That's _not_ how snapshots work. Instead, when you modify the file, the new data will be written to new disk blocks, and the blocks containing the original data are assigned to the snapshots. There is no copying involved, and the data exists only once on the disk. Therefore, when you change 5%, the size requirement for the snapshots grows by 5%, no matter how many snapshots you have and how old they are. > If the 5% data changed per day is the _same_ 5% every day (perhaps > changing the same table in a DB every day, That depends on the DB. For PostgreSQL the WAL files will almost always occupy new (different) disk space, even if you only modify the same table over and over again. That's a feature, not a bug. ;-) > or perhaps changing the same > block of lines in a text file every day) That depends on the editor. Some editors write a completely new file and mv(2) it into the place of the original one. You can check for this case by watching the inode number of the file ("ls -li"). If it changed after editing, then the editor wrote a new file, so it has occupied different blocks on the filesystem. > does that mean that every day > simply represents 5%_of_filesystem, for a total of 20% of the total > filesystem in use at all times for snapshot data ? That should happen in all cases, no matter what data you modify, whether it's the same as the previous night or not. > Finally, are there any snapshot diag tools at all ? Like, something that > reports snapshot sizes Well, it's not easy to define "snapshot size". Of course it has a virtual size which is reported by df(1), and a physical size reported by "ls -l". But the data of the snapshots consists of regular filesystem blocks which have not been modified yet, and blocks of original content that has been modified -- but these might be shared between multiple snapshots, so how would you account them? > percent of disk used for snapshots, Well, that should be easy to calculate from df(1). I think there's already a tool in the ports collection which does that. > and maybe even > a way for me to actually calculate what the percent change for time period > X is for a particular filsystem >? Basically, it depends exactly on the amount of data that you modify. When you modify or delete blocks that have not been modified since the last snapshot had been created, the space requirement of the snapshot data will grow by blocks. The number of snapshots in existence does not matter. Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd Any opinions expressed in this message may be personal to the author and may not necessarily reflect the opinions of secnetix in any way. "When your hammer is C++, everything begins to look like a thumb." -- Steve Haflich, in comp.lang.c++