Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 14 Jan 2005 14:57:40 -0600
From:      John <john@starfire.mn.org>
To:        Tabor Kelly <tkelly-freebsd-questions@taborandtashell.net>
Cc:        questions@freebsd.org
Subject:   Re: Backups / Dump etc
Message-ID:  <20050114145740.A11218@starfire.mn.org>
In-Reply-To: <41E82BF9.2030505@taborandtashell.net>; 2005 at 12:30:49PM -0800
References:  <f17daf04050114090052fc6253@mail.gmail.com> <41E82BF9.2030505@taborandtashell.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jan 14, 2005 at 12:30:49PM -0800, Tabor Kelly wrote:
> Jeff MacDonald wrote:
> 
> <snip>
> 
> On a related note:
> 
> If I want to do complete dumps of all of my file systems do I need to be 
> in single user mode? Will running in multiuser mode (with all of my 
> normal daemons running) mess up my dumps?

If you are running FreeBSD 5.x, you get the cool "L" option on
dump which will automatically snapshot the mounted filesystems.

NOTE!  This is WONDERFUL, but not the same as shutting things down.
You will have a point-in-time image, but that may not be the end of
the story.

WARNING!  What follows is a true depiction of the situation,
but don't let it panic you.  This deals with fairly pathological
timing situations, which occur infrequently in small server
situations with relatively few users.  If you make regular, frequent
backups, you will be much better off than 99% of us, anyway...

What you will be protected from is a situation where files
change size during the backup, so that the filesystem meta (inode)
information captured at the beginning will not reflect the state
of the filesystem at the end.  This is a good thing to avoid.

Let's take the case where you've just edited a file with vi, for
instance.  Let us further conjecture that you have added many lines
to the top of the file, making it longer, but from the front on.
Let us posit that the dump kicks off just as you are writing the
new file, let's say half-way through.  The top half of the file is
the new data, the bottom half is the old data, and may be internally
inconsistent with the front.

>From an fsck point of view, there is no problem.  The state of the
inode and the state of the data blocks are CONSISTENT with respect
to where they were at the moment the snapshot was taken (NOT
guaranteed without the snapshot - in fact, without the snapshot,
you could theoretically be backing up data blocks that may not even
be part of that file any longer).  You got a snapshot of the file
between "write()" operations.

Please understand that NO PERFECT SOLUTION TO THIS PROBLEM EXISTS
TODAY in any operating system or computing environment that I know
of.  Oracle and other databases can guarantee RESTARTABILITY from
point-in-time images, but that is at the expense of backing out
incomplete transactions.  Anything that the application knows
belongs in the file, but which it hasn't actually WRITTEN yet, will
be lost.

Snapshot it wonderful, if you could make sure that all your
applications were quiesced at the time of the snapshot.  There is
no universal way, today, of quisecing all applications (think
everyting from databases to someone's first C program).  So - if
you can guarantee that everyone's work is in a consistent state
from the application point of view at the moment the snapshot is
taken, then you are golden.  Otherwise, if a file is in the middle
of being posted to the disk when the snapshot it taken, it will
be consistent from a filesystem perspective, but may be useless
as a C program, shell script, or e-mail message.  The operating
system simply has no way to know that the application is holding
data which have not been written yet.

If you make regular backups, no problem.  If your recovery point
happens to contain one inconsistent file (and that could only happen
if the file was in the process of being written at the time the snapshot
was triggered), then you can recover that file from the state it was
in the previous day.  Databases and other sophisticated applications
are written in such a way that they can survive this sort of problem,
though possibly at the expense of in-flight transactions or operations.
Simpler applications, which have a sequential, linear view of data
as a stream of bytes, may not be as fortunate, but the chances of
getting caught at this point is very small.

What can you do?  Well, you can kick everyone off the system for
the moment that it takes the dump snap to trigger, or you can
just grin and bear it - the cost/benefit analysis is for you to
make, we can only advise you of what the technology can do.
You don't have to unmount the filesystems or shut down the system,
but that still remains the only 100%, fool-proof, guaranteed method
of having all data consistent from an application point of view.

Maybe I should submit this to the handbook - or has it been
covered already?  I didn't see it when I was reading the backup
section last night.
-- 

John Lind
john@starfire.MN.ORG



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050114145740.A11218>