Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 Dec 2002 23:09:13 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Brad Knowles <brad.knowles@skynet.be>
Cc:        Patrick Cable II <freebsd@slaudiovis.org>, chat@freebsd.org
Subject:   Re: Backup Solutions
Message-ID:  <3E0FF119.7792A270@mindspring.com>
References:  <3E0DC536.8010001@slaudiovis.org> <3E0EBC49.86AD7E28@mindspring.com> <a05200f09ba3573361365@[10.0.1.5]>

next in thread | previous in thread | raw e-mail | index | archive | help
Brad Knowles wrote:
> At 1:11 AM -0800 2002/12/29, Terry Lambert wrote:
> >  I expect that the correct thing to do is to have a replica and
> >  a non-volatile backup mechanism, in combination.
> 
>         Sounds good.  But what are good tools to achieve these goals?
> 
>         Myself, I would be interested in extending this question to also
> cover PC/Windows & Macintosh (MacOS X) clients, in addition to the
> FreeBSD server.  So, in addition to backing up the server itself, you
> also need software to back the clients up to the server, which can
> then be rolled into the server "data" to be backed up.

The problem with Windows and Macintosh is the software doesn't
provide transaction triggers.  Most places where you would want
to do this sort of thing are for replica servers for databases
for small businesses; large ones already have Veritas with
snapshots or Oracle or Sybase or some other "real" database.

For Windows and Macintosh, you are most likely to be using some
Microsoft based solution, like Access, or Access with a Microsoft
SQL back end.  Most often, this is MSDE, which is a cut-down SQL
server that comes free with the developer's software, and has a
distribution license, which saves you from the license fees and
the per-user license requirements of MS SQL Server.  In the common
case, a well-written application will close the database between
transactions, so that implied atomicity guarantees happen.  But
most of these applications are written by non-CS people who are
writing code to get something working, and never think about the
consequences of big customers, etc..

As an example, one of my dad's businesses has a number of various
purpose-built applications that have grown up in order to implement
his business rules and internal practices.  One of them is a
client/server application that uses MS SQL server, another is mostly
Access based (but thinking of moving the data store over to MS SQL),
and another is the MSDE "free" SQL server (it's a document imaging
system).  None of them can really work properly for a backup that
doesn't go out of its way to deal with open files, so there is a one
day latency in data recovery, except for tiny windows that would make
the database(s) unavailable for however long the backup or replica
creation took.

The MS SQL one could be handled, though... it's possible to make a
replicating proxy.  But then corrupt data would still be replicated
all over, as would improper deletions, since there is not a deleted
record marker and a pending index, to delay actual deletion until a
purge operation takes place.

Access can be backed by MS SQL, and you can replace MSDE with MS SQL,
meaning you can proxy all of them, but then you are paying on the
order of $10,000, which is a little steep for a 25 person office.

In etiher case, the offline backup is still needed for the reasons
stated previously.  That business uses tape to do a rotating offsite
backup, with an incremental and full archival dump schedule.  These
require that people exit the applications, so it can be some work
walking around the office after hours, for the person doing it, if
anyone has left their machine on and accessing a record in any of
them.


>         Is there an Amanda PC/Windows client?  Or an Amanda MacOS X
> client?

There is one in beta right now.  IT's available from:

	http://sourceforge.net/projects/amanda-win32/

It doesn't seem to have been updated since last June.  8-(.

You probably are not going to find it useful, due to the "open
file backup" problem.  The normal way to handle "open file backup"
on windows is to install an IFSMgr hook to hook calls into the
IFS manager.  You can then use the existing open instance for the
file in question in order to back it up.  This is usually not
very satisfying for database files, since the transactional
representation of atomicity and/or idempotence to the application
are done at the database application level, and can't be guaranteed
at the IFSMgr level (there is no nesting information pushed by the
application to the IFSMgr, so you can wait for a 1->0 nesting level
for transactions in progress to complete, before doing your backup).

The result is "corrupted" database files (they aren't really corrupt,
they're more like there was a crash in the middle of however many
updates were in progress at the time).  Most database software will
not do a check on the data for you automatically,  and you may not
be able to trigger crash-recovery behaviour following a restore,
without special knowledge.

The most common method is to export the FS as a share, and then
use Amanda with SAMBA (client) in order to back up the data; this
has the same problems.  See the online book at:

	http://www.backupcentral.com/amanda.html

specifically:

	http://www.backupcentral.com/amanda-13.html


> What about the handling of tape swapping, archiving, and
> other things normally done with stackers and libraries?

You use stackers and libraries.  8-).


> >  I also suggest that you avoid the "active file can't be backed up"
> >  problem, by choosing the correct software (and no, "snapshots" are
> >  not good enough, because they don't trap the right state for the
> >  implied metadata, among other deficiencies).
> 
>         Good point.  What are good tools to avoid this problem, at least
> with regards to FreeBSD?

There aren't any, per se.  You have to have people write the code
the right way, instead.  For Access, the most common method of
making code "backup safe" is to close it between operations, to
mark the ends of transactions.  The database software has the same
problem opening the file as you would, if the database software has
the file open (barring the IFSMgr shim approach).

Even so, this only works to protect the integrity of data or of
metadata, but not of implied metadata.  For example, if you have
a seperate index and record file, and there is no guarantee on the
order of operation on th two of them, even if the application were
to sync the data out to guarantee idempostence against reuse of the
data by the application, that doesn't make the backup operation into
an atomic snapshot.  Even if the backup software opened the next file
before closing the last, you can't guarantee that the snapshot that
you will get is accurate.

Most professional database software comes with software that does
its accesses the same as the database software, and sometimes even
does its dumps by connecting to the database, locking records, and
dumping them out (MySQL does it that way, but a record that contains
a non-indexed field which is an index for another database, can still
have inconsistant state: that's an implied metadata relationship).

If you have professional database software, then you will back up
the database contents with the vendor supplied software, and not
treat the database files as if they were files.

In most cases, you are talking about spending money for commercial
software, since the professional database backup software is often
a seperate product add-on.

That's the case for all MS products using FreeBSD systems as a
file server to store database files on remotely accessed volumes.

If you're talking about FreeBSD databases on FreeBSD, then most of
them have ways to dump the database contents atomically, for backup
purposes.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E0FF119.7792A270>