Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 6 Jun 2000 10:22:25 +0200 (CEST)
From:      Juergen Nickelsen <jnickelsen@acm.org>
To:        Anatoly Vorobey <mellon@pobox.com>
Cc:        chat@freebsd.org
Subject:   Re: Undelete in Unix (Was: Re: Why encourage stupid people to use *BSD)
Message-ID:  <14652.46273.709215.10448@goting.jn.berlin.snafu.de>
In-Reply-To: <20000606042412.A43514@happy.checkpoint.com>
References:  <3939F26A.A405DD4A@mail.ptd.net> <LPBBJIAAFFNFMKJGNIAIIEBJCAAA.keramida@ceid.upatras.gr> <20000605081334.C25970@ecto.greenpeas.org> <001501bfceee$34cb9a00$858c8c96@win2000.cc.ceid.upatras.gr> <20000605162025.A32447@happy.checkpoint.com> <x7d7lvlqs0.fsf@goting.jn.berlin.snafu.de> <20000606042412.A43514@happy.checkpoint.com>

next in thread | previous in thread | raw e-mail | index | archive | help
In <20000606042412.A43514@happy.checkpoint.com>, Anatoly Vorobey writes:
> On Tue, Jun 06, 2000 at 12:14:07AM +0200, Juergen Nickelsen wrote:
[...]
> > You would also have to handle the case that multiple instances of
> > the same pathname get removed, so a kind of versioning scheme would
> > be needed.
> 
> I'd just use inode as the unique identifier. Thus links in .dustbin
> will look like origdir-origname-inode-->inode,  and inode hardlinked
> to the actual file. Slashes are encoded in origdir somehow, etc.

You'd also have to encode the inode itself for uniqueness to prevent
clashes with legal filenames. It has to be guaranteed unique,
otherwise it *will* clash sometime (says Murphy).

There is the problem of the pathname length vs. filename length,
though, or you would have to rebuild the directory hierarchy. I'd
prefer to have a kind of database for performance reasons and
because I think you'd need it anyway (see below).

> When looking to undelete origdir-origname, search for
> origdir-origname-* in .dustbin, and using inode hardlinks present
> helpful information to the user about sizes, times, etc. of
> different candidates for undeletion.

This is not trivial, I think, especially if you have several
versions of the deleted file. Or if you don't exectly know where it
was. There should be a user interface like restore(8)'s, especially
including a -i option and the possibility to browse the different
versions. I can even imagine various frontends (GUI, Emacs, curses).

> > And, of course, a wrapper for the unlink(2) and rmdir(2)
> > system calls to catch other programs than just rm(1).
> 
> Well, that demands kernel level intervention (because of the static
> binaries), I was trying to explicitly show something that only uses
> user-mode code. But yes, simple wrappers for syscalls are possible,
> which will relegate to user-mode the actual handling, for robustness.

I meant wrappers in libc, but of course to catch existing binaries,
you'd have to do it in the kernel, perfoming the actual save
operation with a daemon.

[configurable]
> I agree. Could be a usermode-visible mount option which newrm(1) or 
> syscalls wrappers check. Per-directory configuration seems too
> bothersome to be worth it, though one can kludge away something like
> touch $dir/.dontundeletefileshere.

Thw daemon can as well read a database with pattern not to save for
undeletion. I think a file-system-wide option is not enough, as
there are /tmp, /usr/tmp, and /var/tmp, where you may or may not
save files for undeletion. You might also want to decide that you
don't want to save any *~ files, no files under /usr/obj, and no *.o
file in your home directory.

> > [...] it might be useful to maintain a log *when* which file was
> > deleted, in order to be able to restore a consistent state of a
> > certain time in the past.
> 
> The log is simply there in inode, isn't its access time changed
> when a directory entry to it is unlinked? If not, it can be made
> to change. Since the inode is effectively being frozen when the
> file is "deleted", the only reference to it staying in .dustbin,
> this information is available later for undeletion considerations.

Probably a ctime change would be most appropriate. But I think it
isn't enough to save a directory entry only if this is the last link
to the file. All entries should be saved (though not . and .. for
directories), because I might not even know where the other link to
a file is, when I deleted one accidentally. Or I might have deleted
both in the same action -- which one is privileged to be saved? To
do it right, (nearly) every removed link has to be saved, and the
deletion time has to be saved separately, since the ctime of the
inode may of course change later for different reasons.

This is the place where the database idea for deleted entries comes
up again. This would also save any filename clash or filename length
trouble.

> > "echo 'test:*:101:20:test account:/home/test:/bin/sh' > /etc/passwd"
> > where ">>" was meant. Should the same level of protection apply to
> > file that are accidentally overwritten?
> 
> This sounds too complicated. Well, OK, how about this -- catch
> ftruncate's to zero size, and clone the file in such cases,
> storing it in .dustbin in a similar fashion to above.
> 
> Don't even start talking about mmap'ed files, please :)

:-) I wasn't too serious with this one. But catching {f,}truncate()
is a good idea. A versioning file system (as with VMS and its
predecessors) would make this even unnecessary.


As you may have guessed by now I am undecided between "forget it, it
is too hard to do right" and "well, what *has* to be done?" It even
begins to sound feasible to me, although also like some work.

Greetings, Juergen.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?14652.46273.709215.10448>