Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Apr 2002 09:27:17 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Joshua Steele <jsteele@CodefusionIS.com>
Cc:        Michael Sierchio <kudzu@tenebras.com>, freebsd-fs@freebsd.org
Subject:   Re: newfs overwrite...
Message-ID:  <3CCEC5E5.FED0CBF@mindspring.com>
References:  <20020429121106.V97112-100000@lilly>

next in thread | previous in thread | raw e-mail | index | archive | help
Joshua Steele wrote:
> Well..this was the backup/storage server.  I contacted drivesavers, and
> its going to be about 7,000.00US to get it fixed by them...which is not an
> option because i do not have that much in resources to get the drive fixed
> (i am a small business)
> 
> Are there any other tools, etc. for freebsd that aide in rebuilding the fs
> table?  Or am i basically not going to be able to repair the drive, and
> might as well move on and start salvaging what financial data i do have at
> the current time before the tax quarter is up....

Buy a much-larger-than-60G disk (preferrably, more than twice as
large), and:

1)	dd the image of the 60G disk into a single file

	Note: Not really necessary, but it prevents you from
	screwing up your "live" disk)

2)	Start copying out chunks of data base on cylinder
	groups, and identification of secondary indirect
	blocks

The data better be *really* valuable, as this is a manual, labor
intensive operation.  If it's recognizable to a human, then you
are going to be doing a lot of looking; if it's not, you are
going to be using the remainder of the disk space to write some
programs to recover particular file contents type of data.

It's alway easiest if the drive is human readable.  I recovered
a good 250,000 lines of source code from a spammed drive this way,
at one time in my misspent youth, so that the project, due in a
couple of days after the fact, would not be turned in late.


The main problem is that when you delete a file, the physical
analogy is to  take the contents out of the file folder, rip the
label off the file folder, and then shuffle the pages that were
in the folder into your blank printer paper (knowing that the
printer will erase them before it prints on them), after which
you throw the file floder back into the supply cabinet.

You've basically done this with all your files.

If the papers don't contain binary information (e.g. the moral
equivalent of encrypted data, in terms of being able to identify
which piece of paper goes in which file folder, or which piece
of paper goes in what order), then it's just a big sorting job.

If it's binary data, you can basically perform an iterative
search based on your knowledge of the contents, in order to
recover the data.  For an executable, this is probably not
worthwhile (you can always replace it), but identifying "magic
numbers" for things like Postscript, ELF executables, etc.,
are actually very easy; the remainder of the file, less so.

The other hint you have is that every set of 9 pages in large
file folders are "stapled together" -- members of the same
clyinder group.

If you have a rough idea of the FS size (which you do), then
examining the post-newfs disk read-only will tell you where
all the FS layout information lives.  From this, you can
probably recover directory information pretty easily, which
can give you inode and relative cylinder group information;
doing this requires a fairly deep understanding of the FS in
question.

THe drive recovery place might be a deal.  Basically, they copy
the normallay readable data off the disk, and then read the disk,
taking head hysteresis into account, to recover the misaligned
track writes, if any, to recover the data (which is why MILSPEC
erasure requires the writing of patterned data to the disk, from
both seek directions, to achieve erasure of "secret" data).


On a theoretical standpoint:

o	Everything above is predicated on the idea you are using
	FFS.  If you use another FS, the recovery details become
	very much easier or very much harder, depending on the
	FS.

o	It's pretty trivial to change the process to lazy-bind
	the contents of deleted information, so that instead of
	writing zero'ed inodes to the disk, you leave the index
	information intact, and only zeroit on reallocation;
	this makes undeleting files a lot easier, because it
	doesn't put the unlabelled file folder back into the
	file cabinet.  It also leaves the papers in the folder,
	though they are available for the printer to grab and
	clear at random, if it's asked to print (saving new files
	to the FS may overwrite "deleted" data).  This would be a
	rather simple operation for FFS, actually.

o	It's also pretty trivial to change it so that formatting
	actually scrubs the disk, and then deletion also scrubs
	the disk.  In combination, this would be a bad thing, but
	seperately, it would allow you to recover a lot of data
	much more quickly, by being able to rule out large amounts
	of disk space from consideration.

o	It's pretty trivial to change the formatting process to
	resemble the Windows formatting process, which means
	that the newfs can be made largely reversible.  This is
	actually probably a pretty good idea, for general small
	businesses like yours, actually.  No one has seriously
	attempted to productize UNIX, yet... not even Univel,
	back in the day.

	Anecdote time: One thing we often did at the local
	university any time a machine was donated was to first
	undelete everything, and see if there were games on the
	disks.  The FS layout helped us considerably.  This was
	before doing such things was considered illegal.

o	If you are depending on the data being unrecoverable merely
	because you format the disk... it's not going to happen...

o	The data is always recoverable.  The speed and time is a
	matter of the effort you are willing to expend.  Depending
	on the unrecoverability of the data is a losing proposition,
	unless it's encrypted, and if it's something like DES, using
	"the crypt breaker's workbench" makes it pretty trivial to
	recover the data, as well.

o	Having some of the financial data on hand in a format that
	allows recreation of partial data gives you enough information
	that you can probably eliminate the data

There are some things that can make a disk unrecoverable, but they
all require the use of cryptographic mechanisms.  If you have used
a good one on your financial data on the disk... it's time to start
over entering the data.

If you have time pressure on you right now, spend the money.  If you
have some leeway, then recover the data the slow way, and if it's not
panning out, then spend the money before the time-to-recover window
closes.


You might also look at this as an opportunity to build the tools
needed to recover the data more quickly.  It's actually not that
difficult to build such tools, and you have a test image (now)
that is a relatively expensive thing to create.  8-(.

Frankly, any time I've done this to a disk, I've always been most
concerned with a small subset of the data, not the whole disk, so
the recovery was simultaneously much easier and much less totally
labor intensive; like a linear search, I could stop after only
having examined about 50% of my total data set.  It also means
that all the tools I wrote for the job were so small that I just
threw them away when I was done with them (e.g. I didn't archive
them for posterity, but I also didn't actively seek to get rid of
them, they just got backed up on tape and ignored, over time).

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3CCEC5E5.FED0CBF>