Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Sep 2001 21:49:34 +1000
From:      Mark Hannon <markhannon@optushome.com.au>
To:        freebsd-hackers@freebsd.org
Subject:   dump files too large, nodump related??
Message-ID:  <3BAF1DCE.4BA21B8D@optushome.com.au>

next in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------558CB1854F512336B4523BA1
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Hi,

I have noticed some strange behaviour with 4.3-RELEASE and dump.  I have
been dumping my filesystems through gzip into a compressed dumpfile.
Some
of the resulting dumps have been MUCH larger than I would expect.

As an example, I have just dumped my /home partition .... note that lots
of directories on this partition are marked nodump, eg /home/ftp which
is one of the biggest users of diskspace.

Building 8 level dump of /home and writing it to /var/dumps//home8.gz
(gzipped)
  DUMP: Date of this level 8 dump: Mon Sep 24 21:13:55 2001
  DUMP: Date of last level 1 dump: Tue Sep 18 20:15:43 2001
  DUMP: Dumping /dev/ad0s1h (/home) to standard output
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 360780 tape blocks.
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
  DUMP: 30.76% done, finished in 0:11
  DUMP: 60.89% done, finished in 0:06
  DUMP: DUMP: 360664 tape blocks
  DUMP: finished in 849 seconds, throughput 424 KBytes/sec
  DUMP: level 8 dump on Mon Sep 24 21:13:55 2001
  DUMP: DUMP IS DONE

The GZIPPED dumpfile is 289 MB!!!   

I wrote a little perl script to check the table of contents and estimate
how
big the dump should be (see attached) and this gives an interesting
result.

doorway:~> proj/dumpsize/dumpsize.pl /home /var/dumps/home8.gz 
Level 8 dump of /home on doorway.home.lan:/dev/ad0s1h
Label: none
The level 0 dump of /home partition written to /var/dumps/home8.gz
contains 689 files totalling 146450 KB, cf size of dumpfile = 282063 (
360660 ) 
KB

The following files are larger than 1024 KB in size:
163264 ./mark/.netscape/xover-cache/host-news/athome.aus.service.snm
1343488 ./mark/.netscape/xover-cache/host-news/athome.aus.support.snm
2097152
./mark/.netscape/xover-cache/host-news/athome.aus.users.linux.snm
1754819 ./mark/.netscape/xover-cache/host-news/hostinfo.dat
1122336 ./samba/profile.9x/mark/USER.DAT
1441792 ./samba/profile.9x/tuija/History/History.IE5/index.dat
92440996        ./tuija/Mail/Archive/Sent Items 2001
2985510 ./tuija/My Documents/gas1.JPG
2528914 ./tuija/My Documents/gas2.JPG

The interesting thing here is that the sum of all the file sizes in the
dump
is only 147MB cf the 361MB uncompressed dump size!!!!!!!  This is a
discrepancy
of 210MB.  (This would line up with the 180MB ISO image plus other dribs
and
drabs that I have stored in a nodump flagged directory since my last
dump)

Any ideas of what is wrong?  Are the nodumped files stored on the dump
for some reason (even though they don't appear in the restore table of
contents)

Regards/Mark
--------------558CB1854F512336B4523BA1
Content-Type: application/x-perl;
 name="dumpsize.pl"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="dumpsize.pl"

#!/usr/bin/perl
#
# $Id: dumpsize.pl,v 1.4 2001/09/23 05:33:22 mark Exp mark $
#
# Usage: dumpsize.pl [-list] partition gzipped_dumpfilename
#

use strict;

my ($progname) = "dumpsize.pl";	# Name of this program for errors
my ($progusage) = 
  "[-list] partition gzipped_dumpfilename";
my ($list_flag) = 0;		# 1 if -list specified on command line
my ($threshold) = 1024 * 1024;	# Threshold to print out files
my ($tmp_dump_gzip) = "/tmp/dump_gzip";
my ($tmp_dump_toc) = "/tmp/dump_toc";

my ($partition);		# Name of partition in dumpfile
my ($dumpfile);			# Name of dumpfile
my ($dumplevel) = 0;		# Dump level - not implemented yet!
my ($dump_size);		# Size, in bytes, of unzipped dumpfile
my ($dump_size_gzip);		# Size, in bytes, of gzipped dumpfile
my ($dump_is_gzipped) = 1;	# 1 if dumpfile has been gzipped

my ($i);			# Loop counter

my ($total_size) = 0;		# Sum of file sizes included in dump

my (@line);			# All lines for TOC stored in this array
my (@token);			# Temporary variable used to split lines

my (@leaf_file);		# Unsorted array of leafnodes found in TOC
my (@file_list);		# Sorted array of leafnode filenames
my (@file_size);		# Parallel array of file sizes ...
my ($no_files) = 0;		# Number of files on tape


# -----------------------------------------------------------------------------------
# Parse command line, open dump file and table of contents

unless( $ARGV[0] ne "" ){
  die( "Usage %s %s\n", $progname, $progusage);
}

if ( $ARGV[0] eq "-list" ){
  $list_flag = 1;
  $partition = $ARGV[1];
  $dumpfile = $ARGV[2];
} else {
  $list_flag = 0;
  $partition = $ARGV[0];
  $dumpfile = $ARGV[1];
}

unless( chdir ( $partition ) ){
  die ( "$progname : Unable to chdir to partition $partition\n" );
}

unless( -e $dumpfile ){
  die( "$progname : Unable to open gzipped dumpfile $dumpfile\n");
}

# -----------------------------------------------------------------------------------
# Calculate uncompressed size of dumpfile

$dump_size_gzip = -s $dumpfile;

system( "/usr/bin/gzip -l $dumpfile > $tmp_dump_gzip.$$" );

unless( open( GZIP_DETAILS, "$tmp_dump_gzip.$$" ) ){
  die( "$progname : Unable to open TOC file $tmp_dump_gzip.$$\n" );
}

@line = <GZIP_DETAILS>;
$line[1] =~ s/^ +//;
@token = split( / +/, $line[1] );
$dump_size = $token[1];

# -----------------------------------------------------------------------------------
# Parse restore TOC and look for all leaf entries, store contents into @file_list

system( "/usr/bin/zcat $dumpfile | /sbin/restore tvf - > $tmp_dump_toc.$$" );

unless( open( RESTORE_TOC, "$tmp_dump_toc.$$" ) ){
  die( "$progname : Unable to open TOC file $tmp_dump_toc.$$\n" );
}

@line = <RESTORE_TOC>;

for ( $i = 0 ; $i < @line ; $i++ ){
  if ( $line[$i] =~ /^leaf/ ){
    @token = split(/\t/, $line[$i] );
    $leaf_file[$i] = $token[1];
    chop( $leaf_file[$i] ); 
  }
}

@file_list = sort( @leaf_file );

for ( $i = 0 ; $i < @file_list ; $i++ ){
  $file_size[$i] = -s $file_list[$i];
  if ( $file_size[$i] > 0 ){
    $no_files++;
  }
  $total_size += $file_size[$i];
}


# -----------------------------------------------------------------------------------
# Print detailed list of dumpfiles

if ( $list_flag == 1 ){
  for ( $i = 0 ; $i < @file_list ; $i++ ){
    printf( "%d\t%s\n" , $file_size[$i], $file_list[$i] );
  }
}

# -----------------------------------------------------------------------------------
# Print summary of results

printf( "The level %d dump of %s partition written to %s\n", 
	$dumplevel, $partition, $dumpfile );
printf( "contains %d files totalling %d KB, cf size of dumpfile = %d ( %d ) KB\n", 
	$no_files, $total_size/1024, $dump_size_gzip/1024, $dump_size/1024 );

printf( "\nThe following files are larger than %d KB in size:\n", $threshold/1024 );
for ( $i = 0 ; $i < @file_list ; $i++ ){
  if ( $file_size[$i] > $threshold ){
    printf( "%d\t%s\n" , $file_size[$i], $file_list[$i] );
  }
}

# -----------------------------------------------------------------------------------
# Cleanup temporary files etc.

unlink ( "$tmp_dump_gzip.$$" );
unlink ( "$tmp_dump_toc.$$" );

--------------558CB1854F512336B4523BA1--


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3BAF1DCE.4BA21B8D>