Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Jun 2002 22:49:48 +0200
From:      Andreas Koch <koch@eis.cs.tu-bs.de>
To:        freebsd-stable@freebsd.org
Subject:   4.6-RC: Glacial speed of dump backups
Message-ID:  <20020606204948.GA4540@ultra4.eis.cs.tu-bs.de>

next in thread | raw e-mail | index | archive | help
The following applies to 4.6-RC cvsup'ped on May 23.

I noticed that I have considerable problems getting dump to actually
stream on my DLT VS80 tape drive, regardless of the block length
specified (`-b' option to dump).  Instead, the drive operates in
start-stop-rewind-start mode (colloquially known as shoe-shining). 
Note that it is possible to actually operate in streaming mode when I
pipe the output of dump to a buffering program such as team (from the
ports tree).  But then the multi-volume capability, which depends on
the end-of-media recognition of dump (quite essential when writing to
compressed tapes with variable capacity) is of course no longer
available

So, I looked deeper to enable the direct use of dump without a
buffering utility, and something seems fishy.  But first, some more
details:  I am trying to backup a filesystem hosted on an Adaptec
2400A IDE RAID controller operating in RAID 5 mode.  The machine
itself has an Athlon XP 1700+ CPU and 512 MB of memory.  The sustained
read throughput from the RAID (as measured by reading 1G of data from
a raw partition to /dev/null using dd) is 16-17MB/s.  When using the
file system, this is only a bit lower (roughly 16 MB/s for
non-fragmented files).  The tape drive can accept 3 MB/s natively and
up to 6 MB/s for compressible data.  For a well-compressible file
system such as /usr, a high input data rate is thus a necessity to
keep the tape streaming.  This machine was newly installed and hasn't
been used much, thus the degree of fragmentation on the file systems
is very low (for the /usr example used below, fsck reports a value of
just 0.2%)

However, when using a command such as

	dump -0af /dev/nsa0 /usr

the average throughput reported by dump is only 2.7MB/s (which explains the
constant shoe shining). Adding a buffering command such as 

	dump -0af - /usr | team 8m 32 >/dev/nsa0

keeps the tape streaming and leads to an average throughput of 5.7MB/s
(but makes multi-volume backups impossible).

After establishing in this way that the path to the tape itself is not at
fault, I performed some more experiments concentrating on dump writing to
/dev/null and keeping an eye on the iostats of the disk. To my great
astonishment, the command

	dump -0af /dev/null /usr

has roughly the following throughput profile:

1) In the first phase, dump's Pass I and II (mapping files and
directories), I get the following data from iostats

      tty             da0             acd0             acd1           cpu
 tin tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0   76 16.00 783 12.24   0.00   0  0.00   0.00   0  0.00   1  0  1 2 96
   0   76 16.00 742 11.59   0.00   0  0.00   0.00   0  0.00   1  0  0 1 98
   0   76 16.00 777 12.14   0.00   0  0.00   0.00   0  0.00   1  0  2 0 97
   0   76 16.00 743 11.60   0.00   0  0.00   0.00   0  0.00   1  0  0 0 99
   0   76 16.00 769 12.02   0.00   0  0.00   0.00   0  0.00   0  0  0 0100
   0   76 16.00 770 12.04   0.00   0  0.00   0.00   0  0.00   0  0  0 0100
   0   76 16.00 757 11.83   0.00   0  0.00   0.00   0  0.00   2  0  0 0 98
   0   76 16.00 768 12.00   0.00   0  0.00   0.00   0  0.00   0  0  1 1 98
   0   76 16.00 766 11.97   0.00   0  0.00   0.00   0  0.00   1  0  0 2 97

Thus, dump appears to be reading in 16KB blocks from the disk da0,
leading to a throughput 11-12 MB/s, which isn't too shabby.

2) Then, in Pass III, dumping directories, the directory data is
supposed to be written to tape (or, in this scenario, to /dev/null).
Now the throughput profile changes to 

      tty             da0             acd0             acd1           cpu
 tin tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0   76  2.32 651  1.47   0.00   0  0.00   0.00   0  0.00   1  0  4 0 95
   0  229  2.33 757  1.72   0.00   0  0.00   0.00   0  0.00   0  0  2 1 97
   0   76  2.34 581  1.33   0.00   0  0.00   0.00   0  0.00   0  0  3 0 97
   0   76  2.34 549  1.25   0.00   0  0.00   0.00   0  0.00   0  0  3 0 97
   0   76  2.41 548  1.29   0.00   0  0.00   0.00   0  0.00   1  0  0 1 98
   0   76  2.35 647  1.49   0.00   0  0.00   0.00   0  0.00   0  0  3 0 97
   0   76  2.32 672  1.52   0.00   0  0.00   0.00   0  0.00   0  0  3 0 97
   0   76  2.36 577  1.33   0.00   0  0.00   0.00   0  0.00   0  0  1 4 95
   0   76  2.34 678  1.55   0.00   0  0.00   0.00   0  0.00   0  0  3 0 97
   0   76  2.34 788  1.80   0.00   0  0.00   0.00   0  0.00   0  0  3 3 94

Two things are immediately noticeable: dump now accesses the disk only
in 2.3 KB blocks, leading to a corresponding drop of throughput to
only 1.47 MB/s. When using a real tape, it would start shoe-shining
right from the start here.

3) In Pass IV, dumping files, throughput rates vary wildly:

      tty             da0             acd0             acd1           cpu
 tin tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0   76  2.49 273  0.66   0.00   0  0.00   0.00   0  0.00   0  0  0 0100
   0   76  2.45 268  0.64   0.00   0  0.00   0.00   0  0.00   0  0  0 0100
   0   76  2.67 297  0.78   0.00   0  0.00   0.00   0  0.00   0  0  2 1 97
   0   76  3.00 290  0.85   0.00   0  0.00   0.00   0  0.00   0  0  1 0 99
   0   76  2.66 313  0.81   0.00   0  0.00   0.00   0  0.00   0  0  1 0 99
   0   76  3.76 281  1.03   0.00   0  0.00   0.00   0  0.00   0  0  1 0 99
   0   76  2.67 445  1.16   0.00   0  0.00   0.00   0  0.00   1  0  1 1 97
   0   76  3.66 227  0.81   0.00   0  0.00   0.00   0  0.00   1  0  0 0 99
   0  229  2.49 649  1.58   0.00   0  0.00   0.00   0  0.00   0  0  2 0 98
   0   76  2.40 974  2.28   0.00   0  0.00   0.00   0  0.00   0  0  1 3 96
   0   76  4.87 732  3.48   0.00   0  0.00   0.00   0  0.00   0  0  4 1 95
   0   76  3.14 840  2.57   0.00   0  0.00   0.00   0  0.00   1  0  7 1 91
   0   76  4.35 796  3.38   0.00   0  0.00   0.00   0  0.00   3  0  5 1 91
   0   76  3.70 964  3.49   0.00   0  0.00   0.00   0  0.00   0  0  3 1 96
   0   76  2.77 833  2.25   0.00   0  0.00   0.00   0  0.00   1  0  1 0 98
   0   76  2.62 1073  2.75   0.00   0  0.00   0.00   0  0.00   1  0  1 2 96
   0   77  3.56 451  1.57   0.00   0  0.00   0.00   0  0.00   0  0  2 0 98
   0   76  3.29 686  2.20   0.00   0  0.00   0.00   0  0.00   0  0  4 1 95

In general, all of these rates are insufficient to keep the tape
streaming (especially when considering the compressibility of the
data). Furthermore, the block sizes used for the reads are also quite
low and the average number of transactions has also dropped (but also
has some peaks).

Given the low degree of fragmentation, dump should easily be able to
achive the 6MB/s required to operate the tape drive in streaming mode. 
Especially when considering that dump internally appears to perform
some double buffering itself (three processes:  one reading, one for
slack, one writing). 

Currently, the only explanation I can think of is this:  dump always
seems to read the entire file at once if the file size is less than
64KB.  Otherwise, the file is read in multiple 64KB chunks.  Each
chunk is sent individually to the tape, even if the block size is
larger (e.g., -b 1000 to set 500KB blocks is capped at 64KB).  Thus,
for small files, the lack of adequate buffering plus the file system
overhead for the large number of files leads to the reduction in
throughput.  The reason `| team 8m 32' helps is that many of these small
files are collected together and the write to tape is only
started after sufficient data has actually been accumulated.  Maybe
someone more familiar with the internal operation of dump could
clarify this (I am not too clear on the interaction between the master
and slave processes from glancing at the source).

If the previous hypothesis is indeed true, dump in its current form
would be severely limited when trying to use reasonably fast tape
drives in a multi-volume backup situation.  As for the speed of the
tape drive:  The DLT VS80 I used is actually on the lower end of the
spectrum.  SDLT has 11 MB/s native, and LTO goes up to 15 MB/s native. 
How do people with those drives keep their tapes streaming? 
Alternatively, does anyone know of a more intelligent dump variant for
FreeBSD that performs better buffering internally?

I would be grateful for any comments. Am I overlooking something?
Or is there a real problem in dump as distributed by
FreeBSD?

Many thanks for any help (and for the patience in reading these ramblings :),
  Andreas Koch

-- 
Andreas Koch                                      Email  : koch@eis.cs.tu-bs.de
Technische Universit"at Braunschweig              Phone  : x49-531-391-2384
Abteilung Entwurf integrierter Schaltungen        FAX    : x49-531-391-5840
M"uhlenpfordtstr. 23, D-38106 Braunschweig, Germany      * PGP key available * 

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020606204948.GA4540>