Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Aug 2002 14:43:12 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Patrick Thomas <root@utility.clubscholarship.com>
Cc:        Daniel O'Connor <doconnor@gsoft.com.au>, freebsd-hackers@FreeBSD.ORG
Subject:   Re: possible to expand a file for vn-device FS usage ?
Message-ID:  <3D5C2070.4F2C17A3@mindspring.com>
References:  <20020815102056.J58763-100000@utility.clubscholarship.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Patrick Thomas wrote:
> What is the negative effect of this fragmentation, and does it mean I
> won't be able to use all of the space that I added ?


Old disk:

[X X ][XX  ][ XX ][X  X][  XX]

New disk (initial state):

[X X ][XX  ][ XX ][X  X][  XX][    ][    ][    ][    ][    ]

New disk (after 10 allocations):

[XXX ][XX X][XXX ][XX X][ XXX][X   ][ X  ][   X][ X  ][  X ]

New disk (after 20 allocations):

[XXXX][XXXX][XXXX][XXXX][XXXX][X X ][XX  ][ X X][ XX ][  XX]

Result: slowed access times, very long allocation cycle, even
though there is a lot of free space (100% fill on bottom half,
50% fill on top half, when it should have an overall 50% fill).

You get to spend all of your time generating random numbers
until you find a cylinder group that isn't full.  50/50 chance
of getting one of the bottom instead of top sectors each time.

Performance falls off exponentially, just like the whole disk
was almost full, even if the top half is really half empty.  In
the graphic example above, the fill on the disk is an *average*
75%; selection of disk space is a hash function, which, if you
read Knuth: "Seminumerical algorithms: Sorting and Searching",
would not have a performance hit at all until 85% full... IF
the distribution was perfectly random, which it isn't because
you added disk space without reallocating already allocated
"random" allocations -- thus introducing an artificial selection
bias "history".


So: you'll be able to use the space, but your performance will
suck, probably sooner, rather than later.

FWIW, LFS and EXT2FS and EXT3FS and Reiser and NTFS all have
this issue, but some of those have "cleaners"/defragmenters.


Note: this example started with the bottom disk 50% full; you
are much more likely to add disk space only after you really
need it, and it's not likely to be double.  Therefore (1) your
performance will suck almost immediately, rather than the suck
being delayed, and (2) the base of the exponential curve will
be much, much higher, and (3) you will spend all your CPU on
looking for an empty cluster in a cylinder group somewhere.


In normal operation, FFS never has this problem because the free
reserve guarantees that the hash never gets full enough to have
a problem (well, except FreeBSD has reduced the default free
reserve below the 10% speed/space tradeoff level to 8%, when 15%
would be optimal [100% - 85% = 15%]), but then growing your disk
space available is an abnormal operation not considered in the
original design of FFS (all disk packs are 4M, and that's *luxury*).
See the "newfs" man page, and the FFS design paper for details.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3D5C2070.4F2C17A3>