Skip site navigation (1)Skip section navigation (2)
Date:      27 Jan 2002 01:46:27 -0800
From:      swear@blarg.net (Gary W. Swearingen)
To:        Terry Lambert <tlambert2@mindspring.com>
Cc:        freebsd-chat@FreeBSD.ORG
Subject:   Re: Bad disk partitioning policies (was: "Re: FreeBSD Intaller (was   "Re: ... RedHat ...")")
Message-ID:  <0s3d0s5dos.d0s@localhost.localdomain>
In-Reply-To: <3C534C4A.35673769@mindspring.com>
References:  <20020123124025.A60889@HAL9000.wox.org> <3C4F5BEE.294FDCF5@mindspring.com> <20020123223104.SM01952@there> <p0510122eb875d9456cf4@[10.0.1.3]> <15440.35155.637495.417404@guru.mired.org> <p0510123fb876493753e0@[10.0.1.3]> <15440.53202.747536.126815@guru.mired.org> <p05101242b876db6cd5d7@[10.0.1.3]> <15441.17382.77737.291074@guru.mired.org> <p05101245b8771d04e19b@[10.0.1.3]> <20020125212742.C75216@over-yonder.net> <p05101203b8788a930767@[10.0.1.14]> <gc1ygc7sfi.ygc@localhost.localdomain> <3C534C4A.35673769@mindspring.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Terry Lambert <tlambert2@mindspring.com> writes:

> "Gary W. Swearingen" wrote:
> > I'd be good to have this documented after some more experts express a
> > common opinion on whether absolute or relative size of the reserve
> > matters and how they'd choose the numbers.  I'd hope they'd speak of
> > partition size instead of disk size.  And whether the value should
> > have any dependence on tunefs's -o value.
> > 
> > I suspect that the answer is "absolute", except for the effect big
> > partitions have on the willingness of the SA to reduce risks by
> > increasing their safety margins, at the cost of cheap disk space.
> 
> It's partition size, but X*(N + M) = (X*N) + (X*M).
> 
> Multiplication is commutative and associative.  8-).
> So yes: it's absolute.

That's odd.  Your example there shows relative and I interpret the rest
of your comments about hashing to imply that it's relative.

If it's absolute, you'd have (N + M - A) != (N - A) + (M - A), and
discussing disk size (N+M) would be somewhat misleading.

(Maybe my use of absolute and relative wasn't clear.  Absolute meant
the reserve space for good defraging (or SA reserve) wasn't (much)
dependant on partition size, while relative meant the reserve space
needs was a set fraction of the partition size.)

> Basically, at 85% in a perfect hash, there is 0%
> fragmentation, at 90% that goes up to 7%.
> 
> It's really very easy to understand: you are using a
> statistical function to select a non-colliding subset
> of a set, and you want to know at what point you end
> up with diminishing returns, and collisions occur.

Trust me.  It's not easy to understand from this thread so far, and I
don't expect it to be; I can go to the FFS treatise for understanding.
I feel bad even seeing you spend your time trying to explain reasons.
But I am asking for statements of how the algorithm behaves which
would be helpful in knowing whether to twist the -m knob or how far.

> If you have a friend who is a statistician, you should
> ask them to explain "The Birthday Paradox" to you.

I've read about it several times, always forgetting the math, but I
remember you need only about 50 people for a 0.5 match probability.

> You'd probably benefit from reading the original FFS paper.

No doubt.  Though I trust you that the performance of the algorithm
is not a function of the partition size, but of the reserve relative
to that size (and the space filled relative to that size), I'll need
to read more to believe that I care as much about poor performance with
relatively full big disk than with a small one.  For example, I might
accept slow performance to get an extra 5 GB when I wouldn't for 50 MB.

> > The tunefs(8) man page leaves me wondering, when it says
> > 
> >     This value can be set to zero, however up to a factor of three in
> >     throughput will be lost over the performance obtained at a 10%
> >     threshold.
> > 
> > whether that's true even when the filesystem is far from full or only
> > when comparing, say, two fileystems with 0-10% free space (and, I
> > suspect, only a factor of three near 0%).
> 
> You know, you could worry about something else... like
> the fact that a formatted disk has less capacity than an
> unformatted one.

I probably would, if there was a poorly-documented knob for that too.
But when I read silly recommendations to set the swap/RAM knob to 2,
regardless of the size of RAM or applications, I find it easy to
question other recommendations for which the justification is only deep
in the source or developer archives or even hairy treatises or seemingly
wrong (as the above tunefs(8) quote).

Actually, my worry was not really in how something worked or could be
optimized as much as it was a response to what I find to be a poorly
documented config setting.  If it just said "leave this to experts" I
probably wouldn't have brought it up.  But when I read the tunefs quote
above, I see an implication that I'm quite sure is absolutely wrong: It
implies that the throughput will always be poor, regardless of how full
the disk is.  That is misleading and tends to make people twist the knob
less far than they would if the statement expressed the truth better.
Maybe it only needs to change "throughput" to "worst-case throughput" or
"near-full throughput".  It's also quite common-sensical to think that
the reserve wouldn't be as necessary for bigs disks as it was for small
ones.  Better documentation would head off many FAQs on this issue.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0s3d0s5dos.d0s>