From owner-freebsd-chat Sun Jan 27 16:59:13 2002 Delivered-To: freebsd-chat@freebsd.org Received: from scaup.prod.itd.earthlink.net (scaup.mail.pas.earthlink.net [207.217.120.49]) by hub.freebsd.org (Postfix) with ESMTP id 3E88B37B402 for ; Sun, 27 Jan 2002 16:59:06 -0800 (PST) Received: from pool0437.cvx22-bradley.dialup.earthlink.net ([209.179.199.182] helo=mindspring.com) by scaup.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16V08R-0007gd-00; Sun, 27 Jan 2002 16:58:59 -0800 Message-ID: <3C54A24B.B0B607F@mindspring.com> Date: Sun, 27 Jan 2002 16:58:51 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "Gary W. Swearingen" Cc: freebsd-chat@FreeBSD.ORG Subject: Re: Bad disk partitioning policies (was: "Re: FreeBSD Intaller (was "Re: ... RedHat ...")") References: <20020123124025.A60889@HAL9000.wox.org> <3C4F5BEE.294FDCF5@mindspring.com> <20020123223104.SM01952@there> <15440.35155.637495.417404@guru.mired.org> <15440.53202.747536.126815@guru.mired.org> <15441.17382.77737.291074@guru.mired.org> <20020125212742.C75216@over-yonder.net> <3C534C4A.35673769@mindspring.com> <0s3d0s5dos.d0s@localhost.localdomain> <3C53ED01.61407A02@mindspring.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-chat@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org "Gary W. Swearingen" wrote: > But you probably can't, without rewriting much of the FFS treatise. > People are willing to trust experts when they say a certain behavior is > result of the chosen algorithm if there's a hint that the expert has > considered the issue (as you have more than hinted). I thank you for > trying to explain the reasons, but this just isn't the forum for it. I > don't wan't to seem ungrateful, but I think you should know that much of > your explaination is the sort of thing that is often referred to by the > very term you used in a physical context, "hand waving" (maybe with > flakes of "snow job" thrown in). It's better than nothing, but it's > probably not worth the effort. Please don't take offense; I'm trying, > as I did in my last message less bluntly (and unsuccessfully), to > convince you to not waste your time on incomplete explanations of > hard-to-explain reasons, especially when the only question is how the > system behaves, not why it does so. (Thank you for having enough of the > former in the last msg.) Heh. How about this: "It will hurt if you change things without understanding them. Read and understand the FFS paper". ? > > Relative to the size of your disk, people complain about > > very large disks for even a very small free reserve > > percentage, mostly because they grew up in an era when > > "that was a lot of space!". > > It's not just that. It's a hunch that defrag considerations should > have as much to do with the size of files as it does with the amount > of unused FS. If the former stay the same, it seems reasonable that > the free space/reserve/whatever should remain the same for similar > defrag performance, regardless of FS size. OK, the hunch is wrong. If you don't exceed the optimal free reserve, the file system doesn't fragment. There is no such thing as significant fragmentation in an optimally tuned FFS. The fragmentation is avoided mathematically, not as a result of having a reserved "work area" in which there is active defragmentation occurring. Read the paper. ;^). > > The reality is that the algorithm needs a certain percentage > > of the space to work correctly, and if you take that away, > > then it doesn't work correctly. > > People reading about -m (or not even that) need a statement at > least as blunt as that to prevent many from guessing that the > talk of percentages isn't just another obsolete rule of thumb. Patches? 8-). > > Really, it'd probably be a good idea to find a reasonable > > way to make swap take up disk space until you ran out, on > > Interesting. > > > This issue has been discussed many times before. It's > > in the literature, and it's in the FreeBSD list archives > > dozens of times, at least. 8-). > > And if it was discussed near the -m option or an SA-level article was > referred to, we wouldn't be doing it again. Patches? 8-). > > To address your suggestions: this would imply that the you > > could get non-worst-case performance on a full disk near a > > very small free reserve selected administratively. > > OK, so it will take a few lines to explain better. > > > The real answer is that the more data on the disk above > > the optimal free reserve for the algorithm used for block > > selection, the worse the performance will be, and "worst > > case" is defined as "the last write before hitting the > > free reserve limit". So disk performance degrades > > steadily, the fuller it gets over the optimal free reserve > > (which is ~15%, much higher than the free reserve kept on > > most disks). > > So it should say that performance degrades increasingly from negligible > at 85% of the full FS to about 3 times slower near 100% full (plus > increased permanent fragmentation of files). And that this is a result > of the algorithms used and is independent of FS size. And this needs > complication to mention the effects of the 5% switch and -o option. Sure. Let's see if other people agree with that; it's a bit simplistic, in that you don't know whether the degradation is linear or exponential (exponential), and even saying that raises more questions from people who want to have knowledge given to them, instead of having to learn it (such people should have slots installed into their skulls before they come bother us, wince without a means of "giving" it to them like slotting a skills card into their brain, they are wasting their time. 8-)). > If I understand this correctly (a bad assumption), the peformance at > 95% full is the same regardless of whether I reserve 10% or 1%. Yes. > Since I don't care if the "end" of the FS is slow, the only reason > for picking a large -m I see is to avoid permanently fragmented > files. Wrong? Yes, if we accept the assumption that you don't care if the "end" of the FS is slow. Realize that this is all irrelevent, and what we are really talking about is not whether or not the disk space is able to be used, but rather "eye candy" for the system owener so that they can see a larger "available disk space" number. I think the confusion comes because anyone who naievely looks at the man pages, without reading them in depth, can come to the conclusion that the "free reserve" might be there for root use to recover a nearly completely full system, and so it's an administrative, rather than an algorithmic requirement. I believe that no matter *how well* you document things, you will still have problems with tourists who don't take time to read what you have written, in depth, to the point of understanding it. > Again, as it is, the documentation implies that performance with a > small -m is always bad regardless of FS space remaining. "The steady state of disks is full" -- Ken Thompson If you fill the disk up, and then empty it back out, for those files created during the "disk full" time, the performance *is* always bad. It's a matter of risk. Like mounting your FS async. The probability is pretty good that you will eventually fill up your disk, because humans are, by nature, pack rats. > > BTWBTW: If you screw up an important file this way, you > > can fix it by backing it up, deleting it, and restoring > > it, once the disk has dropped down to the optimal free > > reserve. This is known as "the poor man's defragger". > > mv file file.bak; cp -p file.bak file; rm file.bak ## ? No, actually. The "cp" program doesn't leave sparse files sparse. You can use the (GNU) tar program option for handling of sparse files (which it does by inference), or you can use "backup" and "restore". If no files are sparse, then the "cp" is OK. Funny story: I filled up an AIX disk by moving the documentation pages from one disk to another with "mv", which degrades to "cp" if it's moving across partitions on AIX. The problem is that there are a *lot* of sparse index files. Since the original move was successful, it took me a while to figure out where the space went. 8-). > Thanks again. I've saved your ID in my PR-to-do list and if I ever get > the easier ones done and write one for -m, I'll CC it to you. NP. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message