From owner-freebsd-fs  Sun Aug 25 18:49:14 1996
Return-Path: owner-fs
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id SAA07359
          for fs-outgoing; Sun, 25 Aug 1996 18:49:14 -0700 (PDT)
Received: from UKCC.uky.edu (ukcc.uky.edu [128.163.1.170])
          by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id SAA07351;
          Sun, 25 Aug 1996 18:49:12 -0700 (PDT)
Received: from t2.mscf.uky.edu by UKCC.uky.edu (IBM VM SMTP V2R3) with TCP;
   Sun, 25 Aug 96 21:47:03 EDT
Received: from t1.mscf.uky.edu by t2.ms.uky.edu id aa12275; 25 Aug 96 21:45 EDT
From: eric@ms.uky.edu
Subject: Re: The VIVA file system (fwd)
To: Terry Lambert <terry@lambert.org>
Date: Sun, 25 Aug 1996 21:45:16 -0400 (EDT)
Cc: freebsd-fs@freebsd.org, current@freebsd.org
In-Reply-To: <199608252036.NAA21331@phaeton.artisoft.com> from "Terry Lambert" at Aug 25, 96 01:36:26 pm
X-Mailer: ELM [version 2.4 PL23]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Message-ID:  <9608252145.aa12275@t2.t2.mscf.uky.edu>
Sender: owner-fs@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk


I guess I should respond to this thread since I happen to be on
all these nice FreeBSD mailing lists nowadays.

The Linux version was done by one of Raphael's Master's students.
It isn't complete, but it does apparently work (I personally have 
not seen it nor have I seen the performance figures).

As a side note, I am currently working on Viva2, which should be a much
more interesting gadget.  Anyway, it is in active development again
and FreeBSD is the platform this time.

> > > Anybody have opinions on this vs LFS? Are we still waiting for the Lite-2
> > > stuff, before LFS can go in?
> >
> > Looks interesting, but LFS is also.  Some of the improvements will appear
> > when we get our implementation of properly delayed writes working for
> > UFS.  I am sure that someone will take-on LFS when Lite-2 stuff goes
> > in, even I might (shiver :-)).
> 
> The VIVA stuff is, I think, overoptimistic.
> 
> They have made a number of claims in the University of Kentucky papers
> that were published about two years ago that seem to rely on overly
> optimistic assumptions about policy and usage.

You might explain this one, I'm not sure I know what you mean.   The
paper was written over three years ago.   The work was actually
performed from late 1991-1992.    The AT&T lawsuit came out, I became
distracted with making a living, and haven't gotten back to it until
a couple months ago.

For all the discussion below, you must remember that the platforms for
Viva were 1) AT&T SysV, and 2) BSDI's BSD/386.    We abandoned SysV
because I wanted to release the code, then came the AT&T lawsuit:-(

> They also seemed to pick "worst case" scenarios for comparison with FFS,
> and avoided FFS best case.

We did our testing on clean, freshly newfs'd partitions for the graphs.
I don't see how this is "worst case", but perhaps you mean the types
of tests we ran.   Obviously, we ran some tests that showed a difference
between FFS and Viva.

> This is not nearly as bad as the MACH MSDOSFS papers, which intentioanlly
> handicapped FFS through parameter setting and cache reduction, while
> caching the entire DOS FAT in core was seen as being acceptable, to
> compare their work to FFS.
> 
> But it is certainly not entirely unbiased reporting.

I'm not sure how to react to this.   Can one write an "entirely
unbiased" report about one's own work?   Personally, I don't think
so.   We tried.  I'll leave it at that.

<--stuff deleted>

> The read and rewrite differences are moslty attributable to policy
> issues in the use of FFS "optimizations" which are inapropriate to
> the hardware used.

The read and rewrite differences are due to the fact FFS didn't do
clustering very well at all.    BSDI *still* doesn't do it well, but
FreeBSD appears to be much better at it.   I'm still running tests
though and probably will be running tests for some time yet.

<--more stuff deleted>

> Finally, I am interested in, but suspicious of, their compression
> claims, since they also claim that the FFS performance degradation,
> which Knuth clearly shows to be a hash effect to be expected after
> an 85% fill (in "Sorting and Searching"), to be nonexistant.

Well, the results are in the paper.   This is what we saw, but
you should look at the table carefully.   There are places where
the effective clustering of a particular file degrades over 50%, 
but that was (at the time) about as good as FFS ever did anyway.
The mean effective clustering always remained very high (90%+).

I should have some more modern numbers in a few months, Raphael's
student probably has some for Linux now.   I think he used the
original algorithms.

> INRE: "where the wins come from", the "Discussion" reverses the
> claims made earlier in the paper -- we see that the avoidance of
> indirect blocks is not the primary win (a conclusion we came to
> on our own from viewing the earlier graphs).

This is correct, the big performance wins came from:

	1) Large block sizes and small frag sizes
	2) Good clustering
	3) Multiple read-ahead

> We also see in "Discussion" that caching file beginnings/ends in the
> inode itself is not a win as they has hoped.  In fact, compilation
> times are pessimized by 25% by it.

Yes, we were disappointed by that, but it just confirmed what others
(Tanenbaum for example) had seen earlier.   You should remember 
that one reason for the degradation was that it threw off all 
the code that tries to read things in FS block-sized chunks.   
We wanted to be able to read headers in files quickly and it *does*
do that well.   We just thought it would be nice to provide that
capability (some people have entire file systems dedicated to
particular tasks).    Some space in the inode can be used for lots
of things, perhaps it would be most useful at user level and disjoint
from the bytes in the file.

Eric