Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Apr 2002 15:17:34 -0500 (EST)
From:      Mikhail Teterin <mi@aldan.algebra.com>
To:        anarcat@anarcat.dyndns.org
Cc:        jhb@FreeBSD.org, imp@village.org, des@ofug.org, pst@pst.org, obrien@FreeBSD.org, cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org, winter@jurai.net, jkh@winston.freebsd.org, rwatson@FreeBSD.org
Subject:   Re: cvs commit: src/usr.sbin/sysinstall install.c installUpgrade
Message-ID:  <200204042017.g34KHYnF006405@aldan.algebra.com>
In-Reply-To: <20020404181423.GB279@lenny.anarcat.dyndns.org>

next in thread | previous in thread | raw e-mail | index | archive | help
[Reply-To set]

On  4 Apr, The Anarcat wrote:

> Indexed packages might take up more space on a CD, but regardless of
> the network connection, it should speed up package installs a 2-fold
> at least.

But if that makes them 15% bigger, I think I'd rather wait. 15% increase
of the download time is more than 100% of the extraction time for too
many people. And then you store the 15% bigger archives forever...
 
> I'm not sure I understand what you mean by seekable. Some network
> connections (HTTP 1.1 and FTP, IIRC) are seekable, ie you can start
> downloading http files at any given location.

By "seekable" I mean, that the same data can be read multiple times.
True, you can do that over the network too, but the net-bandwidth
available to even the most fortunate of us is nowhere close to the local
storage bandwidth.

> The problem is with non-seekable (non-indexed would be the proper
> word) archives. For .tgz (or .tbz2), wether you have the seekable file
> or network connection doesn't matter since you must extract the whole
> file in the order to seek individual files in the archive.
>
> Repeat after me: there's no way to access a given individual file in a
> tar(1) archive without extracting the archive up to the given file.

It is true. ZIP provides _generic_ index, which is good for many. We
can do better by placing the "important" files -- such as the "install"
script -- or "+CONTENTS" at the beginning of the file at the archiving
time... In fact, I think, that's what happens now, the package tools
just don't rely on that fact...
 
>> What's left are the people, who like to install directly from the
>> network and don't mind redownloading in case of a failure. My
>> guesstimate is those are not big in number and mostly don't care for the
>> method chosen one way or the other...
> 
> Choosing an index archive format doesn't mean you can't keep a local
> copy, and actually, right now, libh does keep a copy of the .zip
> locally, as a temporary, yes, but that is a simple toggle.

What I was saying is if you are likely to have the local copy anyway,
it does not matter that much if it is indexed or not -- extraction is
very fast anyway... Again -- indexing saves you time but wastes space.
Some (myself included) think, space is more important.
 
>> >> And I suspect, those who disagree are simply blinded by their
>> >> blazingly fast connections and fat disks. :-)
>> 
>> > No, the fact is that we have thought about some of the problems the
>> > current scheme doesn't address and which you haven't apparently
>> > thought about how to address either.
>> 
>> Mmm, sounds familiar :( Can you explain, what those are, or point me to
>> the mail archive, where this was discussed?
> 
> I can point you to the libh design document on /projects/libh.html.

Ok... I just read it. It does not contain anything, that was not
expressed in this thread -- regarding package format that is. Zip is
advocated as the most suitable in there... And I remain convinced, that
the overhead of compressing each file individually (which is what Zip
does) is too much of a price to pay...

The present pkg_add can read +CONTENTS (or whatever the meta-data
file(s) is(are) going to be named) from the beginning of the tarball and
proceed to extracting from the rest of the file, preferably -- directly
into the right place, or into the temporary directory _on the same
filesystem_, so that the bits can be quickly mv-ed to the right place.

The document describes having to extract into a temporary location as
"evil", which is not neccessarily true. If the location is chosen on
the same filesystem as the final destination, there will be enough
space, and there will be very little overhead -- rename(2) is very
quick...

	-mi


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200204042017.g34KHYnF006405>