Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Apr 2002 15:33:56 -0500
From:      The Anarcat <anarcat@anarcat.dyndns.org>
To:        libh@FreeBSD.org, mi@aldan.algebra.com
Cc:        jhb@FreeBSD.org, imp@village.org, des@ofug.org, pst@pst.org, obrien@FreeBSD.org, cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org, winter@jurai.net, jkh@winston.freebsd.org, rwatson@FreeBSD.org
Subject:   Re: cvs commit: src/usr.sbin/sysinstall install.c installUpgrade
Message-ID:  <20020404203356.GG279@lenny.anarcat.dyndns.org>
In-Reply-To: <200204042017.g34KHYnF006405@aldan.algebra.com>
References:  <20020404181423.GB279@lenny.anarcat.dyndns.org> <200204042017.g34KHYnF006405@aldan.algebra.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--reI/iBAAp9kzkmX4
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu Apr 04, 2002 at 03:17:34PM -0500, Mikhail Teterin wrote:
> [Reply-To set]
>=20
> On  4 Apr, The Anarcat wrote:
>=20
> > Indexed packages might take up more space on a CD, but regardless of
> > the network connection, it should speed up package installs a 2-fold
> > at least.
>=20
> But if that makes them 15% bigger, I think I'd rather wait. 15% increase
> of the download time is more than 100% of the extraction time for too
> many people. And then you store the 15% bigger archives forever...

The problem is that our problem is not *only* with package size but
also with the package suite architecture itself.

I think having to extract the archive to a temporary location is a
problem in limited environments since a given installation will need
the size of its biggest installed package free in order to have the
installation successful.

Arguably, if this is a problem, such installation is doomed to fail on
space problems anyways. ;)
 =20
> > I'm not sure I understand what you mean by seekable. Some network
> > connections (HTTP 1.1 and FTP, IIRC) are seekable, ie you can start
> > downloading http files at any given location.
>=20
> By "seekable" I mean, that the same data can be read multiple times.

Ever heard about caching? :)

> True, you can do that over the network too, but the net-bandwidth
> available to even the most fortunate of us is nowhere close to the local
> storage bandwidth.

If the file is remote, just cache the read data locally and you're
done with bandwidth.

> > The problem is with non-seekable (non-indexed would be the proper
> > word) archives. For .tgz (or .tbz2), wether you have the seekable file
> > or network connection doesn't matter since you must extract the whole
> > file in the order to seek individual files in the archive.
> >
> > Repeat after me: there's no way to access a given individual file in a
> > tar(1) archive without extracting the archive up to the given file.
>=20
> It is true. ZIP provides _generic_ index, which is good for many. We
> can do better by placing the "important" files -- such as the "install"
> script -- or "+CONTENTS" at the beginning of the file at the archiving
> time...

No. That's not better since the whole file still needs to be *read*
(note: not seeked) in order to find the file.

Bandwidth-wise, indexed archives are better.
Storage-wise, non-indexed are better.

I think I'll start working on adding .tar support to libh. ;)

But I still think we should offer support for network .zip installs.

> In fact, I think, that's what happens now, the package tools
> just don't rely on that fact...

It wouldn't make much of a difference except it might be possible to
avoid extracting the archive to a temporary location.
 =20
> >> What's left are the people, who like to install directly from the
> >> network and don't mind redownloading in case of a failure. My
> >> guesstimate is those are not big in number and mostly don't care for t=
he
> >> method chosen one way or the other...
> >=20
> > Choosing an index archive format doesn't mean you can't keep a local
> > copy, and actually, right now, libh does keep a copy of the .zip
> > locally, as a temporary, yes, but that is a simple toggle.
>=20
> What I was saying is if you are likely to have the local copy anyway,
> it does not matter that much if it is indexed or not -- extraction is
> very fast anyway... Again -- indexing saves you time but wastes space.
> Some (myself included) think, space is more important.

For some others, bandwidth is more important. I think there's an
unreconcialable conflict here, and that therefore both notions should
be supported.
 =20
> >> >> And I suspect, those who disagree are simply blinded by their
> >> >> blazingly fast connections and fat disks. :-)
> >>=20
> >> > No, the fact is that we have thought about some of the problems the
> >> > current scheme doesn't address and which you haven't apparently
> >> > thought about how to address either.
> >>=20
> >> Mmm, sounds familiar :( Can you explain, what those are, or point me to
> >> the mail archive, where this was discussed?
> >=20
> > I can point you to the libh design document on /projects/libh.html.
>=20
> Ok... I just read it. It does not contain anything, that was not
> expressed in this thread -- regarding package format that is. Zip is
> advocated as the most suitable in there... And I remain convinced, that
> the overhead of compressing each file individually (which is what Zip
> does) is too much of a price to pay...
>=20
> The present pkg_add can read +CONTENTS (or whatever the meta-data
> file(s) is(are) going to be named) from the beginning of the tarball and
> proceed to extracting from the rest of the file, preferably -- directly
> into the right place, or into the temporary directory _on the same
> filesystem_, so that the bits can be quickly mv-ed to the right place.

That requires reading/downloading the whole archive. Which might not
always be necessary in some cases.
=20
> The document describes having to extract into a temporary location as
> "evil", which is not neccessarily true. If the location is chosen on
> the same filesystem as the final destination, there will be enough
> space, and there will be very little overhead -- rename(2) is very
> quick...

Good point.

I guess I need to think this over.

A.

--=20
Jesus died for his own sins. Not mine. (CRASS, 1978)

--reI/iBAAp9kzkmX4
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (FreeBSD)
Comment: For info see http://www.gnupg.org

iEYEARECAAYFAjysuLMACgkQttcWHAnWiGcchACdESidNv7gy+osPCm/PdeIXdRu
NhQAoIS25tE3yZ/ZZ5011NfMSb4ETLlu
=vUp6
-----END PGP SIGNATURE-----

--reI/iBAAp9kzkmX4--

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020404203356.GG279>