Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 23 Sep 1998 01:52:29 +0200
From:      Stefan Esser <se@mi.uni-koeln.de>
To:        Adrian Penisoara <ady@warpnet.ro>, Satoshi Asami <asami@cs.berkeley.edu>
Cc:        ports@FreeBSD.ORG, Stefan Esser <se@mi.uni-koeln.de>
Subject:   Re: fetch + size
Message-ID:  <19980923015229.A7423@mi.uni-koeln.de>
In-Reply-To: <Pine.BSF.3.96.980922224533.12618K-100000@ady.warpnet.ro>; from Adrian Penisoara on Tue, Sep 22, 1998 at 10:53:36PM %2B0300
References:  <199809221627.JAA08248@silvia.hip.berkeley.edu> <Pine.BSF.3.96.980922224533.12618K-100000@ady.warpnet.ro>

next in thread | previous in thread | raw e-mail | index | archive | help
On 1998-09-22 22:53 +0300, Adrian Penisoara <ady@warpnet.ro> wrote:

Hi!

Please allow me to give a rationale for the suggested addition of SIZE
information for the ports' distfiles:

> > (1) The -S flag of "fetch" just went into -current, if it doesn't work
> >     as well as it's supposed to this might cause a great instability
> >     in ports
> 
>  This won't help too many people as for the moment I assume that the vast
> majority of FreeBSD production machines run 2.2-stable...
>  And if it might be buggy in bleeding edge -current...

The implementation of -S is very simple, it was only a few lines to add 
to fetch. The -S option can be enabled/disabled via bsd.port.mk, but it
is possible to override the setting in make.conf.

> > (2) The patch adds a SIZE line to files/md5.  That is quite
> >     displeasing aesthetically (there's a reason why the file was
> >     called "md5" in the first place...)
> 
>  I don't see how a file with different size would have the same MD5
> checksum !...

Yes, that's why I want to add the SIZE info!
I agree that the name "md5" is not fully appropriate for a file that 
holds different information as well, but the designer of the MD5 file
syntax obviously expected other information to go into that file, or
he would not have added the "MD5" line lead in.
Adding another file wastes lots of disk space (an estimated 1700 files
of 1KB minimum or 1.7MB), but the "Md5" file is far less than 0.5KB for 
just about every port, which makes the on disk file size stay unchanged
if the SIZE lines go in as well.
The name of the "md5" file could be "checksums", but changing this now
would be more of a mess than just allowing different contents in that 
file.
If you look at the suggested patch, then you see that calculation of
the SIZE info takes just a single line in bsd.port.mk.

The SIZE lines can't possibly cause problems, since all lines not starting
with "MD5" are ignored if an old bsd.port.mk is used (e.g. in 2.2.x). 

I've often wished I had the size info in **some** place, to manually check
the size of the distribution files before starting the download. And this
is an additional bonus of having the SIZE lines, IMHO.

>  Why should we need a SIZE line on top of the MD5 checksum ?

Because you don't know whether the MD5 checksum will match until after 
you fetched the file (possibly tens of megabytes over a privately paid
modem or ISDN link). The size can be queried from the file server, and
in fact this did already happen in fetch (for the progress bar).

> > What do people think?  I'm leaning towards leaving this out for 3.0R,
> > for reasons stated above.  (We'll be dealing with a whole bunch of
> > diffs to files/md5 if this goes in, so it's going to be a royal pain
> > to change it later....)
> 
>  IMHO I think that this SIZE addition would be an uneccesary complication!

Sure, if you got 200KB/s bandwidth and don't pay for volume or connect 
time. I prefer to have fetch ignore files that changed (and don't fit a
port) before downloading them and finding that the port build aborts
because of the MD5 mismatch. At that time the ports mechanism can't go
back and fetch another file form a different master site, since the "fetch"
target comes before the "checksum" target, and without the size check, the
fetch target will succeed for files that will never make it through "checksum".

I do not expect the SIZE info to improve the checksum target, but instead to
put a first consistency check into the fetch target, at little cost. Even if
most distribution files don't change without a change in name, it still 
happens and it happens more often than one would naively inspect (check the
CVS logs for all those comments ...)

Again: I consider this a low risk change, since I have tested the "-S" patch
on a variety of systems, even from behind a very restrictive firewall.

The creation of the SIZE lines is automatic from within the "makesum" target.
If a port doesn't have a "SIZE" line for its distfile(s), then the previous 
behaviour will be preserved (i.e. fetch will be invoked without "-s size").

The addition of SIZE lines is benign (besides the unpleasent aspect of having
non-MD5 info in a file named "md5", but I don't consider this too much of a 
problem). The SIZE lines will be silently ignored, if you don't want them,
but they can be used by those, who can take advantage of them.

Regards, STefan

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-ports" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19980923015229.A7423>