Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 May 1996 12:48:30 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        rkw@dataplex.net (Richard Wackerbarth)
Cc:        terry@lambert.org, questions@freebsd.org, alk@Think.COM
Subject:   Re: Re(2): why so many ways to stay in sync?
Message-ID:  <199605151948.MAA15113@phaeton.artisoft.com>
In-Reply-To: <n1380003191.78965@Richard Wackerbarth> from "Richard Wackerbarth" at May 15, 96 00:14:36 am

next in thread | previous in thread | raw e-mail | index | archive | help
> > Sup is a connection per site, whereas CTM can be distributed by
> > FTP mirrors.  CTM distribution is closer to usenet because it's
> > closer to store-and-foward flood model distribution.
> > 
> > The problem with store-and-forward is that it is an unreliable
> > delivery mechanism; SUP is closer to demand mirroring, and so
> > is really more useful.
> 
> I think that you are under-rating the utility of "store and forward". As we
> all know, the entire internet is based on store and forward of packets. The
> primitive delivery mechanism is unreliable.
> However, that does not prevent its effective use.
> 
> The salvation of unreliable delivery is the ability to detect the non-delivery
> and initiate corrective action. With CTM, we have that ability. Out of
> sequence updates are not applied, but are held awaiting the earlier ones. This
> is much like the TCP window. The recovery mechanism is presently ftp. However,
> it could easily be implemented with an ftp by mail service.

If you don't get the missing packets (because you can't request a
"retransmit" from an FTP server -- "you don't have this file -- give me
it anyway"), you are screwed.

In TCP, the packets are available until they are acked.  Missing
CTM files aren't because:

1)	They are not held until they are acked
2)	Different sites have different aging policies for FTP mirrors


> The advantage is that the end user does not ever REQUIRE a connection to the
> distribution site. It also has the major advantage of giving resource
> allocation control to the source of the information rather than to the
> destination.

Assuming the intermediates have as long or longer expire times.

It's like a usenet FAQ posting... you post the FAQ, and hoe it doesn't
expire on someones site such that they ask a stupid question which is
answered in the FAQ (definition of "stupid question").

If it expires, then there is the potential that between the expiry date
and the next posting date, there will be stupid questions.

The problem with CTM is that the "FAQ" (the baseline from which
subsequent "posts" are derived) is not "posted" frequently enough.

> I also think that you need to recognize that the dropped update rate
> is not at all very high.

The problem is not the "dropped update rate", it's the "not received
update rate".  There are reasons other than a dropped update for a
non-receipt... the most common one is "I'm a new subscriber".


> > SUP snapshots tend to be more buildable (in my experience).
>
> This is a "locking" problem on the master source and is related to the lack of
> disipline on the part of committers who do not always (often?) make atomic
> updates.

Yes; nolo contendre.  I have suggested a "protocol" (multiple reader/single
writer/virtual writer tree snapshot) fix to this many times in the past.
I am always shouted down.

This of course is only intended to speak to the ability to employ CTM
vs. SUP.  The policy that makes this The Way It Is is not as important
as the fact that This Is The Way It Is, Live With It.


> > Finally, using SUP seems to let me sync multiple trees easier than CTM.
> I don't understand this. There is only one tree. You should be using the
> composite tree (CVS). The method of delivery does not affect the usefulness.
> 
> > The real problem is that CVS sucks for people with commit privs... it would
> be better if there were a method ...
> 
> Again, this is a problem with the message and not with the messenger.

The difference is whether I have to allocate two trees worth of disk
space, or if I can allocate a single tree for both checkout and
mirroring, and allocate 600M or 1.2G for doing the job.

> The big advantage that I see to the sup scheme is that it provides a mechanism
> to restore a partially trashed tree without transferring or storing the entire
> source. However, in most cases that I have seen, the need to do this was
> created by the user's misuse of the distribution. IMHO, the distribution needs
> to be treated as "read only"

The problem here is that I then have to locally mirror the tree to do
anything with it.

I locally maintain an FS experimentation branch, a devfs experimentation
branch, an SMP experimentation branch, a PPC porting branch, an ELF
experimentation branch, and an LKM autoload branch.  I also have an
inactive DEC Alpha porting branch.

You are suggesting that I duplicate ~3.4G of data simply to make your
scheme work.

If I buy another 4G disk, it's going to be at least 9G to let me test
large drive boundry conditions, and it's going to be assembled using
logical-to-physical device autorecognition, etc..  It will be for
cutting edge stuff, not for making it so I can "take advantage" of
CTM (or more correctly, so CTM can take advantage of me -- or my disks).

I won't be able to trust it with my data anyway, if I'm using it for
testing.  Currently, I have only 2G of test disk available to me.

> I think it would greatly help if we would look at source distribution as a
> mini release (multiple times per day for some things) and have all of the
> necessary information distributed so that it is possible to use any of the
> four distribution channels (tarballs, live tree on CD, CTM, and SUP) to move
> from one point to another. In other words, have sup distribute a CTM release
> (complete with the .ctm_status file identifying it) rather than some other
> arbitrary snapshot. Similarly, we should assure that the information on the
> CD's matches an identifiable point in the CTM sequence.

This requires changes -- significant changes, which I'm not prepared
to deal with because of GPL -- in the CVS software itself.  It also
requires a change in the CDROM distribution, which has been discussed
at length and then *rejected*.  There are better uses for the space
than mail-order developement with surface mail turnaround on integration
sets.


> I recognize that this does not really address the problems of committing
> changes to a rapidly moving target

People who have enough disk to deal with CTM can afford it; I don't.
Almost all of my fixed storage is tied up right now with various code
branches or completely non-Intel code versions, and I don't have enough
to spare to implement what you suggest.  Yes, it would be convenient
for the badly connected people if everone could treat their local
archive updates as read-only for everything but the update program.
Practically, however, this is not an option for most of us, especially
if we are actively hacking code in one or more areas and CVS doesn't
support organizing its database for branch-level synchronization so
we can do our work on local vendor branches.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199605151948.MAA15113>