Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 06 Feb 2009 21:50:34 +0200
From:      Giorgos Keramidas <>
To:        cpghost <>
Cc:        "" <>
Subject:   Re: OT: SVN checkout checksumming
Message-ID:  <87myczz6np.fsf@kobe.laptop>
In-Reply-To: <> ('s message of "Fri, 6 Feb 2009 20:11:57 +0100")
References:  <> <878wolpydl.fsf@kobe.laptop> <> <871vubv66x.fsf@kobe.laptop> <>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
On Fri, 6 Feb 2009 20:11:57 +0100, cpghost <> wrote:
>On Fri, Feb 06, 2009 at 07:14:14PM +0200, Giorgos Keramidas wrote:
>>On Fri, 6 Feb 2009 17:58:00 +0100, cpghost <> wrote:
>>>> Let's assume for a moment that you install a post-commit hook that
>>>> generates a SHA-256 checksum of all the files in the latest repo
>>>> revision on the svn server.
>>>> For the sake of simplicity, let's assume that this file is a simple,
>>>> plain text file that is named db/revs/NUMBER.sha256 where 'NUMBER' is
>>>> the revision number you are check-summing.
>>>> How are you going to *safely* transmit those SHA-256 checksums to the
>>>> client on 'svn checkout'?
>>> Well, sorry to bring this back up, but again: how about signing
>>> NUMBER.sha256 with a GnuPG private key belonging to the FreeBSD
>>> Project? If there's a way to *safely* get the corresponding
>>> public key, checking the signature of the NUMBER.sha256 files
>>> would be trivial.
>> If the signed data is not part of the actual repository, you have a
>> signature for a numeric value, not a signature for the *contents* of the
>> repository itself.
> Hmmm... yes, you're right. Only the digest would be signed in this
> case, and that's not enough. But if the (digest, revision) pair is
> signed, that would at least be useful (somewhat).
> So, let's say that NUMBER.sha256 starts with something like a comment:
> # r123456
> <path1 / digest1>
> <path2 / digest2>
> <path3 / digest3>
> ...
> and all this signed, would it be enough?

Sorry, but no, it wouldn't be enough.  There are other SCM systems where
the sha256 hash is *part* of the history, like Mercurial, Git and Darcs.

If you really want to be _certain_ that a particular revision is truly
what it is supposed to be, using something that makes cryptographically
secure hashes an integral part of the history is probably the only way
to achieve that goal :/

> Even if the repository isn't signed, one can compute the digests
> locally and check them with the *signed* list of digests. It may not
> catch everything because of possible collisions, but wouldn't that be
> already better than nothing?

Yes, that might be "good enough", but it might have a slightly hard to
define set of constraints.  For example:

  * Do you publish checksums for all the files in each revision (a
    'manifest' as some systems call the collection of files)?

  * Do you allow checksums to be recorded as a full manifest every time,
    or do you publish only the checksums for the files that changed
    since the last revision?

  * How do you handle separate branches?

  * Do svn:keywords play a role in the calculation of the checksum?  If
    not, why?

These are not as easy problems to solve as it may initially appear :(

Want to link to this message? Use this URL: <>