Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 4 Dec 2015 11:53:39 -0800
From:      Bryan Drewery <bdrewery@FreeBSD.org>
To:        =?UTF-8?Q?Ulrich_Sp=c3=b6rlein?= <uqs@FreeBSD.org>, freebsd-git@freebsd.org
Cc:        freebsd-current@freebsd.org, git-admin@freebsd.org
Subject:   Re: FYI: SVN to GIT converter currently broken, github is falling behind
Message-ID:  <5661EF43.9040406@FreeBSD.org>
In-Reply-To: <CAJ9axoRV1gwpVsTpB_%2BPQX4ZrWpnJRtTJ77dsA0vC_BekR8=9g@mail.gmail.com>
References:  <CAJ9axoTuuBt4%2Bg4o1%2BLy9VmNfAa3pcMhcPr2ws8T1kCm=Om=tg@mail.gmail.com> <CAJ9axoRBcFD=-d=pzJJvYempEO-EyR_kAiK3EZQ_hp%2B7_J1iyQ@mail.gmail.com> <563EAAB8.5020702@freebsd.org> <CAJ9axoQmgT0B23UtmzGeMcvS%2BCHxC16FL53fPGObBZoxEC03aQ@mail.gmail.com> <CAJ9axoRcCUoLzyGN-JkJEn%2ByinWdVoSKUcBr7eas5638t6jBUg@mail.gmail.com> <CAJ9axoRV1gwpVsTpB_%2BPQX4ZrWpnJRtTJ77dsA0vC_BekR8=9g@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--KIw4nppIcKxsRKMVRwiWAsIG3FSmtXmw6
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

On 12/4/2015 10:49 AM, Ulrich Sp=C3=B6rlein wrote:
> 2015-11-08 12:06 GMT+01:00 Ulrich Sp=C3=B6rlein <uqs@freebsd.org>:
>> 2015-11-08 11:32 GMT+01:00 Ulrich Sp=C3=B6rlein <uqs@freebsd.org>:
>>> 2015-11-08 2:51 GMT+01:00 Alfred Perlstein <alfred@freebsd.org>:
>>>>>
>>>> Uli,
>>>>
>>>> One of the biggest concerns I've heard from folks using FreeBSD's gi=
t mirror
>>>> is that the hashes can change.
>>>>
>>>> I have a question about this.   Is it possible to keep track of what=
 the
>>>> "official" git mirror (on github) is doing and keep that as a log.  =
Then
>>>> that log can be used to replay commits when there is a divergence pr=
oblem.
>>>>
>>>> What I'm basically saying is that let's take this small example:
>>>>
>>>> importer is working fine @rev 10000
>>>> imports 10000
>>>> imports 10001
>>>> imports 10002
>>>> something happens to importer to give indeterminate shas.
>>>> imports 10003 - sha is "unstable" sha3
>>>> imports 10004 - sha is "unstable" sha4
>>>> imports 10005 - sha is "unstable" sha5
>>>> imports 10006 - sha is "unstable" sha6
>>>> importer is fixed
>>>>
>>>>
>>>> At this point normally we'd rewind the importer to 10002 and then fo=
rce
>>>> update the affected branches.
>>>>
>>>> My question is... can the imports of 10003, 10004, 10005 and 10006 b=
e put
>>>> into the importer such that any "mirror site" that re-does the impor=
t using
>>>> the most up to date importer will get the same shas.
>>>>
>>>> That would allow to proceed with 10007, etc without force pushing.
>>>>
>>>> This should be possible based on querying "git" for the meta data as=
sociated
>>>> with sha3..sha6 and then forcing those commits to have the same meta=
 data.
>>>>
>>>> This would eliminate the concern about shas in the mirror changing t=
hat I've
>>>> heard.
>>>
>>> The goal of the conversion is that everyone can re-do the conversion
>>> in their basement and come up with the same history and checksums.
>>> This was not the case when I first started, as there was some
>>> non-deterministic hash structure being used in svn2git. This was fixe=
d
>>> in the code and then all converter runs produced the very same
>>> results.
>>>
>>> The scenario that we have right now, is that one of the merge commits=

>>> done about two weeks ago is being handled different by svn2git w/ svn=

>>> v1.8 vs. svn v1.9 and I haven't investigated yet how the API's
>>> behavior changed to cause this. I'm afraid I also swapped out all my
>>> knowledge about svn2git internals and will have to redo this all from=

>>> scratch :/
>>>
>>> Your suggestion could only work, if we hard-code this svn revision
>>> special handling into svn2git, either in the code or by providing mor=
e
>>> mappings and rules to the process. svn2git should run hermetic and no=
t
>>> poke at github's commits to see how things were handled in the past.
>>> It has to be self-sufficient and must not depend on github.
>>>
>>> This would also only work, if the "breakage" window was very small,
>>> but it is already about two weeks long and will surely increase till =
I
>>> find the proper fix.
>>>
>>> So, to take a stand here: this sort of kludge is unlikely to ever
>>> happen. Git commit hashes *might* change in the future. I really don'=
t
>>> see how this is a big deal anyway.  It happened once and I'm trying t=
o
>>> have it never happen again. But why are people afraid of this
>>> happening? Every "official" git commit is tagged with a SVN revision
>>> and the contents of those revisions are obviously correct (just not
>>> the ancestry and the commit objects, possibly). So it would be easy t=
o
>>> write a script that replays VendorA's git history and swaps out the
>>> new official commits for the old official commits. There would be no
>>> merge conflicts.
>>>
>>> I can see how this would be annoying if you have 100 developers and
>>> dozens of branches that are far from mainline FreeBSD. But I'm sure
>>> these companies that depend on git will come forward and donate some
>>> of their developer manpower to help me with keeping the converter
>>> stable/deterministic. Right? Right? :) :)
>>>
>>> Cheers,
>>> Uli
>>
>> Quick update: doc is so far unaffected by svn 1.9, but for ports, the
>> drift happened as of Jul 18, so you'd need to special case a lot of
>> commits.
>>
>> Here's the same commit, and the difference between 1.8 and 1.9:
>>
>> % git cat-file commit 803795d
>> tree 7fc83aba022834da5c218114b09ad4640735bcc0
>> parent c96fb0418e545a569b5975b4d878a30a948c29d5
>> author olgeni <olgeni@FreeBSD.org> 1437203525 +0000
>> committer olgeni <olgeni@FreeBSD.org> 1437203525 +0000
>>
>> Upgrade to version 0.4.1.
>> % git cat-file commit 61ca43b
>> tree 7fc83aba022834da5c218114b09ad4640735bcc0
>> parent c96fb0418e545a569b5975b4d878a30a948c29d5
>> author olgeni <olgeni@FreeBSD.org> 1437203529 +0000
>> committer olgeni <olgeni@FreeBSD.org> 1437203529 +0000
>>
>> Upgrade to version 0.4.1.
>>
>>
>> In case you don't see it, there's a 4s difference in the timestamps
>> for authoring and committing. Here's the original:
>>
>> % svn log -vc392405 svn://svn.freebsd.org/ports
>> ----------------------------------------------------------------------=
--
>> r392405 | olgeni | 2015-07-18 09:12:05 +0200 (Sat, 18 Jul 2015) | 2 li=
nes
>> Changed paths:
>>    M /head/www/elixir-maru/Makefile
>>    M /head/www/elixir-maru/distinfo
>>
>> Upgrade to version 0.4.1.
>>
>> ----------------------------------------------------------------------=
--
>>
>> So yeah, svn 1.9 returned a timestamp that was off by 4s. WTF?
>>
>> For base it's actually even more complicated than I had thought so
>> far. But let's take this one step at time ...
>=20
> An update, which you won't like to hear:
>=20
> SVN v1.9 is totally innocent, the API changed a little and has been
> patched, this is not the source of the difference between the
> currently published repo and a clean run. The difference stems from
> the fact that the svnsync'ed copy on git.freebsd.org was poisoned and
> is *NOT* in sync with our main repo. People tell me this is due to a
> shortcoming of svnsync that can race and thus produce different
> metadata for a commit, depending on when it is run.
>=20
> This is a clusterfuck.
>=20
> Both freebsd-base and freebsd-ports are no longer reproducible by
> third-parties. It is only a matter of time when freebsd-doc is
> affected.
>=20
> clusteradm@ sadly has remained rather silent on this issue and unless
> we can move the mirroring to rsync or syncthing or whatever I don't
> see how the project can continue to provide a so-called git "mirror"
>=20

Running svnsync in 2 places and then calling them mirrors seems odd.
It's only needed once. (svnsync hurts global warming too). Then just
rsync or use the git mirroring features.


--=20
Regards,
Bryan Drewery


--KIw4nppIcKxsRKMVRwiWAsIG3FSmtXmw6
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBAgAGBQJWYe9DAAoJEDXXcbtuRpfPMpIIAM0eaiH62z729Usg40skt+Ys
jrQqnWegG/AGi5ULgMBFlh7LivlmCPobnwQokX83aZEsBhUvPKtfQp7b9Y14QWO1
cful3FRd4VfK4ti2uI0FMHnKmvLyb2iEATSnUCdAf+J+zB4kK14kz2gGDOUrec2+
A83MGgk4bJFRi1wFeqtzO6ZfBjecQPyikXICxX+rNtYd8pMfuuepr2MyneISUzyv
Whc40hYYdyw1AIGv1mtMJhn4VQkKwjiyhzjLmHUsm+RSDlAOoV0UY6EQeVnbXL/L
msRpLxhB4qq5EghEjqbUuL8C5wkwt9JO/BrW0/hSQXuBHKRKiab2pPVxXCZ48Kg=
=rPTF
-----END PGP SIGNATURE-----

--KIw4nppIcKxsRKMVRwiWAsIG3FSmtXmw6--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5661EF43.9040406>