Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 26 Sep 2019 10:26:43 -0400
From:      Ed Maste <emaste@freebsd.org>
To:        Warner Losh <imp@bsdimp.com>
Cc:        Sean Chittenden <sean@chittenden.org>, freebsd-git@freebsd.org,  =?UTF-8?Q?Ulrich_Sp=C3=B6rlein?= <uspoerlein@gmail.com>
Subject:   Re: Service disruption: git converter currently down
Message-ID:  <CAPyFy2C5FNwHOTuamwKQXY9Z_uMJJGnmo_4fG8UOp8expxiN%2BQ@mail.gmail.com>
In-Reply-To: <CANCZdfoBYwp6Gn9nh754yQGXFR0MWkg3hKo8LF-RX_YgdSBycA@mail.gmail.com>
References:  <CAJ9axoR41gM5BGzT-nPJqqjym1cPYv31dDUwXwi4wsApfDJW%2Bw@mail.gmail.com> <CAJ9axoToynYpF=ZdWdtn_CkkA2nVkgtckQSu%2BcMis1NOXgUdnA@mail.gmail.com> <CAJ9axoR2VXFo9_hx9Z1Qwgs7U-dkan56hrUKO9f7uN6Wpd15xQ@mail.gmail.com> <CAHevUJHwDet8pBdrE4SN3nuoAUgP-ixpCz9uOTdwbE31UDDsbA@mail.gmail.com> <CAPyFy2AMqft2EwdZHYnFUOFxSDOmN1Rv0A9jnR3VdE38SP87pw@mail.gmail.com> <CANCZdfq71yYjGGog9qm2-xb0RRZG8=YdCg3g0%2BotLvPn6r3xJw@mail.gmail.com> <CAPyFy2AWOqtb_DNiekKUx07LbQPzvOkw_qvf58DKuopsvHySTQ@mail.gmail.com> <CANCZdfoBYwp6Gn9nh754yQGXFR0MWkg3hKo8LF-RX_YgdSBycA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 25 Sep 2019 at 15:50, Warner Losh <imp@bsdimp.com> wrote:
>
> git log always requires added care. There's not actually 9000 commits the=
re. The tree looks fine topologically. Its purely an artifact of git log.

This seems to be getting into a philosophical discussion of what it
means for a commit to exist. But, given the constraints in the way git
represents commits the history crafted by the svn-git exporter indeed
shows thousands of "phantom" commits. The converter should (and with
uqs tweaks, would) represent the offending commit here as if it were a
cherry pick, not a merge.

In order to really represent this correctly we need to add to git
metadata tracking file operations. Recording that path d1/f1 was
copied from d2/f2 at some hash would allow us to properly represent
this case as well as renames/moves.

>> git log --first-parent isn't really a solution here either, because
>> there are cases where one legitimately does want history from both
>> parents, especially working in downstream projects.
>
> I'm pretty sure it would be fine, even in that case.

It's not fine, because it omits the commits I want to see.

>> > I'd offer the opinion that needing to know about things like git log -=
-first-parent vs having to rebase every single downstream fork,
>>
>> We won't need to rebase every fork - in no case should the path
>> forward be worse than uqs's suggestion of a merge from both old/new
>> conversions.
>
> IMHO, uqs suggestion is a complete non-starter, at least the "git diff | =
git patch" one. It destroys all local history, commit messages, etc. Except=
 for the most trivial cases, it's not really going to fly with our users. H=
is other, followup ones might be workable into scripts.

diff | patch is not the suggestion; the suggestion is to perform a
merge from the "new" conversion. Other options (e.g. some sort of
scripted commit replay) are at least no worse than that base case.

> I'm not sure you can merge, as there's no common ancestor that's recent e=
nough to give it a chance at succeeding (since the different exports would =
have different hashes starting fairly early in our history). My experience =
with qemu is that long-lived merge-updated branches become quite difficult =
to cope with after a while. It took me three weeks to sort out that relativ=
ely simple repo.

In fact, the merge works fine, even with completely unrelated
histories. You can try this by merging 'svn_head' (from git svn) to
'master' (from svn2git), using `git merge --allow-unrelated-histories
origin/svn_head`. The resulting history has two copies of every
commit, but the file contents are unchanged over the merge.

If you try this in a tree with changes (i.e., try applying it to a
long-running merge-based branch) every modified file will result in a
conflict, but they can be trivially resolved in favour of the first
version. From that point on merging from the "new" conversion will
work as expected.

> A rebase has a chance of working for people following a 'rebase' work flo=
w.

Indeed, for rebase workflow it's fairly straightforward.

> However, for people like CHERIBSD who follow a 'merge from upstream' mode=
l which never rebases (since that would be anti-social to their down stream=
s), I'm having trouble understanding how that could work. At work, we basic=
ally do the merge from upstream with collapse model, which I'm having troub=
le seeing how to move from old hashes to new. I'd like to know what the pla=
n for that would be and would happily test any solution there with a copy o=
f our repo. I'd even be happy to run experiments in advance of there being =
something more public available to see what options do or don't work.

Could you expand on the "merge from upstream with collapse" -
specifically, can you provide an example command used when merging
from FreeBSD?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAPyFy2C5FNwHOTuamwKQXY9Z_uMJJGnmo_4fG8UOp8expxiN%2BQ>