The Uncherrypick Identity Crisis
Page last updated
This was a scenario presented to me at the Mercurial Sprint days that poses a subtle problem to VCSes like JJ and GitButler that choose to constantly replay history.
We start with a small commit graph that looks like:
* B
| * C
|/
* ALet’s imagine they have the file contents as follows:
A: a
B: ab
C: abcWe can imagine this scenario as having some bug fix in , while contains both the same bug fix and an additional feature.
In our scenario, we then perform two operations:
- Rebase on top of , to get
- Rebase back on top of , to get
After the first step, we achieve a history , where the
contents of are abc as expected.
The problem appears when we cherry-pick back onto . In that operation,
the merge base is still : one side removes b to get back to , while the
other side adds only c to get from to . The three-way merge combines
those changes and produces ac, not abc.
That is surprising because the two rebases look like they should cancel out. We started with , moved it onto , then moved it back onto ; but the result is no longer equal to .
This example shows that when performing the cherry-pick in a VCS that makes use
of three-way merges, merge(l=A, base=B, r=merge(l=B, base=A, r=C)) does not
always equal .
If we go back to imagining that the content b was some sort of bug fix, then
we have potentially introduced a regression by reverting what should have
otherwise been an identity operation.
This is something that affects both Git and Mercurial, but is not so much of a concern since users rarely do this by hand. However, in a VCS like JJ or GitButler we encourage users to drag and drop changes about the place with the promise of some notion of consistency and reversibility in these operations.
As a result, there is really an onus on us to take scenarios like this seriously and to try and define better primitives that can avoid pitfalls like this.