The delang is using merge instead of rebase/squash - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » The delang is using merge instead of rebase/squash

Thread overview

The delang is using merge instead of rebase/squash

Mar 15, 2017

Mar 15, 2017

Mar 15, 2017

Mar 20, 2017

Mar 20, 2017

Mar 20, 2017

Mar 21, 2017

Vladimir Panteleev

Mar 21, 2017

Mar 21, 2017

Vladimir Panteleev

Mar 21, 2017

Mar 22, 2017

Vladimir Panteleev

Mar 22, 2017

Sebastien Alaiwan

Mar 22, 2017

Multi-commit PRs vs. multiple single-commit PRs
Mar 21, 2017 Vladimir Panteleev
Mar 21, 2017 deadalnix
Mar 22, 2017 Vladimir Panteleev
Mar 22, 2017 Vladimir Panteleev
Mar 22, 2017 Daniel N
Mar 22, 2017 Vladimir Panteleev
Mar 22, 2017 Daniel N
Mar 22, 2017 Vladimir Panteleev
Mar 22, 2017 Daniel N
Mar 22, 2017 deadalnix
Mar 22, 2017 Vladimir Panteleev
Mar 22, 2017 Andrei Alexandrescu
Mar 23, 2017 Vladimir Panteleev
Mar 23, 2017 Jonathan M Davis
Mar 23, 2017 Andrei Alexandrescu
Mar 24, 2017 Vladimir Panteleev
Mar 24, 2017 Seb
Mar 24, 2017 Vladimir Panteleev
Mar 24, 2017 deadalnix
Mar 24, 2017 Martin Nowak

Mar 21, 2017

Mar 24, 2017

Nov 19, 2017

Feb 03, 2018

March 15, 2017

The delang is using merge instead of rebase/squash

Posted by deadalnix

deadalnix

This is making the history very spaghettified. Is that possible to have the bot rebase/squash commits and then pushing ?

March 15, 2017

Re: The delang is using merge instead of rebase/squash

Posted by deadalnix
in reply to deadalnix

deadalnix

Posted in reply to deadalnix

On Wednesday, 15 March 2017 at 13:14:31 UTC, deadalnix wrote:
> This is making the history very spaghettified. Is that possible to have the bot rebase/squash commits and then pushing ?

Arf I fat fingered the title, i meant the dlang bot.

March 15, 2017

Re: The delang is using merge instead of rebase/squash

Posted by Seb
in reply to deadalnix

Seb

Posted in reply to deadalnix

On Wednesday, 15 March 2017 at 13:23:00 UTC, deadalnix wrote:
> On Wednesday, 15 March 2017 at 13:14:31 UTC, deadalnix wrote:
>> This is making the history very spaghettified. Is that possible to have the bot rebase/squash commits and then pushing ?
>
> Arf I fat fingered the title, i meant the dlang bot.

I absolutely agree with you and in fact Andrei and I pushed for auto-merge-squash, which was enabled for the last three months. However, recently Martin disabled it to prevent "accidental squashes".
Discussion:

https://github.com/dlang-bots/dlang-bot/issues/64

March 20, 2017

Re: The delang is using merge instead of rebase/squash

Posted by Martin Nowak
in reply to deadalnix

Martin Nowak

Posted in reply to deadalnix

On Wednesday, 15 March 2017 at 13:14:31 UTC, deadalnix wrote:
> This is making the history very spaghettified. Is that possible to have the bot rebase/squash commits and then pushing ?

I don't really agree with the argument. A merge commit is a clear way to integrate changes from a PR/branch.
Just rebasing a PR on top of master removes a lot of information from git, only leaving references to github.

Can you be more specific, what you mean w/ spaghetti? The fact that reciew fixes are added to PRs.

Also github's commit view misleadingly shows commits from merged PRs/branches, which aren't actually in master.

March 20, 2017

Re: The delang is using merge instead of rebase/squash

Posted by deadalnix
in reply to Martin Nowak

deadalnix

Posted in reply to Martin Nowak

On Monday, 20 March 2017 at 05:10:04 UTC, Martin Nowak wrote:
> On Wednesday, 15 March 2017 at 13:14:31 UTC, deadalnix wrote:
>> This is making the history very spaghettified. Is that possible to have the bot rebase/squash commits and then pushing ?
>
> I don't really agree with the argument. A merge commit is a clear way to integrate changes from a PR/branch.
> Just rebasing a PR on top of master removes a lot of information from git, only leaving references to github.
>
> Can you be more specific, what you mean w/ spaghetti? The fact that reciew fixes are added to PRs.
>
> Also github's commit view misleadingly shows commits from merged PRs/branches, which aren't actually in master.

Because a picture is clearer than a thousand words:

| | | | | | | |
* | | | | | | |   08ae52d8 The Dlang Bot
|\ \ \ \ \ \ \ \      Merge pull request #5231 from RazvanN7/Update_generated
| |_|_|_|_|/ / /
|/| | | | | | |
| | | | | | | |
| * | | | | | | c6480976 RazvanN7
|/ / / / / / /      Updated posix.mak makefile to use ../tools/checkwhitespace.d
| | | | | | |
* | | | | | |   1181fcf7 The Dlang Bot
|\ \ \ \ \ \ \      Merge pull request #5239 from sprinkle131313/ignore-vscode-lib
| | | | | | | |
| * | | | | | | f1b8d0d4 sprinkle131313
| | | | | | | |     Add temp/tmp folder to gitignore.
| | | | | | | |
| * | | | | | | b67bf9d1 sprinkle131313
| | |_|/ / / /      Add vscode folder and lib files to gitignore.
| |/| | | | |
| | | | | | |
* | | | | | |   0b41c996 The Dlang Bot
|\ \ \ \ \ \ \      Merge pull request #5242 from wilzbach/fix-lref-links
| | | | | | | |
| * | | | | | | 090d5164 Sebastian Wilzbach
|/ / / / / / /      Fix links from $(LREF $(D ...)) -> $(LREF ...)
| | | | | | |
* | | | | | |   f2a019df The Dlang Bot
|\ \ \ \ \ \ \      Merge pull request #5241 from MartinNowak/merge_stable
| | | | | | | |
| * | | | | | | a6cb85b8 Sebastian Wilzbach
| | | | | | | |     Add @safe to std.regex unittest
| | | | | | | |
| * | | | | | |   ad70b082 Martin Nowak
| |\ \ \ \ \ \ \      Merge remote-tracking branch 'upstream/stable' into merge_stable
|/ / / / / / / /
| | | | | | | |
| * | | | | | |   694dd174 Stefan Koch
| |\ \ \ \ \ \ \      Merge pull request #5167 from DmitryOlshansky/fix-freeform-regex
| | | | | | | | |
| | * | | | | | | 62cf615d Dmitry Olshansky
| |/ / / / / / /      Fix issue 17212 std.regex doesn't ignore whitespace after character classes
| | | | | | | |
* | | | | | | | 5b07bd59 Sebastian Wilzbach
| |_|_|_|/ / /      [BOOKTABLES]: Add BOOKTABLE to stdx.checkedint (#5238)
|/| | | | | |
| | | | | | |
* | | | | | |   75059373 Jack Stouffer
|\ \ \ \ \ \ \      Merge pull request #5225 from wilzbach/booktable-std-utf
| |_|_|_|_|_|/
|/| | | | | |

What the hell is going on in there ?

In addition there are a bunch of practical issues with this way of doing things. First there is no given that any intermediate state is sound, or even builds at all. That makes it very hard to bissect anything. There are also a lot of errands and correction that are made during review that are not that interesting to keep in the project history. Knowing that someone did thing the A way and then changed it the B way after review is more noise than useful infos in the general case, and in the rare case when someone actually wants to know, the github PR is still out there (on that note, yes GH PR kind fo sucks, but that's another topic).

March 20, 2017

Re: The delang is using merge instead of rebase/squash

Posted by Martin Nowak
in reply to deadalnix

Martin Nowak

Posted in reply to deadalnix

On Monday, 20 March 2017 at 12:25:22 UTC, deadalnix wrote:
> In addition there are a bunch of practical issues with this way of doing things. First there is no given that any intermediate state is sound, or even builds at all. That makes it very hard to bissect anything.

You bissect on master and there is one merge commit per PR, no intermediate states involved.

> There are also a lot of errands and correction that are made during review that are not that interesting to keep in the project history. Knowing that someone did thing the A way and then changed it the B way after review is more noise than useful infos in the general case, and in the rare case when someone actually wants to know, the github PR is still out there (on that note, yes GH PR kind fo sucks, but that's another topic).

That's a conflict of interest. As said, GH's interface is targetted toward pushing review fixes as new commits, rather than ammending changes.
And yes those commits mostly provide little information, but they're also on a separate branch.

Using auto-squash before merging https://github.com/dlang-bots/dlang-bot/issues/64#issuecomment-284155249 made sense but isn't offered by GH's API and thus requires quite some work.

Just squashing everything to a single commit and putting that on master (as done by GH's squash+rebase) doesn't preserve all information in git.
Also smells like it might cause automation troubles at some point.

March 21, 2017

Re: The delang is using merge instead of rebase/squash

Posted by Vladimir Panteleev
in reply to deadalnix

Vladimir Panteleev

Posted in reply to deadalnix

On Monday, 20 March 2017 at 12:25:22 UTC, deadalnix wrote:
> Because a picture is clearer than a thousand words:

What this tells me is that the default way git-log presents history is not very useful. Consider this presentation of the same information:

08ae52d8 The Dlang Bot: Merge pull request #5231 from RazvanN7/Update_generated
- c6480976 RazvanN7: Updated posix.mak makefile to use ../tools/checkwhitespace.d
1181fcf7 The Dlang Bot: Merge pull request #5239 from sprinkle131313/ignore-vscode-lib
- f1b8d0d4 sprinkle131313: Add temp/tmp folder to gitignore.
- b67bf9d1 sprinkle131313: Add vscode folder and lib files to gitignore.
0b41c996 The Dlang Bot: Merge pull request #5242 from wilzbach/fix-lref-links
- 090d5164 Sebastian Wilzbach: Fix links from $(LREF $(D ...)) -> $(LREF ...)
f2a019df The Dlang Bot: Merge pull request #5241 from MartinNowak/merge_stable
- a6cb85b8 Sebastian Wilzbach: Add @safe to std.regex unittest
- ad70b082 Martin Nowak: Merge remote-tracking branch 'upstream/stable' into merge_st…
  - 694dd174 Stefan Koch: Merge pull request #5167 from DmitryOlshansky/fix-freeform-…
    - 62cf615d Dmitry Olshansky: Fix issue 17212 std.regex doesn't ignore whitespace …
5b07bd59 Sebastian Wilzbach: [BOOKTABLES]: Add BOOKTABLE to stdx.checkedint (#5238)
75059373 Jack Stouffer: Merge pull request #5225 from wilzbach/booktable-std-utf

In particular, the origin commit of a branch is often not interesting; only the list of commits that are on one branch and aren't on another are.

Anyway, personally I don't think there is a severe problem in dire need of fixing with the git log excerpt you pasted. If you're interested in looking at changes to the master branch, look for asterisks in the first column.

> In addition there are a bunch of practical issues with this way of doing things.

There seem to be more practical issues with the opposite approach.

> First there is no given that any intermediate state is sound, or even builds at all. That makes it very hard to bissect anything.

Bisecting D is not something that can be reasonably done by looking at just one repository's history anyway; this is why we have D-dot-git and Digger. Either way, for pull requests that make non-trivial changes or additions, you will need to descend into the pull request itself.

> There are also a lot of errands and correction that are made during review that are not that interesting to keep in the project history. Knowing that someone did thing the A way and then changed it the B way after review is more noise than useful infos in the general case,

Agreed, this is one case where squashing is appropriate. However, consider the worst-case scenarios where either merge strategy is abused:

- If a pull request that should have been squashed has been merged without squashing, the result is:
  - Some clutter in the git history;
  - Possible (but avoidable) complications when doing git-bisect on a single repository, which you shouldn't be doing anyway.

- If a pull request that should not have been squashed has been squashed while merging, the result is:
  - Commit messages are lost and remain available only on GitHub.
  - Any logical separation of changes that might have been represented through separate commits is lost and remains available only on GitHub.
  - "git blame" becomes less useful because it can only lead to the big blob of the squashed changes.
  - "git blame" becomes less useful because in some situations it loses its ability to track moved code, which should and often is done in separate commits.
  - Bisection becomes more difficult because it is no longer easily possible to dive into a PR, as has been occasionally necessary.

In general, I am not opposed to giving reviewers the option to merge pull requests with squashing, assuming we can all agree to not abuse it and only use it for PRs where there nothing useful can be gained by preserving the multiple commits as they are; however, their words and actions have shown that this doesn't seem to be an attainable point of agreement.

March 21, 2017

Re: The delang is using merge instead of rebase/squash

Posted by deadalnix
in reply to Vladimir Panteleev

deadalnix

Posted in reply to Vladimir Panteleev

On Tuesday, 21 March 2017 at 01:39:39 UTC, Vladimir Panteleev wrote:
> On Monday, 20 March 2017 at 12:25:22 UTC, deadalnix wrote:
>> Because a picture is clearer than a thousand words:
>
> What this tells me is that the default way git-log presents history is not very useful. Consider this presentation of the same information:
>

It's not good either. Why would I want to look at a DAG when the serie of event is strictly linear to begin with ?

> In particular, the origin commit of a branch is often not interesting; only the list of commits that are on one branch and aren't on another are.
>

Yes, that's why rebasing makes thing clearer. Nobody care what the master commit was when the work was started.

>> First there is no given that any intermediate state is sound, or even builds at all. That makes it very hard to bissect anything.
>
> Bisecting D is not something that can be reasonably done by looking at just one repository's history anyway; this is why we have D-dot-git and Digger. Either way, for pull requests that make non-trivial changes or additions, you will need to descend into the pull request itself.
>

"Our source control is completely broken, but that's not a problem because we developed 3rd party tools to work around the brokenness"

While I agree with you that things like bisecting are broken in D, I don't see it as a reason to screw things up even more. I'm not a big fan of "it's already broken, so we can break it even more". This should, and can, be fixed.

https://danluu.com/monorepo/

Incidentally, I got a company contacting me last week willing to pay me good money to help them transition toward these kind of workflow.

> - If a pull request that should not have been squashed has been squashed while merging, the result is:
>   - Commit messages are lost and remain available only on GitHub.
>   - Any logical separation of changes that might have been represented through separate commits is lost and remains available only on GitHub.
>   - "git blame" becomes less useful because it can only lead to the big blob of the squashed changes.
>   - "git blame" becomes less useful because in some situations it loses its ability to track moved code, which should and often is done in separate commits.
>   - Bisection becomes more difficult because it is no longer easily possible to dive into a PR, as has been occasionally necessary.
>

Then it should have been 2 PR or more to begin with. Splitting PR in smaller ones is a good practice in general, there are ample proof that is increase the quality of the code review, reduce conflicts surface with other PR, makes reverting easier and more targeted when something happens, etc...

Keeping this PR's commits is just a way to mitigate one of the negative consequences of kitchen sink PRs. It does so by impacting negatively others aspects of source control, and does nothing to mitigate other negatives aspects of kitchen sink PRs, such as review fatigue (see a specific example below).

> In general, I am not opposed to giving reviewers the option to merge pull requests with squashing, assuming we can all agree to not abuse it and only use it for PRs where there nothing useful can be gained by preserving the multiple commits as they are; however, their words and actions have shown that this doesn't seem to be an attainable point of agreement.

If multiple commits are important for the PR, then the PR should have been several PR to begin with. Asking people to split s the way to go.

Consider this PR: https://github.com/BitcoinUnlimited/BitcoinUnlimited/pull/164

You can see in the comments that I asked the original author to split it up because it was a kitchen sink and very hard to review in its current form. This was ignored. The PR ended up containing a bug that would cost about $12 500 to one of the users of the software, plus a fair amount of reputational damage. The change containing the bug did not need to be bundled with the rest of the PR, and would have almost certainly be noticed if it had been made on a PR of its own.

Bundling several changes in the same PR has real world consequences that go beyond screwing up source control.

March 21, 2017

Re: The delang is using merge instead of rebase/squash

Posted by Vladimir Panteleev
in reply to deadalnix

Vladimir Panteleev

Posted in reply to deadalnix

On Tuesday, 21 March 2017 at 11:59:42 UTC, deadalnix wrote:
> It's not good either. Why would I want to look at a DAG when the serie of event is strictly linear to begin with ?

Not sure what you mean here. The way it's presented is not a DAG.

> Yes, that's why rebasing makes thing clearer. Nobody care what the master commit was when the work was started.

Sure, I'm not against rebasing. It's the squashing that's problematic.

> "Our source control is completely broken, but that's not a problem because we developed 3rd party tools to work around the brokenness"

That's fallacious.

> While I agree with you that things like bisecting are broken in D, I don't see it as a reason to screw things up even more. I'm not a big fan of "it's already broken, so we can break it even more". This should, and can, be fixed.
>
> https://danluu.com/monorepo/
>
> Incidentally, I got a company contacting me last week willing to pay me good money to help them transition toward these kind of workflow.

I don't disagree with you, but this is a different discussion that's orthogonal to this one.

> Then it should have been 2 PR or more to begin with. Splitting PR in smaller ones is a good practice in general,

You are changing the subject. I'll reply in another post with a different subject.

March 21, 2017

Multi-commit PRs vs. multiple single-commit PRs

Posted by Vladimir Panteleev
in reply to deadalnix

Vladimir Panteleev

Posted in reply to deadalnix

On Tuesday, 21 March 2017 at 11:59:42 UTC, deadalnix wrote:
> Then it should have been 2 PR or more to begin with. Splitting PR in smaller ones is a good practice in general,

This is probably true for many cases, but I don't think it's a general truth.

First, there are extreme cases like these:

https://github.com/dlang/druntime/pull/1402
https://github.com/dlang/phobos/pull/260

I think we can agree that it would be better to have 1 pull request with 70 commits than 70 pull requests with 1 commit.

Second, there are many cases that fall in the middle: some ancillary change (such as a minor refactoring, or a build file change) is needed by a bigger change, but it is too minor to be a PR of its own. Putting both in one commit is also unwarranted.

I think that the philosophy to prefer squashing is not suited for projects such as D, where we care about history. Pushing for one change per PR also pushes people to put too many things in one commit, and write less descriptive commit messages.

Finally, some of the biggest open source projects merge pull requests consisting of multiple commits, and encourage submitters to divide their changes into as many commits as is logical, which seems to be the workflow they consider optimal.

> there are ample proof that is increase the quality of the code review,

OK, where is the proof?

It is worth pointing out that GitHub's UI is heavily biased towards reviewing PRs in entirety, however it does allow reviewing PRs commit by commit, which is how I think non-trivial submissions and reviews should occur anyway.

> reduce conflicts surface with other PR, makes reverting easier and more targeted when something happens, etc...

Sure, I'm not advocating that all submissions happen as one PR. The way I understand it, it is you who is advocating the extreme position that all PRs should always contain a single commit.

> Keeping this PR's commits is just a way to mitigate one of the negative consequences of kitchen sink PRs. It does so by impacting negatively others aspects of source control, and does nothing to mitigate other negatives aspects of kitchen sink PRs,

Frankly I don't think this makes any sense at all.

> such as review fatigue (see a specific example below).

The other side of the coin is submitter fatigue.

I've seen this happen:

1. Submitter submits a PR, containing two commits: a change, and a refactoring required for the change.
2. Reviewers ask the submitter to split it into two PRs.
3. Submitter resubmits the refactoring as a separate PR.
4. The refactoring PR sits in the review queue forever because at best, it does nothing, at worst it introduces a regression. Reviewers who did not see or bother to read the first PR ask what this refactoring is for and why it's needed.
5. Submitter is fed up and leaves.

> Consider this PR: https://github.com/BitcoinUnlimited/BitcoinUnlimited/pull/164
>
> You can see in the comments that I asked the original author to split it up because it was a kitchen sink and very hard to review in its current form. This was ignored. The PR ended up containing a bug that would cost about $12 500 to one of the users of the software, plus a fair amount of reputational damage. The change containing the bug did not need to be bundled with the rest of the PR, and would have almost certainly be noticed if it had been made on a PR of its own.
>
> Bundling several changes in the same PR has real world consequences that go beyond screwing up source control.

I don't think that's a fair example at all.

What exactly prevents reviewing a PR consisting of multiple commits differently from multiple PRs consisting of one commit?

In both cases, you can:
- look at each change individually
- add review comments on the change, either on the change in entirety or on individual lines
- selectively pick and merge a subset of the submitted changes (though, GitHub makes this more difficult for multi-commit PRs).

I can only guess that by "difficult to review" you mean from only looking at the "diff" tab; however, I think it's disingenuous to say that if you are not using the tools properly.

Anyway, to reiterate, this is a distinct argument from which merge strategy to use. However, generally, I think this approach is more bad than good because it pushes towards destroying information (commit messages and separation) and clumping too many minor changes into single commits.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation