Clean Version Control with Git and Gerrit

Mark Dappollone
5 min readOct 2, 2018

Version control is not the most interesting problem in software development, and with the ubiquitousness of Github, a lot of organizations simply feel as though it’s one that’s already been solved. But there’s a different way to version your software using Git and I’m always surprised when people don’t know about it. It’s a little-known code-review tool originally developed by Google for the development of Android. Nowadays, there are lots of ways to get your hands on Gerrit, but before we get into that, let’s back up a bit.

What is Gerrit?

At its core, Gerrit is a code-review tool, but the way it works can have much larger implications for your team’s workflow. Gerrit acts as a middleman between your developers and your actual git repository. Instead of pushing and merging commits directly to your repo, you push them to Gerrit, where they are staged for things like code-review, and verification (more on that later). While staged in Gerrit, a commit can be edited or changed by the original developer on their machine, and then it can be pushed to Gerrit again. Through a magic system of “Change-Ids”, Gerrit keeps track of all the versions of all your commits and creates “Patch-Sets” — stacks of single commits that track all your edits and iterations. It also associates those patch sets to other related commits, creating a stack of commits that can be individually reviewed, edited and merged. Here’s how it works:

Every time a developer pushes a new version of a commit to Gerrit, the system will create a list of patch-sets for that commit. Using this system, a code-reviewer can view a line-by-line diff of each patch set against any of its predecessors, to see exactly what’s changed, using the Gerrit web portal:

Notice up in the top right hand corner where it says “Patch Sets (2/2)”? That means that I pushed 2 different versions of this commit. I pushed one, someone commented on it, and then I pushed another to address those comments. Now the reviewer can look at not only the changes in the commit, but also the changes between this version of the commit and the last version. This is incredibly useful for checking whether all the code-review requirements were addressed.

Notice as well, that the commit in the screenshot is one of 11 commits all in a stack. That means that a reviewer can look at each and every commit individually, comment on each one, and then the developer can edit each one and push a whole new stack. When this gets merged to the repo, the commit log will be a clean, detailed list of granular commits, and each one will have been edited to conform to the team’s standards.

How do you edit a commit? Glad you asked

The Magic of Rebase

The Gerrit system, when used as designed, trades all your old Github style merge-commits for rebases. For anyone unfamiliar with rebases, they’re like merges, except magic. In a merge, your changes are “replayed” on top of another branch, and the resulting changes are captured in a new commit. This is a merge-commit… a commit that you didn’t write, but Git generates for you and comprises the resolution of your changes with the branch you’re merging onto. Once a merge commit is in your history, it becomes much more difficult to revert individual commits that are involved in the merge. With a rebase, a very similar process occurs — your commits are replayed onto another branch — but the difference is that a rebase replays your commits as if they had been originally committed on the new branch’s tip. That means that when you’re done, you have your exact stack of commits sitting neatly on top of the new branch, as if you had started there from the beginning. This makes everything simpler, from reading the commit log, to reverting individual commits. It’s so clean that you can actually use your unedited commit log as a complete set of build notes, with no special handling. Rebases are also what allows you to push and re-push whole stacks of edited commits to Gerrit and create patch sets for each commit, while still associating the commits together into a logical list of dependencies.

The Github Dilemma

Now let’s consider the same situation using a typical Github workflow. First, all 11 related commits would typically be part of a single Pull-Request. Normally, a reviewer wouldn’t even bother to review each commit, instead reviewing all the changes in the entire request at once, and commenting potentially on multiple commits at the same time. The reason that happens is that the author of the commit can’t go back and edit individual commits in the stack, so there’s no point in reviewing each one. If the reviewer were to comment on each commit, the only way the author could edit them individually is to either create an entirely new PR, in which case the original comments would be lost, or to force-push a new stack of commits on top of the existing one, dropping the original changes altogether. Because of this, a developer usually winds up pushing a new commit on top of everything else that addresses the review comments.

This system quickly becomes a mess. Not only does your repo become littered with merge commits from PRs, but also, reverting a single commit from one of those merges is tough. And in addition to those problems, you also have a laundry list of commits in the history that simply address disparate, sometimes unrelated code review comments on a bunch of previous commits. As time goes on, those “Address Review Comments” commits become more and more inscrutable. And if you want to know why you made any of those changes, well, good luck.

In addition to keeping your repo clean, and your commit log readable, Gerrit also keeps a history of every commit and every reviewer’s comments on those commits. If you want to know why you made some random change in the code, you can find the commit in Gerrit and look at its entire history to see what prompted every edit.

So, Gerrit makes your version history much cleaner, but it can also help you verify your commits with sanity checks and protect your repo from accidental compilation errors. How does that all work? Stay tuned.

Next Time: Verifying code using Gerrit with Continuous Integration

--

--

Mark Dappollone

Director, Mobile Product Engineering at Anywhere Real Estate