Git – Merge vs. Rebase

Version control systems are an essential tool in software development and branches make it easier to manage code and organize projects. So being able to keep track of feature branches goes a long way in moving projects along efficiently.  I’ll outline what I’ve learned to be good practices for managing code on your local repository as well as your remote(s). And then I’ll cover the basics of merging vs. rebasing and how both methods can be used to re-integrate feature branches back into master.


Managing Branches

Rule #1 – Preserve the master branch

Your master branch should contain all of latest and greatest code that is worthy of your final product. Ideally, all of the work you do should be completed in feature branches and then merged back into master when it’s been tested and approved. If you’re having conflicts when you pull or push on your master branch, something may have gone terribly wrong and it’s definitely worth taking the time to figure out what happened and fixing the problem.

Rule #2 – Be clear

Visualize a desktop screen on a computer that is littered with over 100 files. You decide you want to organize them so you group them into directories, each with a unique name. But you get careless and name one directory ‘important’, another ‘essential’, and yet another ‘files to keep’. What’s the difference between these directories? Can you tell me their contents by just glancing at them?

Branches are ways to work on groups of files that are all related to one topic. If you’re creating a way for people to purchase things on your website, maybe you’ll call the branch ‘purchase’. Or maybe you can be even more specific and have a ‘shopping_cart’ branch as well as a ‘credit_card_authentication’ branch. The point is that being specific is important to maintaining your sanity while working on multiple projects at once. Name your branch something so specific that you can identify it’s purpose and contents with a quick glance.

Rule #3 – Be consistent

This rule goes hand in hand with being clear about your work. In short, try to follow conventions for naming branches locally as well as remotely so that you don’t confuse yourself. Specifically, any major remote project repository should be called upstream and your own forked copy on GitHub should be called origin. In addition, if you have a project directory called ‘shopping_cart’, your local and remote branches should also be called ‘shopping_cart.’

Rule #4  – Commit often and commit well

If you’re working on a feature branch, you could theoretically have one commit for that branch. Just save your work as you go and then make one commit when you’re done coding. WRONG! Commits keep track of the work you’ve done and create a history that reflects the actual changes you made to a file. Good commit messages are important because they summarize what changes you made and, more important, why you made them. Do you press the save key incessantly while writing word documents? Commits are like the save key…it’s hard to overuse them.


How to integrate into master

There are two main methods used for integrating code into the master branch and the first is git merge. Anytime you need to bring code into a branch (even if it’s not master), you need to use the merge command to take the commits from one branch and merge them into your target branch. You can add complexity and power to merging code back into the master branch comes from another re-integration method called git rebase. I’ll go into detail about rebasing in a bit, but basically it’s away of updating a feature branch so that it accurately reflects the most recent changes made on master. It’s sort of like updating your phone – your phone’s contents are still unique to you, but it’s running on the most recent software updates.

As with many things in life, people are often split on whether using merge vs. rebase is the best way to re-integrate code into the master branch. It ultimately boils down to personal preference as both methods are legitimate ways of managing branches and merging into master. I’ll go over the details of what each method does and then you can decided which you prefer.


Merge

Merging is a fairly simple process by which you can integrate code from a topic branch into the master branch. Let’s imagine that you branch off of master to do some work on a small code fix. You make just two commits and are ready to bring those changes into master. In the meantime, someone else made another commit that got integrated into the master branch. This isn’t necessarily a problem because Git will figure out how to merge the two branches together in the best way possible.

Essentially, Git takes a look at both branches and finds the commit that’s a common ancestor to both branches. In the image below, that’s the first commit represented by a blue circle. It then compares all the commits made on master and on the topic branch, decides if they can be combined without conflicts, and then creates a new merge commit.

Merge commit

This last step is important. Anytime that you merge two branches, Git creates a special commit called a merge commit. Unlike all other commits, which have one ‘parent’, merge commits have two parent. It points at the two branches that it merged together so that you can trace back into the two branches that resulted in the merge.

Sometimes, you might find that you modified the same file in two branches. If there are any conflicts because that file exists as two different versions, you will have to resolve them manually. The built in mergetool is a good tool for those who are just starting to find merge conflicts that they need to resolve. Basically, you will have to choose which version of the file you want to keep, line by line.

For example, let’s say you have a file called ‘foo.txt’ in both branches. In the master branch, line 1 says ‘hello’ and in your feature branch, it says ‘hi’. This is a conflict that Git cannot resolve so you have to use mergetool and decide, do I want line 1 to say ‘hello’, ‘hi’, or both?! If you choose both lines, line 1 will stay as ‘hello’ and your line, ‘hi’ will get input as line 2. The important thing to remember is that you probably DON’T ever want to replace the version on master with the version on your feature branch. If it exists on the master branch, it’s probably not code that you want to be changing.


Rebase

Rebasing is a powerful Git tool that can modify the history of commits made on a branch. Now, you might be thinking that Git is supposed to preserve history, that the whole point of using this amazing VCS was so that you could remember exactly all of the changes that were made at some point in time. Well you’re right. Git won’t ERASE any history and it won’t change the contents of any commit. What we’re talking about is rebasing (updating) your topic branch to reflect the most recent changes in master.

Rebase 1

If you make a topic branch and do some work, you could merge those changes back into master with a merge commit. But maybe you want to make sure that your changes don’t conflict with any commits on master BEFORE you merge. You know…because your colleagues might appreciate it if you fix all merge conflicts before they try to merge your code into the master branch? At this point, you would run rebase on the topic branch and it changes its history. Look at the image below. Notice that it rebases the topic branch so that its ‘base’ is built off the last commit on master.

Rebase 2

This is extremely useful because it let’s you see how your changes in the topic branch stack up to the changes made on master. Think about this for a second. Remember that when you merge a branch onto master, Git will notify you if you have any merge conflicts. Rebase will do the same exact thing, but right when you rebase it. So basically, merge and rebase will both help you merge commits onto the master branch, but the difference lies in when you get notified of merge conflicts.

Merge warns you of conflicts while you are trying to merge. Rebase makes you address those conflicts right when you rebase so you can clean them up BEFORE you merge.

Once you try to merge those topic branch commits, your master branch will end up looking like a straight line as if the topic branch commits were just pasted in front of the master commits. This is because of the way Git interprets your branch architecture. If your topic branch just has two different commits compared to your master branch, all Git does is add those to master.

Rebase 3

This is different from a normal merge where Git notices that there are commits that are not shared between your master branch and your topic branch so it creates that special merge commit. In a rebase, you will eventually have to use git merge and create a merge commit. However, the history of your commits will look cleaner as rebasing simply builds your branch commits on the most recent master commits.


An important note about rebasing

Rebasing is extremely useful when you’re managing a large project with a lot of topic branches. It allows developers to compare their code to the recent changes being merged into the master branch so that they can check that their code won’t cause merge conflicts. But there’s a situation when you should NEVER rebase.

If you rebase your topic branch, it will change all of the SHA keys for the commits on that branch. Git does this automatically because you’re effectively telling Git to change the base of your branch from an old commit to the most recent commit on master. Since the ‘history’ of the branch is being changes, the commit SHAs do as well.

So if you ever push your commits onto a public repository, one that other developers are actively pulling from, you should NEVER rebase that branch. Think about it. If your friend pulls all of your commits from a branch and then you rebase that branch, you’ve just changed ALL of the commit SHA keys for that branch. And then next time they pull from that public repository, Git will tell them that they have some serious problems because the commits on the public repository don’t match the commits on their local one…uh oh. Hopefully, you guys catch the problem and can remedy it, but wouldn’t you just like to avoid the problem altogether?


If you want to read more about Git, checkout the Git documentation online. It’s an AMAZING resource, the inspiration for some of my images in this article, and just plain fun reading…if you’re a Git nerd.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s