Git 102 – Branching out

So, you’ve just created a local Git repository to keep track of your work and you’ve even pushed your commits onto a remote on GitHub called origin. That’s neat and all, but how is this actually useful for you and your collaborators?

In my last post, I mentioned that Git is extremely powerful when multiple developers are working on the same project. Remember, when you first initialize a Git repository, the software puts you on the master branch. You can think of this as the main source code, which will most likely evolve into your final product. Most likely, you don’t want a dozen developers with differing levels of experience (noobs) to all touch that code. What to do, what to do?


Go forth and branch out

Topic branches allow multiple contributors to work on code that already exists in the master branch. In the same way that you can push code into your remote origin, you can also take code from other remotes. So, let’s say you want to help your friend build his company’s website. You can pull code from his master branch and work on it in a topic branch (a.k.a. feature branch) without tinkering with the original code. Let’s get some visuals to help us understand this concept.

git102_branching

In the figure above, lines represent a branch and circles represent commits. As you can see, topic branches allow you to make commits that are totally separate from the master branch. This means that you and your friend can work on the same website simultaneously without having your changes interfere with his work. Now, you might be thinking, eventually you have to merge your code back with the master branch, right? To that I say…

patience-young-grasshopper

First, let’s learn to manage branching using GitHub and our command line.


Forking in GitHub

Now, this next process I’m going to explain isn’t an inherent part of managing topic branches. However, it is a powerful tool for managing large projects and, more importantly, it will strengthen your understanding of Git.

timsjpark_git_practice

If you go to any repository on GitHub, there will be an option to ‘Fork.’ All this does is COPY the repository to your own account. If you click it in a repository that you own, you’ll notice that your own account is listed. But it should be greyed out, obviously because you can’t copy your own repository twice. So, go ahead and try and fork my git_practice repository to yours.

timsjpark_rails

Once GitHub is done forking, it will automatically place you into the new repository in your account. There are two very important things to note.

(1) This is just a COPY of the original repository.

(2) Forking is a process that occurs only on GitHub. So far, no changes have been made on your local computer so the git_practice repository that you just forked only exists on your GitHub account.

In order to copy the GitHub repository you just forked onto your local computer, you’ll have to clone it. Go to the right sidebar on GitHub and click the clipboard icon to copy the SSH URL. Then cd into a directory where you want git_practice to exist. Maybe like your home directory. Wherever you are, type this next:

$ git clone URL
Cloning into 'git_practice'...
...some output here...
Checking connectivity... done.
$ git push -u origin master
Branch master set up to track remote branch master from origin.
Everything up-to-date

Cloning a remote repository just makes a copy of it on your local computer. Even though the local repository you just made is an exact copy of your remote origin, we push with the -u option just to make sure our local repository is tracking origin.

To summarize, we forked a git repository from another person’s GitHub account, which made a copy of that repository onto our GitHub account. Then we cloned our copy onto our local computer. So…what’s next?


Adding a new remote – ‘upstream’

Now that you’ve cloned your origin repository onto your local computer, it’s time to start adding some complexity into your branching scheme. Go ahead and type the following code. What you’ll notice is that you’re adding a new git remote, just like you did when you connected your local computer to origin. But hey, the new remote you’re adding is called ‘upstream.’ When you type git remote -v, what you’re seeing is all of the remote repositories that you can access from your command line.

$ git remote add upstream git@github.com:timsjpark/git_practice.git
$ git remote -v
origin	git@github.com:YOUR_USERNAME/git_practice.git (fetch)
origin	git@github.com:YOUR_USERNAME/git_practice.git (push)
upstream git@github.com:timsjpark/git_practice.git (fetch)
upstream git@github.com:timsjpark/git_practice.git (push)

In this specific case, if you’re following my instructions, your ‘git_practice’ repository is called origin and MY ‘git_practice’ repository is called upstream. So what’s the purpose of having a connection to two different remote repositories? Well, it’s relevant to your workflow when you help your imaginary friend build their website. NOTE: This isn’t the only way to manage a project, but it’s a good way.

Git convention dictates that you call your personal repositories origin. Don’t ask me why, it just is. So whenever you are working with another person’s project, their remote repository (on GitHub) is referred to as upstream by your computer. This helps distinguish between the original project repository (upstream) and your forked copy of it (origin).


What about topic branches?

So far, you haven’t made any topic branches yet. Every repository – local, origin, and upstream – all only have master branches. That’s great, but there’s a problem. Remember, that you don’t want to tinker around with any code in the master branch, because that’s the golden code that will eventually become a final product. So how do you start helping your imaginary friend build his website?

Topic branches bud off of master branches and allow you to work on small snippets of code. Usually, those snippets accomplish some feature. For example, if you want your website to have a shopping cart feature, you might build a shopping cart application in a topic branch. Simultaneously, your friend might be working on a user_login application in a different branch.

To create your own branch, type git checkout -b shopping_cart_master. What this does is it lets your checkout a new branch called ‘shopping_cart_master.’ It’s best practice to name your topic branch based on the topic/feature you’re working on and the branch that you built it off of. Since we’re branching off of the master branch, ‘master’ is appended at the end of our branch name. Use git branch to check what branch you’re on and git checkout BRANCH_NAME to move to another branch.

$ git checkout -b shopping_cart_master
Switched to a new branch 'shopping_cart_master'
$ git branch
  master
* shopping_cart_master
$ git checkout master
$ git branch
* master
  shopping_cart_master

Branches =/= directories

Here is an IMPORTANT concept to understand before you move on.

BRANCHES ARE NOT DIRECTORIES!

Branches are references to commits. Branches are data structures created by Git to allow you to work on different versions of code in ONE directory. Confused? Let’s take a look at a simple example to illustrate this point. Check which branch you’re in and then switch to the shopping_cart_master branch we just made. Now follow along with the code below.

$ git branch
* master
  shopping_cart_master

## Switch to the shopping_cart_master branch
$ git checkout shopping_cart_master
$ git branch
  master
* shopping_cart_master
$ ls
README
$ echo 'Does this file exist in master branch?' > file.txt
$ git add file.txt
$ git commit -m 'Testing differences in branch structure'
$ cat file.txt
Does this file exist in the master branch?
$ ls
README  file.txt

## Now we'll switch to the master branch and check for file.txt

$ git checkout master
$ git branch
* master
  shopping_cart_master
$ ls
README

Would you look at that? Even though you NEVER changed directories, there’s a file that exists in shopping_cart_master, but NOT master. It’s not magic. What’s happening is Git is keeping track of which files are in which branch. When you create a file in a topic branch and then switch to master, Git will automatically remove that file from master because it theoretically wasn’t made there.

THAT’S what enables you to fiddle with code. THAT’s the power of branching.


Breathe Deep…and exhale.

If you have followed all of this and you’re not confounded, then I’m really impressed. It takes some time to absorb all of this material so go ahead and walk away from this for a little while. Try to let it sink in. And then come back and see if you can read this again and have it make more sense.

Go through the practice code. Try and fork my git_practice repository and mess around with it. If things get hairy, just delete your local git_practice repository and delete the repository you forked on GitHub. Try again. Don’t give up.

Next time, I’ll explain how to manage these new branches you’re working on and, eventually, how to get your topic branches to merge back with your master code.

It’s pretty exciting…


Credit for branching images goes to Jason Noble, from DaVinci Coders

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s