As web developers, a lot of the time we tend to work on local development sites then just upload everything when we’re done. This is fine when it’s just you and the changes are small, but when you’re dealing with more than one person working on something, or on a large project with lots of complicated components, that’s simply not feasible. That’s when we turn to something called version control.
Today I’ll be talking about an open source version control software called Git. This allows more than one person to safely work on the same project without interfering with each other, but it’s so much more than that too.
Why Use Version Control Software?
First and foremost, the name should give it away. Version control software allows you to have “versions” of a project, which show the changes that were made to the code over time, and allows you to backtrack if necessary and undo those changes. This ability alone – of being able to compare two versions or reverse changes, makes it fairly invaluable when working on larger projects.
You’ve probably even done this yourself at some point, saving out copies of a project at different points so you have a backup. In a version control system, just the changes would be saved – a patch file that could be applied to one version, in order to make it the same as the next version. With one developer, this is sufficient.
But what if you have more than one developer working on a project? That’s when the idea of a centralised version control server comes in. These have been the standard for a long time, whereby all versions are stored on a central server, and individual developers checkout and upload changes back to this server. If you’ve ever looked at the edit history of a Wikipedia page, you’ll have a good idea of how this works in a real world scenario:
The benefits of a system like this is that multiple developers can make changes, and each change can then be attributed to a specific developer. On the downside, the fact that everything is stored on a remote database means no changes can be made when that server goes down; and if the central database is lost, each client only has the current version of whatever they were working on.
That takes us on to Git, and other so-called distributed version control systems. In these systems, clients don’t just check out the current version of the files and work from them – they mirror the entire version history. Each developer always has a complete copy of everything. A central server is still used, but should the worst happen, then everything can still be restored from any of the clients who have the latest versions.
Git specifically works by taking “snapshots” of files; if files remain unchanged in a particular version, it simply links to the previous files – this keeps everything fast and lean.
It might also interest you to learn that Git is used to manage and develop the core linux kernel – the base building block upon which all linux distros are built.
Although you can run your own Git server locally, Github is both a remote server, a community of developers, and a graphical web interface for managing your Git project. It’s free to use for up to 5 public repositories – that is, when anyone can view or fork your code – with low cost plans for private projects. I strongly suggest you go sign up for a free account so you can start playing around with your own projects or forking someone elses.
Forking & Branching
These are core concepts to the Git experience, so let’s take a moment to explain the difference.
You’ve probably heard the work “fork” when dealing with linux distros. If you’re familiar with the media center app Plex, you’ll know it was originally a fork of the similar open source Xbox Media Center. This simply means that at some point in the past, some developers took the code of XBMC, and decided to go their own way with it; that became Plex.
This is of course totally allowed when the project is open source – you can take the code, do whatever you want with it. With Git, if you feel your changes are good enough to be rolled back into the “master” project, you can make a “pull request” to the author, asking them to pull your changes back into their original project. This allows you to have hundreds of thousands of developers working on a project at any point, none of whom must neccessarily be approved for code access – they just copy the code, make changes, and request to be rolled back into the master. Of course, it’s up to the owner of the original project if they decide to accept your changes or not.
Branching is something done internally on a project by the authorized developers. It allows you to easily separate specific issues or features, and work on them without breaking the master files. Once you’re satisfied that your branch has dealt with the issue, you merge it back into the master. At any point, there can be as many branches as you like; they don’t interfere with each other. You can also merge changes between branches without touching the master.
Here’s a great diagram of an example workflow by Vincent Driessen:
Next time, we’ll look at how to set up a working Git example and make code changes within branches. Version control is a huge topic. I’ve only given the briefest overview here, but as a developer who is used to just making changes and undoing them if they don’t work, the whole concept has blown my mind – I hope it does yours too.
Are you a seasoned developer with experience in Git? Are you just getting started and think you’d like to have a go? Sound off in the comments!