You need a system that lets you keep track of all the changes made to the code and allows you to travel through these changes – a Version Control System.
There are many version control systems in the market such as Apache SVN, Mercurial, Git, Bazaar, CVS; shortlisting one will make your task of maintaining projects hassle-free. Traditionally, most of these are centralised i.e. your code is in a centrally-located repository and the entire history is maintained on this repository. If the centralized repository crashes then any changes till the latest backup are lost and that may be a loss of weeks’ worth of effort, which is why distributed VCS are popular among developers.
Advantages of Distributed Version Control Systems:
A distributed VCS follows a peer-to-peer strategy, where each peer has a complete working copy of the project.This provides protection from data loss as each copy is effectively a backup with local changes. This also avoids a single point of failure. Operations like committing, branching, history tracking, reverting changes are available locally and an internet connection is required only when pushing or pulling data from remote servers.This is a major advantage over a centralized VCS, as all the above mentioned features depend on your network connectivity with the central server. Non-linear development is easily possible using branching.
Git and its awesomeness:
Git is one such distributed Version Control System. It was developed by Linus Torvalds primarily for maintaining the Linux kernel project and is also currently being used for maintaining major renowned projects like Ruby, Rails, jQuery.
Git is a particularly popular distributed VCS among developers because it is very fast and has a lower memory footprint.*
In addition to VCS it has features that allow you to:
- Take snapshots of your projects (git commit).
- View the complete history of files (git log).
- Compare any two versions of files (git diff).
- Create different branches of your repository (git branch).
- Merge branches to combine and collaborate (git merge).
- Access any version of any file, anytime! (git checkout).
- Efficiently switch between branches (git checkout my_branch).
- Create remotes to store your repository online(git remote).
- Manage your changes before committing them by using a staging area.
Git in a collaborative environment:
More often than not product development involves a lot of parallel developments, which is where branching comes into play. Whenever a new feature/story needs to be developed, all you have to do is create a new branch, and start working on it, so that whatever changes are done will not affect your main branch/code base. Once the feature is complete you can just merge this new branch into the master (or send a pull request to your reviewer as per your workflow). So does this help us when multiple people are working on a single project? We could probably keep our main repository online and each developer would update it as and when required. But thats not an elegant solution, is it?
There are various online Git repository management systems available which provide some great statistics and tools for tracking projects. Once we have selected a service we need to setup a workflow strategy for contributing to the project. Git is so flexible that it can handle most of the workflows that a company adopts.
Git at Sell.Do:
At Sell.Do we use a simple strategy that works wonders for code tracking? We have a master branch which is deployable at all times and a staging branch which is production-ready with new changes. Each developer has their own fork (a complete copy of the repository) for development. A developer always works on his/her local fork, creating a new branch from staging for every new module or a feature. The staging branch is always up-to-date and hence, the developer always has updated content. Once the feature is completed, the branch is merged into the local staging branch and a pull-request is sent to the main repository’s staging branch (note: pull-requests are accepted for staging branch only). We have a staging environment which emulates production and is used for QA. Once QA is complete, the staging branch is merged to the master, which is then ready for production.
This strategy has many advantages – since any new code is added only via pull-requests, bugs introduced are completely trackable to the original commit. When multiple members are working on the same module/feature they work on their own forks, pulling code from the main repository periodically to keep their code up-to-date. This also helps a lot during code reviews as the reviewers can pinpoint exactly where changes were made.
The dark side of Git:
We cannot checkout a subdirectory of a Git repository, we need to check out the complete repository even though we require a subdirectory. This may be problematic when dealing with large projects. Access Control is not directory-specific, therefore, the project needs to be split into small repositories according to requirement and then cloned into a parent repository.
Git for beginners requires a bit of a learning curve. There are lots of commands at your disposal and sometimes deciding which command to use where can be confusing (especially with the dual-natured git checkout that is used for both, switching branches and undoing uncommitted changes).
The Git community is very active and there are lots of tutorials and services available.
Here are some of my personal picks that may enlighten your Git knowledge.
*Graphs source : http://git-scm.com/about/small-and-fast
Credits: Kiran Narasareddy and Vaidehi Mirashi