01. Introduction to Git History of Version Control

Welcome to our Git tutorial! In this series, we'll go from the basics commands of Git to the more advanced topics. Before we dive into Git, let's talk a bit about history. If you already have Git installed and are ready to learn, you may jump directly into Git Fundamentals.

Why Version Control?

When working on any project that involves revisions, it's imperative that you keep track of the changes you make. As a programmer, you'll need to pinpoint where a bug might have been introduced by sifting through each update checkpoint. As a data scientist, you'll need to keep track of each dataset and analysis script used to obtain your results.

But how can we handle version control? Here are just some of naive ways we could:

  • Keeping specific versions of files using names such as textfile-version1.txt per major editing. This is obviously not the cleanest nor most efficient way to organize files within your directory.
  • Compressing and backing up our files frequently. Storing these backup files can eat up a ton of hard drive space.
  • Copying files into specific time-stamped folders. This can be error prone and time-consuming.
How not to do version control.
I'm sure most of us are familiar with how we version controlled our college essays.

Any of the above may work, but you can see the tediousness each solution entails. And we must note that backing up and keeping track of our files' evolution is just half the picture. The other half lies in collaboration, where a project is built simultaneously with tens of contributors. So what system can we implement that not only keeps track of our project's progress, but also allows for development with others in real-time?

To share, we could use a file-sharing platform, such as Dropbox, but this too has its drawbacks. Files can go missing and edit conflicts may occur. As great as cloud-sharing services are, they not built specifically designed for collaboration and version control.

Dropbox is a great tool to share files among your different compouters.
Dropbox is a great way to keep your files in-sync across different devices and share data amongst collaborators. It even stores the history of the revisions for up to a month! However, it was not built specifically for version control.

The best possible solution that meets our needs are the use of a Version Control System. There have been many different systems implemented since the age of computers, but to understand where Git came from, we must take a look into the Linux Kernel Project.

Developing the Linux Kernel

The Linux Kernel project was initiated in 1991 by Linus Torvalds as an open-source, collaborative project. The aim was to develop a working operating system kernel under GNU licensing. Since the project included contributors from all over the world, it required a system that could track all edits and resolve any conflicts.

Linus and his team had been using a proprietary Distributed Version Control System (DVCS) known as BitKeeper. Although BitKeeper had been free for years, the owner, Larry McVoy, decided he wanted to start charging a licensing fee. Initially, Linus looked towards other free version control systems, but found none good enough for his project. In 2005, Linus decided to create his own management system; this new Version Control System came to be known as Git.

BitKeeper was the very first version control system used by the Linux project.
The Linux Kernel Project, with its hundreds of collaborators around the world, used to use BitKeeper Version Control System.

Git and its Design Criteria

The term git is a British slang term for "an unpleasant person." Torvalds said "I'm an egotistical bastard, and I name all my projects after myself. First 'Linux', now 'git'."

Linus Torvalds and git logo.
Linus Torvalds, creator of Git and the Linux Kernel.

Linus had several design criteria in mind when developing Git:

  • Support a distributed workflow. Each user has his own copy of the entire code base, making it possible to make changes without Internet connection. We'll see how advantageous this is on the next page.
  • Safeguard against malicious attacks or accidents. Everything that is committed should be checksummed before storing it. This makes it near impossible to change contents without Git knowing about it. Furthermore, since Git operations almost always add data, it's difficult to lose data or perform an undoable function.
  • Support for non-linear development. Contributors can create a branch and keep their changes separate from the production, or main branch. When all edits on the branch are complete, they may merge them to the main branch.
  • Speed. Running Git subcommands should be quick and efficient.
  • Local. Git's core operations do not require an Internet connection, making development fast and convenient. Even browsing the entire history of a project can be done locally.
  • Free. Git should be a free software distributed under the GNU General Public License.

Take your Linux skills to the next level!

How Linux Works

Take your Linux skills to the next level! Try Linux & UNIX

In this completely revised second edition of the perennial best seller How Linux Works, author Brian Ward makes the concepts behind Linux internals accessible to anyone curious about the inner workings of the operating system. Inside, you'll find the kind of knowledge that normally comes from years of experience doing things the hard way.

$ Check price
39.9539.95Amazon 5 logo(114+ reviews)

More Linux & UNIX resources

Learn how teams work in Agile

Scrum: A Brief and Agile Introduction

Learn how teams work in Agile Try Productivity Tools

If you're looking to learn the elements of Agile development and Scrum, this is the book for you. This book has been reviewed by many, and sought after as a must-read for anyone who works in agile. The book is brief, with no anecdotes nor reiterations. Just pure content to help you graph what Scrum is in no time.

$ Check price
9.959.95Amazon 4.5 logo(568+ reviews)

More Productivity Tools resources