01. Introduction to Git History of Version Control

Welcome to our Git tutorial! In this series, we'll go from the basics commands of Git to the more advanced topics. Before we dive into Git, let's talk a bit about history. If you already have Git installed and are ready to learn, you may jump directly into Git Fundamentals.

Why Version Control?

When working on any project that involves revisions, it's imperative that you keep track of the changes you make. As a programmer, you'll need to pinpoint where a bug might have been introduced by sifting through each update checkpoint. As a data scientist, you'll need to keep track of each dataset and analysis script used to obtain your results.

But how can we handle version control? Here are just some of naive ways we could:

  • Keeping specific versions of files using names such as textfile-version1.txt per major editing. This is obviously not the cleanest nor most efficient way to organize files within your directory.
  • Compressing and backing up our files frequently. Storing these backup files can eat up a ton of hard drive space.
  • Copying files into specific time-stamped folders. This can be error prone and time-consuming.
How not to do version control.
I'm sure most of us are familiar with how we version controlled our college essays.

Any of the above may work, but you can see the tediousness each solution entails. And we must note that backing up and keeping track of our files' evolution is just half the picture. The other half lies in collaboration, where a project is built simultaneously with tens of contributors. So what system can we implement that not only keeps track of our project's progress, but also allows for development with others in real-time?

To share, we could use a file-sharing platform, such as Dropbox, but this too has its drawbacks. Files can go missing and edit conflicts may occur. As great as cloud-sharing services are, they not built specifically designed for collaboration and version control.

Dropbox is a great tool to share files among your different compouters.
Dropbox is a great way to keep your files in-sync across different devices and share data amongst collaborators. It even stores the history of the revisions for up to a month! However, it was not built specifically for version control.

The best possible solution that meets our needs are the use of a Version Control System. There have been many different systems implemented since the age of computers, but to understand where Git came from, we must take a look into the Linux Kernel Project.

Developing the Linux Kernel

The Linux Kernel project was initiated in 1991 by Linus Torvalds as an open-source, collaborative project. The aim was to develop a working operating system kernel under GNU licensing. Since the project included contributors from all over the world, it required a system that could track all edits and resolve any conflicts.

Linus and his team had been using a proprietary Distributed Version Control System (DVCS) known as BitKeeper. Although BitKeeper had been free for years, the owner, Larry McVoy, decided he wanted to start charging a licensing fee. Initially, Linus looked towards other free version control systems, but found none good enough for his project. In 2005, Linus decided to create his own management system; this new Version Control System came to be known as Git.

BitKeeper was the very first version control system used by the Linux project.
The Linux Kernel Project, with its hundreds of collaborators around the world, used to use BitKeeper Version Control System.

Git and its Design Criteria

The term git is a British slang term for "an unpleasant person." Torvalds said "I'm an egotistical bastard, and I name all my projects after myself. First 'Linux', now 'git'."

Linus Torvalds and git logo.
Linus Torvalds, creator of Git and the Linux Kernel.

Linus had several design criteria in mind when developing Git:

  • Support a distributed workflow. Each user has his own copy of the entire code base, making it possible to make changes without Internet connection. We'll see how advantageous this is on the next page.
  • Safeguard against malicious attacks or accidents. Everything that is committed should be checksummed before storing it. This makes it near impossible to change contents without Git knowing about it. Furthermore, since Git operations almost always add data, it's difficult to lose data or perform an undoable function.
  • Support for non-linear development. Contributors can create a branch and keep their changes separate from the production, or main branch. When all edits on the branch are complete, they may merge them to the main branch.
  • Speed. Running Git subcommands should be quick and efficient.
  • Local. Git's core operations do not require an Internet connection, making development fast and convenient. Even browsing the entire history of a project can be done locally.
  • Free. Git should be a free software distributed under the GNU General Public License.

Learn to be a Pythonista!

Learning Python

Learn to be a Pythonista! Try Python

Get a comprehensive, in-depth introduction to the core Python language with this hands-on book. Based on author Mark Lutz's popular training course, this updated fifth edition will help you quickly write efficient, high-quality code with Python. It's an ideal way to begin, whether you're new to programming or a professional developer versed in other languages.

$ Check price
64.9964.99Amazon 4 logo(279+ reviews)

More Python resources

Aching back from coding all day?

Inversion Therapy Table

Aching back from coding all day? Try Back Problems

Stretch out your back and relieve your back muscles with inversion therapy. This device counteracts the forces of gravity on the body by decompressing and elongating the spine. By using this product just ten minutes a day, you can be well on your way to improved circulation and posture while relieving muscle aches, back pain and stress.

$$ Check price
119.98119.98Amazon 4.5 logo(1,700+ reviews)

More Back Problems resources