Practical 1: Version Control with Git: local#

This notebook is focused on version control with a local repository. You are expected to go through it line by line, step by step. Execute each code cell and follow the instructions to understand how to manage version control locally. Experiment with commands, make changes, and observe how they affect your repository. Your hands-on interaction with the examples will enhance your comprehension of local version control practices.

You might find this reference , this command list as well as this resource very useful for this practical.#

What is Version Control ?#

In software development, revision control systems (RCS) are essential tools. They are widely used across all development environments and by developers everywhere.

RCS are versatile and not limited to just software projects; they are also invaluable for managing various types of digital content, including manuscripts, figures, data, and notebooks.

Revision control systems (RCS) serve two primary purposes:

  1. Track Changes in Source Code:

    • Enable tracking and managing changes to the source code.

    • Allow reverting to previous versions if issues arise.

    • Support working on multiple “branches” of the software at the same time.

    • Use tags to identify and manage different versions, such as “release-1.0” or “paper-A-final.”

  2. Facilitate Collaborative Development:

    • Allow multiple contributors to work on the same codebase simultaneously.

    • Enable numerous authors to make and integrate changes.

    • Provide clear communication and visualization of changes to all team members.

Basic Principles and Terminology of Revision Control Systems (RCS)#

In an RCS, source code or digital content is managed within a repository.

  • Repository: Stores not only the latest version of files but also the complete history of all changes made to these files since their initial addition to the repository.

  • Checkout: Users obtain a local working copy of the files from the repository. Changes are made to these local files, allowing for additions, deletions, and updates.

  • Commit: After completing a task, changes made to the local files are saved back to the repository.

  • Conflict Resolution: If changes have been made by others to the same files, conflicts may arise. The system often resolves conflicts automatically, but manual intervention may be necessary to merge conflicting changes.

  • Branches and Forks: For larger experimental developments, it’s common to create a new branch, fork, or clone of the repository. The primary branch is usually called master or trunk. Once work on a branch or fork is finished, it can be merged back into the main branch or repository.

  • Distributed RCS: Systems like Git or Mercurial allow for pulling and pushing changesets between different repositories. For instance, changes can be pushed from a local repository to a central online repository, such as those hosted on platforms like GitHub.

In a few words, version control is a way to keep a backup of the changes in your files and to store a history of those changes. The key charateristic of VC is that and it allows many people in a collaboration to make changes to the same files concurrently. VC is done via a VC system and there are a lot of them. Wikipedia provides both a nice vocabulary list and a fairly complete table of some popular version control systems and their equivalent commands.

git –help : Getting Help#

The first thing you should know about any tool is how to get help. From the command line type

$ man git

If you remember from the shell class, man tells you more about a command and how to use it. The manual entry for the git version control system will appear before you. You may scroll through it using arrows, or you can search for keywords by typing / followed by the search term. I’m interested in help, so I type /help and then hit enter. It looks like the syntax for getting help with git is git –help.

To exit the manual page, type q.

Let’s see what happens when we type :

$ git --help

Excellent, it gives a list of commands it is able to help with, as well as their descriptions.

$  git help <command>' for more information on a specific command.

git config : Controls the behavior of git#

A few settings are in order. You don’t have to do it now but it is recommanded.

$ git config --global user.name "YOUR NAME"
$ git config --global user.email "YOUR EMAIL"

git init : Creating a Local Repository#

To keep track of numerous versions of your work without saving numerous copies, you can make a local repository for it on your computer. What git does is to save the first version, then for each subsequent version it saves only the changes. This is the trick, git only records the difference between the new version and the one before it. With this compact information, git is able to recreate any version on demand by adding the changes to the original in order up to the version of interest.

To create your own local (on your own machine) repository, you must initialize the repository with the infrastructure git needs in order to keep a record of things within the repository that you’re concerned about. The command to do this is git init .


Practical : Create a Local Repository#

Step 1 : Initialize your repository. Navigate to /home

$ cd
$ mkdir simplestats
$ cd simplestats
$ git init
Initialized empty Git repository in /home/me/simplestats/.git/

Step 2 : Browse the directory’s hidden files to see what happened here. Open directories, browse file contents. Learn what you can in a minute.

$ ls -A .git
$ cd .git
$ ls -A
HEAD        config      description hooks       info        objects     refs      branches

Step 3 : Use what you’ve learned. You may have noticed the file called description. You can describe your repository by opening the description file and replacing the text with a name for the repository. We will be creating a module with some simple statistical methods, so mine will be called “Some simple methods for statistical analysis”. You may call yours anything you like.

$ nano description

You can use !tree or tree to display the directory structure in a tree-like format


An interesting command I would like you to test is git status, I will describe it later but let’s see what it displays now.

$ git status
On branch master

No commits yet

nothing to commit (create/copy files and use "git add" to track)

git add : Adding a File To Version Control#

For the git repository to know which files within this directory you would like to keep track of, you must add them. First, you’ll need to create one, then we’ll learn the git add command.


Practical : Add a File to Your Local Repository#

Step 1 : Create a file to add to your repository.

$ touch README.md

Step 2: Verify that git has seen the file.

$ git status
# On branch master

# No commits yet

# Untracked files:
#(use "git add <file>..." to include in what will be committed) README.md

# nothing added to commit but untracked files present (use "git add" to track)

Step 3 : Inform git that you would like to keep track of future changes in this file.

$ git add README.md

git status : Checking the status of your local copy#

The files you’ve created on your machine are your local “working” copy. The changes your make in this local copy aren’t backed up online automatically. Until you commit them, the changes you make are local changes. When you change anything, your set of files becomes different from the files in the official repository copy. To find out what’s different about them in the terminal, try:

$ git status
# On branch master
#
# No commits yet
#
# Changes to be committed:
#   (use "git rm --cached <file>..." to unstage)
#
#       new file:   README.md
#

The null result means that you’re up to date with the current version of the repository online. This result indicates that the current difference between the repository HEAD (which, so far, is empty) and your simplestats directory is this new README.md file.

git commit : Saving a snapshot#

In order to save a snapshot of the current state (revision) of the repository, we use the commit command. This command is always associated with a message describing the changes since the last commit and indicating their purpose. Informative commit messages will serve you well someday, so make a habit of never committing changes without at least a full sentence description.

ADVICE: Commit often

In the same way that it is wise to often save a document that you are working on, so too is it wise to save numerous revisions of your code. More frequent commits increase the granularity of your undo button.

ADVICE: Good commit messages

There are no hard and fast rules, but good commits are atomic: they are the smallest change that remain meaningful. A good commit message usually contains a one-line description followed by a longer explanation if necessary.


Practical : Commit Your Changes#

Step 1 : Commit the file you’ve added to your repository.

$ git commit -am "This is the first commit. It adds a readme file."
  [master (root-commit) 664867c] This is the first commit. It adds a readme file.
  1 file changed, 0 insertions(+), 0 deletions(-)
  create mode 100644 readme.md

Step 2 : Admire your work.

$ git status
# On branch master
nothing to commit, working tree clean

git diff : Viewing the Differences#

There are many diff tools.

If you have a favorite you can set your default git diff tool to execute that one. Git, however, comes with its own diff system.

Let’s recall the behavior of the linux diff command on the command line. The equivalent command for windows is fc. Choosing two files that are similar, the command:

$!diff file1 file2

will output the lines that differ between the two files. This information can be saved as what’s known as a patch, but we won’t go deeply into that just now.

The only difference between the command line diff tool and git’s diff tool is that the git tool is aware of all of the revisions in your repository, allowing each revision of each file to be treated as a full file.

Thus, git diff will output the changes in your working directory that are not yet staged for a commit. To see how this works, make a change in your README.md file, but don’t yet commit it.

$ git diff

A summarized version of this output can be output with the --stat flag :

$ git diff --stat

To see only the differences in a certain path, try:

$ git diff HEAD -- [path]

To see what IS staged for commit (that is, what will be committed if you type git commit without the -a flag), you can try :

$ git diff --cached

git log : Viewing the History#

A log of the commit messages is kept by the repository and can be reviewed with the log command.

    $ git log       
   commit 664867c42a05461702388310155b785a287d0308 (HEAD -> master)
    Author: Techni Preneurs <ai.technipreneurs@gmail.com>
    Date:   Wed Sep 11 20:31:07 2024 +0200

        This is the first commit. It adds a readme file.    

There are some useful flags for this command, such as

-p
-3
--stat
--oneline
--graph
--pretty=short/full/fuller/oneline
--since=X.minutes/hours/days/weeks/months/years or YY-MM-DD-HH:MM
--until=X.minutes/hours/days/weeks/months/years or YY-MM-DD-HH:MM
--author=<pattern>

git reset : Unstaging a staged file#

There are a number of ways that you may accidentally stage a file that you don’t want to commit. Create a file called temp_notes that describes what you had for breakfast, and then add that file to your repo. Check with status to see that it is added but not committed.

You can now unstage that file with:

$ git reset temp_notes

Check with status.

git checkout : Discarding unstaged modifications (git checkout has other purposes)#

Perhaps you have made a number of changes that you realize are not going anywhere. Add a line to README.md that describes your dinner last night. Check with status to see that the file is changed and ready to be added.

You can now return to previous checked in version with:

$ git checkout -- README.md

Check with status and take a look at the file.

git rm : Removing files#

There are a variety of reasons you way want to remove a file from the repository after it has been committed. Create a file called READYOU.md with the first names of all your immediate family members, and add/commit it to the repository.

You can now remove the file from the repository with:

$ git rm READYOU.md

List the directory to see that you have no file named READYOU.md. Use status to determine if you need any additional steps.

What if you delete a file in the shell without git rm? Try deleting README.md

$ rm README.md

What does git status say? Oops! How can you recover this important file?

$ git checkout -- README.md

git revert : the promised “undo” button#

It is possible that after many commits, you decide that you really want to rollback a set of commits and start over. It is easy to revert your code to a previous version.

You can use git log and git diff to explore your history and determine which version you are interested in. Choose a version and note the hash for that version. (Let’s assume abc456)

$ git revert abc456

Importantly, this will not erase the intervening commits. This will create a new commit that is changed from the previous commit by a change that will recreate the desired version. This retains a complete provenance of your software, and be compared to the prohibition in removing pages from a lab notebook.

Visualize Git Log Tree#

For this section, you can read the necessary information here. The idea is just to make your commits a bit fancy. To really see the beauty of it, try to play with your files and do quite a number of commits.

$ git log

or

$ git log --pretty=oneline

or

$ git log --graph --pretty='%Cred%h%Creset -%C(auto)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --all

Exercise :#

  1. Create 5 files in your directory with one line of content in each file.

  2. Commit the files to the repository.

  3. Change 2 of the 5 files and commit them.

  4. Undo the changes in step 3.

  5. Print out the last entry in the log.


Resources#

  1. git book - Free and Open