Notes on Source Control with GIT

The Basics of Using Git

Understanding the Git File System

DictionaryDescription
commit_editmsgPlain file with commit messages
HEADContains reference to current branch
configConfiguration information about our repository
descriptionContains name of the repository
hooksContains scripts that you can add to setup some automation tasks
indexKeeps tracks of files in the staging area & also files that are ready to be commited
infoContains items that such as exclude files – gitignore
logLog files
objectsMainly our database
refsReference files for branches and tags of the repository

Creating a Local Repository (Empty)

git init /path/to/directory – Initializes a git repository, either by creating a new directory or adding the git repository files to an existing directory
git init --bare /path/to/directory – Initializes a bare git repository, for larger projects, containing no working area.

Most common method is to navigate to parent folder and type git init

Basic Configuration of Git

First need to setup our identity:

git config --global user.name "Peter Parker"
git config --global user.email "pparker@dbugle.com"

Check above with git config --list.

We can also add a preferred editor. For example:

git config core.editor "/usr/bin/vim"

We can see all of the config stuff in ~/.gitconfig

Note: To change information on a particular repo, ignore the --global attribute For example, navigate to your repo and type
git config user.email "user@email.com"

Adding Files to a Project

Create a new file and run git add to track/add it to our staging area. Do a cat .git/index to see that it is in the staging area (waiting to be committed)

There is another, better method to view files: git status

Create an empty directory, and git add it. Next, run git status

Where is our new directory? There is nothing there because git doesn’t care about empty directories. It only cares about files.

To make sure all files in a directory are included, we can add a .keep file in that directory. However, we usually don’t want to stage all files.

To delete files from git:

git rm – Careful though, as this will remove/delete the file

The Status of your Project

Create 2 new files, and git add one of them to add it to our staging area:

touch file1
touch file2
git add file1

Check what happened with

check git status
check git status -s (short)

Stage the second file and check again:

git add file2
git status

Modify one of the files and check the git status:

echo "this is a some text in file2" > file2
git status
git status -s
❯ git status -s
AM file2

A = is the status of the staged info
M = is the status of the unstaged info

So first character refers to the staged value, the second to the unstaged value.2 Which means our modification will not be added. If we go make a commit this change will not be committed. We need to stage the file again: git add file2

To check verbose output, use git status -v

For reference, check the man pages with man git-status for more info.

Committing to Git

Now we have all the files staged, we need to commit them to the database.

There are two ways to do this:

  • git commit (the long way)
  • git commit -a -m "What this commit is for"
  • gc -am "What this commit is for" (the fastest way)

Trying committing using the long way:

> git commit
  
  0
  1 # Please enter the commit message for your changes. Lines starting
  2 # with '#' will be ignored, and an empty message aborts the commit.
  3 #
  4 # On branch master
  5 # Changes to be committed:
  6 #       modified:   file2
  7 #
  8 # Changes not staged for commit:
  9 #       modified:   file2
 10 #

This will launch your default editor and ask to enter the commit message.

To make things easier, we can simply type git commit -m "This is a test commit"

Commit messages are written to .git/COMMIT_EDITMSG

❯ cat .git/COMMIT_EDITMSG
        
   1   This is a test commit
   2   # Please enter the commit message for your changes. Lines starting
   3   # with '#' will be ignored, and an empty message aborts the commit.
   4   #
   5   # On branch master
   6   # Changes to be committed:
   7   #   modified:   file2
   8   #
   9   # Changes not staged for commit:
  10   #   modified:   file2
  11   #

To check our commits, we can issue git log . The output will contain something similar to the following:

commit c702b2bfb4939fb247b852e2f485c0362cab9a09 (HEAD -> master)
Author: Peter Parker <pparker@dbugle.com>
Date:   Mon May 17 17:54:10 2021 +0300

Git will take the first two characters of this hash (in this case c7) and create a directory under the objects folder:

❯ ls .git/objects/c7
02b2bfb4939fb247b852e2f485c0362cab9a09

c7 + 02b2bfb4939fb247b852e2f485c0362cab9a09 Here is the file that contains the hashed version of our committed file’s contents Note that these are binary files, and we cannot directly read them.

To remove a file from git’s staging area but not from the system itself, we use the --cached option:

❯ git rm --cached file2
rm 'file2'

❯ ls -l
-rw-r--r--  1 pparker     7 May 17 17:47 file1
-rw-r--r--  1 pparker    68 May 17 05:55 file2

❯ git status
On branch master
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	deleted:    file2

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	file2

Ignoring certain file types

  • .git/info/exclude – Original file that contains file patterns that git will not track
  • .gitignore – Ignore file local to a git repository commonly used to exclude files based on patterns
  • git check-ignore <pattern> – Local documentation for the gitignore file

Let’s create a new directory:

> mkdir build

We want git to ignore files in this directory

The most popular method is the gitignore file:

touch .gitignore  
git add .gitignore
git status
git commit -m "Adding a gitignore file"

Note that we need to add this file to our Repo.

Everything we list in this file will be ignored and not tracked. Let’s add the build directory:

echo "build/*" >> .gitignore
git commit -a -m "added gitignore file to exclude build directory"

So if for example i compile my code to the build directory, it will be excluded from git. Test by creating a file in there; it should be ignored.

Some developers use vim to create a backup file using :set backup. This creates a new file with an appended ~ containing the original info/configuration. Let’s see if we can ignore this backup as well:

echo *~ >> .gitignore
gc -am "updated .gitignore file"

To see what patterns are excluded by git, we can run git check-ignore -a *~

Tags, Branching, Merging and Reverting

Using Tags

A tag is used to mark a specific commit in your project. It let’s you put a “sticky note” on a particular point in your project’s history, for example adding a new version to a project

There are 2 types of tags:

  • Annotated tags (preferred)
  • Lightweight tags

Example:

> git tag -a v0.1 -m "First tag for the project"

We can check for tags with git tag

Difference between annotated and lightweight tags: Annotated tags contain everything and a db (db, user doing the commit, user who tagged and is also checksummed) lightweight tags are less used and usually denominate a private or temporary “label”

To create a lightweight tag: git tag v0.2 (Note this is run without the -a (annotated) flag)

To delete a tag, use -d:

git tag -d v0.1

You can checkout all major milestones (versions) using git checkout v0.1

Checking out the tag v0.1 in this method will allow you to restore all the files to the state they were in at tagging. Changes to the files at this point, however, would be done by creating a new branch, or it can have unintended consequences when the main branch is reverted to its current state.

Using Branches

Let’s create a new branch:

> git branch development
> git status

We see that we are still on master branch. Why? Let’s run git log --oneline --decorate

> git log --oneline --decorate
6996b44 (HEAD -> master) added file2
c702b2b This is a test commit

We see here that HEAD is pointing to master. HEAD is the pointer to the current branch being worked on, and we can use git log and git status to view which branch HEAD is pointing to.

To switch branches: git checkout development

Let’s make some edits:

❯ echo "New Features: " >> [changelog.md](<http://changelog.md/>)
❯ gc -am "adding features section to changelog"
[development ee3bd38] adding features section to changelog
1 file changed, 1 insertion(+)

Note that this does not affect the master branch, and all of our changes are being recorded in the development branch only. Let’s add some more stuff to changelog.md and commit again.

Switch to the master branch: git checkout master

Run cat changelog.md … uh-oh! Where are our changes? That’s because HEAD is pointing to the last commit of the master branch.

Let’s switch back: git checkout development –> Our changes are visible as HEAD is now pointing to the last commit of the development branch

Merging Branches

To check where we last were, let’s check the HEAD file:

> cat .git/HEAD
ref: refs/heads/development

Let’s check this file:

> cat .git/refs/heads/development
1dd7ad864f783708af12e7c4dc0b95586898869d

This is the SHA1 hash of our last commit

Let’s do a basic merge: The first thing that we need to do is be on the branch that we need to merge something into:

> git checkout master
> git merge development
Updating ff12f31..8bb12cd
Fast-forward
  changelog.md | 1 +
  1 file changed, 1 insertion(+)

And that’s it!

“Fast-forward” means that branch master caught up with the changes in the development branch:

Let’s set up a situation where we have a problem. We’ll create 2 new branches:

❯ git branch trial1
❯ git branch trial2

Let’s jump to the trial1 branch:

❯ git checkout trial1
Switched to branch 'trial1'

And create a new file and commit it to this branch:

❯ echo "stuff" >> info.txt
❯ git add info.txt
❯ git commit -m "new info file - stuff"
[trial1 127558f] new info file - stuff
 1 file changed, 1 insertion(+)
 create mode 100644 info.txt

Let’s do the same on trial2 , where a developer will also create an info.txt file but with different text:

❯ git checkout trial2
Switched to branch 'trial2'
❯ echo "things" >> info.txt
❯ git add info.txt
❯ git commit -m "new info file - things"
[trial2 892337f] new info file - things
 1 file changed, 1 insertion(+)
 create mode 100644 info.txt

Let’s go back to the trial1 branch and attempt to merge trial2 into it:

❯ git checkout trial1
Switched to branch 'trial1'
❯ git merge trial2
CONFLICT (add/add): Merge conflict in info.txt
Auto-merging info.txt
Automatic merge failed; fix conflicts and then commit the result.

This is called a merge conflict. Let’s check our git status output:

❯ git status
On branch trial1
You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both added:      info.txt

no changes added to commit (use "git add" and/or "git commit -a")

We can take a look at the conflicting files to see if there is something we can do:

> vim info.txt
<<<<<<< HEAD
stuff
=======
things
>>>>>>> trial2

Git has labeled the problem areas for us in our file.

The first half of this output is our local branch (HEAD) and anything below the = signs comes from the other branch. Let’s see if we can combine the 2 words by editing the file:

stuff things

Once we are satisfied with the result, we can save and close the file.

❯ git add info.txt
❯ git commit -m "added trial2's work, resolved merge conflict"
[trial1 5d1222b] added trial2's work, resolved merge conflict

Let’s switch to the master branch and merge trial1

❯ git checkout master
Switched to branch 'master'
❯ git merge trial1
Updating 6996b44..5d1222b
Fast-forward
 info.txt | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 info.txt
❯ cat info.txt
stuff things

Let’s remove those branches:

❯ git branch -d trial1
Deleted branch trial1 (was 5d1222b).
❯ git branch -d trial2
Deleted branch trial2 (was 892337f).

Rebasing

Unlike merge where we are combining 2 branches into 1 and creating a commit pointer at the end of the combination process, with rebase it’s more like appending the change from one branch to another.

Let’s try this out:

❯ git checkout development
Switched to branch 'development'
❯ vim changelog.md
❯ git commit -a -m "updated some things in changelog"
[development 6e25d8c] updated some things in changelog
 1 file changed, 2 insertions(+)
❯ git checkout master
Switched to branch 'master'

This time, instead of merging the development branch, we are going to rebase the master branch by replaying the changes we made to the changelog file and the development branch over our master branch

git rebase development

Now master has our updated changelog file. Why do a rebase instead of a merge? For one thing, the history of our project will look cleaner. The git log will look as if we had made all of these changes on our master branch, as opposed to making changes in dev and then in master. Where this really comes into play is if we wanted to make changes to our local repository and we wanted to send or push these changes to our remote repository. Makes things simpler to the maintainer of the project.

However, you should only rebase things that are in your local repository. Merge provides a detailed/accurate history whereas rebase provides something of a falsified record to show a cleaner history.

Revert a commit

Say we have commited a bunch of nonsense to our readme file, and that we need to revert.

To undo the last commit, we can use git revert HEAD:

❯ echo "asdhkjashdlkjashdkljhaskljhdlkajshd" >> file2
❯ gc -am "accidental nonsense"
[master e65c17c] accidental nonsense
 1 file changed, 1 insertion(+)
❯ cat file2
A test
asdhkjashdlkjashdkljhaskljhdlkajshd
> git log --online --decorate
40356d3 (HEAD -> master) Revert "accidental nonsense"
e65c17c accidental nonsense
❯ catp file2
A test

We can also jump back additional commits, counting back from HEAD. For example:

> git log --online --decorate 
40356d3 (HEAD -> master) Revert "accidental nonsense"      [0]   
e65c17c accidental nonsense                                 1  
03ae641 Revert "some stuff in t30"                          2
7c851c8 Revert "important info - walked on the keyboard"    3
e261e7c important info                                      4 <---
877bac3 some stuff in t30                                   5
bb6a75a some test files
15ec51e Merge branch 'development'
d589b0c info.txt
6e25d8c updated some things in changelog
> git revert HEAD~4

This way, we are reverting commit e261e7c (careful when choosing where to revert).

Note that this command will create a brand new commit on the branch that we are on and this commit will contain information that reverses the changes. Understand that we are not going back in time with one commit – We are creating a new one that will revert the previous commits. This way our historical record retains its integrity while removing any mistakes.

Using the diff Command

Very useful to see differences between commits.

Run git diff. We haven’t changed something. Let’s add to a file

❯ echo "This is some new information. We need to do the following: " >> newInfo.txt
❯ git add newInfo.txt
❯ git commit -am "newinfo.txt"
❯ vim newInfo.txt   (Adding new entries here)
> git diff
diff --git a/newInfo.txt b/newInfo.txt
index 08a810e..c9efb9b 100644
--- a/newInfo.txt
+++ b/newInfo.txt
@@ -1 +1,2 @@
 This is some new information. We need to do the following:
+Adding some additional stuff
(END)

Output shows us that we added Adding some additional stuff, indicated by the + sign

Let’s see what changed from our initial commit until commit 4

> git log --oneline --decorate 
126b230 (HEAD -> master) newinfo.txt
9c1245c Revert "Commit 2"
a5d531e Commit 5
4701393 Commit 4
d2a2f36 Commit 3
29eb5f1 Commit 2
7634f1b Commit 1
> git diff 7634f1b 4701393
diff --git a/commits b/commits
index 5852f44..682ff4d 100644
--- a/commits
+++ b/commits
@@ -1 +1,4 @@
 Initial commit
+Second commit
+Third commit
+Fourth commit

To see a short summary, use the --summary option to also see what was created between these commits

How Garbage Collection Works

As our project grows, we need to do some maintenance. If tasks fail, objects may be orphaned, deleted, etc, and this leaves remnants in the database.

The git garbage collection command (git gc) cleans out old objects that cannot be referenced by the database anymore and compresses contents within the .git directory to save space. Let’s have a look at the prune command:

git gc --prune

❯ git gc --prune
Enumerating objects: 21, done.
Counting objects: 100% (21/21), done.
Delta compression using up to 8 threads
Compressing objects: 100% (12/12), done.
Writing objects: 100% (21/21), done.
Total 21 (delta 1), reused 0 (delta 0), pack-reused 0

Will remove unecessary files and compact the db for performance.

You can also use the --auto option which will check to see if a repo needs cleaning

> git gc --auto

You can also configure the garbage collector to run on its own with the following setup:

> git config gc.pruneexpire "30 days"

This config option will have the garbage collector running every month

Git’s Logs and Auditing

Using Git’s logs

Apart from git log, we can use a few additional tools:

git log --graph – will display all commits and links

We can also collect logs from a given period, for example: git log --since="4 days ago"

There is also a handy method to search for a specific line of text: git log -S build will search for the keyword build inside the commits.

We can also view some basic statistics: git log --stat

There is also a --shortstat command

Oneliners:

  • git log --pretty=oneline --abbrev-commit
  • git log --oneline

We can also provide our own way for formatting: git log -—pretty=format:"%h - %an - %ar - %s"

This log format will provide:

%h – Hash

%an – Author name

%ar – Relative time of commit by the author

%s – Subject line

> git log -—pretty=format:"%h - %an - %ar - %s"
126b230 - Calvin Hobbes - 44 minutes ago - newinfo.txt
9c1245c - Calvin Hobbes - 50 minutes ago - Revert "Commit 2"
a5d531e - Calvin Hobbes - 51 minutes ago - Commit 5
4701393 - Calvin Hobbes - 51 minutes ago - Commit 4
d2a2f36 - Calvin Hobbes - 52 minutes ago - Commit 3
29eb5f1 - Calvin Hobbes - 52 minutes ago - Commit 2
7634f1b - Calvin Hobbes - 53 minutes ago - Commit 1

View last 3 commits with git log -3

Cloning Repositories

Cloning Local Repositories

Why: When i want to revert a commit without messing up.

git clone <local repo> <new repo>. For example:

> ls -l
drwxr-xr-x    5 pparker  pparker May 17 22:19 playground

> git clone playground playground-test

> ls -l
drwxr-xr-x    5 pparker  pparker May 17 22:19 playground
drwxr-xr-x    5 pparker  pparker May 17 23:20 playground-test

Overall, it’s a good safety net.

Cloning Remote Repositories over HTTPS

Why: To download source code for an app that may be available, or if you want to contribute to a project

From the Terminal: git clone <url> (for example Netdata)

Forking

Why: Use a fork to update a project, or use a fork to start a new one based on another project

Push, Pull and Tracking Remote Repositories

Tracking Remote Repositories

After cloning a remote repository, run git remote

> git remote
origin

origin is just an alias that referes to the URL where the project came from.

We can view the real repo by passing the -v option:

> git remote -v
origin	<https://github.com/pparker/apex-lemons.git> (fetch)
origin	<https://github.com/pparker/apex-lemons.git> (push)

Some more detailed information:

> git remote show origin

* remote origin
  Fetch URL: <https://github.com/pparker/apex-lemons.git>
  Push  URL: <https://github.com/pparker/apex-lemons.git>
  HEAD branch: master
  Remote branch:
    master tracked
  Local branch configured for 'git pull':
    master merges with remote master
  Local ref configured for 'git push':
    master pushes to master (up to date)

Pushing to Remote Repositories

It is considered bad form to push something straight to the master branch for a project that you do not control.

> git branch readme-updates
> git checkout readme-updates
> echo "Be sure to have your SSH key set up and that your public key is added to your GitHub account" >> README.md
> git commit -a -m "Updated README to include info on SSH key"
> git push -u origin readme-updates

CheatSheet

.git/:

commit_editmsg – Plain file with commit messages

HEAD – Contains reference to current branch

config – Configuration information about our repository

description – Contains name of the repository

hooks – Contains scripts that you can add to setup some automation tasks

index – Keeps tracks of files in the staging area & also files that are ready to be commited

info – Contains items that such as exclude files – gitignore

log – Log files objects Mainly our database

refs – Reference files for branches and tags of the repository

Tags:

git tag -a <tagName> -m <message> – Create an annotated tag

git tag – View all tags in history

git tag <tagName> -m <message> – Create a lightweight tag (without the -a option)

git tag -d <tagName> – Delete a specific tag

man git-tag – Local documentation for the git tag command

Branches:

git branch <branchName> – Creates a new branch of the project there

git checkout <branchName> – Switches to another branch

HEAD – Pointer to the current branch being worked on, can use git log and git status to view which branch HEAD is pointing to

man git-branch – Local documentation for the git branch command

man git-checkout – Local documentation for the git-checkout command

git merge – Combines the latest commits from two branches into one branch

git branch -d <branch> – Deletes speficied branch

man git-merge – Local documentation on using the command

Revert commits:

git revert HEAD – Revert last commit

git revert HEAD~3 – Revert 3rd commit from HEAD

Diff:

git diff – will show last diff

git diff HEAD^ HEAD – Will show differences between last commits

git diff <commit> <commit> – Show all differences between these commits

git diff —summary <commit> <commit> – Show files and permission changes between commits