Moving code around with branches
This is the first of a set of posts supporting a workshop on Git for Beginners. The main source for the workshop is the set of slides I've prepared.
What is Git?
Git is a distributed version control system. Which probably means nothing.
A version control system is a system that keeps track of changes in a bunch of files, such as a programming project.
The distributed part means that it works for teams, with everyone having a copy of the project. With git, everyone is on an equal footing; there are no bosses.
Github is just a website that makes sharing projects easy.
You can install the latest version of Git from the Git website. If you're using Linux or a Mac, you're probably best installing Git from the standard software sources. If you're using Windows, Git for Windows is a good choice to get started (and these workshops will be easier to follow if you install "Git Bash").
Github Desktop is a good front-end for using Github, so long as you're only doing straightforward things; as soon as you want to get complicated, you'll need to use a different tool.
You can do a lot of things with the Git-gui tool, but the command line is the most powerful and flexible way of using git. Once you know the concepts of git, picking up a GUI tool should be fairly simple.
Repositories and commits
Concept: the repository
A repository is everything in a project that Git knows about. It's also the complete history of everything in the project. Nothing is every removed from that history, so you're always able to go back to a previous version. This means that mistakes aren't a bad as they could be.
Commits are cheap: git only stores the differences between commits, so each commit doesn't take up much space. If in doubt, include things in the repository. Include:
- configuration files
- database schemas
Don't include things like:
- secret information (API keys, passwords)
- automatically-generated files (compiled or minified files, pull-in dependencies, etc.
.gitignore file will look after some of that for you, but that's outside the scope of this workshop.
To show git in action, we'll take on the "project" of copying some old novels. Pick your favourite two novels from Project Gutenberg (I'm using Frankenstein by Mary Shelly and Carmilla by J. Sheridan LeFanu).
Making a repository
$ mkdir yourname-git-workshop $ cd yourname-git-workshop $ git init $ ls -a . .. .git
.git directory is where git stores all the history. You are unlikely to need to look inside it.
Including some files
- For each novel, create one text file (name it something with all lower case, no spaces, and ending
- In each file, put the first couple of sentences of each novel.
- Add the marker
# Commit 1before the chunk of text.
- Save the files.
The files should look like these:
# Frankenstein, by Mary Shelly > Text from https://www.gutenberg.org/files/84/84-0.txt # Commit 1 ## Letter 1 _To Mrs. Saville, England._ St. Petersburgh, Dec. 11th, 17—. You will rejoice to hear that no disaster has accompanied the commencement of an enterprise which you have regarded with such evil forebodings. I arrived here yesterday, and my first task is to assure my dear sister of my welfare and increasing confidence in the success of my undertaking.
# Carmilla, by J. Sheridan LeFanu > Text from https://www.gutenberg.org/cache/epub/10007/pg10007.txt # Commit 1 In Styria, we, though by no means magnificent people, inhabit a castle, or schloss. A small income, in that part of the world, goes a great way. Eight or nine hundred a year does wonders. Scantily enough ours would have answered among wealthy people at home. My father is English, and I bear an English name, although I never saw England. But here, in this lonely and primitive place, where everything is so marvelously cheap, I really don't see how ever so much more money would at all materially add to our comforts, or even luxuries.
We're now ready to include these files in the repository. This means we need to understand commits.
Concept: a commit
A commit is a snapshot of a project. It contains all the files in all the directories in the project. It's taken at a particular moment in time, and exists forever in the project's history. Each commit knows its parent commit, and that allows you to go back through the entire history of a project.
Each commit has a unique key (such as
1dbb1a9). You can always refer to another commit, get files from that commit, or even rewind history to a commit.
Commits are cheap to make, as git only stores the changes in each commit from its parent. So commit early and often, so that you don't lose too much work when you need to recover from a mistake.
Making a commit
With that understanding, let's make our first commit. The command
git status is your friend and will tell you a lot about what git thinks is going on.
$ git status On branch master No commits yet Untracked files: (use "git add <file>..." to include in what will be committed) carmilla.txt frankenstein.txt nothing added to commit but untracked files present (use "git add" to track)
This tells us git has found two files it could include in the repository, but we've not yet asked it to keep an eye on them.
We ask git to include these files.
$ git add --all $ git status On branch master No commits yet Changes to be committed: (use "git rm --cached <file>..." to unstage) new file: carmilla.txt new file: frankenstein.txt
Now we commit these files to the repository, and git will forever remember them.
$ git commit -m "First commit" [master (root-commit) 0a411f8] First commit 2 files changed, 33 insertions(+) create mode 100644 carmilla.txt create mode 100644 frankenstein.txt $ git status On branch master nothing to commit, working tree clean
We'll now add two more commits.
- In each file, add another paragraph
- Head the paragraph with the
# Commit 2heading (you'll need them later)
- Add and commit the changed files
- Add a third paragraph to each file
- Add and commit the changes again
# Frankenstein, by Mary Shelly # Commit 1 You will rejoice to hear that no disaster has accompanied ... # Commit 2 I am already far north of London, and as I walk in the streets ... # Commit 3 I try in vain to be persuaded that the pole is the seat of ...
git status should show "working tree clean".
git log should show the three commits you have made, with an identifying code for each one (your codes will differ).
$ git log --oneline 7b2f7e5 Third commit 707372d Second commit 0a411f8 First commit
Making, and fixing, mistakes
In one of the files, change every 'e' to '!'. Save the file.
This is a mistake.
# Frank!nst!in, by Mary Sh!lly > T!xt from https://www.gut!nb!rg.org/fil!s/84/84-0.txt # Commit 1 ## L!tt!r 1 _To Mrs. Savill!, !ngland._ St. P!t!rsburgh, D!c. 11th, 17—. You will r!joic! to h!ar that no disast!r has accompani!d th! comm!nc!m!nt of an !nt!rpris! which you hav! r!gard!d with such !vil ...
We can now use git to recover from this mistake.
git status gives us a clue:
$ git status On branch master Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) modified: frankenstein.txt no changes added to commit (use "git add" and/or "git commit -a")
restore command might do what we want. Try it!
$ git restore frankenstein.txt $ git status On branch master nothing to commit, working tree clean
Now look at the file. The changes have been reversed.
But what happened?
Concept: three trees
Git knows of three places where data sits: the working directory, the Index, and in the commits (the current one is called HEAD).
- your working directory is what's on your local machine, independent of git
- the Index is what will go into the next commit
- HEAD is the most recent commit Git knows about
(Index is separate so that, if you make many changes at once, you can bundle some changes into one commit and others into another commit. But it's also very confusing.)
add command puts files in the Index (also known as staging the files). The
commit command creates a new commit from the Index (and updates HEAD). The
restore command, by default, takes things from the index and puts them in the working directory.
That's what happened above. The good version of the file was in the Index, and the bad version was in the working directory.
git restore took the good version from the Index and replaced the bad version.
git restore file.txtcopies the file in the Index to the working directory
git restore --staged file.txtcopies the file in HEAD to the Index
git restore --staged --worktree --source=HEAD file.txtcopies to the file in HEAD into both the Index and working directory
A note on versions and commands. Up to git version 2.23, the
checkoutcommand was used to restore files. Confusingly,
checkoutwas also used to switch between branches, and the syntax for both uses was itself confusing. In version 2.23,
checkoutbecame the two commands
switch. But many tutorials online were written using the old commands, so you may see lots of references to
checkoutbeing used to do what
Fixing bigger mistakes
Let's make a mistake and commit it.
$ git status
- Replace all the
$ git status
$ git add frankenstein.txt
$ git status
$ git commit -m "No more vowels"
$ git status
The status of the three trees looks like this:
What have we done? How do we get the good file back?
Hint: HEAD~1 means "the parent of HEAD"
That means we can get the good file from an earlier commit.
$ git restore --source=HEAD~ file.txt
will get the file back. We can now continue to work on it, and commit it, as we would any other file.
End of the first part
That's enough for the first post. It's covered several concepts, including:
- three trees
- The commands