JochemKuijpers.nl Personal blog and portfolio

How to share your code through GitHub

github, git, programming

I recently got asked how to share a piece of code. This person explained to me they did not know how to properly share a piece of code. I don't think there's a proper way, but there are certainly less convenient ways of sharing your code.

Small disclaimer: I'm largely self-thought in the use of git. I've used version control software (including git) during my Computer Science courses, and I often use git for my personal projects. I think I have a pretty good idea of how git works, good enough to explain the basics. Note that this is merely an introduction. I will not go into details about branches or other advanced operations.

I will start by explaining the terminology, then I will give a few examples on how to use git and GitHub.

So what is git, and what is GitHub?

Let's ignore GitHub for now, we'll first have a look at what problems git solves for us.

What problem does git solve?

When working on a program, it sometimes occurs you want to restore a previous version of your code, because you tried to fix something but accidentaly broke a lot of other code. This is not possible as every change overwrites your old code files and you end up re-writing it all from memory.

Another problem you may encounter when writing software, is that multiple programmers may be working on similar files. Say you store your code on a shared network folder, it is possible you overwrite changes another programmer made because you edited a small line of code somewhere else in the file.

There's all kinds of issues with multiple people trying to work on the same set of code files. Beginning programmers often move towards Dropbox or Google Drive, or other similar file sharing services, but may notice errors or duplicate files whenever two people edit the same file.

Git solves all of these problems by giving each programmer access to their own set of code files to edit. Git then records the changed made to these files, and applies them to the files of other programmers, though git doesn't do this automatically. The record of your current and all previous versions of your software, is called a repository.

Index

When you're making changes to your local files, it is important that git knows which files are important to track. You need to specify to git which files it needs to track by adding them to the git index. You can do this at any point in time, even mid-project, but keep in mind that any file not in the index will not be tracked by git and that file will not be shared with the other programmers on your project.

Files you don't necessarily want to track in git, are files that are different for each programmer. For example configuration files that hold absolute paths. Often, you also want to keep database passwords and such out of the git index. A good alternative is a text file (often called README.md) which describes that these files need to be set-up manually by anyone using the code in this repository. You can prevent adding configuration files or other files you don't want in your git index by adding them to a file called .gitignore ('dot-git-ignore', indeed). More on that later.

Committing

Once you're done making changes, it is time to record these changes and group the changes in all the files together. This is called committing. This is where you write a small summary (called a commit message) of your changes so everyone knows why you made them.

The word 'commit' should be seen the way it's used in the sentence: "to commit to an idea". In other words, you are determined to share the current state of your code with other programmers. Implicitly, this means you agree that this is a good change to the source code. You commit to this change, so to speak. Often, a team will only accept working and compiling code. If your code does not compile, generates errors or does not work correctly, yet, it's not yet time to commit your changes.

Exceptions to this rule may occus when setting up a new project. It may be helpful in the very beginning to set up a skeleton project that does not necessarily compile, but allows multiple programmers to start working on getting the initial working version.

When you commit, your changes are grouped into one big report. This report of changes, including a changelog and your name and e-mail address (the author of the commit), is called a commit. This commit is stored in your local repository.

Wait..? Aren't we sending our changes to other programmers? Why is it stored in our local repository? Read on.. :)

Pushing

This is where remotes and GitHub come in..

Every programmer on your team has their own local repository, tracking their own local changes and keeping a record of all commits ever made to that repository. To share and use each others commits, you need some way to get your local commits to the repository of another programmer, and vice versa. You can do this by pushing your changes to a remote repository. Pushing here, means uploading.

A remote is basically a normal repository, but accessible to everyone on your team and sometimes to the rest of the world as well (if the repository is a public one). Your repository may be a remote of someone else's repository. Often though, GitHub hosts the remote repository for all programmers. They all push to GitHub and pull from GitHub. This makes the GitHub repository a centralized repository. You could just as easily use one of your team members' repositories as a remote. GitHub is just an easy solution which, in addition, has a lot of nice features on their website (such as issue tracking, wikis, etc.).

There's an important issue here: What if your commit is not compatible with a commit made by someone else? For example, say you added a line of code to the function doSomething() but before your commit got to the remote repository, someone else removed the entire function and pushed their commit first. Now your commit is no longer valid and cannot be applied to the remote repository because commits where made that you've never applied to your local repository. This is when your push gets rejected.

Pulling and merge conflict resolving

Just like we can push, we can also pull changes from the remote repository. This is how other programmers would get your commits to their repositories: they would pull from the remote repository after you pushed your commits there. Pulling means downloading here.

Let's stick with our scenario from the last section, where you wanted to push some commits, but got rejected. Remember how we will only commit working code (see the commit section)? We will have to fix this with another commit before we're allowed to push to the remote repository again. So we will pull all the changes that have been recorded in the remote repository since the last time. These changes are incompatible, so this will create a merge conflict. Git will indicate which files have merge conflicts and we will have to manually inspect which changes where made in our version and in the remote version of the time.

The commit messages of the remote commits will be available to us to figure out how to resolve this merge conflict. This is why it is important to write a good commit message.

Once you have resolved the merge conflict, you can commit these changes and try to push them to the remote repository again. Be sure to check your local code actually compiles before committing. We don't want to break the remote repository everyone depends on!

Now that the remote repository contains your changes, other programmers can pull the remote repository to their own local repositories. This allows other programmers to work with our changes. In the same way, you should pull every time you start working on a new feature or bug, to ensure you are working with the latest version of the code.

Cloning

We have yet to discuss a small, but important, part of git. Sometimes, a new programmer joins the team. To set up their local repository, they can simply clone the remote repository. This means all previous changes to the repository are now also on the machine of this new programmer. It's a simple command, but it's important to know. This is how you can participate in other programmer's projects!

If you understand the above text , then you understand the basic concepts of git. Now take a look at the examples below and see if you can execute them yourself.

Example usage of git

Below are some examples on how to use git. I assume you have installed the command line program git. You can use GUI clients as well, which do largely the same thing, but as a programmer, a command line shouldn't scare you.

To name a few git GUIs you can use:

  • Git GUI (the download contains both command line and a graphical interface for git)
  • TortoiseGit (shell integration for Windows)
  • GitHub Desktop (streamlined but somewhat limited interface by GitHub)

If you're confident and you want to use the command line, be sure to navigate to the repository root directory. This is also called a working directory. It's the directory that holds all your files  and the .git folder.

As a side note: You may or may not see files and folders starting with a dot. Under linux, these files are hidden by default. Windows 10 seems to have copied that behaviour. If you can't see these files, that doesn't mean they're not here. As a developer, you'll probably want to see them to give you absolute control over what is in your git repository, so you should look up how to achieve that with your file explorer.

Create a local repository

Creating a local repository is simple. It's one single command:

git init [directory]

This will create a git repository (and a folder named .git) in the directory you specified, or in the current directory if you did not specify a directory (note: directory means folder. It's the same thing).

The .git folder contains all the repository data. You should not touch it, it's what holds all of the data your git repository needs to function properly. The parent folder, which holds the .git folder, is called the working directory. This is where you put all your code and other project files. You can include documentation here as well, but GitHub has a nice feature called Wikis, which is better suited for shared documentation.

You can find the git-init documentation here.

Adding files to the index

When adding files to the working directory, they're not immediately tracked by git. As I described in the index section above, you may not necessarily want to track every file. Configuration files and such, often are left out. Most of the time, you only want to share the code files.

Adding files in the working directory is done using the following command:

git add <path>

Note that <path> is not optional. If you specify a directory here, all files in the directory are added to the index. You can combine this with the fact that a single dot refers to the current directory, in order to arrive at this command: git add .. That's right: "git, add, dot". This adds all files in the working directory to the index, including hidden files.

You can find the git-add documentation here.

.gitignore

The one exception to git add is that all files that match one or more paths in the .gitignore file, are ignored by git when adding them to the index. If they're already in the index, matching to .gitignore will not remove them from the index.

For example, take the following files:

conf/database.ini
helloworld.cpp
build/executable.so

And take the following content of .gitignore:

conf
*.so

If we would execute git add . on this working directory, we would only add helloworld.cpp, since the other files match to either line in the .gitignore file.

You can find the gitignore documentation here.

Committing changes

Now imagine you added some files, deleted some others, made changes to even more files. Now you want to commit them.

You can do that using the following command:

git commit --all --message="Your commit message goes here"

This is a pretty long command, and you probably don't want to type it all the time. So here's exactly the same command, but shorter:

git commit -am "Your commit message goes here"

This gathers all the changes in all files in the git index, and creates a commit out of them. The commit message is added.

Pushing to a remote repository

Let's push our previous commits to a remote repository.

There's a little bit of setup left to do if this is a repository that we created ourselves. First off, head to GitHub.com and log into your account. Now create a new repository.

GitHub should now show you the exact steps I'm about to tell you. So here's what you do:

Copy the HTTPS link Github provides. It looks like this: https://github.com/<username>/<repositoryname>.git

Now paste it into the following command:

git remote add origin <HTTPS link goes here>

Now that we've told git where the remote repository called origin can be found, all that's left to do is push our changes there:

git push -u origin master

This command pushes the local changes to the remote repository. I have not discussed branches, and I will not do that in this guide, but that's where the master part comes from. We're pushing the current changes to the master branch of the remote repository called origin. The -u part is only required when setting up. When the remote repository is already setup you can leave it out and the full command becomes git push origin master.

A side-note regarding remote repositories

A remote is added automatically to our local repository when we clone an existing repository. This repository you are cloning from is usually called 'upstream', because all changes made there also affect us and any of the 'downstream' repositories that depend on our repository.

GitHub repositories are usually upstream from the developer's local repository.

You can find the git-push documentation here.

Pulling from the remote repository

To retrieve and apply commits from the remote repository, execute the following command:

git pull

That's it. In more advanced set-ups, you may need to specify more options.

You can find the git-pull documentation here.

Cloning an existing repository from GitHub

Let's say you've found a cool project on GitHub. Perhaps it's your own project, but you've never worked on it from this computer. Now we're going to clone this repository to your local machine so you can work on it and later on push your commits back to the repository.

Note that you need write-access to the remote repository if you are going to push. In general, if you find a repository that is not yours, you can fork it on github (basically copy it), to create your own remote repository with the same code. Then you can clone this repository onto your local machine, commit and push changes to it, since it is your remote repository. Then on GitHub, you can issue a pull request. This is a request which asks the maintainer of the original GitHub repository to look at the changes you've made and considder pulling changes from your GitHub fork into the original repository.

That's quite a mouthful.. Here's the command to clone a repository:

git clone https://github.com/<username>/<repositoryname>.git [directory]

This clones the specified repository into the directory. You can also leave out the directory and clone into the current directory. You can find the HTTPS url on the GitHub page of the repository you want to clone. Just click the green Clone button.

You can find the git-clone documentation here.

Reverting commits

We haven't really touched upon this, but since it was one of the reasons I made you read all of this, I thought I should share a bit of information.

Reverting commits is done using the git revert command. How you should go about this depends entirely on the commit you want to revert, but in general, reverting a commit that has been pushed to other repositories than your own, will always require a new commit to "undo" the changes. You're not actually changing the repository history. Instead, you apply new changes, that revert the repository to the old state of a certain commit.

You can find the git-revert documentation here.

Excercise

To test whether you understand the basics of git and GitHub, try to execute the following steps. You can use the command git status to inspect the current state of your repository between each step. I've uploaded my command line output here. In case you're not familiar with the syntax, the lines starting with dollar signs are commands entered by me. The other lines are output.

  1. Create a local repository somewhere on your computer
  2. Create two text documents: A.txt and B.txt
  3. Add only A.txt to the git index
  4. Create a commit, add a message stating which document is added.
  5. Create a new repository on GitHub.
  6. Add the remote to the local repository
  7. Push to the remote repository
  8. Delete the entire local repository
  9. Clone the remote repository

If you have done everything correcty, you should now only see one file named A.txt, B.txt has been lost because we never added it to the index.

If you have any questions that Google or the linked documentation cannot answer for you (and please do check those first!), feel free to contact me via the contact page. At some point in the future, I will implement a comment system on my website and you'll be able to ask questions there as well.

About me

I'm a 22-year-old Computer Science student at Eindhoven University of Technology.

More on the about page.

Archive

Tags

telegram, java, git, github, programming, text-generation, meta, markov-chains

Other stuff