Version control and GitHub

Continuing my quest to master R, I learnt about using GitHub as a way to manage all the files and their different drafts or ‘versions’. I will say that GitHub is only one of many options but it is very popular.

I thought that version control and GitHub was something that fancy developers used to manage their coding projects but I had no idea that you could link it to your projects in RStudio. In fact, you can manage website pages and upload any file – even a word doc!

Here’s a quick rundown of some keywords:

Repository (also called a ‘repo’, if you want to be cool): basically think of this as a master harddrive or folder where you want everything in your project to be stored.

Master: within the repo, the master is where the main or current version of files are kept. It’s the source of all the files in the repo. Think of it like the trunk of a tree and everything branches off from it.

Branch: these are… well, branches or different versions of the master. It’s useful when you have a team working together on different sections of a file or if you want to be able to go back to past versions of a file. Creating your own branch lets you make the edits to the file without worrying about deleting or overwriting changes that have been made.

Merging: if you have multiple branches and have made changes, then it’s important to try and merge these back into the master to create the update. GitHub has a handy feature that compares the original file with the one being uploaded to show you the changes you’re making.

Getting started with GitHub

Creating a new repository

First you need to create a GitHub account and log into it. You can create a new repo by clicking the + at the top right corner of any page or clicking the green ‘new’ button that has a little book logo (see figure below).

Ways of creating a new repo in GitHub: 1) the green ‘new’ button or 2) the “+” menu available on any page

From there you’ll enter the details of your new repo (name, description, default settings and visibility) and create it. The page should look something like this:

Naming and choosing the settings for your new repo 🙂
Repositories for RStudio
Installing Git

Before creating a repo for RStudio, you need to install Git – the system on which GitHub is built (it’s free and open source!). Follow the installation instructions for your operating system with the default settings (for Windows, it might give you a security warning but click “Run” or “Allow” to continue).

After it’s installed, it needs to be connected to GitHub, so we need to tell Git your GitHub account details (email and username) so it know who is uploading or making changes. Launch Git command prompt and enter the following code:

git config --global "Your Username"
git config --global

To confirm these changes have been made, type the following code and check the last 2 lines in the output/results for the assigned username and email.

git config --list
Link RStudio with Git

In RStudio, go to Tools > Global Options > Git/SVN and check that “Git executable” points to the correct folder where Git is located.

In this same window, click the “Create RSA Key” button and close it. Navigate back to the window and then click “view public key” – copy this key into your GitHub account by heading to your Account Settings in GitHub (click your profile in the top right corner then “Settings”). Go to the “SSH and GPC keys” tab > “New SSH Key” > past the copied public key and give it a name that you’d understand (e.g. RStudio) > Add.

Repositories and RStudio

There are a few ways to link RStudio with a GitHub repository

New project and repo

  1. Create a new GitHub Repo (following the instructions above)
  2. Copy the URL of the new repo
  3. In RStudio, create a new project (File > New Project). Then in the window select “Version Control” > Git > paste the copied URL, set a name, and choose where to keep the project > “Create project”

Linking an existing project to a repo

  1. Go to Git Bash or the Terminal and go to the directory of your R project (type cd ~/main folder/R_project_folder)
  2. Then type git init and git add . (this will create a git repository and add all the files in the directory into it)
  3. Commit this change with git commit -m "Initial commit"
  4. Go to GitHub and now create a new repository with the EXACT same name as the R project, making sure not to check any of the initialise options (because you’re importing an existing repo that you just made)
  5. Choose to “push an existing repository from the command line” and copy that code into Git Bash or the Terminal. Check it’s all linked by refreshing the page and you should see a the dashboard of a new repository
  6. Restart the project in RStudio and you should now see the “Git” tab in the upper right quadrant.

Forking a repository

Now we know how to create a new repo, but what if you wanted to use files that someone has shared on their account? This is where forking is really handy.

Forking lets you create a copy of someone else’s repository that you can edit and change under your own account. To create a fork, you’d need to go to the repo you want to copy and click “Fork” in the toolbar in the top right corner (see figure below). This will automatically create and direct you to a copy under your account.

Forking a repository
Clone to RStudio

Clone the repository (creates a local copy so that you can edit files offline on your computer) by clicking the green “code” button on the top right and copying the link (see image below). You’ll then need to create a new RStudio project (New Project > Version Control > Git) and use the copied URL.

Clone a GitHub Repository by copying the link

Pushing and committing

After you’ve made your changes, you then need to update your repo through staging, commiting and pushing changes.

For RStudio

Create or change a file and when you’re ready to upload, in the environment quadrant (usually top right), go to the “Git” tab and click the checkbox next to the file (it’ll say “staged” above the checkbox) that you want to upload, and press “commit”. This will open a new window showing you all the changes that’s been made compared to previous versions (green = added; red = removed). You can also add a commit message in the top right section to outline the changes. Then click “commit” and “push” to push the changes through to GitHub.

The final commit (conclusion)

There are so many useful ways we can use GitHub and other version control systems to manage all our files.

I’m definitely no expert at this but I hope that this quick introduction was helpful. If you want to learn more, check out the GitHub documents and GitHub for the useR for more information and help.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: