Aller au contenu

Collaborating with LaTeX and git

James · October 16, 2012

This article was originally published on the ShareLaTeX blog and is reproduced here for archival purposes.

This article is part of a series about how to work together well in a collaboration with your coauthors when using LaTeX. You can sign up to get the rest of the articles devilered by email to your inbox each week.

In this article I'm going to discuss how to keep track of the changes that have been made to a document. You'll be able to easily see what has changed since you last worked on it, see which parts have been edited by who, and undo any older changes that you don't like. Sound useful? Then keep reading!

The program I'm going to recommend to do this is called git. I have no idea why it's called git, other than it being a short and memorable name, but it's a really powerful program. Before we proceed I should warn you that git is a command line program and to use it you need to type in commands rather than pointing and clicking. As a LaTeX user this probably isn't too unfamiliar to you but you should be prepared. At the end of this article I'll talk about some point and click interfaces to git that you might find easier.

Git is a bit like LaTeX in the sense that it can look quite intimidating if you're never used it before but once you've understood some of the key concepts it's not so bad. This does mean that it might not be for everyone though. I recommend that you skim through this article to get a feel of what git can do and then decide if it's worth your time to learn properly (I certainly won't be able to teach everything in a single article).

Let's start with a brief overview of what git does. When you save your LaTeX file you are normally overwriting the old version of the file. If you've deleted some text then this is gone forever, and if you've made some changes then you can't get the previous version back. With git, you can take a snapshot of your file whenever you want to so that you can always come back to that version, even if you make changes in the mean time. Git lets you manage and view all of the different versions, including letting you restore an older version and see a summary of the changes made between versions. Git is also really useful for working across multiple computers, or with your collaborators. It comes with built in tools for sending your changes directly to your collaborators computers, or grabbing any changes they have made and syncing them to your computer. The real power comes if you've both made changes and need to combine them. Since git has a record of different versions, it can look back and find the latest version you both had in common and then replay all of the changes that you've both made. Hopefully if you haven't been stepping on each others toes then both sets of your changes can be automatically combined into one latest version. Even if you've both edited the same parts, git points this out and lets you decide how best to fix it.

I'll talk about all of these features in turn and how to use them in git. First of all though, you're going to need to install git. You can get it from http://git-scm.com along with installation instructions for different operating systems. The git-scm website also has some very good documention and tutorials that can teach a lot more about git than I will be able to here.

Tracking your first set of changes

Once you have git installed, go to the command line and navigate to the directory with your work in and then run the following command to set up git:

cd my-project
git init

Git needs somewhere to store all of the different versions of your files, and this command tells git to set up a directory called .git to store these in. In general you shouldn't touch any of the files in the .git directory by hand, only use the git command.

Git only tracks changes to files that you tell it about so at the moment it's still not doing anything. If you have a file called project.tex in this directory then you can tell git to monitor it with the following command:

git add project.tex

You can also add entire directories using this command and use a wildcard (*) to add lots of files. E.g.

git add figures/     # Adds the figures directory
git add *.tex        # Adds all .tex files in the current directory
git add .            # Adds all of the files (. is the current directory)

Now that git is monitoring those files, you can create your first saved version, which in git is called a commit. To do this, run the following command

git commit -m "My first commit"

You should provide a message in quotes describing changes you have made since the last version to make it easier to find this version later on in the version list.

Gotcha #1: You need to run git add whenever you change a file, not just at the beginning. This tells git that it should include these changes in the next snapshot. Alternatively you can pass the -a option to git commit to tell it to commit all changes to files that it is monitoring:

git commit -a -m "My second commit"

Viewing your history

Hopefully you've made it this far, because now git starts to pay off. After you've been editing for a while and have created a few versions you can go back and view how the document has changed over time. The following command will give you a list of the versions, with the summary messages you've passed (so make sure they're useful!):

git log

The output from git log looks like this:

commit f69606d7e24ad45b31bb6eb4b38192bd07f274fc
Author: James Allen
Date:   Tue Oct 9 16:14:54 2012 +0100

My second commit

commit 8c3e0d50be899e86787076f787532e7cb189e045
Author: James Allen
Date:   Tue Oct 9 16:12:57 2012 +0100

My first commit

Each entry in git log has a funny list of numbers and letters at the top of it. These are unique identifiers that correspond to your versions internally in git. If you want to ask git about a specific version, you can tell it which version using this identifier. For example, the git show command tells you what has changed between the version you ask about and the previous one (in this case we ask about the version labelled "My second commit" and it returns the changes made in that version):

git show f69606d7e24ad45b31bb6eb4b38192bd07f274fc --word-diff=color

The output is

commit f69606d7e24ad45b31bb6eb4b38192bd07f274fc
Author: James Allen
Date:   Tue Oct 9 16:14:54 2012 +0100

My second commit

diff --git a/project.tex b/project.tex
index 18df063..95ad7b2 100644
--- a/project.tex
+++ b/project.tex
@@ -2,6 +2,6 @@

\begin{document}

TheA quick brown fox jumps over the lazysleazy dog

\end{document}

The second half of the output here shows you any changes that have been made. Text that was added is in highlighted green and text that was removed is in red. By default git compares documents line by line, so that if anything has changed on a line then it shows the whole line as changed. With LaTeX documents lines can be quite long and it can be difficult to quickly see what has changed between similar lines. The option --word-diff=color tells git to instead compare the changes word by word and makes it much clearer.

These features are very useful if you want to see the changes made by one of your collaborators (we'll come to passing around versions a little later), or to remind yourself what you were working on last time.

Undoing changes

Of course, you don't want to just be able to see changes, you want to be able to go back and use the history to undo changes. Perhaps you want to restore the document to a previous version because you don't like the changes you've made or you realise you made a mistake. Git doesn't just rewind the the history of your document because then you would lose the changes made between then and now. Instead, git lets you restore the files you are working with to an earlier version by creating a new version which comes after all of the intervening history. If you decide you didn't actually want to do this after all then nothing has been lost.

To restore your files to an earlier version, simply run:

git checkout f69606d7e24ad45b31bb6eb4b38192bd07f274fc *

This tells git to 'checkout' the files from the older version specified by the third argument (remember we refer to versions by their long identifier of numbers and letters). The files in your project now have their contents restored to the older version, but you still need to create a new version to mark this change in git:

git commit -a -m "Restore files to previous version"

You could of course edit your files at this previous version before creating a new version in git.

Another common situation is you decide that you want to undo only one set of changes from a while ago. Perhaps you've edited the document in three steps: 1) Adding an abstract, 2) Updating your acknowledgements, 3) Adding in a figure. At each step you've created a new version in git, but now you decide that you didn't really want to update you acknowledgements. Unfortunately this is sandwiched between other changes so a simple rollback like before won't do. Fear not, because git is clever enough to do what you want, with the revert command:

git revert f69606d7e24ad45b31bb6eb4b38192bd07f274fc

The final parameter should be the identifier of the version that introduced the changes you want to undo.

Gotcha #2: git revert undoes the changes introduced by the version that you pass to it. It does not revert the project to that version. This is a commonly confused command so be careful! To put the whole project back to a previous version, use the first method described at the top of this section.

Passing your changes around

So all of this is great, but where git really excels is in letting you easily pass around your changes to your collaborators in a way that makes sure everyone stays in sync and nothing gets overwritten or forgotten about. In fact, this is git's primary purpose. Git was originally written to keep track of the source code of the Linux kernel and to make it possible to manage the contributions from thousands of developers all over the world. Collaborating on a paper in LaTeX is on a slightly smaller scale, but the principle is the same. Git has built in tools that will connect to your collaborator's computer and pass over the latest changes you've made, or download the changes they have made to your computer. If you and your collaborator have both made changes in the mean time, git can use the history which is stored with your documents to find the last place that you were both working on the document and then automatically apply the changes you have both made since then. If you've been editing different parts of the document then it can combine your changes automatically, but if you've both made changes to the same sections then it will hightlight these and ask you to combine the two manually.

First you need to tell git about the other computers you want to send your changes to and receive changes from. Generally people find it easier to have a single server that everyone talks to directly rather than trying to send your changes to each other individually. This way the central server can store the canonical version of your document and everyone knows where to get the latest version. If you are in a university department or workplace you might be able to ask your IT staff to set up a server for you with a git repository. There are also some great online services that will provide this, along with clear instructions for getting set up and using git with them. Github and Bitbucket are two good examples. These are both aimed towards hosting source code for computer programs but work equally well with LaTeX documents.

To tell git how to connect to the central server or your collaborators computer run the following command:

git remote add origin git@github.com:username/my-project.git

The example given assumes you are using Github. This tells git to let you connect to a server which you want to refer to as 'origin' and is located at the URL given by the final argument. If you are using another service you will need to find out from that what to replace the final argument with. In subsequent commands we can now refer to this remote computer as 'origin' rather than needing to remember its full name. You can call the server anything you want but 'origin' has become accepted within the git community as the default name for the main server that you want to stay in sync with.

To send your changes to the central server you can now run:

git push origin master

Git uses the term 'pushing' to mean sending your changes to someone else. Here, origin is the reference to another computer that you defined above. The master option tells git which branch to send your changes to on the other computer. Git lets you have multiple versions of your files at once, with the different versions stored in what are called branches. I won't go into this in any more detail here but the default branch is called 'master' and is the one that you will normally be using.

The other possiblity is you want to download changes made by someone else. This is a process which git calls 'pulling'. To do this, simply run

git pull origin master

This will then download any changes that have been made to the document on the central server and integrate them with the version on your computer. If you have made changes as well then git will try to automatically merge the two (or more) sets of changes using the intermediate versions to guide it. Hopefully the two sets of changes will be automatically merged by git but if not it will tell you and let you know how to go about manually combining the sections that it couldn't do automatically.

Final tips

Hopefully you're still with me by now and looking forward to diving into learning git properly. It would take me far longer than this article to give you a complete guide, but I hope to have given enough of a flavour of what is possible with git and the benefits that it provides to encourage to you to learn how to use it. I recommend the book Pro Git by Scott Chacon as the next step in your study. The book is aimed at beginners but goes into detail about almost every aspect of git in later chapters. This is free to download electronically or can purchased in print via its website. Before you go though, I'll give you a few tips that are specific to LaTeX that you probably won't find in the common git literature.

The following command gives you a list of the files that git is aware of having changed, and a list of files that git isn't watching:

git status

This is really useful for getting the status of your document and checking whether or not it's a good time to create a new version. Unfortunately the output of this command can get a bit crowded since LaTeX produces a lot of extra files that you probably don't want to take versions of. Things like .aux files, .toc, and .bbl files are changed everytime you run LaTeX but don't necessarily reflect any changes made to your document. There is no point keeping them versioned since they'll just make your logs and diff more crowded without giving you any useful information. Fortunatelly you can tell git to ignore these files. Create a new file in your project folder called .gitignore and put the following content into it (from here):

# In .gitignore
*.aux
*.glo
*.idx
*.log
*.toc
*.ist
*.acn
*.acr
*.alg
*.bbl
*.blg
*.dvi
*.glg
*.gls
*.ilg
*.ind
*.lof
*.lot
*.maf
*.mtc
*.mtc1
*.out
*.synctex.gz

This tells git to ignore all files with those file extensions. This also allows you to run the command git add . to add all of your changes without adding in the extra files that are generated by LaTeX.

Bye for now

So that's almost all for this article. But first I promised you I'd mention point and click interfaces to git. I don't find that these make git much easier to learn since you still need to understand the underlying concepts, but some people prefer using them to the command line. I don't have any particular recommendations but the official git website has a nice page of graphical interfaces to git depending on which operating system you are using.

I hope I've been able to give you enough of an overview of git to persuade you that it's worth learning to use. Like LaTeX the learning curve can be steep but git is such a powerful tool once you know how to use it that it's very worth it. If you have any comments or questions about anything I've said please get in touch with me at james - at - sharelatex.com. I try hard to answer every email I receive.

Are there are topics you would like me to discuss in future? If so then please get in touch and let me know! What am I doing right and what am I doing wrong?

I hope you're having a great day!

\begin{now}

Discover why 18 million people worldwide trust Overleaf with their work.