There are a lot of Git tutorials on the web that teach people to use git pull
when first teaching them about working with remote repositories and collaboration.
I would like to put forward the position that this is a Bad Idea (TM), and that
it is more instructive to teach people to use git fetch
followed by an explicit
git merge
.
I understand the temptation of teaching people to just git pull
, because it’s
a single command (rather than 2) and often it “just werks”. On the other hand
I get the impression that teaching people only git pull
reinforces an incorrect mental model that causes a ton of confusion when
there are (as there inevitably are) conflicts with the remote repository.
In addition, I’ve noticed that often people just want to see what their collaborators have
done, without necessarily incorporating those changes into their own work.
Teaching the two operations separately enables this workflow; without it you have
to introduce git reset
just so that people can get themselves back to their previous
state!
Because working with a remote repository is essentially (pedants, please contain
yourselves) working with multiple branches I personally think that it is really useful to
teach branches before remote repositories1. Once people have the concept of
branches down, it’s then a pretty small leap to “by the way, you can fetch
the state of other people’s branches with git fetch
”. You then explain that
the branch shows up on your local machine as origin/whatever-branch-name
, and
that you shouldn’t try and make commits directly on this branch because it’s
“owned” by origin
. At this point it’s probably a good idea to show what happens
when the remote repository is updated by somebody else, so that there is a “fork”
in the history:
◯—◯ ← origin/master
╱
◯—◯—◯—◯—◯ ← master
You can then say “ok, origin/master
and master
now contain different things;
we need to incorporate the changes on origin/master
with our ones”.
With that you introduce git merge
, and can show the updated history after that
operation:
◯—◯ ← origin/master
╱ ╲
◯—◯—◯—◯—◯—◯ ← master
then you can git push origin master
and show what that does locally:
◯—◯
╱ ╲
◯—◯—◯—◯—◯—◯ ← master, origin/master
Teaching this sequence of operations, it is abundantly clear that git fetch
only
updates origin/master
; it will never affect what you are working on right now.
It’s the way that you see what other people are working on, while you also continue
working on your own thing. It’s also clear that git merge
totally affects what
you’re working on right now, so you’d better get yourself into a place where
you’re ready to have your files modified as git magically incorporates all those
sweet sweet changes that your buddy just pushed.
This workflow also mitigates the common pitfall of:
$ git push
To git-example-origin
! [rejected] master -> master (fetch first)
error: failed to push some refs to 'git-example-origin'
$ git pull
Auto-merging
CONFLICT (content): Merge conflict in hello-world
Recorded preimage for 'hello-world'
Automatic merge failed; fix conflicts and then commit the result.
So instead of “congratulations, your code is now full of conflict markers, have fun!” you get to inspect the changes that were introduced by the remote before your try to merge them in. This means you can anticipate if there will be any problems, and know what to expect when you try to merge.
You could even imagine running git fetch
periodically to keep origin
up to date with any changes on the remote. This would be complete
madness if you tried to do the same thing with git pull
!
-
This is, of course, tough if you are teaching a Github-centric workflow. One way around this may be to get people to initialize their local repositories by cloning, and then forget about the remote entirely until the time is right. ↩︎