Me: Dr James Hetherington -- UCL Research Software Development Team -- blogs.ucl.ac.uk/research-software-development/blogs.ucl.ac.uk/research-software-development/ -- programming
Version Control and Issue Tracking Managing code inventory –“When did I introduce this bug?” –Undoing mistakes Working with other programmers –How can I merge my work with Jim’s? What’s the most important bug to fix next?
What is version control? (Solo version) Do some programming > my_vcs commit Program some more –Realise mistake > my_vcs rollback –Mistake is undone Syntax here is example only!
What is version control? (team version) … wait … Join the team > my_vcs checkout do some programming > my_vcs commit Do some programming … more programming… > my_vcs commit … more programming … > my_vcs commit Error again… Create some code > my_vcs commit …wait… >my_vcs update Do some programming … program some more > my_vcs commit Oh Noes! Error message! > my_vcs update > my_vcs merge > my_vcs commit More programming… Sue Jim
Centralised VCS concepts There is one, linear history of changes on the server or repository Each revision has a unique identifier You have a working copy You update the working copy to match the state of the repository You commit your changes to the repository If you someone else has changed it you have to resolve conflicts between your changes and the repository, and then commit
Centralised VCS Server With All Committed Versions Client At v4 Client At v4+ Client At v3
Centralised VCS solo workflow svn checkout vim myfile.py svn commit touch mynewfile.yml svn add mynewfile.yml vim mynewfile.yml svn commit Commands for this in Subversion: Time
Centralised VCS Team workflow: no conflicts Jim’s commands: svn checkout vim jimfile.py svn commit Sue’s commands: svn co svn update cat jimfile.py # Sue can see changes vim suefile.py svn commit
Centralised VCS with conflicts Jim’s commands: svn update vim sharedfile.py svn commit Sue’s commands: svn up vim sharedfile.py svn commit svn: Out of date: ’sharedfile.py’ svn up vim sharedfile.py svn ci
Resolving conflicts On update, you get a prompt like: > svn update Conflict discovered in ’sharedfile.py'. Select: (p) postpone, (df) diff-full, (e) edit, (mc) mine-conflict, (tc) theirs-conflict, (s) show all options: If you choose (e) or (p) the conflicted file will look something like: > cat sharedfile.py previous content <<<<<<<.mine Sue’s content ======= Jim’s content >>>>>>>.r4 Previous content You edit to fix this, then save.
Revisiting history You can update to a particular revision –svn up -r 3 You can see the differences between your working area and a revision –svn diff (to current repository most recent version) –svn diff –r 3 You can see which files you’ve changed or added –svn status You can get rid of changes to a file –svn revert myfile.py
Distributed and Centralized Version Control Centralized: –Some server contains the remote version –Your computer has your copy –To switch back to an old copy you need the internet –E.g. cvs, subversion (svn) Distributed: –Every user has a version of the full history –Users can synchronize their history with each other –Having a central “master” copy is a policy option Most groups do this –E.g. git, mercurial (hg), bazaar (bzr)
Distributed VCS In principle: Master copy (with v0,1,2,4,5) Sue’s copy (with v0,1,2,3) Phil’s copy (with v0,1,2,4,5,6) Jim’s copy (with v0,1,2,4,5) In practice:
Pragmatic distributed VCS Subversion svn checkout svn commit svn up svn status svn diff Git git clone git commit -a git push git pull git status git diff
Why go distributed? Easy to start a repository (no server needed) Easy to start a server Can work without an internet connection Better merges Easy branching More widespread support Why not go distributed? More complex commands Easier to get confused!
Distributed VCS concepts (1) Each revision has a parent that it is based on These revisions form a graph The most recent in each copy is the head or tip Each revision has a globally unique hash-code –E.g. in Sue’s copy revision 43 is ab3578d6 –Jim thinks that is revision 38 When you bring in information from a remote the histories might conflict –Histories from different copies are merged together
Distributed VCS concepts (2) You have a working copy You pick a subset of the files in your working copy to add to the next commit: these go into the staging area or index When you commit, you commit: –from the staging area –to the local repository You push to remote repositories to share or publish your changes You pull or fetch to bring in from a remote
Distributed VCS solo workflow Create a file myfile1.py git init git add. git commit create a new file myfile2.py and edit myfile1.py git add file2.py git commit Only changes to file2 get into commit Edit both files git commit -a Commands for this in Git:
Distributed VCS solo with publishing git clone Edit a few files git add --update git commit git push Edit a few files git commit -a git push Commands for this in Git:
Distributed VCS Team workflow: no conflicts Jim’s commands: git init git add mycode git commit git remote add --track master origin git push git fetch git merge git commit –a git push Sue’s commands: git clone git commit –a git push git pull
Distributed team workflow with conflicts Jim’s commands: git commit -a git push Error: ! [rejected] git pull git commit –a git push Sue’s commands: git commit –a git push git commit –a git push git commit -a git commit –a git pull
Really distributed: more than one remote git remote add sue ssh://sue.ucl.ac.uk/somerepo add a second remote git remote list available remotes git push sue git push origin push to a specific remote
Working with branches
Working with branches in git > git branch * master > git branch experiment > git branch * master experiment > git checkout experiment > git branch master * experiment
Sharing branches in git git push origin experiment publish the branch to remote git push -u origin experiment publish the branch to remote(first time) git branch -r discover branches on remote(s) git checkout origin/experiment get a new branch from a remote
Merging and deleting branches git checkout master switch back to master branch git merge experiment take all the changes from experiment into master exactly like merging someone else’s work git branch -d experiment the experiment is done, get rid of local branch git push --delete experiment git rid of the branch on the remote
Working with branches You should have a development branch and a stable branch You should create temporary branches for experimental changes If you release code to others, you should make a release branch –Then you can make fixes to bugs they find –And control which of your work goes in the release
Tagging You should tag working versions You should produce real science only with specific tagged versions, and note which one
Tagging git tag –a v1.3 add a tag, labelling last commit git tag –a v1.3 ab48dc tag an old commit git push --tags publish the tags to origin
Branching and tagging in subversion You can do branches and tags in subversion too –But it’s harder –svn doesn’t have real branches or tags, instead you make copies of code inside the repo –and you can merge between the copies –It works, but it’s cleaner in git –see subversion references if you need this
More comparisons Subversion svn up –r54 myfile.py svn revert myfile.py svn revert –depth=infinity. Git git checkout –r ab39d myfile.py git checkout myfile.py git reset –hard But git can do more, e.g.: git reset HEAD^ will undo the last commit from your local repository (providing you haven’t pushed) Both git and svn have many options – have a look on the web!
Other version control systems vcs –Really old! –Works by “locking” files instead of resolving conflicts cvs –Very like svn hg –“mercurial” –Very like git
Hosting a server In git, any repository can be a remote for pulls –Just use git pull ssh://theircomputer/theirrepo –There are problems with pushing to someone else’s working repo: don’t! –You can, however, create a git repo with git init --bare –This bare repo has no working directory, –use it as a remote for push and pull via ssh:// In subversion, the procedure is more complicated –You have to configure a server ‘daemon’
Hosting a server in the cloud There are many services which allow you to create git, mercurial, and subversion repositories online –Typically free for open source –Typically a fee for private repositories I recommend GitHub –Create an account at –Students can get five free private repositories at –Can interact with GitHub repositories as either svn or git Bitbucket is also good
Working with GitHub
Set up ssh keys
Create repository
Social coding
Browse changes
Comment on and discuss code
Issue tracking Your code has bugs (defects) Your code has things you want to do (enhancements) The best way to keep track of all this is with an issue tracker
Issue tracking
Anatomy of an issue Type –Defect, enhancement, task Severity –Critical, blocker, major, minor, trivial Owner Status –Open, fixed, duplicate, blocked, under review, won’t fix, invalid, new… Estimated time and time spent? Tags
Timeline of an issue
Some questions Public or private issues Organising issues into milestones Estimate effort? Who can close an issue? Review processes Integration with version control
Some issue trackers Trac Redmine Jira GitHub’s issue tracker Bitbucket’s issue tracker
Issues on GitHub
The Pull Request
Conclusions Tools can make your development easier, safer, more reliable, more correct, and more collaborative They can be complicated and take time to learn Learn by practicing –Use the tools –Pick an open source project on github or bitbucket and start contributing