Version Control at Google

Slides:



Advertisements
Similar presentations
Software change management
Advertisements

Configuration Management
Applied Software Project Management Andrew Stellman & Jennifer Greene Applied Software Project Management Applied Software.
SwE 313 Introduction to Rational Unified Process (RUP)
Low level CASE: Source Code Management. Source Code Management  Also known as Configuration Management  Source Code Managers are tools that: –Archive.
Source Code Management Or Configuration Management: How I learned to Stop Worrying and Hate My Co-workers Less.
Andy Nicholls – Head of Consultancy DevelopR – formalising R Development.
G51FSE Version Control Naisan Benatar. Lecture 5 - Version Control 2 On today’s menu... The problems with lots of code and lots of people Version control.
Version Control with git. Version Control Version control is a system that records changes to a file or set of files over time so that you can recall.
CONTINUOUS INTEGRATION, DELIVERY & DEPLOYMENT ONE CLICK DELIVERY.
This chapter is extracted from Sommerville’s slides. Text book chapter
Version Control with Subversion. What is Version Control Good For? Maintaining project/file history - so you don’t have to worry about it Managing collaboration.
The Design Workshop Introduction to Version Control 1.
Presented By : Abirami Poonkundran.  This paper is a case study on the impact of ◦ Syntactic Dependencies, ◦ Logical Dependencies and ◦ Work Dependencies.
1 Lecture 19 Configuration Management Software Engineering.
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
Git – versioning and managing your software L. Grewe.
…using Git/Tortoise Git
Git workflow and basic commands By: Anuj Sharma. Why git? Git is a distributed revision control system with an emphasis on speed, data integrity, and.
SWEN 302: AGILE METHODS Roma Klapaukh & Alex Potanin.
Object-Oriented Analysis & Design Subversion. Contents  Configuration management  The repository  Versioning  Tags  Branches  Subversion 2.
Copyright © 2015 – Curt Hill Version Control Systems Why use? What systems? What functions?
Computer Science and Engineering The Ohio State University  Widely used, especially in the opensource community, to track all changes to a project and.
Software Testing and Maintenance 1 Code Review  Introduction  How to Conduct Code Review  Practical Tips  Tool Support  Summary.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
CPSC 871 John D. McGregor Change management Module 2 Session 3.
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
(1) Introduction to Continuous Integration Philip Johnson Collaborative Software Development Laboratory Information and Computer Sciences University of.
Version Control and SVN ECE 297. Why Do We Need Version Control?
Hall, Accounting Information Systems, 8e ©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly.
(1) Introduction to Subversion (SVN) and Google Project Hosting Philip Johnson Collaborative Software Development Laboratory Information and Computer Sciences.
(1) Introduction to Subversion (SVN) and Google Project Hosting Philip Johnson Collaborative Software Development Laboratory Information and Computer Sciences.
Using Git with collaboration, code review, and code management for open source and private projects. & Using Terminal to create, and push commits to repositories.
1 Ivan Marsic Rutgers University LECTURE 2: Software Configuration Management.
DIGITAL REPOSITORIES CGDD Job Description… Senior Tools Programmer – pulled August 4 th, 2011 from Gamasutra.
1 February 6, Patches William Cohen NCSU CSC 591W February 6, 2008.
1 February 6, Patch Submission and Review Process William Cohen NCSU CSC 591W February 11, 2008.
Open source development model and methodologies.
CompSci 230 Software Construction
Software Requirements
CS5220 Advanced Topics in Web Programming Version Control with Git
Configuration Management
LECTURE 2: Software Configuration Management
Version Control.
Code Management Releases
Configuration Management Why do we need it? What does it do?
Configuration Management
Tortoise SubVersion Client Again
eXtremely Distributed Software Development
Code Reviews.
1.Introduction to Rational Unified Process (RUP)
Concurrent Version Control
Version Control System
Source Code Management
LECTURE 3: Software Configuration Management
Introduction to Configuration Management
X in [Integration, Delivery, Deployment]
HP Quality Center 10 Hottest Features and Project Harmonization
Design and Programming
Part 1: Editing and Publishing Files
Collaboration Work Flow with Git
ABHISHEK SHARMA ARVIND SRINIVASA BABU HEMANT PRASAD 08-OCT-2018
Chapter 25 – Configuration Management
Git CS Fall 2018.
Introduction to K2 Designer
Configuration management
Automation of Control System Configuration TAC 18
SaaS Software as a Service Copyright © Curt Hill
Software Re-engineering and Reverse Engineering
Using GitHub for Papyrus Models Jessie Jewitt – OAM Technology Consulting/ ARM Inc. January 29th, 2018.
Presentation transcript:

Version Control at Google A Case Study This is from: Why Google Stores Billions of Lines of Code in a Single Repository by Rachel Potvin and Josh Levenberg CACM Vol 59, no 7, July 2016. Copyright © 2017 Curt Hill

Introduction Google is a major software developer Their software is a critical asset to their growth and profitability Protecting this code is a priority They have developed their own version control system that is used by the majority (95%) of their developers It is a distributed version system It is massive Copyright © 2017 Curt Hill

History For the first 10 years of so their source repository existed on a single machine They have examined many versioning systems but found all of them unsuitable After the single machine approach was outgrown they switched to a system developed internally called Piper Copyright © 2017 Curt Hill

Piper A distributed and redundant version control system It stores data at 10 Google data centers world wide Google’s developers are able to see essentially the entire code base Regardless of where they work Copyright © 2017 Curt Hill

Statistics Taken from January 2015 Number of files: 1 billion Source files: 9 million Lines of code: 2 billion Number of commits: 36 million Size on disk: 86 TB Commits per workday: 40,000 Does this constitute a serious repository? Copyright © 2017 Curt Hill

Workflow Like most versioning systems the developer creates a local copy of the needed files The developer writes/modifies/tests as is usual The commit process is somewhat more extensive Copyright © 2017 Curt Hill

Commit Code that is ready for commit must first go through a code review Other developers will inspect and evaluate A variety of tools can be used to verify the quality of the commit prior to its actual addition to the code base Copyright © 2017 Curt Hill

Access Piper may also be accessed by Client in the Cloud CitC This may be used by any developer that has access to the Google cloud Now the local workspace is in the cloud instead of the local machine A developer may work on any machine with cloud access as if at their personal workstation Files are now easy to share as well Copyright © 2017 Curt Hill

CitC The CitC workspace now looks like a piece of the entire Piper codebase Any file in Piper may be browsed Only files that are modified end up with local entries in the cloud workspace Thus a build/make would use some files directly from the repository with others that have been modified in a transparent fashion Copyright © 2017 Curt Hill

Trunks Like other systems Piper has trunk-based versions Most developers are working on the head version of a file This is the most recent commit In general, separate branches are short-lived This avoids the problems of merging the results together Copyright © 2017 Curt Hill

Owners and Reviewers The entire code base is visible to every developer The exceptions are a few pieces of confidential and secret code The code base is organized in tree shape directories Each directory has a set of owners These are the ones primarily responsible for the product of the directory Anyone may do a code review, but an owner must approve Copyright © 2017 Curt Hill

Code Reviewers A reviewer may comment on many aspects of the code There are language specific guides They will use a code review tool named Critique This receives the comments of the reviewer and attaches them to the file in question This may attach to specific line numbers When the code is approved this is available to the owner Copyright © 2017 Curt Hill

The Commit There are a number of custom tools that help to ensure that a commit is of high quality These follow the code review process There is an automatic testing structure Each commit forces a series of tests on code that depends on the committed file Breakage of these dependencies forces the rescinding of the commit A pre-submit also triggers this testing Copyright © 2017 Curt Hill

Breakage Example Suppose module M is used in programs A, B and C This is good code reuse When an update to M is committed, then A, B and C are all rebuilt If the builds do not break (no compile issues) we may accept the commit Another form of breakage is when the automated unit tests for the modules of A, B and C fail Copyright © 2017 Curt Hill

Automated Analysis The Tricorder system is part of the pre-submit and commit facility It does a static analysis on the code in question It may suggest one line fixe Other tools do profiling Test data code coverage Copyright © 2017 Curt Hill

Pros One repository Simplified dependency information Atomic changes No question where the most recent version is Simplified dependency information Atomic changes Easy to collaborate with different teams All the code is visible to every developer Copyright © 2017 Curt Hill

Cons There is no Commercial Off The Shelf software It is all Google developed and maintained The massive size of the base makes for problems in discovering existing code Startup costs for new developers who are used to other versioning systems of less complexity Copyright © 2017 Curt Hill

GIT The GIT repository has been considered and used The GIT repository is used for Android and Chrome Not in Piper These are open source They need to be usable by developers outside of the employ of Google Converting code base to GIT would require thousands of repositories Massive shock to their culture Copyright © 2017 Curt Hill

Conclusion This approach would not work for everyone The collaborative culture of Google makes this a good choice It has provided several serious technical obstacles to implementation Who better than Google to solve? Copyright © 2017 Curt Hill