Dr. Tanusri Bhattacharya

Name: Dr. Tanusri Bhattacharya
Uploaded: 2018-01-13T12:10:11+00:00
Duration: PTM12S42
Channel: Prudence Hill
Description: Dr. Tanusri Bhattacharya

Dr. Tanusri Bhattacharya tanusri@iiitd.ac.in
Object Oriented Programming and Design (OOPD) Version Control, Introducing Git Dr. Tanusri Bhattacharya

In the last lecture… Integrated testing Continuous Integration

Today’s… Version Control Introducing Git (and GitHub a bit)

What is Version Control?
A system that records changes to a file or a set of files over time. You can recall specific version later. If you screw things up or lose you files, you can easily recover through Version Control System (VCS). You can do version control for any type of computer file Example: a graphic designer may want to keep every version of an image or layout through a VCS. As software developers we are interested in version control of our source code files.

Why Version Control A VCS allows you to -
Revert back files to a previous state. Revert the entire project back to a previous state. Compare changes over time. See who modified last which might cause the problem. Who introduced an issue, when, and more.

Local Version Control Systems
Many people’s version control methodology is to copy files into another directory (may be a time-stamped directory). Simple approach, but extremely error prone. It is easy to forget which directory you’re in and accidentally write to the wrong file or copy over files you don’t mean to. Programmers long ago developed local VCSs that had a simple database that kept all the changes to files under revision control. A popular VCS tools was a system called RCS, which is still distributed with many computers today. The popular Mac OS X operating system includes the rcs command when you install the Developer Tools. RCS works by keeping patch sets (the differences between files) in a special format on disk; it can then re-create what any file looked like at any point in time by adding up all the patches.

Local VCS

Centralized VCS What if you need to collaborate with other developers in other systems? Cannot solve using local VCS Centralized Version Control Systems (CVCSs) have been developed to resolve the above issue. Example: CVS, Subversion, and Perforce. CVCSs have a single server that contains all the versioned files, and a number of clients that check out files from that central place. For many years, this has been the standard for version control.

Centralized VCS

Pros and Cons of CVCSs Everyone knows to a certain degree what everyone else on the project is doing. Administrators have fine-grained control over who can do what. It’s far easier to administer a CVCS than it is to deal with local databases on every client. Downside: Centralized control – if the central server fails for an hour, no body can collaborate at all or save version changes during that time. You lose the complete history of project versions if the hard disk of the central server is corrupted and proper back-up is not maintained.

Distributed Version Control Systems
Whenever you have the entire history of the project in a single place, you risk losing everything. This is where Distributed VCSs or DVCSs step in. Example: Git, Mercurial, Bazaar or Darcs. In DVCSs, the client not only check out the latest snapshot of the file, but they fully mirror the repository. If any server dies, any of the client repositories can be copied back up to the server to restore it. Every clone is a full backup of all the data. Many of these systems deal pretty well with having several remote repositories they can work with allows collaboration with different groups of people in different ways simultaneously within the same project. Allows setup of different types of workflow – not possible in CVCSs.

Distributed VCS

Introducing Git

Introducing Git and GitHub
Git is a Distributed VCS. GitHub is a web-based Git repository hosting service. offers all of the distributed version control and source code management (SCM) functionality of Git as well as adding its own features. In 2005, Git has been developed by the Linux development community with the following things in mind: Speed Simple Design Strong support for non-linear development (through several parallel branches) Fully distributed Able to handle large projects like Linux kernel efficiently. Git is incredibly fast, very efficient for large projects, and has an incredible branching system (Git Branching) for non-linear development.

Git Basics Git stores and thinks about information much differently than other VCS even though the user interface could be similar. Other VCSs such as CVS, Subversion, Perforce, Bazaar, and so on think of the information they keep as a set of files and the changes made to each file over time. Git thinks about its data more like a stream of snapshots. Every time you commit, or save the state of your project in Git, it takes a picture of what all your files look like at that moment and stores a reference to that snapshot. To be efficient, if files have not changed, Git doesn’t store the file again, just a link to the previous identical file it has already stored.

Git Basics Storing data as changes to base version of each file (other VCSs). Storing data as snapshots of the project over time (Git)

Git Basics Nearly every operations in Git is only need local files and resources to operate – facilitates extra ordinary speed avoiding network latency. For example, to browse the history of the project, Git doesn’t need to go out to the server to get the history and display it – simplay reads it from the local database. Very little you can’t do if you’re offline or off VPN. Git has integrity through checksums - impossible to change the contents of any file or directory without Git knowing about it. Git uses a SHA-1 hash for checksumming. A 40 character string composed of hexadecimal characters (0–9 and a–f) and calculated based on the contents of a file or directory structure in Git. Looks like 24b9da aa493b52f8696cd6d3b00373. Git stores everything in its database not by file name but by the hash value of its contents.

Git Basics Git generally only adds data - when you do actions in Git, almost all of them only add data to the Git database. Unlike other VCSs, once you commit a snapshot into Git, it is very difficult to lose data, especially if you regularly push your database to another repository. Git has three main states that a file can reside in: committed, modified, and staged. Committed means that the data is safely stored in your local database. Modified means that you have changed the file but have not committed it to your database yet. Staged means that you have marked a modified file in its current version to go into your next commit snapshot.

Three main sections of a Git project

Three main sections of a Git project
The Git directory is where Git stores the metadata and object database for your project. This is the most important part of Git, and it is what is copied when you clone a repository from another computer. The working directory is a single checkout of one version of the project. These files are pulled out of the compressed database in the Git directory and placed on disk for you to use or modify. The staging area is a file, generally contained in your Git directory, that stores information about what will go into your next commit. It’s sometimes referred to as the “index”, but it’s also common to refer to it as the staging area.

Git Workflow You modify files in your working directory.
You stage the files, adding snapshots of them to your staging area. You do a commit, which takes the files as they are in the staging area and stores that snapshot permanently to your Git directory. A particular version of a file in the Git directory is considered as committed. If a file has been modified and was added to the staging area, it is staged. If the file was changed since it was checked out but has not been staged yet, it is considered as modified.

Installing Git Follow the tutorial link –

Using a git repository $ git config - lets you get and set configuration variables that control all aspects of how Git looks and operates. $ git init - to initialize a new project, in the project directory $ git clone <url> - to retrieve an existing repository from the specified location in <url> $ git add <filename> - tell git the existence of a new file to be added – staging the file $ git commit – m - now you commit the file $ git branch - list, create or delete branches $ git checkout - checking out a file from the git repository $ git push - Update remote refs along with associated objects $ git pull - Fetch from and integrate with another repository or a local branch

Git Branching and Merging
To understand what a branch is, let’s first go back to what a commit is. A commit is a snapshot of a repository at a certain time. Each commit contains metadata: a hash to identify the commit, the author name, date etc. It also contains a link to the parent commit. Hence, committing creates a sort of linked list of commits. A branch is just a pointer to a commit:

Git Branching and Merging
master is the main branch where every commits is made by default. Creating a new branch (for example, testing) just adds a pointer to a commit. git branch testing To incorporate the changes of the branch testing into master, you need to merge testing in master. Make sure you are in branch master (using git branch), and run the command: git merge testing

Summary What is VCS Why do we need it? Local VCS Centralized VCS
Distributed VCS Basics of Git – a Distributed VCS Some popular commands in Git

References and Further Readings

Dr. Tanusri Bhattacharya

Similar presentations

Presentation on theme: "Dr. Tanusri Bhattacharya"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dr. Tanusri Bhattacharya

Similar presentations

Presentation on theme: "Dr. Tanusri Bhattacharya"— Presentation transcript:

Similar presentations

About project

Feedback