The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.

The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16

Problem Component failures are the norm Files are huge
Appends are common; random writes are rare Co-designing the applications and file system API increases flexibility

Architecture

Master Single master Metadata Replicate using operation log
File and chunk namespaces Mapping from files to chunks Locations of each chunk’s replica Replicate using operation log Read-only shadow masters

Master operations Namespace management using locks
Place chunk replicas across racks Replicate chunks for higher availability Moves chunks around for disk space and load balancing Garbage collection Deleted files Stale replicas Lazy reclaim

Chunkserver Multiple chunkservers Stores actual data
Free to join and leave Stores actual data Report chunk locations to master Checksums the data for integrity Replicated by the master

Interface Normal operations: create, delete, open, close, read, write
Additional operations Snapshot Create copy of file or directory tree Copy-on-write Record append Atomic Returns the offset to the client

System Interaction - Read
Client sends file name and chunk index to master Can be multiple chunks Master returns replica locations May return locations for the next chunks Client sends request to a chunkserver Chunk servers returns the data Further reads require no client-master interaction

System Interaction - Write
Master selects a primary chunkserver and grants a lease Client asks master the location of primary and secondaries Client pushes the data to all replicas All replicas reply to the client Client sends write request to primary Primary executes request, forwards it to secondaries Secondaries replay all mutations in the order of the primary

System Interaction - Append
Same as write Primary pads the chunk if space is not enough and client retries Each append is at most ¼ of the chunk size Large appends are broken into multiple operations

Consistency Model Consistency level Implications for applications
Defined Consistent Inconsistent Implications for applications Rely on appends rather than overwrites Checkpoint Use self-validating, self-identifying records

Evaluation - Micro-benchmarks

Evaluation - Real world clusters
Cluster A Research and Development A few MBs to a few TBs of data Tasks run up to hours Cluster B Production use Continuously generate and process multi-TB data Long running tasks

Storage and Metadata

Read/Write Rate

Recovery Time Kill one chunkserver Kill two chunkservers
15000 chunks containing 600 GB data All chunks restored in 23.2 mins Kill two chunkservers Each with chunks and 660 GB data Results in 266 single replicas Single replicas restored to at least 2x within 2 mins

Conclusion Some assumptions no longer hold Optimize for large files
Failures are normal Optimize for large files Optimize for appends Fault tolerance Constant monitoring Replication Fast recovery

Flat Datacenter Storage (FDS)
Bisection, high bandwidth network Flat storage model Non-blocking API Single master, multiple tractservers Deterministic data placement Dynamic work allocation with small work unit Parallel writes to all replicas Parallel replication

Tachyon Pushes lineage into the storage layer
Lineage information is persisted before the actual data Asynchronous, selective checkpointing Leaves and hot files first Resource allocation based on job priority Uses client side caching to increase replication factor

Discussion How should the design be changed to handle small files?
How to use multiple masters to avoid SPOF? How can consistency be improved?

The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.

Similar presentations

Presentation on theme: "The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.

Similar presentations

Presentation on theme: "The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16."— Presentation transcript:

Similar presentations

About project

Feedback