Download presentation
Presentation is loading. Please wait.
Published byDonna Shields Modified over 8 years ago
1
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures) EECS 582 – W161
2
Motivation Google needed a good distributed file system Redundant storage of massive amounts of data on commodity computers (cheap and unreliable) Why not use an existing file system? Google’s problems are different from others in terms of workload and design priorities Google file system is designed for Google applications Google applications are designed for GFS EECS 582 – W162
3
Assumptions High component failure rates Inexpensive commodity components often fail Modest number of huge files A few million 100 MB or larger files Files are write-once, mostly appended to Large streaming reads High sustained bandwidth is favored over low latency EECS 582 – W163
4
Design Decisions Files stored as chunks Fixed size (64 MB) Reliability through replication Each chunk is replicated across 3+ chunkservers Single master to coordinate access and keep metadata Simple centralized management No data caching Little benefit due to large datasets, streaming reads Familiar interface but customize the API Snapshot and record append EECS 582 – W164
5
Architecture EECS 582 – W165
6
Single Master Problem: Single point of failure Scalability bottleneck GFS solutions Shadow master Minimize master involvement Never move data through master, only used for metadata Large chunk size Master delegates authority to primary replicas in data mutations (chunk leases) EECS 582 – W166
7
Metadata Metadata is stored on master File and chunk namespaces Mapping from files to chunks Locations of each chunk’s replicas All in memory (64 bytes per chunk) Fast Easily accessible EECS 582 – W167
8
Metadata Master has an operation log for persistent logging of critical metadata updates Persistent on local disk Replicated Checkpoints for faster recovery EECS 582 – W168
9
Metadata Master has an operation log for persistent logging of critical metadata updates Persistent on local disk Replicated Checkpoints for faster recovery EECS 582 – W169 XY
10
Metadata Master has an operation log for persistent logging of critical metadata updates Persistent on local disk Replicated Checkpoints for faster recovery EECS 582 – W1610 X
11
Metadata Master has an operation log for persistent logging of critical metadata updates Persistent on local disk Replicated Checkpoints for faster recovery EECS 582 – W1611 X START X Y END
12
Mutations Mutation = write or record append Must be done for all replicas Goal: minimize master involvement Lease mechanism Master picks on replica as primary and gives it a “lease” for mutations Primary defines a serial order of mutations All replicas follows this order Data flow decoupled from control flow EECS 582 – W1612
13
Atomic Record Append GFS appends it to the file atomically at least once GFS picks the offset Works for concurrent writers Used heavily by Google applications For files that serve as multiple-producer/single consumer queues Merge results from multiple machines to one file EECS 582 – W1613
14
Master’s Responsibilities Metadata storage Namespace management/locking Heartbeat with chunkservers Give instructions, collect state, track cluster health Chunk creation, re-replication, rebalancing Balance space utilization and access speed Re-replicate data if redundancy is lower than threshold Rebalance data to smooth out storage and request load EECS 582 – W1614
15
Master’s Responsibilities Garbage collection Simple and reliable in distributed system where failures are common Master logs the deletion, rename the file to a hidden name Lazily garbage collect hidden files (three days?) Stale replica deletion Detect stale replicas using chunk version numbers EECS 582 – W1615
16
Fault Tolerance High availability Fast recovery Master and chunks server can restart in a few seconds Chunk replication Default is three replicas Shadow masters Data integrity Checksum every 64 KB block in each chunk EECS 582 – W1616
17
Evaluation EECS 582 – W1617
18
Conclusion GFS shows how to support large-scale processing workloads on commodity hardware Design to tolerate frequent component failures Optimized for huge files that are mostly appended and read Simple solution (single master) EECS 582 – W1618
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.