Download presentation
Presentation is loading. Please wait.
Published byAmy Singleton Modified over 9 years ago
1
Google File System Robert Nishihara
2
What is GFS? Distributed filesystem for large-scale distributed applications
3
Setting Frequent hardware failures Large files Most writes are appends Most reads are sequential Throughput > latency
4
Architecture Files divided into 64MB “chunks” Chunkservers store/write/serve chunks Master maps files -> chunk
6
Design Decisions Large chunks (64MB) – Pro: fewer client/master interactions – Pro: less metadata No caching Writes optimized for “appends” Single master => optimizations Master metadata stored in memory – Pro: master operations are fast – Con: limits number of files
7
Fault Tolerance Chunks replicated (3x by default) Master state replicated (both logs and checkpoints)
8
Consistency Namespace mutation (e.g., file creation) is atomic Relaxed guarantees (“inconsistent” regions may be interspersed between “consistent” ones) Clients can handle de-duplication
9
Conclusion GFS is a filesystem designed for large scale distributed applications Optimized for appends and sequential reads Fault tolerance via replication, monitoring, fast recovery, checksumming
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.