Download presentation
Presentation is loading. Please wait.
Published byGöran Bergström Modified over 5 years ago
1
Federated, Available, and Reliable Storage for an Incompletely Trusted Environment
Atul Adya, William J. Bolosky, Miguel Castro, Gerald Cermak, Ronnie Chaiken, John R. Douceur, Jon Howell, Jacob R. Lorch, Marvin Theimer, Roger P. Wattenhofer
2
Overview Secure, serverless distributed filesystem on untrusted hosts
Implemented in MS Windows Byzantine state machine approach for metadata File data replicated for availability Cryptographically secured for privacy Performance improvements through caching, leasing, delayed updates
3
Design motivations Serverless
Lower cost maintenance and administration No centralized failure points Ensures user privacy Design decision based on file access traces obtained at Microsoft Untrusted hosts can be used for trusted storage
4
Design Assumptions 105 machines – typical corporate, campus network
Small portion of machines are malicious Machine availability is moderately high Cryptographic operations do not add significant overhead High percentage of disk space on office machines is unused
5
Filesystem overview / Users Shared Cusp Cruft Alice Bob Charlie Docs
emacs PowerPoint Exchange
6
Filesystem overview Machines are classed as
Clients Directory group File hosts Metadata is stored and maintained on directory group using Byzantine Fault Tolerant protocol File data is stored on file hosts using raw replication
7
Metadata operations Directory groups maintain metadata
When load increases the group can delegate a subtree to a set of machines Metadata includes a one way hash of the file data (more on that later) Manages access control Aggressively replicate metadata if any machine in the group fails
8
Features Reliability and Availability Security
Data and metadata replicated Security File data is encrypted so that only authourized users can decrypt data (read) Updates are committed only if user is authorized for write Privacy is managed by encrypting all data and metadata
9
Features Security Durability
Integrity maintained by keeping a Merkle Hash Tree Durability Logged updates to prevent inconsistant data and metadata Logs are periodically pushed out and possibly compressed File data is also logged to allow modify operation to be atomic
10
Performance decisions
Does not use Erasure Codes Uses local caching and leases to minimize remote operations Updates are batched to reduce overhead Read/write sharing is kept to a minimum Reclaims space from identical files Hint based pathname translation and delayed dir-change notification
11
Implementation Windows user level service, kernel level FS driver
Uses RDBSS Uniquely it follows NTFS semantics (with few exceptions) Provides Windows file semantics instead of Unix
12
Performance Machine trace used for performance measurement
20% faster then CIFS 600% slower then NTFS
13
Future Work Quota Management
Monitoring availability for smart replication Implementation of recovery, group delegation and duplicate file detection
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.