From Boxwood to Eclipse
Eclipse Evolution2 A Quick Overview of Boxwood Virtualized distributed storage that provides high-level abstractions Storage Applications An evolution path for distributed storage:
Eclipse Evolution3 A Quick Overview of Boxwood Virtualized distributed storage that provides high-level abstractions An evolution path for distributed storage: Virtual Disk Storage Applications
Eclipse Evolution4 A Quick Overview of Boxwood Virtualized distributed storage that provides high-level abstractions An evolution path for distributed storage: Storage Applications …… TreeTableList
Eclipse Evolution5 Chunk Store Reliable “Media” ServicesServices Locking Logging Consensus Storage Application High-level Storage Abstractions Boxwood Architecture Replicated Logical Device Magnetic Media B-Tree
Eclipse Evolution6 Open Question What happens if the system experiences massive failures? Graceful degradation (or mitigating the “availability cliff”)
Eclipse Evolution7 Availability Cliff Availability Failures 1 0 Graceful Degradation Tolerable Failures More Severe
Eclipse Evolution8 Boxwood is Inherently NOT Gracefully Degradable ABCDEFEFABCDGH ABCDE Paxos Service/Monitor Service M : machine M Dependencies are the main problem! B-Tree
Eclipse: Gracefully Degradable Storage Abstractions Courtesy of Shirin Observatory and Science Center, MIT
Failures More Severe System View current scenario Why Eclipse? Tolerable Failures
Eclipse Evolution11 Degraded Availability A client might have a partial view of data A client might be allowed to perform a subset of operations on data We are NOT talking about graceful performance degradation!
Eclipse Evolution12 Benefits of Graceful Degradation Seamless disaster recovery from massive permanent failures During massive transient failures or network partitions, Offer a partial system view until system heals Return to a consistent/complete state when system heals
Eclipse Evolution13 Eclipse Concepts Gracefully degradable storage abstractions Sets or a collection of sets A subset can be considered a degradation Gracefully degradable and self-restoring system architecture Failure isolation: Failure of one unit should not cause the information on other units to be inaccessible Paxos for the self-restoring point, but operational even without Paxos
Eclipse Evolution14 Is Set Abstraction Useful? A mail service, where each mailbox can be implemented as a set A data retention/backup system MSN Spaces An emerging trend: A flat structure with search capability to replace a hierarchical structure
Eclipse Evolution15 Availability Cliff Revisited Availability Failures 1 0 More Severe Fault toleranceFault isolation Self-restoration
Eclipse Evolution16 Storing a Set Set elements are stored on multiple servers A local index is maintained for each locally stored element As long as a server is available, its local elements are accessible A global view can be constructed from local views
Eclipse Evolution17 Global Index as Soft State Global index for each set Can also maintain metadata for each element Soft state, for performance improvements Can support more complex data structures Map set id to the server maintaining the set’s global index Paxos maintains the authoritative mapping Mapping disseminated to all servers as hints Same set might have multiple index servers during massive failures
Eclipse Evolution18 Replication Strategies Limited operations during degraded mode (e.g., no updates) Optimistic replication: hard to figure out the stabilization point Inherently weak semantics: immutable elements, tolerance to re- appearance of deleted elements
Eclipse Evolution19 Related Work Optimistic Replication: (Saito and Shapiro) Bayou (Terry et al.), Coda (Kumar and Satyanarayanan), Ficus (Reiher et al.), and Locus (Walker et al.) Fault Isolation Hive (Chapin et al.), Archipelago (Ji et al.), Porcupine (Saito et al.), Pangaea (Saito et al.), and D-GRAID (Sivathanu et al.) Other Related Work Harvest, Yield, and Scalable Tolerant Systems (Fox and Brewer), TACT (Yu and Vahdat)