RobuSTore: Performance Isolation for Distributed Storage and Parallel Disk Arrays Justin Burke, Huaxia Xia, and Andrew A. Chien Department of Computer Science and Engineering and Center for Networked Systems, University of California, San Diego OptIPuter Supported in part by the National Science Foundation under awards NSF Cooperative Agreement ANI (OptIPuter), NSF CCR (VGrADS), NSF ACI , and NSF Research Infrastructure Grant EIA Support from the UCSD Center for Networked Systems, BigBangwidth, and Fujitsu is also gratefully acknowledged. Storage Systems in the OptIPuter Project Layer 4: XCP Node Operating Systems (Storage Systems) λ-configuration, Net Management Grid and Web Middleware – (Globus/OGSA/WebServices/J2EE) Physical Resources DVC #1 OptIPuter Applications DVC #2DVC #3 Layer 5: SABUL, RBUDP, Fast, GTP Real-Time Objects Security Models Data Services: DWTP Higher Level Grid Services OptIPuter Software Architecture “Optical IP Computer” is a project to develop a powerfully distributed infrastructure that tightly couples computational, storage and visualization resources over optical DWDM networks using novel software. Overview Multi-terabyte datasets Interactive applications means fewer benefits from prefetching. Workload Assumptions Goals QoS Guarantees Provide statistical guarantees about storage system performance Performance Isolation in Shared Environment Minimize jitter in a competitive (shared) environment Performance Resilience to Failures Ability to cope with node failures while still maintaining QoS guarantees High Performance Match hardware speeds RobuSTore for Distributed Storage Design Approach Manages remote storage nodes in a SAN-like fashion. Additional capacity can be added independent of current configuration. Use of erasure codes allows us to achieve order independence of block retrieval. Storages nodes and file blocks are managed by a metadata server. MDS is used to locate file blocks and provide user authentication. Achieves performance goals by exploiting parallelism. From Traditional Storage Methods to Erasure Encoding RobuSTore for Parallel Drive Arrays Traditional Data Storage Methods FILEFILE Segments Striping and Replication High Performance Fault Tolerance Performance Isolation FILEFILE Segments Erasure Encoded Encoded Segments High Performance Fault Tolerance Performance Isolation Requested Segment Candidate Blocks Set of Reads Current Disk State Reconstruct Segment Exploit detailed knowledge of drive internals to further improve performance and performance isolation. Store segments of data files as encoded blocks Use erasure codes to create choice freedom. Distribute encoded blocks across drive array When a segment is requested, identify candidate encoded blocks for retrieval Use model of current disk state to optimize for head motion. Erasure codes allow choice freedom of block retrieval. Reconstruct original data segment in device driver Leverages disparity between host processing capabilities and disk speed. Erasure Encoding Encoding creates interdependencies between each of the Encoded Segments. Any K of N Encoded Segments are sufficient to reconstruct the original file. Design Approach…… Received Blocks: Reconstructed Blocks: Complete! Delayed! Reconstructing the file from first set of blocks returned from a large group of storage servers yields improvement in both latency and bandwidth. Ability to reconstruct the file from any set of blocks yields robust and isolated performance.…… Received Blocks: Reconstructed Blocks: Complete! Grid with slow network Lambda Grids