Boxwood: Abstractions as the Foundation for Storage Infrastructure Lidong Zhou, Microsoft Research Silicon Valley Joint work with Chandu Thekkath, John.

Slides:



Advertisements
Similar presentations
From Startup to Enterprise A Story of MySQL Evolution Vidur Apparao, CTO Stephen OSullivan, Manager of Data and Grid Technologies April 2009.
Advertisements

ScaleDB Transactional Shared Disk storage engine for MySQL
Boxwood: Distributed Data Structures as Storage Infrastructure Lidong Zhou Microsoft Research Silicon Valley Team Members: Chandu Thekkath, Marc Najork,
Archive Task Team (ATT) Disk Storage Stuart Doescher, USGS (Ken Gacke) WGISS-18 September 2004 Beijing, China.
Windows XP Operating Systems  COSC513 Operating Systems  Mr. Nut Prommongkonkun  Student ID #
Crossing the Chasm: Sneaking a parallel file system into Hadoop Wittawat Tantisiriroj Swapnil Patil, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon.
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
Petal and Frangipani. Petal/Frangipani Petal Frangipani NFS “SAN” “NAS”
High Performance Cluster Computing Architectures and Systems Hai Jin Internet and Cluster Computing Center.
Pond: the OceanStore Prototype CS 6464 Cornell University Presented by Yeounoh Chung.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
From Boxwood to Eclipse. Eclipse Evolution2 A Quick Overview of Boxwood Virtualized distributed storage that provides high-level abstractions Storage.
1 Principles of Reliable Distributed Systems Tutorial 12: Frangipani Spring 2009 Alex Shraer.
Other File Systems: LFS and NFS. 2 Log-Structured File Systems The trend: CPUs are faster, RAM & caches are bigger –So, a lot of reads do not require.
Sinfonia: A New Paradigm for Building Scalable Distributed Systems Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, Christonos Karamanolis.
Wide-area cooperative storage with CFS
G Robert Grimm New York University (with some slides by Steve Gribble) Distributed Data Structures for Internet Services.
CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Porcupine: A Highly Available Cluster- based Mail Service Y. Saito, B. Bershad, H. Levy U. Washington.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
PETAL: DISTRIBUTED VIRTUAL DISKS E. K. Lee C. A. Thekkath DEC SRC.
Presented by: Alvaro Llanos E.  Motivation and Overview  Frangipani Architecture overview  Similar DFS  PETAL: Distributed virtual disks ◦ Overview.
Object-based Storage Long Liu Outline Why do we need object based storage? What is object based storage? How to take advantage of it? What's.
1 The Google File System Reporter: You-Wei Zhang.
1 Storage Refinement. Outline Disk failures To attack Intermittent failures To attack Media Decay and Write failure –Checksum To attack Disk crash –RAID.
A brief overview about Distributed Systems Group A4 Chris Sun Bryan Maden Min Fang.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Some key-value stores using log-structure Zhichao Liang LevelDB Riak.
XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
NoSQL Databases Oracle - Berkeley DB Rasanjalee DM Smriti J CSC 8711 Instructor: Dr. Raj Sunderraman.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Presenters: Rezan Amiri Sahar Delroshan
Serverless Network File Systems Overview by Joseph Thompson.
GFS : Google File System Ömer Faruk İnce Fatih University - Computer Engineering Cloud Computing
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Parallel IO for Cluster Computing Tran, Van Hoai.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
CEG 2400 FALL 2012 Windows Servers Network Operating Systems.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
CS294, YelickDataStructs, p1 CS Distributed Data Structures
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
By Harshal Ghule Guided by Mrs. Anita Mahajan G.H.Raisoni Institute Of Engineering And Technology.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
BIG DATA/ Hadoop Interview Questions.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
PHD Virtual Technologies “Reader’s Choice” Preferred product.
CS 540 Database Management Systems
File-System Implementation
Chapter 1: Introduction
File System Implementation
Scaling for the Future Katherine Yelick U.C. Berkeley, EECS
Storage Virtualization
A Survey on Distributed File Systems
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.
Distributed P2P File System
Outline Virtualization Cloud Computing Microsoft Azure Platform
Ch 4. The Evolution of Analytic Scalability
Co-designed Virtual Machines for Reliable Computer Systems
Database System Architectures
Presentation transcript:

Boxwood: Abstractions as the Foundation for Storage Infrastructure Lidong Zhou, Microsoft Research Silicon Valley Joint work with Chandu Thekkath, John MacCormick, Nick Murphy, and Marc Najork

12/06/2004Boxwood2 Distributed Storage Applications are Hard to Build Distributed storage: low hardware cost, but high development/deployment cost Application logic on low-level storage interface Hardware parallelism and concurrency control Fault tolerance a necessity Incremental expansion and dynamic reconfiguration vs. system consistency Our goal: Distributed storage applications made easy to design, build, and deploy

12/06/2004Boxwood3 Target Application and Setting Enterprise storage applications and back-end storage for data-intensive Internet services

12/06/2004Boxwood4 Roadmap Boxwood Vision Boxwood Architecture Building Applications on Boxwood Performance Related Work and Conclusion

12/06/2004Boxwood5 Boxwood Vision Incorporate rich virtualized abstractions into low levels of the storage An evolution path for distributed storage: Storage Applications

12/06/2004Boxwood6 Boxwood Vision Incorporate rich virtualized abstractions into low levels of the storage An evolution path for distributed storage: Virtual Disk Storage Applications

12/06/2004Boxwood7 Boxwood Vision Incorporate rich virtualized abstractions into low levels of the storage An evolution path for distributed storage: Storage Applications …… TreeTableList

12/06/2004Boxwood8 Why High-Level Abstractions Reduce the complexity of distributed storage applications Natural continuum of storage virtualization High-level programming language for building distributed storage applications Potential built-in performance optimization by exploiting structural information Caching Prefetching

12/06/2004Boxwood9 Roadmap Boxwood Vision Boxwood Architecture Building Applications on Boxwood Performance Related Work and Conclusion

12/06/2004Boxwood10 Chunk Store Reliable Media ServicesServices Locking Logging Consensus Storage Application High-level Storage Abstractions Boxwood Architecture Replicated Logical Device Magnetic Media B-Tree

12/06/2004Boxwood11 Chunk Store Persistent storage with malloc-like interface Virtualization layer that hides the distributed nature Manage address space or free space for higher layers Reliable storage through replicated logical device Chunk Store Allocate De-allocate Read Write Replicated Logical Device

12/06/2004Boxwood12 B-Tree Abstraction B-Tree: A proven useful data structure for storage applications Distributed/reliable B- Link trees in Boxwood B-Link trees: high concurrency with simple locking Distributed reliable storage from chunk store Caching for performance Distributed lock service for consistency Logging for recovery B-Link Tree Insert Delete Lookup Enumerate Create Chunk Store LockingLogging

12/06/2004Boxwood13 Boxwood Services Distributed lock service for coordinating concurrent access to shared data Logging and recovery service for atomicity in face of transient failures Consensus service for system consistency Clean design of these services is crucial for scalability and for managing complexity

12/06/2004Boxwood14 Roadmap Boxwood Vision Boxwood Architecture Building Applications on Boxwood Performance Related Work and Conclusion

12/06/2004Boxwood15 Distributed Storage Applications on Boxwood: A Recipe 1.Design applications for local storage Map application logic to storage abstractions 2.Adapt the design for a distributed storage infrastructure Boxwood abstractions are virtualized Boxwood offers facilitating distributed services Separating algorithmic design from distributed system concerns is attractive.

12/06/2004Boxwood16 Local Disks From B-Link Tree Algorithm to Distributed Reliable B-Link Trees Local Disks B-Link trees on a single machine B-Link Tree Algorithm Local Locks Logging

12/06/2004Boxwood17 From B-Link Tree Algorithm to Distributed Reliable B-Link Trees B-Link Tree Algorithm Global Lock Service Reliable Logging Chunk Store Distributed and reliable B-Link trees Replicated Logical Device

12/06/2004Boxwood18 B-Link Tree Chunk Store ServicesServices BoxFS BoxFS: Multi-Node File Server on Boxwood Exported via NFS v2 Directory/File B-Tree Directory: maps names to NFS file handle with embedded B-tree handle File: maps block number to chunk handle File blocks chunks Locking/caching at file system level ~2500 lines of C# code

12/06/2004Boxwood19 Roadmap Boxwood Vision Boxwood Architecture Building Applications on Boxwood Performance Related Work and Conclusion

12/06/2004Boxwood20 Prototype Deployment and Performance Evaluation System setup Eight Dell PowerEdge 2650 servers with a single 2.4 GHz Xeon processor, 1GB of RAM Gigabit Ethernet switch Adaptec AIC-7899 dual SCSI adapter, and 5 SCSI drives Performance evaluation Single-machine non-replicated performance (BoxFS vs. NFS) B-tree operation scalability BoxFS operation scalability

12/06/2004Boxwood21 BoxFS vs. NFS over NTFS: Connectathon Benchmarks

12/06/2004Boxwood22 B-Tree Scaling (Private Tree)

12/06/2004Boxwood23 BoxFS Scaling (Read)

12/06/2004Boxwood24 B-Tree Scaling (Shared Tree)

12/06/2004Boxwood25 BoxFS Scaling (Write/MkDirEnt)

12/06/2004Boxwood26 Roadmap Boxwood Vision Boxwood Architecture Building Applications on Boxwood Performance Related Work and Conclusion

12/06/2004Boxwood27 Related Work Distributed Storage/Operating Systems Virtual/Logical disks File systems Database systems Scalable Distributed Data Structures Linear Hash Table (LH) and its variants (Litwin, present) Scalable distributed hash table (Gribble et al., 2000) Highly concurrent B-trees (Lehman and Yao, 1981; Sagiv, 1986)

12/06/2004Boxwood28 Conclusion and Future Directions A storage infrastructure offering virtualized high-level abstractions is promising Future Work: Explore more abstractions and applications; expose flexible interfaces (e.g., through hints) Leverage high-level abstractions for better load balancing, prefetching, and caching Graceful degradation during massive failures