OceanStore: An Architecture for Global-Scale Persistent Storage Professor John Kubiatowicz, University of California at Berkeley

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

What is OceanStore? - 10^10 users with files each - Goals: Durability, Availability, Enc. & Auth, High performance - Worldwide infrastructure to.
Henry C. H. Chen and Patrick P. C. Lee
POND: the OceanStore Prototype Sean Rhea, Patrick Eaton, Dennis Geels, Hakim Weatherspoon, Ben Zhao and John Kubiatowicz UC, Berkeley File and Storage.
Pond: the OceanStore Prototype CS 6464 Cornell University Presented by Yeounoh Chung.
Pond The OceanStore Prototype. Pond -- Dennis Geels -- January 2003 Talk Outline System overview Implementation status Results from FAST paper Conclusion.
Pond: the OceanStore Prototype Sean Rhea, Patrick Eaton, Dennis Geels, Hakim Weatherspoon,
Pond The OceanStore Prototype. Introduction Problem: Rising cost of storage management Observations: Universal connectivity via Internet $100 terabyte.
Pond: the OceanStore Prototype Sean Rhea, Patrick Eaton, Dennis Geels, Hakim Weatherspoon,
David Choffnes, Winter 2006 OceanStore Maintenance-Free Global Data StorageMaintenance-Free Global Data Storage, S. Rhea, C. Wells, P. Eaton, D. Geels,
Outline for today Structured overlay as infrastructures Survey of design solutions Analysis of designs.
OceanStore Exploiting Peer-to-Peer for a Self-Repairing, Secure and Persistent Storage Utility John Kubiatowicz University of California at Berkeley.
OceanStore Status and Directions ROC/OceanStore Retreat 1/16/01 John Kubiatowicz University of California at Berkeley.
OceanStore An Architecture for Global-scale Persistent Storage By John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels,
Rutgers PANIC Laboratory The State University of New Jersey Self-Managing Federated Services Francisco Matias Cuenca-Acuna and Thu D. Nguyen Department.
OceanStore Status and Directions ROC/OceanStore Retreat 1/13/03 John Kubiatowicz University of California at Berkeley.
Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing Presenter: Chunyuan Liao March 6, 2002 Ben Y.Zhao, John Kubiatowicz, and.
OSD Metadata Management
Tentative Updates in MINO Steven Czerwinski Jeff Pang Anthony Joseph John Kubiatowicz ROC Winter Retreat January 13, 2002.
Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.
The Oceanic Data Utility: (OceanStore) Global-Scale Persistent Storage John Kubiatowicz.
OceanStore: Data Security in an Insecure world John Kubiatowicz.
OceanStore Theoretical Issues and Open Problems John Kubiatowicz University of California at Berkeley.
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
Opportunities for Continuous Tuning in a Global Scale File System John Kubiatowicz University of California at Berkeley.
Tapestry on PlanetLab Deployment Experiences and Applications Ben Zhao, Ling Huang, Anthony Joseph, John Kubiatowicz.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
OceanStore/Tapestry Toward Global-Scale, Self-Repairing, Secure and Persistent Storage Anthony D. Joseph John Kubiatowicz Sahara Retreat, January 2003.
Wide-area cooperative storage with CFS
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
OceanStore An Architecture for Global-Scale Persistent Storage Motivation Feature Application Specific Components - Secure Naming - Update - Access Control-
Tapestry: A Resilient Global-scale Overlay for Service Deployment Ben Y. Zhao, Ling Huang, Jeremy Stribling, Sean C. Rhea, Anthony D. Joseph, and John.
Long Term Durability with Seagull Hakim Weatherspoon (Joint work with Jeremy Stribling and OceanStore group) University of California, Berkeley ROC/Sahara/OceanStore.
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
OceanStore: An Architecture for Global-Scale Persistent Storage John Kubiatowicz, et al ASPLOS 2000.
MCSE Guide to Microsoft Exchange Server 2003 Administration Chapter Four Configuring Outlook and Outlook Web Access.
1 JTE HPC/FS Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca Fabio Picconi Pierre Sens LIP6, Université Paris 6.
Failure Resilience in the Peer-to-Peer-System OceanStore Speaker: Corinna Richter.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
CH2 System models.
Project Presentation Students: Yan Michalevsky Asaf Cidon Supervisors: Alexander Shraer Assoc. Prof. Idit Keidar.
Low-Overhead Byzantine Fault-Tolerant Storage James Hendricks, Gregory R. Ganger Carnegie Mellon University Michael K. Reiter University of North Carolina.
Pond: the OceanStore Prototype Sean Rhea, Patric Eaton, Dennis Gells, Hakim Weatherspoon, Ben Zhao, and John Kubiatowicz University of California, Berkeley.
Designing a global repository using OceanStore Steven Czerwinski, Anthony Joseph, John Kubiatowicz Summer Retreat June 11, 2002 UC Berkeley.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
OceanStore: An Infrastructure for Global-Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels,
OceanStore: In Search of Global-Scale, Persistent Storage John Kubiatowicz UC Berkeley.
Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer.
OceanStore: An Architecture for Global- Scale Persistent Storage.
HDFS (Hadoop Distributed File System) Taejoong Chung, MMLAB.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Toward Achieving Tapeless Backup at PB Scales Hakim Weatherspoon University of California, Berkeley Frontiers in Distributed Information Systems San Francisco.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
POND: THE OCEANSTORE PROTOTYPE S. Rea, P. Eaton, D. Geels, H. Weatherspoon, J. Kubiatowicz U. C. Berkeley.
2012 Objectives for CernVM. PH/SFT Technical Group Meeting CernVM/Subprojects The R&D phase of the project has finished and we continue to work as part.
Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing Presenter : Lee Youn Do Oct 5, 2005 Ben Y.Zhao, John Kubiatowicz, and Anthony.
1 JTE HPC/FS Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca Fabio Picconi Pierre Sens LIP6, Université Paris 6.
TRUST Self-Organizing Systems Emin G ü n Sirer, Cornell University.
St. Petersburg, 2016 Openstack Disk Storage vs Amazon Disk Storage Computing Clusters, Grids and Cloud Erasmus Mundus Master Program in PERCCOM Author:
CS791Aravind Elango Maintenance-Free Global Data Storage Sean Rhea, Chris Wells, Patrick Eaten, Dennis Geels, Ben Zhao, Hakim Weatherspoon and John Kubiatowicz.
OceanStore : An Architecture for Global-Scale Persistent Storage Jaewoo Kim, Youngho Yi, Minsik Cho.
Option 2: The Oceanic Data Utility: Global-Scale Persistent Storage
OceanStore: An Architecture for Global-Scale Persistent Storage
OceanStore August 25, 2003 John Kubiatowicz
OceanStore: Data Security in an Insecure world
Pond: the OceanStore Prototype
OceanStore: An Architecture for Global-Scale Persistent Storage
Content Distribution Network
Outline for today Oceanstore: An architecture for Global-Scale Persistent Storage – University of California, Berkeley. ASPLOS 2000 Feasibility of a Serverless.
Presentation transcript:

OceanStore: An Architecture for Global-Scale Persistent Storage Professor John Kubiatowicz, University of California at Berkeley Overview OceanStore is a global-scale data utility for Internet services How OceanStore is used  Application/user data is stored in objects  Objects are placed in global OceanStore infrastructure  Objects are accessed via Global Unique Identifiers  Objects are modified via action/predicate pairs  Each operation creates new version of object  Internet services (applications) define object format and content Potential Internet services Web caches, global file systems, Hotmail-like mail portals, etc. Goals  Global-scale  Extreme durability of data  Use untrusted infrastructure  Maintenance-free operation  Privacy of data  Automatic performance tuning Enabling technologies  Peer-to-peer and overlay networks  Erasure encoding and replication  Byzantine agreements  Repair and automatic node failover  Encryption and access control  Introspection and data clustering Key components: Tapestry and Inner Ring Tapestry  Decentralized Object Location and Routing (DOLR)  Provides routing to object independent of its location  Automatically reroutes to backup nodes when failures occur  Based on Plaxton algorithm  Overlay network  Scales for systems with large number of nodes  See Tapestry poster for more information Inner Ring  A set of nodes per object chosen by Responsible Party  Applies updates/writes requested by user  Checks all predicates and access control lists  Byzantine agreement used to check and serialize updates  Based on algorithm by Castro and Liskov  Ensures correctness even with f of 3f+1 nodes compromised  Threshold encryption used Key components: Archival Storage and Replicas Archival Storage  Provides extreme durability of data objects  Disseminates archival fragments throughout infrastructure  Fragment replication and repair ensures durability  Utilizes erasure codes  Redundancy without overhead of complete replication  Data objects are coded at a rate, r = m/n  Produces n fragments, where any m can reconstruct object  Storage overhead is n/m Replicas  Full copies of data objects stored in peer-to-peer infrastructure  Enables fast access  Introspection allows replicas to self-organize  Replicas migrate towards client accesses  Encryption of objects ensures data privacy  Dissemination tree is used to alert replicas of object updates Pond prototype benchmarks Update Latency (ms) 11502MB 994kB 1024b 10862MB 404kB 512b Median Time Updat e Size Key Size Latency Breakdown PhaseTime (ms) Check0.3 Serialize6.1 Apply1.5 Archive4.5 Sign77.8 Application benchmarksConclusions and future directions OceanStore’s accomplishments  Major prototype completed  Several fully-functional Internet services built and deployed  Demonstrated feasibility of the approach  Published results on system’s performance  Collaborating with other global-scale research initiatives Current research directions  Investigate new introspective data placement strategies  Finish adding features  Tentative update sharing between sessions  Archival repair  Replica management  Improve existing performance and deploy to larger networks  Examine bottlenecks  Improve stability  Data structure improvements  Develop more applications Current status: Pond implementation complete Pond implementation  All major subsystems completed  Fault-tolerant inner ring, erasure-coding archive  Software released to developer community outside Berkeley  280K lines of Java, JNI libraries for crypto, archive  Several applications implemented  See FAST paper on Pond prototype and benchmarking Deployed on PlanetLab  Initiative to provide researchers with wide-area testbed   ~100 hosts, ~40 sites, multiple continets  Allows pond to run up to 1000 virtual nodes  Have successfully run applications in wide-area  Created tools to allow quick deployment to PlanetLab Current status: Pond implementation complete Pond implementation  All major subsystems completed  Fault-tolerant inner ring, erasure-coding archive  Software released to developer community outside Berkeley  280K lines of Java, JNI libraries for crypto, archive  Several applications implemented  See FAST paper on Pond prototype and benchmarking Deployed on PlanetLab  Initiative to provide researchers with wide-area testbed   ~100 hosts, ~40 sites, multiple continents  Allows pond to run up to 1000 virtual nodes  Have successfully run applications in wide-area  Created tools to allow quick deployment to PlanetLab Internet services built on OceanStore MINNO  Global-scale system built on OceanStore  Enables storage and access to user accounts  Send via SMTP proxy, read and organize via IMAP  MINNO stores data in four types of OceanStore objects:  Folder list, Folder, Message, and Maildrop  Relaxed consistency model enables fast wide-area access Riptide  Web caching infrastructure  Uses data migration to move web objects closer to users  Verifies integrity of web content NFS  Provides traditional file system support  Enables time travel (reverting files/dirs) through OceanStore’s versioning primitives Many others  Palm pilot synchronizer, AFS, etc. Object update latency Measures latency of inner ring Byzantine agreement commit time Shows threshold signature is costly 100 ms latency on object writes Object update throughput Measures object write throughput Base system provides 8 MBps Batch updates to get good performance Archival Storage Client Inner ring Replicas Client NFS: Andrew benchmark Client in Berkeley, server in Seattle 4.6x than NFS in read-intensive phases 7.3x slower in write-intensive phases Reasonable time w/ key size of 512 Signature time is the bottleneck MINNO: Login time Client cache sync time w/ new msg retrieval Measured time vs. latency to inner ring Simulates mobile clients MINNO adapts well with data migration and tentative commits enabled Outperforms traditional IMAP server w/ no processing overhead