OceanStore Global-Scale Persistent Storage John Kubiatowicz.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

Perspective on Overlay Networks Panel: Challenges of Computing on a Massive Scale Ben Y. Zhao FuDiCo 2002.
Thanks to Microsoft Azure’s Scalability, BA Minds Delivers a Cost-Effective CRM Solution to Small and Medium-Sized Enterprises in Latin America MICROSOFT.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
IB in the Wide Area How can IB help solve large data problems in the transport arena.
POND: the OceanStore Prototype Sean Rhea, Patrick Eaton, Dennis Geels, Hakim Weatherspoon, Ben Zhao and John Kubiatowicz UC, Berkeley File and Storage.
Pond: the OceanStore Prototype CS 6464 Cornell University Presented by Yeounoh Chung.
David Choffnes, Winter 2006 OceanStore Maintenance-Free Global Data StorageMaintenance-Free Global Data Storage, S. Rhea, C. Wells, P. Eaton, D. Geels,
OceanStore: An Infrastructure for Global-Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels,
Option 2: The Oceanic Data Utility: Global-Scale Persistent Storage John Kubiatowicz.
OceanStore Global-Scale Persistent Storage John Kubiatowicz.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
OceanStore Global-Scale Persistent Storage John Kubiatowicz University of California at Berkeley.
Option 2: The Oceanic Data Utility: Global-Scale Persistent Storage John Kubiatowicz.
OceanStore Status and Directions ROC/OceanStore Retreat 1/16/01 John Kubiatowicz University of California at Berkeley.
OceanStore: An Architecture for Global-Scale Persistent Storage John Kubiatowicz University of California at Berkeley.
1 OceanStore Global-Scale Persistent Storage Ying Lu CSCE496/896 Spring 2011.
OceanStore Global-Scale Persistent Storage John Kubiatowicz University of California at Berkeley.
OceanStore An Architecture for Global-scale Persistent Storage By John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels,
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
Locality Optimizations in OceanStore Patrick R. Eaton Dennis Geels An introduction to introspective techniques for exploiting locality in wide area storage.
OSD Metadata Management
Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.
The Oceanic Data Utility: (OceanStore) Global-Scale Persistent Storage John Kubiatowicz.
OceanStore: Data Security in an Insecure world John Kubiatowicz.
OceanStore Theoretical Issues and Open Problems John Kubiatowicz University of California at Berkeley.
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
OceanStore: An Architecture for Global-Scale Persistent Storage Professor John Kubiatowicz, University of California at Berkeley
Opportunities for Continuous Tuning in a Global Scale File System John Kubiatowicz University of California at Berkeley.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
OceanStore/Tapestry Toward Global-Scale, Self-Repairing, Secure and Persistent Storage Anthony D. Joseph John Kubiatowicz Sahara Retreat, January 2003.
Live for today as if it is your last day but plan for tomorrow as if it will last forever!
OceanStore Global-Scale Persistent Storage John Kubiatowicz University of California at Berkeley.
OceanStore An Architecture for Global-Scale Persistent Storage Motivation Feature Application Specific Components - Secure Naming - Update - Access Control-
OceanStore: An Architecture for Global - Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patric Eaton, Dennis Geels,
Metadata Issues in a Cryptographic File System David Bindel IRAM/ISTORE/OceanStore Retreat.
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Reducing Risk with Cloud Storage. Dell Storage Forum 2011 Storage 2 Dells’ Definition of Cloud Demand driven scalability: up or down, just happens Metered:
11 REVIEWING MICROSOFT ACTIVE DIRECTORY CONCEPTS Chapter 1.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
Version 4.0. Objectives Describe how networks impact our daily lives. Describe the role of data networking in the human network. Identify the key components.
Failure Resilience in the Peer-to-Peer-System OceanStore Speaker: Corinna Richter.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Pond: the OceanStore Prototype Sean Rhea, Patric Eaton, Dennis Gells, Hakim Weatherspoon, Ben Zhao, and John Kubiatowicz University of California, Berkeley.
OceanStore: An Infrastructure for Global-Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels,
OceanStore: In Search of Global-Scale, Persistent Storage John Kubiatowicz UC Berkeley.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Cloud Computing Project By:Jessica, Fadiah, and Bill.
Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer.
OceanStore: An Architecture for Global- Scale Persistent Storage.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
POND: THE OCEANSTORE PROTOTYPE S. Rea, P. Eaton, D. Geels, H. Weatherspoon, J. Kubiatowicz U. C. Berkeley.
Societal-Scale Computing: The eXtremes Scalable, Available Internet Services Information Appliances Client Server Clusters Massive Cluster Gigabit Ethernet.
TRUST Self-Organizing Systems Emin G ü n Sirer, Cornell University.
An Introduction to GPFS
CS791Aravind Elango Maintenance-Free Global Data Storage Sean Rhea, Chris Wells, Patrick Eaten, Dennis Geels, Ben Zhao, Hakim Weatherspoon and John Kubiatowicz.
OceanStore : An Architecture for Global-Scale Persistent Storage Jaewoo Kim, Youngho Yi, Minsik Cho.
Clouding with Microsoft Azure
Reducing Risk with Cloud Storage
Option 2: The Oceanic Data Utility: Global-Scale Persistent Storage
OceanStore: An Architecture for Global-Scale Persistent Storage
Scaling for the Future Katherine Yelick U.C. Berkeley, EECS
OceanStore: Data Security in an Insecure world
John D. Kubiatowicz UC Berkeley
Introduction To Distributed Systems
Content Distribution Network
Outline for today Oceanstore: An architecture for Global-Scale Persistent Storage – University of California, Berkeley. ASPLOS 2000 Feasibility of a Serverless.
Presentation transcript:

OceanStore Global-Scale Persistent Storage John Kubiatowicz

OStore Ubiquitous Devices  Ubiquitous Storage Consumers of data move, change from one device to another, work in cafes, cars, airplanes, the office, etc. Properties REQUIRED for Endeavour storage substrate: –Strong Security: data must be encrypted whenever in the infrastructure; resistance to monitoring –Coherence: too much data for naïve users to keep coherent “by hand” –Automatic replica management and optimization: huge quantities of data cannot be managed manually –Simple and automatic recovery from disasters: probability of failure increases with size of system –Utility model: world-scale system requires cooperation across administrative boundaries

OStore Pac Bell Sprint IBM AT&T Canadian OceanStore Service provided by confederation of companies –Monthly fee paid to one service provider –Companies buy and sell capacity from each other IBM Utility-based Infrastructure

OStore Amusing back of the envelope computation How many files in OceanStore? –Assume people in world –Say 10,000 files/person (very conservative?) –So files in OceanStore! –If 1 gig files (not likely), get 1 mole of files! Truly impressive numbers of elements… but this is small relative to physical constants

OStore OceanStore Assumptions Untrusted Infrastructure: –Infrastructure is comprised of untrusted components –Only cyphertext within the infrastructure –Must be careful to avoid leaking information Mostly Well-Connected: –Data producers and consumers are connected to a high- bandwidth network most of the time –Exploit mechanism such as multicast for quicker consistency between replicas Promiscuous Caching: –Data may be cached anywhere, anytime –Global optimization through tacit information collection Operations Interface with Conflict Resolution: –Applications employ an operations-oriented interface, rather than a file-systems interface –Coherence is centered around conflict resolution

OStore OceanStore Technologies I: Naming and Data Location Requirements: –Find nearby data without global communication –Don’t get in way of rapid relocation of data –Search should reflect locality and network efficiency –System-level names should help to authenticate data OceanStore Technology: –Underlying namespace is flat and built from cryptographic signatures (160-bit SHA-1) –Data location is a form of gradient-search of local pools of data (use of attenuated Bloom-filters) –Fallback to global, “exact” indexing structure in case data not found with local search

OStore Progress Last Term: Sean Rhea and Westly Weimer –Built data location facility on simulated network –Uses attenuated bloom filters –Performs search by passing messages from node to node. All state kept in messages! –Updates filters through semi-chaotic passing of information between neighbors Resembles compiler dataflow algorithm Can be shown to converge Future? –Find other “holographic representations of location” –Whole new approach to data location? –Unified name service, data location, routing

OStore OceanStore Technologies II: High-Availability and Disaster Recovery Requirements: –Handle diverse, unstable participants in OceanStore –Eliminate backup as independent (and fallible) technology –Flexible “disaster recovery” for everyone OceanStore Technologies: –Use of erasure-codes (Tornado codes) to provide stable storage for archival copies and snapshots of live data –Mobile replicas are self-contained centers for logging and conflict resolution –Version-based update for painless recovery –Redundancy exploited to tolerate variation of performance from network servers (RIVERS)

OStore Progress Last Term Hakim Weatherspoon, Shelley Zhuang and Matthew Delco –Designed a storage system using erasure codes –Compared Reed-Solomon codes to Tornado codes: over 1000 to 1 performance advantage in favor of Tornado codes! –Explored different distribution and gathering techniques Future? –Can this system be turned into a generic replacement for standard UNIX backup? –Transform into underlying archival piece of OceanStore –Use of Tornado codes for Rivers-like adaptation to variations in latency –Self-repairing data structures???

OStore OceanStore Technologies III: Introspective Monitoring and Optimization Requirements: –Reasonable job on a global-scale optimization problem –Take advantage of locality whenever possible –Sensitivity to limited storage and bandwidth at endpoints –Stability in chaotic environment OceanStore Technologies: –Introspective Monitoring and analysis of relationships: between different pieces of data between users of a given piece of data –Rearrangement of data in response to monitoring: Economic models with analogies to simulated annealing –Sub problem of Tacit Information Analysis (option 5)

OStore Progress Last Term Patrick R. Eaton, Dennis Geels and Greg Mori –Introspective monitoring of local file system Clustering of related data together Identifying of patterns for prefetching –Built filesystem simulation system in which to explore techniques Byung Hoon Kang, Sarika Sahni and H. Wilson So in collaboration with Laurent El Ghaoui –Time-series extraction of patterns –Do people move predictably? Can we use this? Future? –Kalman filters, hidden-Markov Models, and other statistical methods for automatically migrating data –More realistic traces (collaboration with Mary Baker?)

OStore OceanStore Technologies IV: Rapid Update in an Untrusted Infrastructure Requirements: –Scalable coherence mechanism which provides performance even though replicas widely separated –Operate directly on encrypted data –Updates should not reveal info to untrusted servers OceanStore Technologies: –Operations-based interface using conflict resolution –Use of incremental cryptographic techniques: No time to decrypt/update/re-encrypt –Use of oblivious function techniques to perform this update (fallback to secure hardware in general case) –Use of automatic techniques to verify security protocols

OStore Progress Last Term Monica Chew and Chris Wells and David Bindel –Designed ECFS, the extended cryptographic filesystem Explored metadata in an untrusted infrastructure Uses encryption and signatures to provide protection against substitution attacks Dawn Song, David Wagner, Doug Tygar –New technique for encrypting data in a way that is searchable –Could perform general “grep” functionality at server without revealing what you are searching for –Use in conflict resolution seems plausible Future? –Key problem: Denial of Service –Conflict resolution interfaces –Computation on Encrypted data?

OStore Grab Bag Collaboration opportunities –100TB of spinning storage (Brewster Kahle) –EMC data collaborations –Microsoft (Bill Bolosky) Use of Archival system to handle portions of the Berkeley backup? –To get “same level of service” need 12TB of spinning storage –Want it to be off site for disaster recovery OceanStore as a software distribution technology: Microsoft windows in the net? –Versioning mechanism for handling software upgrades

OStore Two-Phase Implementation: This term: Read-Mostly Prototype –Construction of data location facility –Initial introspective gathering of tacit info and adaptation –Initial archival techniques (use of erasure codes) –Unix file-system interface under Linux (“legacy apps”) Later?: Full Prototype –Final conflict resolution and encryption techniques –More sophisticated tacit info gathering and rearrangement –Final object interface and integration with Endeavour applications –Wide-scale deployment via NTON and Internet-2