OceanStore Status and Directions ROC/OceanStore Retreat 1/16/01 John Kubiatowicz University of California at Berkeley.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

Perspective on Overlay Networks Panel: Challenges of Computing on a Massive Scale Ben Y. Zhao FuDiCo 2002.
What is OceanStore? - 10^10 users with files each - Goals: Durability, Availability, Enc. & Auth, High performance - Worldwide infrastructure to.
POND: the OceanStore Prototype Sean Rhea, Patrick Eaton, Dennis Geels, Hakim Weatherspoon, Ben Zhao and John Kubiatowicz UC, Berkeley File and Storage.
Pond: the OceanStore Prototype CS 6464 Cornell University Presented by Yeounoh Chung.
Pond The OceanStore Prototype. Pond -- Dennis Geels -- January 2003 Talk Outline System overview Implementation status Results from FAST paper Conclusion.
Pond The OceanStore Prototype. Introduction Problem: Rising cost of storage management Observations: Universal connectivity via Internet $100 terabyte.
Pond: the OceanStore Prototype Sean Rhea, Patrick Eaton, Dennis Geels, Hakim Weatherspoon,
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
David Choffnes, Winter 2006 OceanStore Maintenance-Free Global Data StorageMaintenance-Free Global Data Storage, S. Rhea, C. Wells, P. Eaton, D. Geels,
1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.
OceanStore: An Infrastructure for Global-Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels,
Option 2: The Oceanic Data Utility: Global-Scale Persistent Storage John Kubiatowicz.
1 U.S. Department of the Interior U.S. Geological Survey A Cognitive Agent Based Geospatial Data Distribution System 12 May 2006.
Option 2: The Oceanic Data Utility: Global-Scale Persistent Storage John Kubiatowicz.
OceanStore: An Architecture for Global-Scale Persistent Storage John Kubiatowicz University of California at Berkeley.
OceanStore An Architecture for Global-scale Persistent Storage By John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels,
Failure Independence in Oceanstore Archive Hakim Weatherspoon University of California, Berkeley.
Scalable Adaptive Data Dissemination Under Heterogeneous Environment Yan Chen, John Kubiatowicz and Ben Zhao UC Berkeley.
OceanStore Status and Directions ROC/OceanStore Retreat 1/13/03 John Kubiatowicz University of California at Berkeley.
Protecting Free Expression Online with Freenet Presented by Ho Tsz Kin I. Clarke, T. W. Hong, S. G. Miller, O. Sandberg, and B. Wiley 14/08/2003.
Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing Presenter: Chunyuan Liao March 6, 2002 Ben Y.Zhao, John Kubiatowicz, and.
Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.
The Oceanic Data Utility: (OceanStore) Global-Scale Persistent Storage John Kubiatowicz.
OceanStore: Data Security in an Insecure world John Kubiatowicz.
OceanStore Theoretical Issues and Open Problems John Kubiatowicz University of California at Berkeley.
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
Weaving a Tapestry Distributed Algorithms for Secure Node Integration, Routing and Fault Handling Ben Y. Zhao (John Kubiatowicz, Anthony Joseph) Fault-tolerant.
OceanStore: An Architecture for Global-Scale Persistent Storage Professor John Kubiatowicz, University of California at Berkeley
Opportunities for Continuous Tuning in a Global Scale File System John Kubiatowicz University of California at Berkeley.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
OceanStore/Tapestry Toward Global-Scale, Self-Repairing, Secure and Persistent Storage Anthony D. Joseph John Kubiatowicz Sahara Retreat, January 2003.
Or, Providing High Availability and Adaptability in a Decentralized System Tapestry: Fault-resilient Wide-area Location and Routing Issues Facing Wide-area.
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
OceanStore Global-Scale Persistent Storage John Kubiatowicz University of California at Berkeley.
OceanStore An Architecture for Global-Scale Persistent Storage Motivation Feature Application Specific Components - Secure Naming - Update - Access Control-
Long Term Durability with Seagull Hakim Weatherspoon (Joint work with Jeremy Stribling and OceanStore group) University of California, Berkeley ROC/Sahara/OceanStore.
OceanStore: An Architecture for Global - Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patric Eaton, Dennis Geels,
7/15/2015ROC/OceanStore Winter Retreat Introspective Replica Management in OceanStore Dennis Geels.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
OceanStore: An Architecture for Global-Scale Persistent Storage John Kubiatowicz, et al ASPLOS 2000.
Arnold N. Pears, CoRE Group Uppsala University 3 rd Swedish Networking Workshop Marholmen, September Why Tapestry is not Pastry Presenter.
Failure Resilience in the Peer-to-Peer-System OceanStore Speaker: Corinna Richter.
Pond: the OceanStore Prototype Sean Rhea, Patric Eaton, Dennis Gells, Hakim Weatherspoon, Ben Zhao, and John Kubiatowicz University of California, Berkeley.
OceanStore: An Infrastructure for Global-Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels,
OceanStore: In Search of Global-Scale, Persistent Storage John Kubiatowicz UC Berkeley.
Distributed Architectures. Introduction r Computing everywhere: m Desktop, Laptop, Palmtop m Cars, Cellphones m Shoes? Clothing? Walls? r Connectivity.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
OceanStore: An Architecture for Global- Scale Persistent Storage.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Toward Achieving Tapeless Backup at PB Scales Hakim Weatherspoon University of California, Berkeley Frontiers in Distributed Information Systems San Francisco.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
POND: THE OCEANSTORE PROTOTYPE S. Rea, P. Eaton, D. Geels, H. Weatherspoon, J. Kubiatowicz U. C. Berkeley.
Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing Presenter : Lee Youn Do Oct 5, 2005 Ben Y.Zhao, John Kubiatowicz, and Anthony.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
CS791Aravind Elango Maintenance-Free Global Data Storage Sean Rhea, Chris Wells, Patrick Eaten, Dennis Geels, Ben Zhao, Hakim Weatherspoon and John Kubiatowicz.
OceanStore Global-Scale Persistent Storage John Kubiatowicz University of California at Berkeley.
OceanStore : An Architecture for Global-Scale Persistent Storage Jaewoo Kim, Youngho Yi, Minsik Cho.
Persistence of Data in a Dynamic Unreliable Network
Option 2: The Oceanic Data Utility: Global-Scale Persistent Storage
OceanStore: An Architecture for Global-Scale Persistent Storage
Plethora: Infrastructure and System Design
OceanStore August 25, 2003 John Kubiatowicz
OceanStore: Data Security in an Insecure world
John D. Kubiatowicz UC Berkeley
OceanStore: An Architecture for Global-Scale Persistent Storage
Dynamic Replica Placement for Scalable Content Delivery
Content Distribution Network
Outline for today Oceanstore: An architecture for Global-Scale Persistent Storage – University of California, Berkeley. ASPLOS 2000 Feasibility of a Serverless.
Presentation transcript:

OceanStore Status and Directions ROC/OceanStore Retreat 1/16/01 John Kubiatowicz University of California at Berkeley

OceanStore:2ROC/OceanStore Jan’01 Questions about ubiquitous information: Where is persistent information stored? –Want: Geographic independence for availability, durability, and freedom to adapt to circumstances How is it protected? –Want: Encryption for privacy, signatures for authenticity, and Byzantine commitment for integrity Can we make it indestructible? –Want: Redundancy with continuous repair and redistribution for long-term durability Is it hard to manage? –Want: automatic optimization, diagnosis and repair

OceanStore:3ROC/OceanStore Jan’01 Everyone’s Data, One Utility Millions of servers, billions of clients … YEAR durability (excepting fall of society) Maintains Privacy, Access Control, Authenticity Incrementally Scalable (“Evolvable”) Self Maintaining! Not quite peer-to-peer: Utilizing servers in infrastructure Some computational nodes more equal than others

OceanStore:4ROC/OceanStore Jan’01 Want Automatic Maintenance Can’t possibly manage billions of servers by hand! System should: –Be Fault-Tolerance (High MTTF) –Repair itself (Low MTTR through adaptation) –Incorporate new elements Can we guarantee data is available for 1000 years? –New servers added from time to time –Old servers removed from time to time –Everything just works Many components with geographic separation –System not disabled by natural disasters –Can adapt to changes in demand and regional outages –Gain in stability through statistics

OceanStore:5ROC/OceanStore Jan’01 OceanStore Assumptions Untrusted Infrastructure: –The OceanStore is comprised of untrusted components –Only ciphertext within the infrastructure Responsible Party: –Some organization (i.e. service provider) guarantees that your data is consistent and durable –Not trusted with content of data, merely its integrity Mostly Well-Connected: –Data producers and consumers are connected to a high- bandwidth network most of the time –Exploit multicast for quicker consistency when possible Promiscuous Caching: –Data may be cached anywhere, anytime

OceanStore:6ROC/OceanStore Jan’01 This Talk: making it real! (Or: you will hear reality from my students)

OceanStore:7ROC/OceanStore Jan’01 The Path of an OceanStore Update Second-Tier Caches Multicast trees Inner-Ring Servers Clients

OceanStore:8ROC/OceanStore Jan’01 Important Components: Data Object: (Distribution-enabled data format) –Must support copy-on-write and versioning efficiently –Must allow sparse population of data in caches –Must smoothly interface with archive Inner Ring: (Byzantine Agreement) –Check write access control –Choose seriallize updates/resolve micro-conflicts –Sign result with Threshold Signature –Erasure code result and send fragments Second Tier Server: (Promiscuous Caches) –Serve local clients –Tie itself into Dissemination tree –Apply updates that it receives through tree –Decision point for caching policies: tentative vs committed

OceanStore:9ROC/OceanStore Jan’01 Implementation Framework Asynchronous DiskAsynchronous Network Network Operating System Java Virtual Machine Thread Scheduler X Y Consistency Location & Routing Archival Introspection Modules DispatchDispatch Event-driven Implementation Model in Java –Divided into a sequence of communicating “stages” –Communication between stages in the form of “snoopable” messages –> 100,000 lines of Java, Comments, Test scripts –Substantially functioning!

OceanStore:10ROC/OceanStore Jan’01 GUIDs for Naming Unique, location independent identifiers: –Every version of every unique entity has a permanent, Version-GUID (or VGUID): Hash over content  Versioning supports time-travel –Each object has a permanent (version-independent) Archival-GUID (or AGUID): –Signed Associations between AGUIDs and latest VGUIDs are produced by inner ring (called Heartbeats) Naming hierarchy: –Users map from names to AGUIDs via hierarchy of OceanStore objects Each link is an AGUID Foo Bar Baz Myfile Out-of-Band “Root link”

OceanStore:11ROC/OceanStore Jan’01 Data Object Structure All about flexibility and validation

OceanStore:12ROC/OceanStore Jan’01 Status: Data Object Development Second-Tier Replica support: functional –Second-tier caches can hold multiple versions –Tie themselves into multicast trees Several dissemination tree algorithms explored Updates forwarded from inner ring through trees Complete B-Tree object structure developed –Data blocks named with unforgeable hashes Hashes can point to archival fragments/live blocks –Supports copy on write –Top block defines complete version Missing blocks filled in from archive or other replicas Update commits with distributed threshold signatures –Byzantine commitment not quite integrated into prototype Traffic generator for testing

OceanStore:13ROC/OceanStore Jan’01 Exploiting Law of Large Numbers for Durability

OceanStore:14ROC/OceanStore Jan’01 The Dissemination Process Model Builder Set Creator Introspection Human Input Network Monitoring model Disseminator set probe type fragments

OceanStore:15ROC/OceanStore Jan’01 Achieving Low MTTR: Global Heartbeats Trigger repair when level of redundancy to low Continuous sweep (slowly over time)

OceanStore:16ROC/OceanStore Jan’01 Status: Archival Infrastructure Archival Fragments generated by Inner Ring –Multi-stage-based implementation at inner ring –Storage servers hold fragments –Caching servers (2 nd - tier replicas) hold data objects Independence Analysis (mostly there) –Node discovery technique exists –Analysis of long-running reliability data –Dissemination-set creator: initial versions Storage servers (Naïve but functional): –Initial implementation: cache + object store –Ongoing tuning efforts –Redesign in the works

OceanStore:17ROC/OceanStore Jan’01 Location Independent Routing Paradigm: Routing –Route messages to objects by GUID regardless of location Fast, probabilistic search for “routing cache”: –Built from attenuated bloom filters –Approximation to gradient search Redundant Plaxton Mesh used for underlying routing infrastructure: –Randomized data structure with locality properties –Redundant, insensitive to faults, and repairable –Amenable to continuous adaptation to adjust for: Changing network behavior Faulty servers Denial of service attacks Tomorrow: 3 talks on Routing

OceanStore:18ROC/OceanStore Jan’01 Status: Location Independent Routing Basic Tapestry infrastructure is operational –Single-path static routing: works –Multi-path adaptive routing: mostly there –Dynamic Integration of new nodes: implemented Network adaptation almost there (Patchwork) –Framework for Measurement of network properties –Periodic beacons measure loss and network latency Exploitation of Differences in nodes: –Brocade backbone supplement to Tapestry: Improves routing –Differentiation in service experiments ongoing Theoretical Results on Tapestry –Construction/Analysis of Dynamic Integration Algorithms –Voluntary/involuntary node deletion algorithms –View of Tapestry as data structure for solving nearest neighbor Attenuated Bloom Filters are operational –Implemented and functional –Optimizes short-distance routing infrastructure!

OceanStore:19ROC/OceanStore Jan’01 Introspection: The New Architectural Creed Using Moore’s law gains for something other than performance Examples: –Online algorithmic validation –Model building for data rearrangement Availability Better prefetching –Extreme Durability (1000-year time scale?) Use of erasure coding and continuous repair –Stability through Statistics Use of redundancy to gain more predictable behavior Systems version of Thermodynamics! –Continuous Dynamic Optimization of other sorts Adapt Compute Monitor

OceanStore:20ROC/OceanStore Jan’01 Status: Introspection Development of OIL framework for introspection: this framework is operational –Collection facilities can observe all events in the system –Multiple aggregation models available Example 1: Clustering for prefetching –Currently builds Hidden Markov-model of access patterns utilizing OIL framework –Almost there: Use models to better prefetch objects Placement of replices assisted by bloom filters (almost) Example 2: Observation of network behavior –Framework for observation of network latencies –Adaptation of network topology: almost there Example 3: Grammer building for prefetching –Experiment of introspection at processor level –Talk later today about this (Mark Whitney)

OceanStore:21ROC/OceanStore Jan’01 Status: Medium Scale Test and Emulation Two medium clusters from IBM SUR Grant –Each cluster 21 servers: Each with two 1 GHz processors One GByte of RAM, 73 GB of Disk –1 GB Switch per cluster –MIRNET switch Plan to have continuous OceanStore components running – in approximately 1 month Emulation technology: currently works –Able to simulate large-scale network by simulating network latencies –Multiple OceanStore nodes emulated/node

OceanStore:22ROC/OceanStore Jan’01 Reality: Web Caching through OceanStore

OceanStore:23ROC/OceanStore Jan’01 Day Dreams? (Becoming real) NFS File system built in OceanStore (Exists) –Still have to integrate ACLs –Update to latest prototype Windows Installable File system (Planning) –“USB Keys” hold cryptographic keys and personal identity –Automatic downloading and verification of filesystem IMAP  OceanStore gateway (Planning) Lotus Notes Domino Server –Exploring use of work flow on top of OceanStore

OceanStore:24ROC/OceanStore Jan’01 OceanStore Conclusions OceanStore: everyone’s data, one big utility –Global Utility model for persistent data storage Very Soon: Working OceanStore cluster!!!! –Event-driven programming in Java –You will hear about components today and tomorrow OceanStore assumptions: –Untrusted infrastructure with a responsible party –Mostly connected with conflict resolution –Continuous on-line optimization

OceanStore:25ROC/OceanStore Jan’01 For more info: OceanStore vision paper for ASPLOS 2000 “OceanStore: An Architecture for Global-Scale Persistent Storage” OceanStore paper on Maintenance (IEEE IC): “Maintenance-Free Global Data Storage” Both available on OceanStore web site: