OceanStore An Architecture for Global-Scale Persistent Storage Motivation Feature Application Specific Components - Secure Naming - Update - Access Control-

Slides:



Advertisements
Similar presentations
Tapestry: Decentralized Routing and Location SPAM Summer 2001 Ben Y. Zhao CS Division, U. C. Berkeley.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
What is OceanStore? - 10^10 users with files each - Goals: Durability, Availability, Enc. & Auth, High performance - Worldwide infrastructure to.
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Storage management and caching in PAST Antony Rowstron and Peter Druschel Presented to cs294-4 by Owen Cooper.
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Antony Rowstron, Peter Druschel Presented by: Cristian Borcea.
Pond: the OceanStore Prototype CS 6464 Cornell University Presented by Yeounoh Chung.
Pond The OceanStore Prototype. Introduction Problem: Rising cost of storage management Observations: Universal connectivity via Internet $100 terabyte.
David Choffnes, Winter 2006 OceanStore Maintenance-Free Global Data StorageMaintenance-Free Global Data Storage, S. Rhea, C. Wells, P. Eaton, D. Geels,
1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.
OceanStore: An Infrastructure for Global-Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels,
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Option 2: The Oceanic Data Utility: Global-Scale Persistent Storage John Kubiatowicz.
Outline for today Structured overlay as infrastructures Survey of design solutions Analysis of designs.
OceanStore Status and Directions ROC/OceanStore Retreat 1/16/01 John Kubiatowicz University of California at Berkeley.
OceanStore: An Architecture for Global-Scale Persistent Storage John Kubiatowicz University of California at Berkeley.
1 OceanStore Global-Scale Persistent Storage Ying Lu CSCE496/896 Spring 2011.
OceanStore An Architecture for Global-scale Persistent Storage By John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels,
Each mesh represents a single hop on the route to a given root. Sibling nodes maintain pointers to each other. Each referrer has pointers to the desired.
Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing Presenter: Chunyuan Liao March 6, 2002 Ben Y.Zhao, John Kubiatowicz, and.
Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.
OceanStore: Data Security in an Insecure world John Kubiatowicz.
OceanStore Theoretical Issues and Open Problems John Kubiatowicz University of California at Berkeley.
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
Weaving a Tapestry Distributed Algorithms for Secure Node Integration, Routing and Fault Handling Ben Y. Zhao (John Kubiatowicz, Anthony Joseph) Fault-tolerant.
OceanStore: An Architecture for Global-Scale Persistent Storage Professor John Kubiatowicz, University of California at Berkeley
Opportunities for Continuous Tuning in a Global Scale File System John Kubiatowicz University of California at Berkeley.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
OceanStore/Tapestry Toward Global-Scale, Self-Repairing, Secure and Persistent Storage Anthony D. Joseph John Kubiatowicz Sahara Retreat, January 2003.
Or, Providing High Availability and Adaptability in a Decentralized System Tapestry: Fault-resilient Wide-area Location and Routing Issues Facing Wide-area.
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
OceanStore Global-Scale Persistent Storage John Kubiatowicz University of California at Berkeley.
Long Term Durability with Seagull Hakim Weatherspoon (Joint work with Jeremy Stribling and OceanStore group) University of California, Berkeley ROC/Sahara/OceanStore.
OceanStore: An Architecture for Global - Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patric Eaton, Dennis Geels,
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
1 Plaxton Routing. 2 Introduction Plaxton routing is a scalable mechanism for accessing nearby copies of objects. Plaxton mesh is a data structure that.
OceanStore: An Architecture for Global-Scale Persistent Storage John Kubiatowicz, et al ASPLOS 2000.
Arnold N. Pears, CoRE Group Uppsala University 3 rd Swedish Networking Workshop Marholmen, September Why Tapestry is not Pastry Presenter.
Failure Resilience in the Peer-to-Peer-System OceanStore Speaker: Corinna Richter.
Pond: the OceanStore Prototype Sean Rhea, Patric Eaton, Dennis Gells, Hakim Weatherspoon, Ben Zhao, and John Kubiatowicz University of California, Berkeley.
OceanStore: An Infrastructure for Global-Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels,
OceanStore: In Search of Global-Scale, Persistent Storage John Kubiatowicz UC Berkeley.
Distributed Architectures. Introduction r Computing everywhere: m Desktop, Laptop, Palmtop m Cars, Cellphones m Shoes? Clothing? Walls? r Connectivity.
1 More on Plaxton routing There are n nodes, and log B n digits in the id, where B = 2 b The neighbor table of each node consists of - primary neighbors.
Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer.
OceanStore: An Architecture for Global- Scale Persistent Storage.
Tapestry: A Resilient Global-scale Overlay for Service Deployment 1 Ben Y. Zhao, Ling Huang, Jeremy Stribling, Sean C. Rhea, Anthony D. Joseph, and John.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Toward Achieving Tapeless Backup at PB Scales Hakim Weatherspoon University of California, Berkeley Frontiers in Distributed Information Systems San Francisco.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
POND: THE OCEANSTORE PROTOTYPE S. Rea, P. Eaton, D. Geels, H. Weatherspoon, J. Kubiatowicz U. C. Berkeley.
Peer to Peer Network Design Discovery and Routing algorithms
Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing Presenter : Lee Youn Do Oct 5, 2005 Ben Y.Zhao, John Kubiatowicz, and Anthony.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
CS791Aravind Elango Maintenance-Free Global Data Storage Sean Rhea, Chris Wells, Patrick Eaten, Dennis Geels, Ben Zhao, Hakim Weatherspoon and John Kubiatowicz.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
OceanStore Global-Scale Persistent Storage John Kubiatowicz University of California at Berkeley.
OceanStore : An Architecture for Global-Scale Persistent Storage Jaewoo Kim, Youngho Yi, Minsik Cho.
Presented by Edith Ngai MPhil Term 3 Presentation
Option 2: The Oceanic Data Utility: Global-Scale Persistent Storage
OceanStore: An Architecture for Global-Scale Persistent Storage
Plethora: Infrastructure and System Design
Accessing nearby copies of replicated objects
OceanStore: Data Security in an Insecure world
John D. Kubiatowicz UC Berkeley
OceanStore: An Architecture for Global-Scale Persistent Storage
Tapestry: Scalable and Fault-tolerant Routing and Location
Content Distribution Network
Outline for today Oceanstore: An architecture for Global-Scale Persistent Storage – University of California, Berkeley. ASPLOS 2000 Feasibility of a Serverless.
Presentation transcript:

OceanStore An Architecture for Global-Scale Persistent Storage Motivation Feature Application Specific Components - Secure Naming - Update - Access Control- Deep Archival Storage - Data Location and Routing- Introspection Conclusion

provides persistent storage for ubiquitous computing Secure information Durable information Automatic and reliable archiving of information Geographically distributed data and cache users * 10,000 files/user = files Motivation

Data Utility Model - user - service provider / responsible party Untrusted Infrastructure - privacy & integrity & robustness Nomadic Data/Promiscuous Caching floating replicas Deep Archival Storage archival form of data object self-verifying data Introspection Features Computation OptimizationObservation

Groupware and personal information management tools challenge: concurrent updates from many people solution: flexible update mechanism Digital libraries and repositories for scientific data challenge: Massive quantities of storage, reliability, complicated management solution: Deep archival storage + seamless data migration New streaming applications challenge: data aggregation and dissemination solution: uniform infrastructure Applications

Secure Naming Fundamental Unit: Persistent Object GUID: secure hash, 160bit, location independent Uniqueness + unforgeability + verification AGUID Active data: SHA-1(human-readable name + owner’s public key) VGUID Archival data: SHA-1(data) NodeID Server: SHA-1(public key of server) Directory object: securely mapping human-readable names to GUID

GUIDs  Secure Pointers Name+Key Active GUID Global Object Resolutions Floating Replica (Active Object) Active Data Commit Logs CKPoint GUID Archival GUID Signature RP Keys ACLs MetaData Global Object Resolution Archival copy or snapshot Archival copy or snapshot Archival copy or snapshot Erasure Coded Archival GUID Signature Inactive Object Global Object Resolution

Access Control Reader restriction restrict key distribution only to readers Writer restriction ACL require all writes be signed

Data Location and Routing Two levels: Fast probabilistic search for “routing cache” Attenuated Bloom filter first Bloom filter: record of the objects contained locally on the current node ith Bloom filter:union of all of the Bloom filters for all of the nodes a distance i through any path from the current node. fully distributed, constant amount of storage locality – provided by introspection mechanism Slow guaranteed global search plaxton mesh

Global Algorithm Nodes : NodeID Data Object: GUID –Each object has Root node f (ObjectID) = RootID, randomly mapped –Root node is responsible for storing object’s location –Publish process : deposit a pointer at every hop along the path to root node Plaxton mesh Incremental suffix based routing

Plaxton Mesh Incremental suffix-based routing NodeID 0x43FE NodeID 0x13FE NodeID 0xABFE NodeID 0x1290 NodeID 0x239E NodeID 0x73FE NodeID 0x423E NodeID 0x79FE NodeID 0x23FE NodeID 0x73FF NodeID 0x555E NodeID 0x035E NodeID 0x44FE NodeID 0x9990 NodeID 0xF990 NodeID 0x993E NodeID 0x04FE NodeID 0x43FE

Object Location Randomization and Locality

Fault-tolerant Routing Multiple roots of each object using salted hash Additional neighbor links & neighbor link repair Repeat publishing process to repair location pointers Detect failures via soft-state probe packets Dynamic insertion & deletion

Update Model TimeStamp Client ID {Pred1, Update1} {Pred2, Update2} {Pred3, Update3} Client Signature Update message format: Conflict resolution Predicate-action pairs write restriction All updates submitted to Inner Ring servers which use byzantine agreement protocol to choose the final commit order Responsible party decides the inner ring Use plaxton mesh to disseminate commit order to secondary tier replicas Flexible update: support a range of consistency semantics (e.g. ACID) Untrusted infrastructure, limitation to work over ciphertext. Performance:- requirement of network bandwidth - latency of the client side

OceanStore Update

Deep Archive Storage Archival Data in Erasure Coded Fragments - Erasure codes produce n fragments, where any m is sufficient to reconstruct data. m < n. rate r = m/n. Storage overhead is 1/r. OceanStore equivalent of stable store Archival Fragments generated by Inner Ring Fragments are self-verifying

Deep Archive Storage- update

Deep Archival Storage - Self Verifying Data Fragment 3: Fragment 4: Data: Fragment 1: Fragment 2: H2H34HdF1 - fragment data H14 data H1H34HdF2 - fragment data H4H12HdF3 - fragment data H3H12HdF4 - fragment data F1F2F3F4 H1H2H3H4 H12H34 H14 B-GUID Hd Data Encoded Fragments F1 H2 H34 Hd Fragment 1: H2H34HdF1 - fragment data

Introspection Monitoring and adaptation of routing substrate –Optimization of Plaxton Mesh –Adaptation of second-tier multicast tree Continuous monitoring of access patterns: –Clustering algorithms to discover object relationships Clustered prefetching: demand-fetching related objects Proactive-prefetching: get data there before needed –Time series-analysis of user and data motion Continuous testing and repair of information –Slow sweep through all information to make sure there are sufficient erasure-coded fragments –Continuously reevaluate risk and redistribute data –Diagnosis and repair of routing and location infrastructure

Conclusions OceanStore: everyone’s data, one big utility –Global Utility model for persistent data storage OceanStore properties: –Provides security, privacy, and integrity –Provides extreme durability –Lower maintenance cost through continuous adaptation, self-diagnosis and repair –Large scale system has good statistical properties

Difference: Oceanstore: persistent storage infrastructure, untrusted infrastructure, passive data object OSD: active/dynamic object, trust model can not be too active over ciphertext. Common issues: - data security(privacy, integrity, reliability) - authentication and authorization - naming and routing - data consistency - caching - maintain-free - applications OceanStore vs OSD