Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Antony Rowstron, Peter Druschel Presented by: Cristian Borcea.

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

What is OceanStore? - 10^10 users with files each - Goals: Durability, Availability, Enc. & Auth, High performance - Worldwide infrastructure to.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK
Scalable Content-Addressable Network Lintao Liu
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
TAP: A Novel Tunneling Approach for Anonymity in Structured P2P Systems Yingwu Zhu and Yiming Hu University of Cincinnati.
Storage management and caching in PAST Antony Rowstron and Peter Druschel Presented to cs294-4 by Owen Cooper.
Storage management and caching in PAST, a large-scale, persistent peer- to-peer storage utility Antony Rowstron, Peter Druschel.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
1 PASTRY Partially borrowed from Gabi Kliot ’ s presentation.
Pastry Scalable, decentralized object location and routing for large-scale peer-to-peer systems Peter Druschel, Rice University Antony Rowstron, Microsoft.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Bowstron & Peter Druschel Presented by: Long Zhang.
Secure routing for structured peer-to-peer overlay networks M. Castro, P. Druschel, A. Ganesch, A. Rowstron, D.S. Wallach 5th Unix Symposium on Operating.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Applications over P2P Structured Overlays Antonino Virgillito.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.
Storage Management and Caching in PAST, a large-scale, persistent peer- to-peer storage utility Authors: Antony Rowstorn (Microsoft Research) Peter Druschel.
Secure routing for structured peer-to-peer overlay networks Miguel Castro, Ayalvadi Ganesh, Antony Rowstron Microsoft Research Ltd. Peter Druschel, Dan.
Pastry Partially borrowed for Gabi Kliot. Pastry Scalable, decentralized object location and routing for large-scale peer-to-peer systems  Antony Rowstron.
Large Scale Sharing The Google File System PAST: Storage Management & Caching – Presented by Chi H. Ho.
1 Pastry and Past Based on slides by Peter Druschel and Gabi Kliot (CS Department, Technion) Alex Shraer.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
peer-to-peer file systems
Secure routing for structured peer-to-peer overlay networks (by Castro et al.) Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
Large Scale Sharing GFS and PAST Mahesh Balakrishnan.
1 Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Gabi Kliot, Computer Science Department, Technion Topics.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.
Wide-area cooperative storage with CFS
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems (Antony Rowstron and Peter Druschel) Shariq Rizvi First.
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
Wide-area cooperative storage with CFS Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica.
Cooperative File System. So far we had… - Consistency BUT… - Availability - Partition tolerance ?
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
PRESENTED BY KEVIN LARSON & WILL DIETZ 1 P2P Apps.
1 Configurable Security for Scavenged Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany, Matei Ripeanu.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Security Michael Foukarakis – 13/12/2004 A Survey of Peer-to-Peer Security Issues Dan S. Wallach Rice University,
Storage Management and Caching in PAST A Large-scale persistent peer-to-peer storage utility Presented by Albert Tannous CSE 598D: Storage Systems – Dr.
Practical Byzantine Fault Tolerance
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
P2p file storage and distribution Team: Brian Smith, Daniel Suskin, Dylan Nunley, Forrest Vines Mentor: Brendan Burns.
Secure Routing for Structured Peer-to-Peer Overlay Networks M. Castro, P. Druschel, A. Ganesh, A. Rowstron and D. S. Wallach Proc. Of the 5 th Usenix Symposium.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel, Middleware 2001.
Pastry Antony Rowstron and Peter Druschel Presented By David Deschenes.
POND: THE OCEANSTORE PROTOTYPE S. Rea, P. Eaton, D. Geels, H. Weatherspoon, J. Kubiatowicz U. C. Berkeley.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Bruce Hammer, Steve Wallis, Raymond Ho
Outline for Today’s Lecture Administrative: –Happy Thanksgiving –Sign up for demos. Objective: –Peer-to-peer file systems Mechanisms employed Issues Some.
Peer-to-Peer Networks 11 Past Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Peer-to-Peer Networks 11 Past Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
Fabián E. Bustamante, Fall 2005 A brief introduction to Pastry Based on: A. Rowstron and P. Druschel, Pastry: Scalable, decentralized object location and.
CS791Aravind Elango Maintenance-Free Global Data Storage Sean Rhea, Chris Wells, Patrick Eaten, Dennis Geels, Ben Zhao, Hakim Weatherspoon and John Kubiatowicz.
Pastry Scalable, decentralized object locations and routing for large p2p systems.
(slides by Nick Feamster)
Applications (2) Outline Overlay Networks Peer-to-Peer Networks.
Presentation transcript:

Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Antony Rowstron, Peter Druschel Presented by: Cristian Borcea

What is PAST ? Archival storage and content distribution utility Not a general purpose file system Stores multiple replicas of files Caches additional copies of popular files in the local file system

How it works Built over a self-organizing, Internet-based overlay network Based on Pastry routing scheme Offers persistent storage services for replicated read-only files Owners can insert/reclaim files Clients just lookup

PAST Nodes The collection of PAST nodes form an overlay network Minimally, a PAST node is an access point Optionally, it contributes to storage and participate in the routing

PAST operations fileId = Insert(name, owner-credentials, k, file); file = Lookup(fileId); Reclaim(fileId, owner-credentials);

Insertion fileId computed as the secure hash of name, owner’s public key, salt Stores the file on the k nodes whose nodeIds are numerically closest to the 128 msb of fileId Remember from Pastry: each node has a 128-bit nodeId (circular namespace)

Insert contd The required storage is debited against the owner’s storage quota A file certificate is returned Signed with owner’s private key Contains: fileId, hash of content, replication factor + others The file & certificate are routed via Pastry Each node of the k replica storing nodes attach a store receipt Ack sent back after all k-nodes have accepted the file

Lookup & Reclaim Lookup: Pastry locates a “near” node that has a copy and retrieves it Reclaim: weak consistency After it, a lookup is no longer guaranteed to retrieve the file But, it does not guarantee that the file I no longer available

Security Each PAST node and each user of the system hold a smartcard Private/public key pair is associated with each card Smartcards generate and verify certificates and maintain storage quotas

More on Security Smartcards ensures integrity of nodeId and fileId assignments Store receipts prevent malicious nodes to create fewer than k copies File certificates allow storage nodes and clients to verify integrity and authenticity of stored content, or to enforce the storage quota

Storage Management Based on local coordination among nodes nearby with nearby nodeIds Responsibilities: Balance the free storage among nodes Maintain the invariant that replicas for each file are are stored on k nodes closest to its fileId

Causes for storage imbalance & solutions The number of files assigned to each node may vary The size of the inserted files may vary The storage capacity of PAST nodes differs Solutions Replica diversion File diversion

Replica diversion Recall: each node maintains a leaf set l nodes with nodeIds numerically closest to given node If a node A cannot accommodate a copy locally, it considers replica diversion A chooses B in its leaf set and asks it to store the replica Then, enters a pointer to B’s copy in its table and issues a store receipt

Policies for accepting a replica If (file size/remaining free storage) > t Reject t is a fixed threshold T has different values for primary replica ( nodes among k numerically closest ) and diverted replica ( nodes in the same leaf set, but not k closest ) t(primary) > t(diverted)

File diversion When one of the k nodes declines to store a replica  try replica diversion If the chosen node for diverted replica also declines  the entire file is diverted Negative ack is sent, the client will generate another fileId, and start again After 3 rejections the user is announced

Maintaining replicas Pastry uses keep-alive messages and it adjusts the leaf set after failures The same adjustment takes place at join What happens with the copies stored by a failed node ? How about the copies stored by a node that leaves or enters a new leaf set ?

Maintaining replicas contd To maintain the invariant ( k copies )  the replicas have to be re-created in the previous cases Big overhead Proposed solution for join: lazy re-creation First insert a pointer to the node that holds them, then migrate them gradually

Caching The k replicas are maintained in PAST for availability The fetch distance is measured in terms of overlay network hops ( which doesn’t mean anything for the real case ) Caching is used to improve performance

Caching contd PAST uses the “unused” portion of their advertised disk space to cache files When store a new primary or a diverted replica, a node evicts one or more cached copies How it works: a file that is routed through a node by Pastry ( insert or lookup ) is inserted into the local cache f its size < c c is a fraction of the current cache size

Conclusions Along with Tapestry, Chord(CFS), and CAN represent peer-to-peer routing and location schemes for storage The ideas are almost the same in all of them Questions raised at SOSP about them: Is there any real application for them ? Who will trust these infrastructures to store his/her files ?