Storage management and caching in PAST Antony Rowstron and Peter Druschel Presented to cs294-4 by Owen Cooper.

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Antony Rowstron, Peter Druschel Presented by: Cristian Borcea.
Storage management and caching in PAST, a large-scale, persistent peer- to-peer storage utility Antony Rowstron, Peter Druschel.
P2P Systems and Distributed Hash Tables Section COS 461: Computer Networks Spring 2011 Mike Freedman
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
1 PASTRY Partially borrowed from Gabi Kliot ’ s presentation.
Pastry Scalable, decentralized object location and routing for large-scale peer-to-peer systems Peter Druschel, Rice University Antony Rowstron, Microsoft.
1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Applications over P2P Structured Overlays Antonino Virgillito.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.
Storage Management and Caching in PAST, a large-scale, persistent peer- to-peer storage utility Authors: Antony Rowstorn (Microsoft Research) Peter Druschel.
Secure routing for structured peer-to-peer overlay networks Miguel Castro, Ayalvadi Ganesh, Antony Rowstron Microsoft Research Ltd. Peter Druschel, Dan.
1 Pastry and Past Based on slides by Peter Druschel and Gabi Kliot (CS Department, Technion) Alex Shraer.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
IPv6 Mobility David Bush. Correspondent Node Operation DEF: Correspondent node is any node that is trying to communicate with a mobile node. This node.
Secure routing for structured peer-to-peer overlay networks (by Castro et al.) Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
Large Scale Sharing GFS and PAST Mahesh Balakrishnan.
1 Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Gabi Kliot, Computer Science Department, Technion Topics.
Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.
SkipNet: A Scaleable Overlay Network With Practical Locality Properties Presented by Rachel Rubin CS294-4: Peer-to-Peer Systems By Nicholas Harvey, Michael.
Decentralized Location Services CS273 Guest Lecture April 24, 2001 Ben Y. Zhao.
Wide-area cooperative storage with CFS
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
1 Freenet  Addition goals to file location: -Provide publisher anonymity, security -Resistant to attacks – a third party shouldn’t be able to deny the.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems (Antony Rowstron and Peter Druschel) Shariq Rizvi First.
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
1 Plaxton Routing. 2 Introduction Plaxton routing is a scalable mechanism for accessing nearby copies of objects. Plaxton mesh is a data structure that.
Cooperative File System. So far we had… - Consistency BUT… - Availability - Partition tolerance ?
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
PRESENTED BY KEVIN LARSON & WILL DIETZ 1 P2P Apps.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Storage Management and Caching in PAST A Large-scale persistent peer-to-peer storage utility Presented by Albert Tannous CSE 598D: Storage Systems – Dr.
P2p file storage and distribution Team: Brian Smith, Daniel Suskin, Dylan Nunley, Forrest Vines Mentor: Brendan Burns.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Secure Routing for Structured Peer-to-Peer Overlay Networks M. Castro, P. Druschel, A. Ganesh, A. Rowstron and D. S. Wallach Proc. Of the 5 th Usenix Symposium.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1 JTE HPC/FS Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca Fabio Picconi Pierre Sens LIP6, Université Paris 6.
Freenet “…an adaptive peer-to-peer network application that permits the publication, replication, and retrieval of data while protecting the anonymity.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel, Middleware 2001.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
Peer-to-Peer Networks 11 Past Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
Peer-to-Peer Networks 11 Past Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
Peer-to-Peer Networks 05 Pastry Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
Fabián E. Bustamante, Fall 2005 A brief introduction to Pastry Based on: A. Rowstron and P. Druschel, Pastry: Scalable, decentralized object location and.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
OceanStore : An Architecture for Global-Scale Persistent Storage Jaewoo Kim, Youngho Yi, Minsik Cho.
Peer-to-Peer Information Systems Week 12: Naming
Pastry Scalable, decentralized object locations and routing for large p2p systems.
Distributed Hash Tables
Controlling the Cost of Reliability in Peer-to-Peer Overlays
Accessing nearby copies of replicated objects
Applications (2) Outline Overlay Networks Peer-to-Peer Networks.
Consistent Hashing and Distributed Hash Table
Peer-to-Peer Information Systems Week 12: Naming
Presentation transcript:

Storage management and caching in PAST Antony Rowstron and Peter Druschel Presented to cs294-4 by Owen Cooper

Outline PAST goals PAST api File storage overview File and replica diversion Replica management Caching Performance Discussion

PAST (non)goals P2P global storage network –Use properties of existing p2p systems (Pastry) –Support for strong persistence Via a core set of replicas –High availability Via local caching –Scalable Obtain high storage utilization via local cooperation –Secure Design goals do not include –Replacing the file system –Updatable files –Directory or lookup service

Security Model Pastry node ids are a hash of a public key Smartcard based security –Provides keys –Quota management Nodeid and fileid generation controlled –Try to stop nodes from getting consecutive ids –Or clients from overloading parts of the network But node id and real world identity may not be linked Data not encrypted

PAST API’s In PAST, files are immutable Fileid=Insert(filename,credentials, k, file) –Insert k copies of the file into the network, or fail. –Fileid a signed (filename, credentials, salt) –Successful if ack with receipts from k nodes File=lookup(fileid) –Return a copy of the file if it exists Reclaim(fileid, cradentials) –Reclaim accepted if requested by the owner –Allows, but does not require, storage reclamation

File insertion Insert(name, c, k, file) –Computes a storage certificate Contains fileid, hash of content, k, salt –Deducts k*filesize from quota –Routes file and storage certificate using pastry using fileid. –Node verifies the integrity of the file, stores it, and asks k-1 closest nodes to store the file. K-1 nodes in leaf set (k-1 <= l) –Node returns ack with k signed storage receipts, or a nak.

Lookup and Reclamation Pastry ensures replica is found –Since a lookup is routed to the closest nodeid Reclamation –Client generates a reclaim certificate –Sends it to the fileid via pastry –Recipients verify the certificate & issue receipt –Client reclaims quota

Diversion A file or replica can be relocated For a replica, to another close node –If one of the K closest is overloaded For a file, to another set of nodes in the idspace –If the nodes around a fileid are (possibly locally) congested Why is this necessary? –Differing storage capacity at nodes –Differing file size for inserted files

Replica Diversion Node responsible for fileid asks k-1 neighbors to store the file Neighbor (N) may divert a copy to a node in its leaf set –Pointer to copy inserted at N –N issues storage certificate –N also inserts a pointer on the k+1th closest node No orphan if N fails N remains responsible for pointer maintenance

File Diversion Replica diversion is local –Allows storage choice between nodes around fileid File Diversion –Triggered when an insert with a fileid fails –Insert is tried a total of three times –New fileid generated by changing the salt

Storage Policy How does a node choose to accept or reject a replica? –Computes sizeof(file)/sizeof(free_space) –Compares to T pri or T div depending node’s role –T pri > T div How is node chosen for replica diversion –Search leaf set for the node that Has maximal free space Doesn’t already hold a diverted or primary replica File diversion – K copies cannot be located (via primary or diversion)

Replica maintenance Node join/leave causes responsibility shift –Pastry node failure detection will cause leaf set updates Past detects responsibility shifts this way Newly responsible node must copy files –Make a copy immediately, OR –pointer to old owner & copy lazily Diverted replicas –Target of diversion may move out of leaf set Node to store repica can be any one in leaf set –Must exchange keepalive messages themselves –Should be relocated

Replica maintenance (2) Node failure may cause storage shortage –No node in leaf set can take over ownership Search space is widened –Ask most extreme nodes to locate storage Increases search space to 2l nodes –If no storage space found, fail.

Caching Pastry’s locality based routing will tend to direct requests to nearby copies PAST also stores cached copies –Along routing path between client and fileid –For insert and lookup operations –Cache maintained using GD-size algorithm Weight per file: 1/size(file) Eviction: –Pick file with minimum weight –Subtract weight of evicted file from all others

Experiments: without diversion Experiments use –Large trace from web server –Files from local web server The case for diversion with web trace –Without diversion: 51.1% of insertions failed 60.8% storage utilization

Experiments (2): with diversion With diversion –Bigger leaf set size a plus

Experiments (3):varying T pri Effects of varying T pri # files stored v.s. size of file

Experiments (4): Varying T div Varying T div T pri is constant

File and Replica Diversion

caching 8 traces combined Requests from clients in each trace are mapped to close PAST nodes