An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.

Slides:



Advertisements
Similar presentations
Peer-to-Peer Infrastructure and Applications Andrew Herbert Microsoft Research, Cambridge
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
1 PASTRY Partially borrowed from Gabi Kliot ’ s presentation.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Small-world Overlay P2P Network
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Distributed Algorithms for Peer-to-Peer Systems Ronaldo Alves Ferreira PhD Thesis Advisors: Ananth Grama and Suresh Jagannathan Department of Computer.
Secure routing for structured peer-to-peer overlay networks (by Castro et al.) Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
SkipNet: A Scaleable Overlay Network With Practical Locality Properties Presented by Rachel Rubin CS294-4: Peer-to-Peer Systems By Nicholas Harvey, Michael.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
Topology-Aware Overlay Networks By Huseyin Ozgur TAN.
Peer-to-peer file-sharing over mobile ad hoc networks Gang Ding and Bharat Bhargava Department of Computer Sciences Purdue University Pervasive Computing.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems (Antony Rowstron and Peter Druschel) Shariq Rizvi First.
Structured P2P Network Group14: Qiwei Zhang; Shi Yan; Dawei Ouyang; Boyu Sun.
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
Mobile Ad-hoc Pastry (MADPastry) Niloy Ganguly. Problem of normal DHT in MANET No co-relation between overlay logical hop and physical hop – Low bandwidth,
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati.
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Security Michael Foukarakis – 13/12/2004 A Survey of Peer-to-Peer Security Issues Dan S. Wallach Rice University,
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
A Peer-to-Peer Approach to Resource Discovery in Grid Environments (in HPDC’02, by U of Chicago) Gisik Kwon Nov. 18, 2002.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel, Middleware 2001.
DHT-based unicast for mobile ad hoc networks Thomas Zahn, Jochen Schiller Institute of Computer Science Freie Universitat Berlin 報告 : 羅世豪.
Pastry Antony Rowstron and Peter Druschel Presented By David Deschenes.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
Malugo – a scalable peer-to-peer storage system..
Plethora: A Locality Enhancing Peer-to-Peer Network Ronaldo Alves Ferreira Advisor: Ananth Grama Co-advisor: Suresh Jagannathan Department of Computer.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
Fabián E. Bustamante, Fall 2005 A brief introduction to Pastry Based on: A. Rowstron and P. Druschel, Pastry: Scalable, decentralized object location and.
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
Pastry Scalable, decentralized object locations and routing for large p2p systems.
Plethora: Infrastructure and System Design
Early Measurements of a Cluster-based Architecture for P2P Systems
PASTRY.
Presentation transcript:

An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer Sciences – Purdue University December

Outline Introduction Motivation IP Addresses as Virtual IDs Cache Organization Simulation Results Conclusions

Introduction Peer-to-Peer (P2P) networks are self-organizing distributed systems where participating nodes both provide and receive services from each other in a cooperative manner without distinguished roles as pure clients or pure servers. P2P Internet applications have recently been popularized by file sharing applications like Napster and Gnutella. P2P systems have many interesting technical aspects such as decentralized control, self-organization, adaptation and scalability. One of the key problems in large-scale P2P applications is to provide efficient algorithms for object location and routing within the network.

Location and Routing - DHT Most of known proposals take as input a key and, in response, route a message to the node responsible for that key. The keys are strings of digits of some length (generally 128 bits). Nodes have identifiers taken from the same space as the keys (same number of digits). Each node maintains a routing table consisting of a small subset of nodes in the system. Nodes route queries to neighbor nodes that make the most “progress” towards resolving the query.

Location and Routing - DHT The notion of progress differs from algorithm to algorithm. Plaxton developed the first ideas that could be applied in a scalable manner. While intended for a static node population, Plaxton algorithm provides efficient routing of queries. The algorithm works by “correcting” a single digit at a time. Chord, Pastry, and Tapestry are variants of Plaxton algorithm.

Location and Routing - DHT 0XXX1XXX2XXX3XXX START 0112 routes a message to key First hop fixes first digit (2) Second hop fixes second digit (20) END 2001 closest live node to 2000.

Location and Routing - DHT Node 0 Routing Table Leaf Set

Location and Routing - DHT Node 0 Routing Table

Location and Routing - Pastry Computers (nodes) have unique ID  Typically 128 bits long  Assignment should lead to uniform distribution in the node ID space, for example hash of node’s IP Primitive: route(msg, key)  Deliver msg to currently alive node with ID numerically closest to key Node state  Routing table  Neighborhood set  Leaf set Scalable, efficient  O(log(N)) routing table entries per node  Route in O(log(N)) number of hops

DHT Performance Issues Virtualization destroys locality. Messages may have to travel around the world to reach a node in the same LAN. Query responses do not contain locality information. Heuristics to minimize the problem:  Proximity routing  Topology-based node ID assignment  Proximity neighbor selection

Motivation Virtualization destroys locality. Query responses do not contain locality information. Recent studies show that queries for multiple keys in P2P networks follow a Zipf-like distribution. For many wide-are distributed applications, nodes in the same region share common interests. For example, music sharing applications. Networking intensive applications have been built using P2P networks (Distributed File Systems).

IP Addresses as Virtual IDs A natural way of building locality in an overlay network is to explore the addressing scheme of the underlying network. In most cases, nodes with IP addresses that are numerically close are also physically close. Organization of the Internet in ASs. By correcting a few bits in each hop, the last hops would be inside an AS.

IP Addresses as Virtual IDs IP space is not uniformly populated by peers. Load imbalance at the peers. The upper bound of O(log n) can no longer be guaranteed.

IP Addresses as Virtual IDs How severe would be the load imbalance if we use the IP address of the node as its overlay identifier? Is it possible to find a boundary in the IP address such that distribution of peers is uniform and such that some form of locality is captured? Experimental Basis: Gnutella traces from June 2002 with 56M messages. 62,000 different IP addresses. Addresses were validated using a whois server and Ping.

IP Addresses as Virtual IDs

2,420 nodes. 20 keys per node.

IP Addresses as Virtual IDs Average CIDR prefix length for the address over 19 bits. Negative result. Provides us with an insight to propose a two-level overlay architecture. One global overlay, and several local overlays. A local overlay is formed with nodes that share the first 8 bits.

Cache Organization

Node Arrivals When joining the network, a node first joins the global overlay using the specific DHT protocol. After joining the global overlay, the new node contacts the rendezvous point of its domain to determine which local overlay it will join.

Simulation Setup Internet topology generated using GT-ITM topology generator. 10,000 overlay nodes selected randomly from the hosts. NLANR web proxy trace with 500,254 objects. Zipf distribution parameters: {0.75, 0.80, 0.85, 0.90, 0.95} Local cache size: 5MB (LRU replacement policy).

IP Addresses as Virtual IDs

Simulation Results Zipf-parameterCache Hit RatioGain %31.0% %33.5% %36.0% %38.7% %41.3%

Conclusions Use of IP addresses as virtual IDs would probably produce overlays with good locality properties, but the non-uniform population of nodes in the IP space leads to severe load imbalances and no guarantees on the number of hops exist. Two-level overlay architecture. Local overlays are created to cluster nodes that are close in the underlying network. The performance gains of a two-level architecture are significant, when compared with a single global overlay. The costs of maintaining the two-level architecture are very low.