Building Low-Diameter P2P Networks Eli Upfal Department of Computer Science Brown University Joint work with Gopal Pandurangan and Prabhakar Raghavan.

Slides:



Advertisements
Similar presentations
Peer-to-Peer and Social Networks An overview of Gnutella.
Advertisements

Supporting Cooperative Caching in Disruption Tolerant Networks
The Theory of Zeta Graphs with an Application to Random Networks Christopher Ré Stanford.
1 Continuous random variables Continuous random variable Let X be such a random variable Takes on values in the real space  (-infinity; +infinity)  (lower.
Week 5 - Models of Complex Networks I Dr. Anthony Bonato Ryerson University AM8002 Fall 2014.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Online Social Networks and Media Navigation in a small world.
Delay and Throughput in Random Access Wireless Mesh Networks Nabhendra Bisnik, Alhussein Abouzeid ECSE Department Rensselaer Polytechnic Institute (RPI)
Rumors and Routes Rajmohan Rajaraman Northeastern University, Boston May 2012 Chennai Network Optimization WorkshopRumors and Routes1.
Structuring Unstructured Peer-to-Peer Networks Stefan Schmid Roger Wattenhofer Distributed Computing Group HiPC 2007 Goa, India.
A Distributed and Oblivious Heap Christian Scheideler and Stefan Schmid Dept. of Computer Science University of Paderborn.
Information Networks Small World Networks Lecture 5.
Technion –Israel Institute of Technology Computer Networks Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi.
1 An Overview of Gnutella. 2 History The Gnutella network is a fully distributed alternative to the centralized Napster. Initial popularity of the network.
Lecture 7 CS 728 Searchable Networks. Errata: Differences between Copying and Preferential Attachment In generative model: let p k be fraction of nodes.
Farnoush Banaei-Kashani and Cyrus Shahabi Criticality-based Analysis and Design of Unstructured P2P Networks as “ Complex Systems ” Mohammad Al-Rifai.
On the Spread of Viruses on the Internet Noam Berger Joint work with C. Borgs, J.T. Chayes and A. Saberi.
Antonis Papadogiannakis
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
1 Maximal Independent Set. 2 Independent Set (IS): In a graph, any set of nodes that are not adjacent.
Online Ramsey Games in Random Graphs Reto Spöhel Joint work with Martin Marciniszyn and Angelika Steger.
Small-world Overlay P2P Network
1 Complexity of Network Synchronization Raeda Naamnieh.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Evaluation of Ad hoc Routing Protocols under a Peer-to-Peer Application Authors: Leonardo Barbosa Isabela Siqueira Antonio A. Loureiro Federal University.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Online Ramsey Games in Random Graphs Reto Spöhel Joint work with Martin Marciniszyn and Angelika Steger.
1 Analyzing Kleinberg’s (and other) Small-world Models Chip Martel and Van Nguyen Computer Science Department; University of California at Davis.
Dynamic Hypercube Topology Stefan Schmid URAW 2005 Upper Rhine Algorithms Workshop University of Tübingen, Germany.
Advanced Topics in Data Mining Special focus: Social Networks.
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
A Distributed Search Service for Peer-to-Peer File Sharing in Mobile Application Presented by Tony Sung On Loy, MC Lab, CUHK IE 1 A Distributed Search.
1 Analyzing Kleinberg’s (and other) Small-world Models Chip Martel and Van Nguyen Computer Science Department; University of California at Davis.
Searching in Unstructured Networks Joining Theory with P-P2P.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Correctness of Gossip-Based Membership under Message Loss Maxim Gurevich, Idit Keidar Technion.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Cache Updates in a Peer-to-Peer Network of Mobile Agents Elias Leontiadis Vassilios V. Dimakopoulos Evaggelia Pitoura Department of Computer Science University.
Queuing Networks. Input source Queue Service mechanism arriving customers exiting customers Structure of Single Queuing Systems Note: 1.Customers need.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Developing Analytical Framework to Measure Robustness of Peer-to-Peer Networks Niloy Ganguly.
1 BitHoc: BitTorrent for wireless ad hoc networks Jointly with: Chadi Barakat Jayeoung Choi Anwar Al Hamra Thierry Turletti EPI PLANETE 28/02/2008 MAESTRO/PLANETE.
1 Constant Density Spanners for Wireless Ad-Hoc Networks Discrete Mathematics and Algorithms Seminar Melih Onus April
CCAN: Cache-based CAN Using the Small World Model Shanghai Jiaotong University Internet Computing R&D Center.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Online Social Networks and Media
Peer Pressure: Distributed Recovery in Gnutella Pedram Keyani Brian Larson Muthukumar Senthil Computer Science Department Stanford University.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Analyzing the Vulnerability of Superpeer Networks Against Attack Niloy Ganguly Department of Computer Science & Engineering Indian Institute of Technology,
K-Anycast Routing Schemes for Mobile Ad Hoc Networks 指導老師 : 黃鈴玲 教授 學生 : 李京釜.
Chord Advanced issues. Analysis Theorem. Search takes O (log N) time (Note that in general, 2 m may be much larger than N) Proof. After log N forwarding.
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Models of Web-Like Graphs: Integrated Approach
Analysis of the Evolution of Peer-to-Peer Systems Proseminar “Peer – to – Peer Information Systems” WS 04/05 Prof. Gerhard Weikum Speaker : Emil Zankov.
1 Roie Melamed, Technion AT&T Labs Araneola: A Scalable Reliable Multicast System for Dynamic Wide Area Environments Roie Melamed, Idit Keidar Technion.
1 “Hybrid Search Schemes for Unstructured Peer- to-Peer Networks” “Random Walks in Peer-to-Peer Networks” Christos Gkantsidis, Milena Mihail, Amin Saberi.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
An example of peer-to-peer application
Data Center Network Architectures
Peer-to-Peer and Social Networks
Peer-to-Peer and Social Networks
Early Measurements of a Cluster-based Architecture for P2P Systems
Determining the Peer Resource Contributions in a P2P Contract
Peer-to-Peer Information Systems Week 6: Performance
Introduction Wireless Ad-Hoc Network
Joydeep Chandra, Santosh Shaw and Niloy Ganguly
Deterministic and Semantically Organized Network Topology
Modelling and Searching Networks Lecture 6 – PA models
Network Science: A Short Introduction i3 Workshop
Presentation transcript:

Building Low-Diameter P2P Networks Eli Upfal Department of Computer Science Brown University Joint work with Gopal Pandurangan and Prabhakar Raghavan.

2 P2P Network A distributed network of computers; no distinction between a server and a client. A dynamic network: nodes (peers) and edges (currently established connections) appear and disappear over time. Nodes communicate using only local information. Advantages: decentralized computing (e.g., search), sharing data and resources. Real-life Systems: Gnutella, Freenet.

3 Gnutella Joining: Nodes contact a (central) host server to get entry-points to the network. Search: A node sends query to its neighbors. They in turn forward it to their neighbors: −decrement “Time to Live (TTL)’’ for query; −query dies when TTL = 0. Search answers sent back along requested path.

4 Desired Global Network Properties Connectivity and low-diameter are two global properties crucial for doing search. Maintaining (even) global connectivity under a dynamic setting is a non-trivial issue. Current real-life systems (e.g., Gnutella): − take an ad hoc approach − results in partitioning of the network into disconnected pieces Challenge is to design distributed protocols which operate with only local knowledge.

5 A Bad Scenario Each incoming node attaches to two recently joined nodes - Long chains - prone to disconnections

6 Our P2P Protocol Pandurangan, Raghavan, and Upfal; IEEE Symposium on Foundations of Computer Science (FOCS), A distributed protocol to build P2P networks with provable guarantees under a reasonable model: − connectivity. − logarithmic diameter. − constant degree. − low overhead. − operates with no global knowledge. − can be easily implemented with local message passing. To our knowledge, this is the first P2P protocol with provable guarantees on connectivity and diameter under a realistic dynamic setting.

7 The P2P Protocol: Preliminaries A set of rules applicable to various situations a node may find itself in: −How to join the network ? −What happens if a neighbor drops out ? −How to maintain bounded number of connections ? A central host server: −a gateway mechanism to enter the network. −maintains a cache - a list of (= constant) nodes (i.e., their IP addresses) at all times. −is reachable by all nodes at all times. −need not know the network topology nor the identities of all the nodes in the network.

8 The P2P Protocol On Arrival: Connect to ( ) random nodes chosen from the cache. Cache Replacement Rule: When a cache node reaches degree ( ) it is replaced from the cache by a node having degree. (Degree of all nodes bounded by a constant.) Reconnect Rule: If a node (say ) loses its neighbor it (re-)connects to a random node in cache with probability. : degree of before losing the neighbor. (The above probabilistic reconnection is crucial to maintaining bounded degree. Degree of any node.)

9 Illustration cache node c-node d-node c-node: node that was a cache node at some time. d-node: all other nodes.

10 The P2P Protocol (contd.) These rules turn out to be crucial for maintaining connectivity. Preferred Connection Rule: When a cache node leaves the cache it maintains a preferred connection to the node that replaced it in the cache. Preferred Reconnect Rule: If a node’s preferred connection is lost, then it reconnects to a random node in the cache which becomes its new preferred connection.

11 Illustration: Preferred Connection cache node c-node d-node c-node: node that was a cache node at some time. d-node: all other nodes. preferred connection

12 Our Stochastic Model Nodes arrive and depart in an uncoordinated and unpredictable fashion. A stochastic model for the dynamic setting: −Arrival of nodes: Poisson process with rate. −Duration of nodes: Independently and Exponentially distributed with parameter. A reasonable model: −Used in modeling similar scenarios e.g., the classical telephone trunking model in queuing theory. −Approximates real-life data fairly well; [Sariou et al. MMCN 2002] study of real P2P systems. −Insight into real-life performance.

13 A Stochastic Graph Process Let be the network at time. We analyze the evolution in time of the graph process. Network size depends only on the ratio. Theorem 1. a) For any, with high probability (w.h.p.). b) If then w.h.p.. (w.h.p. = ) w.l.o.g, we let, thus.

14 Connectivity Theorem 2. There is a constant such that at any given time, “The protocol maintains connectivity with large probability at any time after a short initial period.” Proof ideas: −preferred connections (form a “backbone”). −random selection of cache nodes in the arrival rule. Has a nice “self-correcting” property to recover from failures.

15 Connectivity: Intuition d-node cache node c-node

16 Theorem 3. For any, such that, has diameter with probability. “The protocol maintains logarithmic diameter with large probability at any point of time after the network has “stabilized” (say, after time from start).” Proof Ideas: −“reconnect” connections: “long-range”, “random”. −“ good” cache nodes: many reconnect connections. − many good cache nodes. − distance between any two nodes is. Diameter

17 Diameter: Intuition It is sufficient to analyze the distance between c-nodes. Let and be two c-nodes; then Pr( and are connected by a “reconnect” edge) Although the reconnect edges are “random”, their occurrence is not independent (unlike model of random graphs). More sophisticated analysis needed.

18 Good Cache Nodes A cache node is good if during its time in cache it receives at least (= fixed constant) connections such that: −they are “reconnect” connections; −they are not preferred connections; −they resulted from different nodes leaving the network. Color the above connections blue; Color all other connections red. Lemma 1. Let node enter the cache at time, where. Then, Pr( leaves the cache as a good node) −the blue edges are distributed uniformly at random among the nodes in the current network. −is independent of other c-nodes being good.

19 Proof Sketch of Lemma 1 Nodes join the network according to a Poisson process with rate. The expected number of connections to from an incoming node is. Nodes leave the network according to a Poisson process with rate. The expected number of connections to as a result of a node leaving the network is: Each connection to has a constant probability of being a reconnect connection. The probability of a node connecting to is All nodes have equal probability of connecting to. For a sufficiently large, the lemma holds.

20 Diameter: Expanding Neighborhoods For a c-node : : arbitrary connected cluster of c-nodes including, using red edges. : c-nodes that are connected by blue edges to, but are not in.

21 Expansion Lemma: Proof Sketch Lemma 2. If then w.h.p.. Proof sketch: is a good cache node with probability (Lemma 1) We use an exposure martingale to prove that is concentrated around its mean with high probability.

22 Replacing Cache Nodes Theorem 4. There is a constant such that at any time, a) With high probability there is always a d-node in the network to replace a saturated cache node. b) With probability the protocol finds a replacement d-node by searching only nodes.

23 Further Issues The network can maintain a bounded number of connections with high probability. The overhead involved per protocol step is constant. What if there are no preferred connections ? − Running the protocol without it leads to formation of many small disconnected components.