Distributed Computing Group Roger Wattenhofer P2P P2P 2005 Past 2 Present.

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
A Blueprint for Constructing Peer-to-Peer Systems Robust to Dynamic Worst-Case Joins and Leaves Fabian Kuhn, Microsoft Research, Silicon Valley Stefan.
Fabian Kuhn, Microsoft Research, Silicon Valley
Structuring Unstructured Peer-to-Peer Networks Stefan Schmid Roger Wattenhofer Distributed Computing Group HiPC 2007 Goa, India.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Distributed Computing Group Roger Wattenhofer Ad-Hoc and Sensor Networks Worst-Case vs. Average-Case IZS 2004.
1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Algorithms for Ad Hoc Networks Roger Wattenhofer MedHocNet 2005 alg hoc net 2005.
Ad-Hoc Networks Beyond Unit Disk Graphs
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Routing, Anycast, and Multicast for Mesh and Sensor Networks Roland Flury Roger Wattenhofer RAM Distributed Computing Group.
Churn and Selfishness: Two Peer-to-Peer Computing Challenges Invited Talk University of California, Berkeley 380 Soda Hall March, 2006 Stefan Schmid Distributed.
On the Topologies Formed by Selfish Peers Thomas Moscibroda Stefan Schmid Roger Wattenhofer IPTPS 2006 Santa Barbara, California, USA.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Taming Dynamic and Selfish Peers “Peer-to-Peer Systems and Applications” Dagstuhl Seminar March 26th-29th, 2006 Stefan Schmid Distributed Computing Group.
Randomized 3D Geographic Routing Roland Flury Roger Wattenhofer Distributed Computing Group.
Distributed Computing Group A Self-Repairing Peer-to-Peer System Resilient to Dynamic Adversarial Churn Fabian Kuhn Stefan Schmid Roger Wattenhofer IPTPS.
Dynamic Hypercube Topology Stefan Schmid URAW 2005 Upper Rhine Algorithms Workshop University of Tübingen, Germany.
Fault-tolerant Routing in Peer-to-Peer Systems James Aspnes Zoë Diamadi Gauri Shah Yale University PODC 2002.
Building Low-Diameter P2P Networks Eli Upfal Department of Computer Science Brown University Joint work with Gopal Pandurangan and Prabhakar Raghavan.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
ETH Zurich – Distributed Computing Group Jasmin Smula 1ETH Zurich – Distributed Computing – Stephan Holzer Yvonne Anne Pignolet Jasmin.
A TCP With Guaranteed Performance in Networks with Dynamic Congestion and Random Wireless Losses Stefan Schmid, ETH Zurich Roger Wattenhofer, ETH Zurich.
Aggregating Information in Peer-to-Peer Systems for Improved Join and Leave Distributed Computing Group Keno Albrecht Ruedi Arnold Michael Gähwiler Roger.
Distributed Computing Group Locality and the Hardness of Distributed Approximation Thomas Moscibroda Joint work with: Fabian Kuhn, Roger Wattenhofer.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Algorithmic Models for Sensor Networks Stefan Schmid and Roger Wattenhofer WPDRTS, Island of Rhodes, Greece, 2006.
GS 3 GS 3 : Scalable Self-configuration and Self-healing in Wireless Networks Hongwei Zhang & Anish Arora.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 2 ARCHITECTURES.
Improved Sparse Covers for Graphs Excluding a Fixed Minor Ryan LaFortune (RPI), Costas Busch (LSU), and Srikanta Tirthapura (ISU)
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Models and Techniques for Communication in Dynamic Networks Christian Scheideler Dept. of Computer Science Johns Hopkins University.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
1 BitHoc: BitTorrent for wireless ad hoc networks Jointly with: Chadi Barakat Jayeoung Choi Anwar Al Hamra Thierry Turletti EPI PLANETE 28/02/2008 MAESTRO/PLANETE.
1 Constant Density Spanners for Wireless Ad-Hoc Networks Discrete Mathematics and Algorithms Seminar Melih Onus April
CCAN: Cache-based CAN Using the Small World Model Shanghai Jiaotong University Internet Computing R&D Center.
Architectures of distributed systems Fundamental Models
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
PODC Distributed Computation of the Mode Fabian Kuhn Thomas Locher ETH Zurich, Switzerland Stefan Schmid TU Munich, Germany TexPoint fonts used in.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
A Membership Management Protocol for Mobile P2P Networks Mohamed Karim SBAI, Emna SALHI, Chadi BARAKAT.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
Peer to Peer Network Design Discovery and Routing algorithms
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Peer-to-peer Systems All slides © IG.
Peer-to-Peer Networks 04: Chord Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
Peer-to-Peer Information Systems Week 12: Naming
Monitoring Churn in Wireless Networks
Wireless Sensor Networks 7. Geometric Routing
Accessing nearby copies of replicated objects
SKIP GRAPHS James Aspnes Gauri Shah SODA 2003.
DHT Routing Geometries and Chord
Algorithms for Ad Hoc Networks
Peer-to-Peer Information Systems Week 12: Naming
Routing in Networks with Low Doubling Dimension
Presentation transcript:

Distributed Computing Group Roger Wattenhofer P2P P2P 2005 Past 2 Present

Roger Wattenhofer, ETH P2P My Research Distributed Computing Theory Wireless Networking (Ad Hoc, Sensor) Distributed Systems P2P Disclaimer: I’m a P2P ignorant giving a P2P talk

Roger Wattenhofer, ETH P2P P2P vs. Ad Hoc/Sensor Networking Often considered to be “similar” –Without infrastructure, without servers, etc. –Routing is essential –Both feature some sort of topology control (“What are the neighbors?”) Major differences –Internet vs. wireless (interference, MAC layer, etc.) –Graph theory vs. geometry (…really?!?) –Churn vs. mobility –Completely different applications

Roger Wattenhofer, ETH P2P P2P vs. Distributed Computing/Systems Ignorant’s Lemma 1: P2P research is the child of successful file sharing applications a la Napster and the distributed computing/systems community Let’s first try to “prove” Lemma 1 … and then convince you about Corollary 2 Ignorant’s Corollary 2: A child should learn from his/her parents

Roger Wattenhofer, ETH P2P Overview Introduction Past –What is the first P2P paper/system? –Really? Present

Roger Wattenhofer, ETH P2P The Four P2P Evangelists If you read your average P2P paper, there are (almost) always four papers cited who “invented” efficient P2P in 2001: These papers are somewhat similar, with the exception of CAN (which is not really efficient) So what’s the „Dead Sea Scrolls of P2P”? ChordCANPastryTapestry

Roger Wattenhofer, ETH P2P “Dead Sea Scrolls of P2P” „Accessing Nearby Copies of Replicated Objects in a Distributed Environment“, by Greg Plaxton, Rajmohan Rajaraman, and Andrea Richa, at SPAA Basically, the paper proposes an efficient search routine (similar to the evangelist papers). In particular search, insert, delete, storage costs are all logarithmic, the base of the logarithm is a parameter. However, it‘s a theory paper, so that alone would be too simple... So the paper takes into account latency; in particular it is assumed that nodes are living in a metric, and that the graph is of „bounded growth“ (meaning that node densities do not change abruptly).

Roger Wattenhofer, ETH P2P Genealogy of P2P Chord CAN Pastry Tapestry 2001 Napster KademliaP-GridViceroy SkipGraphSkipNet 2003 Plaxton et al. Koorde Gnutella Kazaa Gnutella-2 eDonkey BitTorrent Skype Steam WWW, POTS, etc. PS3 The parents of Plaxton et al.?

Roger Wattenhofer, ETH P2P Overview Introduction Past –What is the first P2P paper/system? –Really? Present

Roger Wattenhofer, ETH P2P Consistent Hashing “Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web.” David Karger, Eric Lehman, Tom Leighton, Matthew Levine, Daniel Lewin and Rina Panigrahy, at STOC Big difference: still a client/server paradigm.

Roger Wattenhofer, ETH P2P Locating Shared Objects “Sparse Partitions”. Baruch Awerbuch and David Peleg, at FOCS “Concurrent Online Tracking of Mobile Users”. Baruch Awerbuch and David Peleg, at SIGCOMM “Locating Nearby Copies of Replicated Internet Servers”. James Guyton and Michael Schwartz, at SIGCOMM “A Model for Worldwide Tracking of Distributed Objects”. Marteen van Steen, Franz Hauck, Andrew Tanenbaum, at TINA Maintaining a distributed directory.

Roger Wattenhofer, ETH P2P Compact Routing “A trade-off between space and efficiency for routing tables”. David Peleg and Eli Upfal, at STOC Trade-off routing table memory space vs. stretch (quality of routes) Name-independent vs. labeled routing –Name-independent: the node names are fixed (like in a regular network) –Labeled: a designer can choose names (P2P) In particular interesting if latency does matter.

Roger Wattenhofer, ETH P2P Hypercubic Topologies In my lecture Distributed Computing I teach six topologies: –Hypercube –Butterfly / Benes Network –DeBruijn Graph –Skip List –Pancake Graph –Cube-Connected-Cycles ChordKademlia Viceroy SkipGraphSkipNet Plaxton et al. Koorde Your-name-here Kuhn et al.

Roger Wattenhofer, ETH P2P Overview Introduction Past Present* –Dynamic systems & mobility –Fault-tolerance (crash failures) –Security (Byzantine failures) –Selfish agents & computational economy –Simple and implementable algorithms –Local algorithms –Geometry, metrics, bounded growth, etc. –Applications *current hot topics in distributed computing

Roger Wattenhofer, ETH P2P Dynamic Peer-to-Peer Systems Properties compared to centralized client/server approach –Availability, Reliability, Efficiency  Peers may join and leave the network at any time! However, P2P systems are –composed of unreliable desktop machines –under control of individual users “A Self-Repairing Peer-to-Peer System Resilient to Dynamic Adversarial Churn”. Fabian Kuhn, Stefan Schmid, Roger Wattenhofer, at IPTPS 2005.

Roger Wattenhofer, ETH P2P Churn (permanent joins and leaves) How to maintain desirable properties such as –Connectivity, –Network diameter, –Peer degree?

Roger Wattenhofer, ETH P2P Motivation Why permanent churn? Saroiu et al.: „A Measurement Study of P2P File Sharing Systems“ Peers join system for one hour on average  Hundreds of changes per second with millions of peers in system! Why adversarial (worst-case) churn? E.g., a crawler takes down neighboring machines rather than randomly chosen peers!

Roger Wattenhofer, ETH P2P The Adversary Model worst-case faults with an adversary ADV(J,L, ) ADV(J,L, ) has complete visibility of the entire state of the system May add at most J and remove at most L peers in any time period of length Note: Adversary is not Byzantine!

Roger Wattenhofer, ETH P2P Synchronous Model Our system is synchronous, i.e., our algorithms run in rounds. One round: –receive messages, –local computation, –send messages However: Real distributed systems are asynchronous! But: Notion of time necessary to bound the adversary

Roger Wattenhofer, ETH P2P A First Approach What if number of peers is not 2 i ? How to prevent degeneration? Where to store data? Idea: Simulate the hypercube Fault-tolerant hypercube?

Roger Wattenhofer, ETH P2P Simulated Hypercube System Basic components: Simulation: Each node consists of several peers Route peers to sparse areas Adapt dimension Token distribution Information aggregation

Roger Wattenhofer, ETH P2P Example: Information Aggregation Algorithm: Count peers in every sub-cube by exchange with corresponding neighbor Correct number after d steps!

Roger Wattenhofer, ETH P2P Results All our algorithms (token distribution and data aggregation) consistently run in the background. We can tolerate an adversary who can insert/delete O(log n) peers per maximum message delay. Our system is never fully repaired, but always fully functional. In detail, we have in spite of ADV(O(log n),O(log n),1): –always at least one peer per node, –at most O(log n) peers per node, –network diameter O(log n), –peer degree O(log n).

Roger Wattenhofer, ETH P2P Overview Introduction Past Present –Dynamic systems & mobility –Fault-tolerance (crash failures) –Security (Byzantine failures) –Selfish agents & computational economy –Simple and implementable algorithms –Local algorithms –Geometry, metrics, bounded growth, etc. –Applications

Roger Wattenhofer, ETH P2P Byzantine Failures If adversary controls more and more corrupted nodes and then crashes all of them at the same time (“sleepers”), we stand no chance. “Robust Distributed Name Service”. Baruch Awerbuch and Christian Scheideler, at IPTPS Idea: Assume that the Byzantine peers are the minority. If the corrupted nodes are the majority in a specific part of the system, they can be detected (because of their unusual high density).

Roger Wattenhofer, ETH P2P Overview Introduction Past Present –Dynamic systems & mobility –Fault-tolerance (crash failures) –Security (Byzantine failures) –Selfish agents & computational economy –Simple and implementable algorithms –Local algorithms –Geometry, metrics, bounded growth, etc. –Applications

Roger Wattenhofer, ETH P2P Selfish Agents Freeloading…How to generalize BitTorrent’s “tit4tat” mechanism? But also: In unstructured P2P systems: Who should I connect to? –I want to be highly connected since this improves my searches –I want to have few neighbors only (forward too many searches) –Hypercubic networks probably are a “socially efficient” solution, however, if every node acts selfishly, do we end up with a hypercubic network?!? “On a network creation game”. Alex Fabrikant, Ankur Luthra, Elitza Maneva, Christos H. Papadimitriou, Scott Shenker, at PODC 2003

Roger Wattenhofer, ETH P2P Overview Introduction Past Present –Dynamic systems & mobility –Fault-tolerance (crash failures) –Security (Byzantine failures) –Selfish agents & computational economy –Simple and implementable algorithms –Local algorithms –Geometry, metrics, bounded growth, etc. –Applications

Roger Wattenhofer, ETH P2P Unstructured P2P: Who should I connect to? How do I figure out that the yellow node is farther away? Idea: Cluster the network using a generalized MIS (  -net). “Structuring Unstructured P2P Networks”. Stefan Schmid, Roger Wattenhofer, in submission. unstructured P2P ?

Roger Wattenhofer, ETH P2P Overview Introduction Past Present –Dynamic systems & mobility –Fault-tolerance (crash failures) –Security (Byzantine failures) –Selfish agents & computational economy –Simple and implementable algorithms –Local algorithms –Geometry, metrics, bounded growth, etc. –Applications

Roger Wattenhofer, ETH P2P Local Algorithms A Dominating Set DS is a subset of nodes such that each node is either in DS or has a neighbor in DS. It might be favorable to have few nodes in the DS. This is known as the Minimum DS problem. This by itself is a hard problem, however, the solution must be local (global solutions are impractical in dynamic P2P networks) – topology of graph “far away” should not influence a local decision. “Constant-Time Distributed Dominating Set Approximation”. Fabian Kuhn, Roger Wattenhofer, at PODC 2003.

Roger Wattenhofer, ETH P2P Algorithm Overview Input: Local Graph Fractional Dominating Set Dominating Set 0.5 Phase B: Probabilistic algorithm Phase A: Distributed linear program rel. high degree gives high value

Roger Wattenhofer, ETH P2P A Lower Bound “What cannot be computed locally!” Fabian Kuhn, Thomas Moscibroda, Roger Wattenhofer, at PODC Model: In a network/graph G (nodes = processors), each node can exchange a message with all its neighbors for k rounds. After k rounds, the node needs to decide. We construct a graph such that there are nodes that see the same neighborhood up to distance k. We show that node ID’s do not help, and using Yao’s principle also randomization does not.

Roger Wattenhofer, ETH P2P Lower Bound for Dominating Sets: Intuition… m n-1 complete n m m … n n n Two graphs (m << n). Optimal dominating sets are marked red. |DS OPT | = 2. |DS OPT | = m+1.

Roger Wattenhofer, ETH P2P Lower Bound for Dominating Sets: Intuition… In local algorithms, nodes must decide only using local knowledge. In the example green nodes see exactly the same neighborhood. So these green nodes must decide the same way! m n-1 n m …

Roger Wattenhofer, ETH P2P Lower Bound for Dominating Sets: Intuition… m n-1 complete n m m … n n n But however they decide, one way will be devastating (with n = m 2 )! |DS OPT | = 2. |DS OPT without green | ¸ m. |DS OPT | = m+1. |DS OPT with green | > n

Roger Wattenhofer, ETH P2P Results Many problems (vertex cover, dominating set, matching, independent set,  -net, etc.) cannot be approximated better than  (n c/k 2 / k) and/or  (  1/k / k). It follows that a polylogarithmic approximation of many standard problems needs at least  (log  / loglog  ) and/or  ((log n / loglog n) 1/2 ) time. For some (exotic) problems this is tight.

Roger Wattenhofer, ETH P2P Overview Introduction Past Present –Dynamic systems & mobility –Fault-tolerance (crash failures) –Security (Byzantine failures) –Selfish agents & computational economy –Simple and implementable algorithms –Local algorithms –Geometry, metrics, bounded growth, etc. –Applications

Roger Wattenhofer, ETH P2P Geometry strikes back! Having a logarithmic number of hops is nice, however, hopping back and forth over continents is a major nuisance. So instead of placing joining nodes randomly into the structured P2P system, one might think of placing nodes such that the total latency of a search is small. In other words, geographically close nodes should also be close in the topology. In fact, this was already the topic of the Plaxton et al. paper, but it’s certainly coming back. These days people have new models for the Internet graph (“almost metric”) which allow for new exciting results. “Competitive Algorithms for Distributed Data Management”. Yair Bartal, Amos Fiat, and Yuval Rabani, at STOC 1992.

Roger Wattenhofer, ETH P2P Overview Introduction Past Present –Dynamic systems & mobility –Fault-tolerance (crash failures) –Security (Byzantine failures) –Selfish agents & computational economy –Simple and implementable algorithms –Local algorithms –Geometry, metrics, bounded growth, etc. –Applications

Roger Wattenhofer, ETH P2P SP.a.M/\TØ – An Extendable Spam Filter System Collaborative spam filter, users report spam: Principle Idea: Reported spam is stored in DHT repository. Problems: –These days spams are personalized and/or randomized, so the DHT needs some form of proximity search. –What about Mr. Bad Guy filling in wrong reports (or what about that some classify as spam and others as ham)? A trust system is needed. Available for Windows/Outlook, Thunderbird, Mozilla, and all other mail clients through a proxy. More info on

Roger Wattenhofer, ETH P2P Conclusions The most exciting years of P2P still to come! On the file-sharing side we see the first structured systems (Kad) On the research/theory side there are a bunch of stimulating areas: –Dynamic systems & mobility –Fault-tolerance (crash failures) –Security (Byzantine failures) –Selfish agents & computational economy –Simple (implementable) algorithms –Local algorithms –Geometry, metrics, bounded growth, etc. Last not least applications beyond file sharing are emerging!

Questions? Comments? Distributed Computing Group Roger Wattenhofer