P2P Systems - 1 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Information Systems: A Short Overview of Peer-2-Peer Systems.

Slides:



Advertisements
Similar presentations
Performance in Decentralized Filesharing Networks Theodore Hong Freenet Project.
Advertisements

Peer-to-Peer and Social Networks Power law graphs Small world graphs.
1 Analyzing Kleinberg’s Small-world Model Chip Martel and Van Nguyen Computer Science Department; University of California at Davis.
Small-world networks.
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Scalable Content-Addressable Network Lintao Liu
Online Social Networks and Media Navigation in a small world.
Information Networks Small World Networks Lecture 5.
1 An Overview of Gnutella. 2 History The Gnutella network is a fully distributed alternative to the centralized Napster. Initial popularity of the network.
Identity and search in social networks Presented by Pooja Deodhar Duncan J. Watts, Peter Sheridan Dodds and M. E. J. Newman.
Company LOGO 1 Identity and Search in Social Networks D.J.Watts, P.S. Dodds, M.E.J. Newman Maryam Fazel-Zarandi.
Search in Power-Law Networks Presented by Hakim Weatherspoon CS294-4: Peer-to-Peer Systems Slides also borrowed from the following paper Path Finding Strategies.
CSE 522 – Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian.
Emergence of Scaling in Random Networks Barabasi & Albert Science, 1999 Routing map of the internet
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Small-World Graphs for High Performance Networking Reem Alshahrani Kent State University.
Small Worlds Presented by Geetha Akula For the Faculty of Department of Computer Science, CALSTATE LA. On 8 th June 07.
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Peer-to-Peer and Grid Computing Exercise Session 3 (TUD Student Use Only) ‏
1 Analyzing Kleinberg’s (and other) Small-world Models Chip Martel and Van Nguyen Computer Science Department; University of California at Davis.
Small World Networks Somsubhra Sharangi Computing Science, Simon Fraser University.
A. Frank 1 Internet Resources Discovery (IRD) Peer-to-Peer (P2P) Technology (1) Thanks to Carmit Valit and Olga Gamayunov.
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
The Small World Phenomenon: An Algorithmic Perspective by Anton Karatoun.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 Analyzing Kleinberg’s (and other) Small-world Models Chip Martel and Van Nguyen Computer Science Department; University of California at Davis.
Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.
P-Grid Presentation by Thierry Lopez P-Grid: A Self-organizing Structured P2P System Karl Aberer, Philippe Cudré-Mauroux, Anwitaman Datta, Zoran Despotovic,
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
P2P File Sharing Systems
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
P2P Architecture Case Study: Gnutella Network
1 Reading Report 4 Yin Chen 26 Feb 2004 Reference: Peer-to-Peer Architecture Case Study: Gnutella Network, Matei Ruoeanu, In Int. Conf. on Peer-to-Peer.
Developing Analytical Framework to Measure Robustness of Peer-to-Peer Networks Niloy Ganguly.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Small-world networks. What is it? Everyone talks about the small world phenomenon, but truly what is it? There are three landmark papers: Stanley Milgram.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Optimisation des DHT à partir des propriétés physiques, logiques et sociologiques des clients Pierre Fraigniaud CNRS LRI, Univ. Paris-Sud
Gennaro Cordasco - How Much Independent Should Individual Contacts be to Form a Small-World? - 19/12/2006 How Much Independent Should Individual Contacts.
Online Social Networks and Media
© 2002, Magdalena Punceva, EPFL-IC, Laboratoire de systèmes d'informations répartis Self-Organized Construction of Distributed Access Structures: A Comparative.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Navigation in small worlds Social Networks: Models and Applications Seminar Toronto, Fall 2007 (based on a presentation by Stratis Ioannidis)
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Performance Evaluation Lecture 1: Complex Networks Giovanni Neglia INRIA – EPI Maestro 10 December 2012.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Models and Algorithms for Complex Networks
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
Lecture 23: Structure of Networks
Identity and Search in Social Networks
Peer-to-Peer and Social Networks
Lecture 23: Structure of Networks
Navigation and Propagation in Networks
Presentation transcript:

P2P Systems - 1 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Information Systems: A Short Overview of Peer-2-Peer Systems Karl Aberer EPFL-IC lsirwww.epfl.ch

P2P Systems - 2 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Overview 1.P2P Systems - Motivation 2.Unstructured P2P Overlay Networks 3.Hierarchical P2P Overlay Networks 4.Structured P2P Overlay Networks 5.Small World Graphs 6.Conclusions

P2P Systems - 3 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Web search engine –Global scale application Example: Google – Mio searches/day –4 10^9 Web pages Client-Server Information Systems processors disks 1 Find "aberer" 2 Result home page of Karl Aberer … Google Server Client Strengths –Global ranking –Fast response time Weaknesses –Infrastructure, administration, cost –A new company for every global application ?

P2P Systems - 4 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis (Semi-)Decentralized Information Systems P2P Music file sharing –Global scale application Example: Napster –1.57 Mio. Users –10 TeraByte of data (2 Mio songs, 220 songs per user) (February 2001) 1 Find "brick in the wall" "pink floyd" "1 MB" "rock" schema 2 Result you find f.mp3 at peer x 3 Request and transfer file f.mp3 from peer X directly Napster Server Peer PeerX 100 servers

P2P Systems - 5 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Lessons Learned from Napster Strengths: Resource Sharing –Every node “pays” its participation by providing access to its resources physical resources (disk, network), knowledge (annotations), ownership (files) –Every participating node acts as both a client and a server (“servent”): P2P –global information system without huge investment –decentralization of cost and administration = avoiding resource bottlenecks Weaknesses: Centralization –server is single point of failure –unique entity required for controlling the system = design bottleneck –copying copyrighted material made Napster target of legal attack Centralized System Decentralized System increasing degree of resource sharing and decentralization

P2P Systems - 6 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis 2. Unstructured P2P Overlay Networks P2P file sharing –Global scale application Example: Gnutella – nodes, 3 Mio files (August 2000) Gnutella: no servers Strengths –Good response time, scalable –No infrastructure, no administration –No single point of failure Weaknesses –High network traffic –No structured search –Free-riding Find "brick in the wall" I have "brick_in_the_wall.mp3" …. Self-organizing System

P2P Systems - 7 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Self-Organization Self-organized systems well known from physics, biology, cybernetics –distribution of control ( = decentralization = P2P) –local interactions, information and decisions –emergence of global structures –failure resilience Self-organization in information systems –new hot topic in research –strongly motivated by P2P systems

P2P Systems - 8 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Connectivity in Gnutella Follows a power-law distribution: P(k) ~ k -g –k number of links a node is connected to, g constant (e.g. g=2) –distribution independent of number of nodes N –explanation: preferential attachment (self-organization process)

P2P Systems - 9 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis 3. Hierarchical P2P Overlay Networks Dedicated servers provide index information, i.e. know which peer holds which file (Napster) Simplest Approach –one central server –user register files –service (file exchange) is organized as P2P architecture k="madonna" index server

P2P Systems - 10 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Superpeer Networks Improvement of Central Index Server (Morpheus, Kaaza) –multiple index servers build a P2P network –clients are associated with one (or more) superpeers –superpeers use message flooding to forward search requests

P2P Systems - 11 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis 4. Structured P2P Overlay Networks Unstructured overlay networks – what we learned –simplicity (simple protocol) –robustness (almost impossible to “kill” – no central authority) Performance –search latency O(logN) –update cost low Drawbacks –tremendous bandwidth consumption Can we do better?

P2P Systems - 12 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis "Napster" bottleneck Search Trees Search tree: search keys are binary keys ?01?10?11? 0??1?? ??? index 101 ? ? ? ! peer 1peer 2peer 3peer 4

P2P Systems - 13 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Non-scalable Distribution of Search Tree Distribute search tree over peers ?01?10?11? 0??1?? ??? peer 1peer 2peer 3peer 4 bottleneck

P2P Systems - 14 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Scalable Distribution of Search Tree (P-Grid) ?01?10?11? 0??1?? ??? peer 1peer 2peer 3peer 4 Associate each peer with a complete path

P2P Systems - 15 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Routing Information ? 1?? ??? peer 1 peer 2 peer 3 peer 4 know more about this part of the tree knows more about this part of the tree

P2P Systems - 16 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Prefix Routing 11? 1?? ??? peer 4 peer 1 peer 2 peer ? 1?? ??? peer 1peer 2 peer 3 peer ? ? ? ? ! Message to peer ? prefixpeer 0??peer1 peer2 10?peer3 routing table of peer4 search(p. k) find in routing table peer i with longest prefix matching k if last entry then found else search(peer i, k)

P2P Systems - 17 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Efficient Resource Location search cost maximal bandwidth update cost low high BROADCAST (e.g. Gnutella) SERVER (e.g. Napster) FULL REPLICATION STRUCTURED OVERLAY NETWORKS (e.g. prefix routing)

P2P Systems - 18 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis 5. Small World Graphs Each overlay network can be interpreted as a directed graph –peers correspond to nodes –routing table entries as directed links Task –Find a decentralized algorithm (greedy routing) to route a message from any node A to any other node B with few hops compared to the size of the graph –Requires the existence of short paths in the graph

P2P Systems - 19 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Milgram’s Experiment Finding short chains of acquaintances linking pairs of people in USA who didn’t know each other; –Source person in Nebraska –Sends message with first name and location –Target person in Massachusetts. Average length of the chains that were completed was between 5 and 6 steps “Six degrees of separation” principle BIG QUESTION: –WHY there should be short chains of acquaintances linking together arbitrary pairs of strangers???

P2P Systems - 20 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Random Graphs For many years typical explanation was - random graphs –Vertices are selected uniformly at random –Low diameter: expected distance between two nodes is log k N (where k is the outdegree and N the number of nodes) But there are some inaccuracies –If A and B have a common friend C it is more likely that they themselves will be friends! (clustering) –Many real world networks (social networks, biological networks in nature, artificial networks – power grid, WWW) exhibit this clustering property –Random networks are NOT clustered.

P2P Systems - 21 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Clustering Clustering measures the fraction of neighbors of a node that are connected themselves Regular Graphs have a high clustering coefficient –but also a high diameter Random Graphs have a low clustering coefficient –but a low diameter Random Graph (k=4) Short path length L ~ log k N Almost no clustering C ~ k/n Regular Graph (k=4) Long paths L ~ n/(2k) Highly clustered C ~ 3/4

P2P Systems - 22 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Small-World Networks Random rewiring of regular graph (by Watts and Strogatz) –With probability p rewire each link in a regular graph to a randomly selected node –Resulting graph has high clustering and short path length BUT! Watts-Strogatz explains the structure of the graph –existence of short paths, high clustering It does not explain how the shortest paths are found –Gnutella networks are also small-world graphs

P2P Systems - 23 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis P2P Overlay Networks as Graphs Each structured overlay network can be interpreted as a directed graph … –peers correspond to nodes –routing table entries as directed links … embedded in some space –P-Grid: interval [0,1] –others: d-dimensional space –etc. Task –Find a decentralized algorithm (greedy routing) to route a message from any node A to any other node B with few hops compared to the size of the graph each node has a coordinate!

P2P Systems - 24 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Kleinberg’s Small-World Model Kleinberg’s Small-World’s model –Embed the graph into an r-dimensional grid –constant number of short range links (neighborhood) –q long range links: choose long-range links such that the probability to have a long range contact is proportional to 1/d r Importance of r ! –Decentralized (greedy) routing performs best iff. r = dimension of space r = 2

P2P Systems - 25 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Influence of “r” Given r = dim each long range contact of u is nearly equally likely to belong to any of the sets A i When q = logN – on average each node will have a link in each set of A i A i, consists of all nodes whose distance from u is between 2 i and 2 i+1, i=0..logN-1.

P2P Systems - 26 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Structured Overlay Networks and Kleinberg's model P-Grid’s model Kleinberg’s model

P2P Systems - 27 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis 6. Conclusions P2P systems –started out from some "hacker-type" applications –initiated lots of original research –basis for novel, highly scalable information systems

P2P Systems - 28 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Small Guide to Information Systems courses Introduction to information systems (Sem 6, mandatory) Basis of all: relational databases and Web technology, fun project Middleware (Masters) Everything you need in industry Distributed Information systems (Masters) Fun part: Web, P2P, Search Engines, and much more

P2P Systems - 29 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Masters Specialization "Internet Computing" INFNatural LanguageRajman et al6(4+2)SS INFAdvanced Databases Spaccapietra6(3+3)SS INFMultimedia DocumentsVanoirbeek6(4+2)SS ALGDistributed Inf. Sys.Aberer6(4+2)WS ALGIntelligent AgentsFaltings6(3+3)WS ALGDistributed algorithms: Schiper4(2+1)WS message passing SYSMiddlewareGuerraoui6(4+2)SS SYSPerformanceLeBoudec6(4+2)SS SYSMobile Networks Hubaux4(2+2)SS SYSCryptography and SecurityVaudenay6(4+2)WS HISHuman-Computer InteractionPu4(2+1)SS HISEnterprise ArchitectureWegmann6(4+2)WS HISE-BusinessPigneur6(4+2) WS 66 Credits Total