An Overview of Peer-to-Peer Sami Rollins 11/14/02.

Slides:



Advertisements
Similar presentations
Peer-to-Peer Infrastructure and Applications Andrew Herbert Microsoft Research, Cambridge
Advertisements

SkipNet: A Scalable Overlay Network with Practical Locality Properties Nick Harvey, Mike Jones, Stefan Saroiu, Marvin Theimer, Alec Wolman Microsoft Research.
An Overview of Peer-to-Peer Sami Rollins
Peer-to-peer and agent-based computing Peer-to-Peer Computing: Introduction.
Peer-to-Peer and Social Networks An overview of Gnutella.
Distributed Hash Tables
One Hop Lookups for Peer-to-Peer Overlays Anjali Gupta, Barbara Liskov, Rodrigo Rodrigues Laboratory for Computer Science, MIT.
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Scalable Content-Addressable Network Lintao Liu
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
An Overview of Peer-to-Peer Networking CPSC 441 (with thanks to Sami Rollins, UCSB)
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Peer-to-Peer Content Sharing. P2P File Sharing Benefits Why use a P2P model for a file sharing application?
Distributed Lookup Systems
Overlay Networks EECS 122: Lecture 18 Department of Electrical Engineering and Computer Sciences University of California Berkeley.
Beyond Napster: An Overview of Peer-to-Peer Systems and Applications Sami Rollins.
Object Naming & Content based Object Search 2/3/2003.
Content Addressable Networks. CAN Associate with each node and item a unique id in a d-dimensional space Goals –Scales to hundreds of thousands of nodes.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
A Measurement Study of Peer-to- Peer File Sharing Systems Sariou, Gummadi, and Gribble.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
CS 268: Overlay Networks: Distributed Hash Tables Kevin Lai May 1, 2001.
Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Introduction Widespread unstructured P2P network
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
Distributed Systems Concepts and Design Chapter 10: Peer-to-Peer Systems Bruce Hammer, Steve Wallis, Raymond Ho.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Peer-to-Peer Networking. Presentation Introduction Characteristics and Challenges of Peer-to-Peer Peer-to-Peer Applications Classification of Peer-to-Peer.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Colin J. MacDougall.  Class of Systems and Applications  “Employ distributed resources to perform a critical function in a decentralized manner”  Distributed.
Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.
Peer-to-Pee Computing HP Technical Report Chin-Yi Tsai.
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Peer-to-Peer Systems: An Overview Hongyu Li. Outline  Introduction  Characteristics of P2P  Algorithms  P2P Applications  Conclusion.
1 A Measurement Study of Peer-to-Peer File Sharing Systems by Stefan Saroiu P. Krishna Gummadi Steven D. Gribble Presentation by Nanda Kishore Lella
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Bruce Hammer, Steve Wallis, Raymond Ho
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Distributed Web Systems Peer-to-Peer Systems Lecturer Department University.
Peer-to-Peer Information Systems Week 12: Naming
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
CS 268: Lecture 22 (Peer-to-Peer Networks)
EE 122: Peer-to-Peer (P2P) Networks
CS 268: Peer-to-Peer Networks and Distributed Hash Tables
A Scalable content-addressable network
CS 162: P2P Networks Computer Science Division
An Overview of Peer-to-Peer
Peer-to-Peer Information Systems Week 12: Naming
Presentation transcript:

An Overview of Peer-to-Peer Sami Rollins 11/14/02

Outline P2P Overview –What is a peer? –Example applications –Benefits of P2P P2P Content Sharing –Challenges –Group management/data placement approaches –Measurement studies

What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing

What is a peer? Contrasted with Client-Server model Servers are centrally maintained and administered Client has fewer resources than a server

What is a peer? A peer’s resources are similar to the resources of the other participants P2P – peers communicating directly with other peers and sharing resources

Levels of P2P-ness P2P as a mindset –Slashdot P2P as a model –Gnutella P2P as an implementation choice –Application-layer multicast P2P as an inherent property –Ad-hoc networks

P2P Application Taxonomy P2P Systems Distributed Computing File Sharing Gnutella Collaboration Jabber Platforms JXTA

P2P Goals/Benefits Cost sharing Resource aggregation Improved scalability/reliability Increased autonomy Anonymity/privacy Dynamism Ad-hoc communication

P2P File Sharing Content exchange –Gnutella File systems –Oceanstore Filtering/mining –Opencola

P2P File Sharing Benefits Cost sharing Resource aggregation Improved scalability/reliability Anonymity/privacy Dynamism

Research Areas Peer discovery and group management Data location and placement Reliable and efficient file exchange Security/privacy/anonymity/trust

Current Research Group management and data placement –Chord, CAN, Tapestry, Pastry Anonymity –Publius Performance studies –Gnutella measurement study

Management/Placement Challenges Per-node state Bandwidth usage Search time Fault tolerance/resiliency

Approaches Centralized Flooding Document Routing

Centralized Napster model Benefits: –Efficient search –Limited bandwidth usage –No per-node state Drawbacks: –Central point of failure –Limited scale BobAlice JaneJudy

Flooding Gnutella model Benefits: –No central point of failure –Limited per-node state Drawbacks: –Slow searches –Bandwidth intensive Bob Alice Jane Judy Carl

Document Routing FreeNet, Chord, CAN, Tapestry, Pastry model Benefits: –More efficient searching –Limited per-node state Drawbacks: –Limited fault-tolerance vs redundancy ?

Document Routing – CAN Associate to each node and item a unique id in an d-dimensional space Goals –Scales to hundreds of thousands of nodes –Handles rapid arrival and failure of nodes Properties –Routing table size O(d) –Guarantees that a file is found in at most d*n 1/d steps, where n is the total number of nodes Slide modified from another presentation

CAN Example: Two Dimensional Space Space divided between nodes All nodes cover the entire space Each node covers either a square or a rectangular area of ratios 1:2 or 2:1 Example: –Node n1:(1, 2) first node that joins  cover the entire space n1 Slide modified from another presentation

CAN Example: Two Dimensional Space Node n2:(4, 2) joins  space is divided between n1 and n n1 n2 Slide modified from another presentation

CAN Example: Two Dimensional Space Node n2:(4, 2) joins  space is divided between n1 and n n1 n2 n3 Slide modified from another presentation

CAN Example: Two Dimensional Space Nodes n4:(5, 5) and n5:(6,6) join n1 n2 n3 n4 n5 Slide modified from another presentation

CAN Example: Two Dimensional Space Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6) Items: f1:(2,3); f2:(5,1); f3:(2,1); f4:(7,5); n1 n2 n3 n4 n5 f1 f2 f3 f4 Slide modified from another presentation

CAN Example: Two Dimensional Space Each item is stored by the node who owns its mapping in the space n1 n2 n3 n4 n5 f1 f2 f3 f4 Slide modified from another presentation

CAN: Query Example Each node knows its neighbors in the d-space Forward query to the neighbor that is closest to the query id Example: assume n1 queries f4 Can route around some failures –some failures require local flooding n1 n2 n3 n4 n5 f1 f2 f3 f4 Slide modified from another presentation

CAN: Query Example Each node knows its neighbors in the d-space Forward query to the neighbor that is closest to the query id Example: assume n1 queries f4 Can route around some failures –some failures require local flooding n1 n2 n3 n4 n5 f1 f2 f3 f4 Slide modified from another presentation

CAN: Query Example Each node knows its neighbors in the d-space Forward query to the neighbor that is closest to the query id Example: assume n1 queries f4 Can route around some failures –some failures require local flooding n1 n2 n3 n4 n5 f1 f2 f3 f4 Slide modified from another presentation

CAN: Query Example Each node knows its neighbors in the d-space Forward query to the neighbor that is closest to the query id Example: assume n1 queries f4 Can route around some failures –some failures require local flooding n1 n2 n3 n4 n5 f1 f2 f3 f4 Slide modified from another presentation

Node Failure Recovery Simple failures –know your neighbor’s neighbors –when a node fails, one of its neighbors takes over its zone More complex failure modes –simultaneous failure of multiple adjacent nodes –scoped flooding to discover neighbors –hopefully, a rare event Slide modified from another presentation

Document Routing – Chord MIT project Uni-dimensional ID space Keep track of log N nodes Search through log N nodes to find desired key N32 N10 N5 N20 N110 N99 N80 N60 K19

Doc Routing – Tapestry/Pastry Global mesh Suffix-based routing Uses underlying network distance in constructing mesh 13FE ABFE E 73FE 9990 F E 04FE 43FE

Comparing Guarantees log b NNeighbor map Pastry b log b N log b NGlobal MeshTapestry 2ddN 1/d Multi- dimensional CAN log N Uni- dimensional Chord StateSearchModel b log b N + b

Remaining Problems? Hard to handle highly dynamic environments Usable services Methods don’t consider peer characteristics

Measurement Studies “Free Riding on Gnutella” Most studies focus on Gnutella Want to determine how users behave Recommendations for the best way to design systems

Free Riding Results Who is sharing what? August 2000 The topShareAs percent of whole 333 hosts (1%)1,142,64537% 1,667 hosts (5%)2,182,08770% 3,334 hosts (10%)2,692,08287% 5,000 hosts (15%)2,928,90594% 6,667 hosts (20%)3,037,23298% 8,333 hosts (25%)3,082,57299%

Saroiu et al Study How many peers are server-like…client- like? –Bandwidth, latency Connectivity Who is sharing what?

Saroiu et al Study May 2001 Napster crawl –query index server and keep track of results –query about returned peers –don’t capture users sharing unpopular content Gnutella crawl –send out ping messages with large TTL

Results Overview Lots of heterogeneity between peers –Systems should consider peer capabilities Peers lie –Systems must be able to verify reported peer capabilities or measure true capabilities

Measured Bandwidth

Reported Bandwidth

Measured Latency

Measured Uptime

Number of Shared Files

Connectivity

Points of Discussion Is it all hype? Should P2P be a research area? Do P2P applications/systems have common research questions? What are the “killer apps” for P2P systems?

Conclusion P2P is an interesting and useful model There are lots of technical challenges to be solved