P2P Search COP5711. 2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

2/66 GET /index.html HTTP/1.0 HTTP/ OK... Clients Server.
INF 123 SW ARCH, DIST SYS & INTEROP LECTURE 12 Prof. Crista Lopes.
Peer to Peer and Distributed Hash Tables
Scalable Content-Addressable Network Lintao Liu
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
Peer-to-Peer Networks as a Distribution and Publishing Model Jorn De Boever (june 14, 2007)
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
CSc 461/561 CSc 461/561 Peer-to-Peer Streaming. CSc 461/561 Summary (1) Service Models (2) P2P challenges (3) Service Discovery (4) P2P Streaming (5)
1 Client-Server versus P2P  Client-server Computing  Purpose, definition, characteristics  Relationship to the GRID  Research issues  P2P Computing.
Object Naming & Content based Object Search 2/3/2003.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
P2P File Sharing Systems
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Peer-to-Peer Computing CS587x Lecture Department of Computer Science Iowa State University.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
Peer-to-Peer Overlay Networks. Outline Overview of P2P overlay networks Applications of overlay networks Classification of overlay networks – Structured.
1 P2P Computing. 2 What is P2P? Server-Client model.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Introduction of P2P systems
Peer-to-Peer Networks University of Jordan. Server/Client Model What?
Chapter 2: Application layer
Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Peer to Peer Networks By Cathy Chen CMSC 621, Fall 2007.
An Introduction to Peer-to-Peer Networks Presentation for MIE456 - Information Systems Infrastructure II Vinod Muthusamy October 30, 2003.
Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.
Structuring P2P networks for efficient searching Rishi Kant and Abderrahim Laabid Abderrahim Laabid.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
1 Slides from Richard Yang with minor modification Peer-to-Peer Systems: DHT and Swarming.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
1 Peer-to-Peer Systems r Application-layer architectures r Case study: BitTorrent r P2P Search and Distributed Hash Table (DHT)
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
PEER TO PEER (P2P) NETWORK By: Linda Rockson 11/28/06.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
Peer-to-Peer File Sharing Systems Group Meeting Speaker: Dr. Xiaowen Chu April 2, 2004 Centre for E-transformation Research Department of Computer Science.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Distributed Web Systems Peer-to-Peer Systems Lecturer Department University.
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
Advanced Computer Networks: Part 1
CHAPTER 3 Architectures for Distributed Systems
EE 122: Peer-to-Peer (P2P) Networks
A Scalable content-addressable network
#02 Peer to Peer Networking
Presentation transcript:

P2P Search COP5711

2 P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella Hybrid - partially decentralized  e.g., Freenet Structured P2P systems  DHT  CAN

P2P Network P2P network is an overlay network built on top of a real physical network (e.g., Internet) In a P2P network, peers are network nodes connected by virtual or logical links A logical link is a path through many physical links in the underlying network 3

4 Napster server (Central Catalog) (xyz.mp3, ) Napster: Publish a File Users upload their IP address and music titles they wish to share

5 Users search for peers to download desired files xyz.mp3 ? Napster: Query for a File Central Napster server

6 File transfer is P2P, using a proprietary protocol xyz.mp3 ? Napster: Transfer Requested File Central Napster server

7 Disadvantage of Centralized Directory Performance bottleneck Single point of failure Can we do it without a directory ?

8 Decentralized P2P - Gnutella No catalog Pings network to locate Gnutella peers File requests are broadcast to peers  Flooding or breadth-first research When provider is located, the file is transferred via HTTP

9 Who are my neighbors ? Gnutella: Join the Network Peers are Internet edges Special peer maintained by Gnutella Pings network to locate peers

10 xyz.mp3 ? Gnutella: Broadcast Request to Peers

11 Gnutella: Flood the Request (Breadth-first research) I have it.

12 xyz.mp3 Gnutella: Reply with the File (via HTTP) I have it.

13 Gnutella - Disadvantages Network flooding - unnecessary network traffic Using TTL - some files might not be found Alternatively,  using ultranodes (or supernodes)  using depth-first search, i.e., Freenet

14 Morpheus, Kazaa Flooding only the Supernodes Supernode Layer

15 Using Ultranodes Queries flood only the network of ultranodes Other peer nodes shielded from query traffic Combine the benefits of centralized and decentralized search; Take advantage of the heterogeneity in peer capabilities;

16 Freenet - Depth-First Search

17 Freenet – File not Found The requested file not found due to a poor routing decision made at peer D In this case, query backs out of the dead-end, and tries another peer in depth-first manner I have file X

Using Distributed Directory Data objects are everywhere Distribute subsets of the data directory among peers If we can find the relevant sub-directory, we can locate the data object 18 Directory Data Objects Sub-directory

19 How to Bound Search Space ? Basic Idea - Hashing Hash key Object “y” Objects have hash keys Peer “x” Peer nodes also have hash keys in the same hash space P2P Network yx H(y)H(x) Join (H(x)) Publish (H(y)) Place location information about an object at the peer with closest hash keys (i.e., a distributed directory)

20 Viewed as a Distributed Hash Table Hash table Peer nodes Each peer node is responsible for a range of the hash table, according to the peer hash key Location information about Objects are placed in the peer with the closest key (information redundancy)

21 How to Find an Object ? Looks for a peer /w the corresponding peer hash key  A peer knows its logical neighbors  Find peer X based on multihop routing  X knows who has the object Hash table Peer node X Peer Y has the file

22 K V Dynamic Hash Table (DHT) in action

23 K V DHT in action

24 K V DHT in action: put() insert(K 1,V 1 ) Operation: Route message, “I have the file,” to node holding key K 1 Want to share a file

25 (K 1,V 1 ) K V DHT in action: put() Operation: take key as input; route messages to node holding key

26 retrieve (K 1 ) K V DHT in action: get() Operation: Retrieve message V 1 at node holding key K 1

27 K V DHT in action Retrieve file according to V 1

28 Still Flooding Still flood the network although intermediate nodes do not need to search Can we avoid flooding ?

29 CAN – Content Addressable Network Each peer is responsible for one zone, i.e., stores all (key, value) pairs of the zone Each peer knows the neighbors of its zone Random assignment of peers to zones at startup – split zone if not empty Dimensional-ordered multihop routing

30 CAN: Object Publishing node I::publish(K,V) I

31 (1) a = h x (K) CAN: Object Publishing x = a node I::publish(K,V) I

32 (1) a = h x (K) b = h y (K) CAN: Object Publishing x = a y = b node I::publish(K,V) I

33 (1) a = h x (K) b = h y (K) CAN: Object Publishing (2) route (K,V) -> J node I::publish(K,V) I J

34 (2) route (K,V) -> J (3) J stores (K,V) CAN: Object Publishing (K,V) node I::publish(K,V) I (1) a = h x (K) b = h y (K) J

35 (2) route “retrieve(K)” to J that is in charge of (a,b) (K,V) (1) a = h x (K) b = h y (K) node I::retrieve(K) I CAN: Object Retrieval J

36 Maintenance Inform neighbors that you are alive at discrete time interval t If your neighbor does not send alive message in time t, takeover its zone

P2P Benefits Efficient use of resources  Use unused bandwidth, storage, and processing power at the edge of the network Scalability  Consumers of resources also donate resources Reliability  Replicas, geographic distribution  No single point of failure Ease of administration  Self organized nodes  Built-in reliability and load balancing 37

Some Prototypes at UCF iSEE (Internet-scale Sensor Exploration Environement)  Publishing real-time sensor data  Browsing and querying real-time sensor data P2P Video Streaming for VoD and Live Broadcast Applications 38