Download presentation
Presentation is loading. Please wait.
Published byBrandon Thornton Modified over 9 years ago
1
Introduction to Peer-to-Peer Networks
2
What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed at the edge of the Internet to build a network that allows resource sharing without any central authority Client-Server vs. Peer-to-peer. A peer is both a client and a server. Control is decentralized. Much more than a system for sharing pirated music.
3
Why does P2P need attention?
4
A P2P network is an overlay network Network of peers. Each link between peers consists of one or more IP links. The overlay network resides in the application layer. Alice Bob Carol
5
Well-known P2P Systems Napster Gnutella KaZaA eDpnkey Chord Tapestry CAN Pastry BitTorrent
6
Some important issues Search Storage Security Applications
7
A Distributed Storage Service Alice Bob Carol David
8
Promises Consider File Sharing as an Example –Available 24/7 –Durable despite machine failures –Information is protected –Resilient to Denial of Service
9
Additional Goals Massive scalability Anonymity Deniability Resistance to censorship
10
Challenges A P2P network must be self-organizing. Join and leave operations must be self-managed. The infrastructure is untrusted and the components are unreliable. The number of faulty nodes grows linearly with system size. Yet, the aggregate behavior has to be trustworthy.
11
Challenges Tolerance to failures and churn Efficient routing even if the structure of the network is unpredictable. Dealing with freeriders Load balancing Security issues
12
Looking up data How do you locate data/files/objects in a large P2P system built around a dynamic set of nodes in a scalable manner without any centralized server or hierarchy? Napster index servers used a central database. Questionable scalability and poor resilience. Check how names are looked up in internet’s DNS.
13
Napster Developed by Shawn Fanning in 1999, Shut down after 2 years for copyright infringement. Centralized directory servers were a bottleneck.. Root/ Redirector Directory server Directory server Directory server Users INTERNETINTERNET Stores indices of songs only
14
Gnutella Truly decentralized system. A search like where is Double Helix? is based on the flooding of the query on a graph of arbitrary topology. Obvious scalability problem, and the wastage of bandwidth caused serious inefficiencies.
15
Gnutella graph Client looking for “ double helix ” double helix
16
Unstructured vs. Structured Unstructured P2P networks allow resources to be placed at any node. The network topology is arbitrary, and the growth is spontaneous. Structured P2P networks simplify resource location and load balancing by defining a topology and defining rules for resource placement.
17
Distributed Hash Table (DHT) Object-to-machine mapping uses unique keys. H (object name) = key (H = hash function) H (machine name) = key Object name mapped to key k is placed in machine whose name is mapped to key k. Simplifies object location.
18
Distributed Hash Table (DHT) keyspace a c b 0 N-1 Machine name hashed to b Object name hashed to b Basic idea
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.