Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks Mohammed Nadeem Ahmed Department of Computer Science and Engineering Professor: Dr. Gautam Das T.A : Mr. Ashish Chawla
Outline What is Peer-to-Peer? Types of Peer-to-Peer Networks Chord Unstructured P2P Networks Napster, Gnutella & KaZaA Structured P2P Networks Chord Overview & Implementation Distributed Hash Tables Routing in Chord Chord Features Chord Performance Comparison in performances in Unstructured and Structured P2P networks
What is Peer-to-Peer Sharing of Free MP3 Files !! : ) Peer-to-Peer unlike Traditional Client-Server models Scalable Flexible Fault-Tolerant Easily Deployable
Types of Peer-to-Peer Networks Unstructured P2P Networks No global protocol Maintained by a Central Server or Peer Leader Easy to Deploy & Maintain Poor Search Efficiency Not Scalable
Types of Peer-to-Peer Systems Structured P2P Networks Global protocol ( Hash Functions) High Search Efficiency Natural Load Balancing Highly Scalable
Examples of Unstructured P2P Systems Napster Peers update central server about Content Peers get files by querying Central Server File transfer directly between peers
Examples of Unstructured P2P Systems Gnutella Peers form a logical overlay network Queries are broadcasted to neighbors Neighbors broadcast to their neighbors (Flooding) Query radius is limited (Not Scalable)
Examples of Unstructured P2P Systems KaZaA Hybrid approach Concept of Peer-leaders (just like C.S in Napster) Peer-Leaders broadcast queries to other Peer-Leaders. (just like in Gnutella)
Problems with Unstructured P2P What we have? Centralized : Napster Table size – O (n) Number of Hops – O (1) Flooding of Queries : Gnutella Table Size – O (1) Number of Hops – O (n)
What we want from P2P Efficiency Scalability Robustness: O (log (N)) messages per look up (where N is the total number of nodes in the network) Scalability O (log (N)) states per node Robustness: Ability to handle a lot of instabilities
Chord Overview Provides just one operation: Lookup (key) -> Node (IP Address) Chord is a lookup service, not a search engine. It’s a building block for P2P applications
Distributed Hash Tables The resource file (e.g. monalisa.jpg) is translated into a 160 bit identifier through the hash function SHA - 1 160 bit resource ID Similarly the node (IP address) is also translated into a 160 bit identifier through the same function. 160 bit node ID Node with the NodeID closest to that resource name ID is responsible for that resource.
Scalable Routing in Chord Chord can scale millions of nodes Uses Finger Tables Each node is responsible for a bunch of nodes In finger table row I at node n = successor (n + 2^i-1) Finger intervals increase with distance from node n If close, short hops If far, long hops Thus in a N-node network, needs to contact only O (log (N) nodes to find any node.
Chord Features Load Balancing Decentralization Scalability Uses distributed Hash Function Decentralization No node is as important as any other Scalability Only uses O (log (N)) to reach N nodes Availability Node responsible for a key can always be found Robustness Minimal join/leave disruption
Average messages per look up Chord Performance Average messages per look up Number of nodes
Summary: Unstructured vs Structured P2P No global protocol used Maintained by a C.S or Peer-Leader All nodes need to know every other node (O (n) Look-ups) Structured Global protocol involved (Hash Function) Completely Distributed Nodes need to know only a handful of others ( O log (N) Look-ups)
Unstructured Vs Structured P2P Contd: Unstable when nodes frequently join or leave. Scalability Problems Phrase-match queries Structured Stable in all conditions Highly Scalable Exact-Key match queries
Thank You