Download presentation
Presentation is loading. Please wait.
Published byAmie Allison Modified over 6 years ago
1
Early Measurements of a Cluster-based Architecture for P2P Systems
Yinglian Xie Carnegie Mellon University Balachander Krishnamurthy, Jia Wang ATT Labs---Research
2
Motivation Peer-to-peer(P2P) applications provide us with a new content service model End-hosts self organized into an overlay network and share content with each other For a wide deployment of P2P applications We need a scalable content location and routing scheme in the application layer We need to study and understand P2P traffic patterns 11/8/2018
3
Recent Work Existing approaches for content location Recent designs
Napster: uses a centralized server Gnutella: relies on flooding of queries Recent designs Distributed indexing schemes based on hash functions CAN, Chord, Pastry, Tapestry 11/8/2018
4
Our Work A Cluster-based architecture (CAP) for P2P systems
Example application: distributed search (support keyword searching) Design: using network-aware clustering Early measurements of CAP trace analysis + simulations 11/8/2018
5
CAP System Design Network-aware clustering
B. Krishnamurthy and J.Wang. On Network-Aware Clustering of Web Clients. In proceedings of ACM Sigcomm, August 2000 An effective technique to group clients that are topologically close and under common administrative domain Apply network-aware clustering to P2P applications An additional level in the hierarchy Less dynamism More scalability 11/8/2018
6
CAP Architecture Three entities Two operations Clustering server
client delegate Clustering server Three entities Clustering server Delegate Client Two operations Node join and node leave Query lookup 11/8/2018
7
Inter-cluster Routing
Each query has a maximum search depth Each delegate keeps a neighbor list Assigned randomly when the delegate joins the network Updated gradually based on application requirements Depth-first search among neighbors 11/8/2018
8
CAP Evaluation Collect Gnutella traces, apply network-aware clustering in trace data analysis To examine the potential advantage of using network-aware clustering Trace-driven simulations Measure CAP system performance based on real deployment (ongoing work) 11/8/2018
9
Collecting Gnutella Trace
A modified open source Gnutella client (gnut) to passively monitor and log all Gnutella messages Location Trace length Number of IP addresses CMU 10 hours 799,386 ATT 14 hours 302,262 ACIRI 6 hours 185,905 Location Trace length Number of IP addresses CMU 89 hours 301,025 ATT 139 hours 261,094 UKY 96 hours 409,084 75 hours 292,759 WPI 10 hours 69,285 Table 1 Traces with unlimited connections Table 2 Traces with limited connections 11/8/2018
10
Cluster Distribution CMU trace
5/24/2001 – 5/25/2001, 799,386 IP addresses, 45,129 clusters Clustering helps reduce query latency by caching repeated queries 11/8/2018
11
Client and Cluster Distribution along Time
Network-aware clustering helps reduce dynamism in the P2P network 11/8/2018
12
Simulation Trace-driven simulation Performance metrics
Use Gnutella trace to generate “join, leave, search” Assume the query distribution follows the file distribution Performance metrics Hit rate Overhead Search Latency 11/8/2018
13
Hit Rate Use CMU trace 1,000 node stationary network 311 clusters
4,615search messages 3,793 unique files 11/8/2018
14
Overhead and Search Latency
Messages per search, forward operations per delegate In Gnutella, overhead grows exponentially In CAP, overhead grows linearly Search Latency Application level hop length In CAP, search path length is short 11/8/2018
15
Summary CAP is promising to increase stability and scalability of distributed applications Ongoing work: We are implementing CAP, deploying it in machines around the world, and measuring the performance 11/8/2018
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.