Download presentation
Presentation is loading. Please wait.
1
Rendezvous Points-Based Scalable Content Discovery with Load Balancing Jun Gao Peter Steenkiste Computer Science Department Carnegie Mellon University October 24th, 2002 NGC 2002, Boston, USA
2
Jun GaoCarnegie Mellon University2 Outline Content Discovery System (CDS) Existing solutions CDS system design Simulation evaluation Conclusions
3
Jun GaoCarnegie Mellon University3 Content Discovery System (CDS) Example: a highway monitoring service Cameras and sensors monitor road and traffic status Users issue flexible queries CDS enables content discovery Locate contents that match queries Example services Service discovery; P2P; pub/sub; sensor networks Snapshot from traffic.com
4
Jun GaoCarnegie Mellon University4 Comparison of Existing Solutions Searchability Centralized Distributed Registration flooding Query broadcasting Hash-based Graph-based Tree-based Scalability Robustness yes Hierarchical names Look-up yes?no yes no yes no Load balancing yes?no yes? no Design Goals Solutions
5
Jun GaoCarnegie Mellon University5 CDS Design Attribute-value pair based naming scheme Enable searchability Peer-to-peer system architecture Robust distributed system Rendezvous Points-based content discovery Improve scalability Load Balancing Matrix Dynamic balance load
6
Jun GaoCarnegie Mellon University6 Naming Scheme Based on Attribute-Value pairs CN: {a 1 =v 1, a 2 =v 2,..., a n =v n } Not necessarily hierarchical Attribute can be dynamic Searchable via subset matching Q CN Number of matches for a CN is large 2 n -1 Camera ID = 5562 Highway = I-279 Exit = 4 City = Pittsburgh Speed = 25mph Road condition = Icy Highway = I-279 Exit = 4 City = Pittsburgh Q1 Q2 CN1 City = Pittsburgh Speed = 25mph
7
Jun GaoCarnegie Mellon University7 Distributed Infrastructure Hash-based overlay substrate Routing, forwarding, management Node ID Hash function H(node) Application layer publishes contents or issues queries CDS layer determines where to register contents and send queries Centralized and network-wide flooding are not scalable Idea: use a small set of nodes as Rendezvous Points TCP/IP Hash-based Overlay CDS Application N1 N6 N4 N5 N3 N2 N9 N8 N7
8
Jun GaoCarnegie Mellon University8 RP-based Scheme Hash each AV-pair to get a set of RPs | RP | = n RP node stores names that share the same pair Maintain with soft state Query is sent directly to an RP node Use the least loaded RP RP node fully resolves locally N3 N5 CN1 N4 N6 CN1: {a1=v1, a2=v2, a3=v3, a4=v4} RP2 CN2 CN2: {a1=v1, a2=v2, a5=v5, a6=v6} N2 N1 RP1 N i H(a i =v i ) N7 N8 N9 ? Q:{a1=v1, a2=v2}
9
Jun GaoCarnegie Mellon University9 System Properties Efficient registration and query O(n) registration messages; n small O(m) messages for query with probing Hashing AV-pair individually ensures subset matching Query may contain only 1 AV-pair No inter-RP node communication for query resolution Tradeoff between CPU and Bandwidth Load is spread across nodes Different names use different RP set
10
Jun GaoCarnegie Mellon University10 Load Concentration Problem RP node may be overloaded Some AV-pairs more popular than others Speed=55mph vs. Speed=95mph P2P keyword follows Zipf distribution However, many nodes are underutilized Intuition: use a set of nodes to share load caused by popular pairs Challenge: accomplish load balancing in a distributed and self-adaptive fashion 10 100 1000 10000 100000 1 10 100 1000 10000 AV-pair rank #of names Example Zipf distribution
11
Jun GaoCarnegie Mellon University11 Load Balancing Matrix (LBM) Organize nodes into a logical matrix Each column holds a partition Rows are replicas of each other Node IDs are determined by: H(a1=v1, p, r) N 1 (p,r) Matrix expands itself to accommodate extra load Increase P when registration load reaches threshold Query load R 1,12,1 3,1 1,22,2 3,2 1,32,3 3,3 CN1 CN2 CN3 CN4 CN3CN4 Q1 Q2 Q3 Q4 Q3 Q4 LBM for AV-pair: {a1=v1} R P
12
Jun GaoCarnegie Mellon University12 Registration and Query with LBM Determine LBM size Register with one random column of each matrix Compute IDs locally 1,1 1,2 2,1 2,2 1,32,3 CN1:{a1v1, a2v2, a3v3} 0,0 P=?, R=? P=3, R=3 LBM1 LBM2 LBM3 3,1 3,2 3,3 Q:{a1v1, a2v2} Load is balanced within an LBM Registration Query can be sent to the matrix of any AV-pair Use LBM with min(P) Sent to one random node in each column Query
13
Jun GaoCarnegie Mellon University13 System Properties with LBM Registration and query cost for one pair increases O(R) registration messages O(P) query messages Matrix size depends on current load LBM must be kept small for efficiency Query optimization helps, e.g., large P small R Matrix shrinking mechanism E.g., May query a subset of the partitions Load on each RP node is upper-bounded Efficient processing Underutilized nodes are recruited as LBM expands
14
Jun GaoCarnegie Mellon University14 Simulation Evaluation Implement in an event-driven simulator Each node monitors its registration and query load Assume Chord-like underlying routing mechanism Experiment setup 10,000 nodes in the CDS network 10,000 distinct AV-pairs (50 attributes, 200 values/attribute) Use synthetic registration and query workload Performance metric: success rate System should maintain high success rate as load increases
15
Jun GaoCarnegie Mellon University15 Workload Rank of AV-pairs Number of names Number of queries Registration load Query load
16
Jun GaoCarnegie Mellon University16 Registration Success Rate Comparison Avg. registration arrival rate (1000 reg/sec) Registration success rate (%) Threshold = 50 reg/sec Poisson arrival P max = 10 100,000 names 1 registration every 10 secs.
17
Jun GaoCarnegie Mellon University17 Query Success Rate Comparison Avg. query arrival rate (1000q/sec) Query success rate (%) 1 million users, 1 query every 10 secs. Threshold = 200 q/sec Poisson arrival R max = 10
18
Jun GaoCarnegie Mellon University18 Conclusions Proposed a distributed and scalable design to the content discovery problem RP-based approach addresses scalability Avoid flooding LBMs improve system throughput Balance load Distributed algorithms Decisions are made locally
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.