A Prediction-based Fair Replication Algorithm in Structured P2P Systems Xianshu Zhu, Dafang Zhang, Wenjia Li, Kun Huang Presented by: Xianshu Zhu College of Computer & Communication, Hunan University, P.R.China
Outline IntroductionContribution PFR (Prediction-based Fair Replication) Performance Evaluation Conclusion and Future Work
Introduction Query Hotspot Structured Peer-to-Peer Network Summary of Replication Schemes
Query Hotspot F G I J C D E H B FileFile Query Hotspot: the number of requests for popular objects increases dramatically, and leads to consequent dropping queries and severe performance failures. Query Hotspot
Structured P2P Network Advantage : - Scalability - Scalability - Efficient Searching - Efficient Searching Disadvantage : The Implementation of Structured P2P Network Assumes that All Data Items are of the Same Popularity. No Mechanism Can Handle Hotspot Problem
Replication Schemes Basic Idea : - Distribute Replicas of the Popular Data Items to Various Light-loaded Nodes - Distribute Replicas of the Popular Data Items to Various Light-loaded Nodes - Fairly Distribute Load onto Each Node. - Fairly Distribute Load onto Each Node. When Apply Replication Technique: - Replica Creation: Time, Number, Location - Replica Creation: Time, Number, Location - Replica Utilization - Replica Utilization
Replication Schemes Classification According to Replica Location: - Path Replication - Path Replication - Owner Replication - Owner Replication - Random Replication - Random Replication A BCDEF FileFileFileFileFileFileFileFileFileFileFileFile High Replication Overhead
Replication Schemes A BCDEF File A 1.New Query Hotspot 2.Low Replication Speed Classification According to Replica Location: - Path Replication - Path Replication - Owner Replication: Gopalakrishnan proposed LAR - Owner Replication: Gopalakrishnan proposed LAR - Random Replication - Random Replication File B File D File B File A
Replication Schemes A BCDEF FileFileFileFileFileFileFileFile Classification According to Replica Location: - Path Replication - Path Replication - Owner Replication - Owner Replication - Random Replication - Random Replication
Outline IntroductionContribution PFR (Prediction-based Fair Replication) Performance Evaluation Conclusion and Future Work
Contribution Design Goals: - Dropped Queries by Only Introducing Minimum Replication Overhead - Dropped Queries by Only Introducing Minimum Replication Overhead - Minimize the Drawbacks of LAR Algorithm (Owner Replication) - Minimize the Drawbacks of LAR Algorithm (Owner Replication) Prediction-based Fair Replication Algorithm (PFR) that Can Almost Fairly Distribute Load onto Each Node, So As to Meet the Above Design Goal.
Contribution Fairness Goal of PFR -Adaptively Determine the Replication Speed and Replication Location According to Node’s Predicted Load Fraction A BCDEFG
Outline IntroductionContribution PFR (Prediction-based Fair Replication) Performance Evaluation Conclusion and Future Work
Predict(n+1) PFR- Appropriate Replication Time To keep the System Performance at a High Level, Preventive Actions Should be Taken Before Query Hotspot Really Happens Period Exponential Weight Prediction Algorithm Predict(n+1)=Current(n) + PredictDiff(n+1) Predict(n+1)=Current(n) + PredictDiff(n+1) 12 nn+1n-1 Current Time Predicted Possible Traffic Difference Between nth and (n+1)th Interval
Period Exponential Weight Prediction Algorithm - Only Incurs Low Computation Overhead - Only Incurs Low Computation Overhead - Applicable to Online Prediction - Applicable to Online Prediction Our Replication Strategy is Set Based on The Predicted load PFR- Appropriate Replication Time
Replication Speed: A BCDEF FileFile FileFileFileFileFileFile 3/6 Replication Speed=(the Number of Nodes Chosen to Hold Replicas)/(the Number of All Nodes that Have Encountered Along the Query Path) PFR- Fairly-decided Replication Speed
Replication Level: NN/2 3N/4 N/4 1 DON’T create replicas N: Total Number of Nodes Along a Query Path PFR- Fairly-decided Replication Speed Replication Speed Predicted Load Fraction (0.5) (0.3) (0.6) (0.7) (0.8) (1) Node Homogeneity
PFR- Replication & Replica Utilization ABCDEF G C: File F:0.25E:0.15F:0.25 E:0.15 F:0.25D:0.3 C:0.55E:0.15 F:0.25 D:0.3B:0.3C:0.55 E:0.15 F:0.25 D:0.3A:0.9B:0.3 C:0.55 E:0.15 F:0.25 D:0.3 RS:N/4=1 A: File RS:N E:C E:C E:C B,D,E,F:A B,D,E,F:A B,D,E,F:A B,D,E,F:A B,D,E,F:A B,D,E,F:A D:A N=6
Outline IntroductionContribution PFR (Prediction-based Fair Replication) Performance Evaluation Conclusion and Future Work
Performance Evaluation Highly modified Chord Simulator from MIT and LAR Implementation Code : Highly modified Chord Simulator from MIT and LAR Implementation Code : System Size 1000 The Time Each Network hop takes 25ms Number of data Average system load 25% Node capacity 10 per sec Number of Queries Generate per Sec 500 Node’s queue length 32 Prediction interval 1s
Performance Evaluation Number of Queries Dropped Over Time 28% 90% of the input queries are directed to 1 item LAR PFR
Performance Evaluation Total Number of Documents Replicated LAR PFR
Performance Evaluation Total Number of Finger Tables Replicated LAR PFR
Performance Evaluation Total Number of Replica Location Hints Created PFR LAR
Outline IntroductionContribution PFR (Prediction-based Fair Replication) Performance Evaluation Conclusion and Future Work
Conclusion Prediction-based Fair Replication Algorithm Can Conduct Fair Replication through: - Appropriate Replication Time - Appropriate Replication Time - Fairly-decided Replication Speed - Fairly-decided Replication Speed - Fairly-decided Replication Location - Fairly-decided Replication Location - High Replica Utilization Rate - High Replica Utilization Rate Performance Evaluation: - Notably Decrease the Number of Dropped Queries - Notably Decrease the Number of Dropped Queries - Low Replication Overhead - Low Replication Overhead
Future Work Taking Node Heterogeneity into Consideration
Thank you!