Download presentation
Presentation is loading. Please wait.
Published byGwendolyn Matilda Briggs Modified over 9 years ago
1
Evaluating the Running Time of a Communication Round over the Internet Omar Bakr Idit Keidar MIT MIT/Technion PODC 2002
2
© Omar Bakr and Idit Keidar; PODC July 2002 Communication Round Exchange of information from all hosts to all hosts Part of many distributed algorithms, systems –consensus, atomic commit, replication,...
3
© Omar Bakr and Idit Keidar; PODC July 2002 Common Metric for Evaluating Algorithms Number of rounds (or steps) they require
4
© Omar Bakr and Idit Keidar; PODC July 2002 Questions What is the best way to implement a communication round over the Internet –decentralized vs. centralized How long is a communication round over the Internet?
5
© Omar Bakr and Idit Keidar; PODC July 2002 Prediction is Hard Internet is unpredictable, diverse, … Different answers for different topologies, different times Different performance metrics –local running time one host is engaged in algorithm –overall running time from when first host starts to when last host finishes
6
© Omar Bakr and Idit Keidar; PODC July 2002 “Communication Round” Primitive Initiated by some host Propagates data from every host to every other host connected to it
7
© Omar Bakr and Idit Keidar; PODC July 2002 Example Implementations All-to-all Leader Secondary Leader
8
© Omar Bakr and Idit Keidar; PODC July 2002 Experiment I 10 hosts: Taiwan, Korea, US academia, ISPs TCP/IP (connections always up) Algorithms: –All-to-all –Leader (initiator) –Secondary leader (not initiator) periodically initiated at each host - 650 times over 3.5 days
9
© Omar Bakr and Idit Keidar; PODC July 2002 Computing Overall Running Time Elapsed time from initiation (at initiator) until all hosts terminate Requires estimating clock differences –Clocks not synchronized, drift –We compute difference over short intervals –Compute 3 different ways –Achieve accuracy within 20 ms. on 90% of runs
10
© Omar Bakr and Idit Keidar; PODC July 2002 Teaser: Comparing Performance Based on Number of Steps All-to-all: 2 Leader: 3 Secondary Leader: 4
11
© Omar Bakr and Idit Keidar; PODC July 2002 Predicting Overall Runnig Times From MIT Ping-measured latencies (IP): – Longest link latency 240 milliseconds – Longest link to MIT 150 milliseconds 150+240 = 390 150+150+150 = 450
12
© Omar Bakr and Idit Keidar; PODC July 2002 Measured Running Times Runs Initiated at MIT All-to-AllLeader OverallLocalOverallLocal Prediction 390300450300 Average (runs under 2 sec) 811295541335 % runs over 2 seconds 55%3%13%6% Running times in milliseconds
13
© Omar Bakr and Idit Keidar; PODC July 2002 What’s Going On? Loss rates on two links are very high –42% and 37% –Taiwan to two ISPs in the US Loss rates on other links up to 8% Upon loss, TCP’s timeout is big –More than round-trip-time All-to-all sends messages on lossy links –Often delayed by loss
14
© Omar Bakr and Idit Keidar; PODC July 2002 Distribution of Running Times Up to 1.3 sec. at MIT
15
© Omar Bakr and Idit Keidar; PODC July 2002 Running Times Runs Initiated at Taiwan % runs over 2 seconds Average (runs under 2 sec) Sec. Leader overall local Leader overall local All-to-all overall local 7%13%43%64%24%54% 6076798441120645866 Running times in milliseconds
16
© Omar Bakr and Idit Keidar; PODC July 2002 Distribution of Running Times in Taiwan
17
© Omar Bakr and Idit Keidar; PODC July 2002 What’s Going On? All-to-all LeaderSecondary Leader Taiwan MIT Hosts with bad links to Taiwan Other Hosts Good link Lossy link
18
© Omar Bakr and Idit Keidar; PODC July 2002 Experiment II: Removing Taiwan Overall running times much better –For every initiator and algorithm, less than 10% over 2 seconds (as opposed to 55% previously) All-to-all overall still worse than others! –either Leader or Secondary Leader best, depending on initiator –loss rates of 2% - 8% are not negligible –all-to-all sends O(n 2 ) messages; suffers But, all-to-all has best local running times
19
© Omar Bakr and Idit Keidar; PODC July 2002 Probability of Delay due to Loss If all links would have same latency –assume 1% loss on all links; 10 hosts (n=10) –Leader sends 3(n-1) = 27 messages probability of at least one loss: 1 -.99 27 24% –All-2-all sends n(n-1) = 90 messages probability of at least one loss: 1 -.99 90 60% In reality, links don’t have same latency –only loss on long links matters
20
© Omar Bakr and Idit Keidar; PODC July 2002 Conclusions Message loss causes high variation in TCP link latencies –latency distribution has high variance, heavy tail Latency distribution determines expected time for receiving O(n) concurrent messages Secondary leader helps –No triangle inequality, especially for loss Different for overall vs. local running times Number of rounds/steps not sufficient metric –One-to-all and all-to-all have different costs
21
© Omar Bakr and Idit Keidar; PODC July 2002 Getting a Clock Difference Sample Assume half of RTT between A and B Can be estimated by sending messages A computes clock differences as: t A - - t B A B tBtB tAtA
22
© Omar Bakr and Idit Keidar; PODC July 2002 Estimating the clock difference Average over multiple samples during a time period Period must be long enough to contain enough samples Period must be short enough to minimize effect of clock drift We chose a period length of 15 minutes –With roughly 50 samples between every pair of hosts
23
© Omar Bakr and Idit Keidar; PODC July 2002 Computing Overall Running Times Pick a host (MIT), –For each period, adjust all clocks to MIT –E.g., if computed difference from Utah to MIT is 15 ms., subtract 15 ms. from all logged start and termination times at Utah Pick a second host (Cornell) –Repeat the above –Compare the results
24
© Omar Bakr and Idit Keidar; PODC July 2002 Limitations of our Method RTTs between different hosts in the Internet vary over time Unsymmetries in links –latency is not the same in both direction Since all messages are exchanged over TCP, latencies vary substantially –Sensitive to packet loss (TCP's exponential backoff algorithm)
25
© Omar Bakr and Idit Keidar; PODC July 2002 What do we do?? Pick a host with reliable links to all other hosts as baseline for computing differences (MIT) Repeat the process with another host that also has reliable links to every other host (Cornell) Check consistency of the results: –Overall running times of “good samples” (under 2 seconds) were within 20ms. (90% within 10ms.)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.