Evaluating the Running Time of a Communication Round over the Internet Omar Bakr Idit Keidar MIT MIT/Technion PODC 2002.

Slides:

Advertisements

Similar presentations

An Improved TCP for transaction communications on Sensor Networks Tao Yu Tsinghua University 2/8/

Advertisements

Gossip and its application Presented by Anna Kaplun.

Web Server Benchmarking Using the Internet Protocol Traffic and Network Emulator Carey Williamson, Rob Simmonds, Martin Arlitt et al. University of Calgary.

1 Transport Protocols & TCP CSE 3213 Fall April 2015.

1 End to End Bandwidth Estimation in TCP to improve Wireless Link Utilization S. Mascolo, A.Grieco, G.Pau, M.Gerla, C.Casetti Presented by Abhijit Pandey.

Ahmed El-Hassany CISC856: CISC 856 TCP/IP and Upper Layer Protocols Slides adopted from: Injong Rhee, Lisong Xu.

How Much Anonymity does Network Latency Leak? Paper by: Nicholas Hopper, Eugene Vasserman, Eric Chan-Tin Presented by: Dan Czerniewski October 3, 2011.

Byzantine Generals Problem: Solution using signed messages.

Receiver-driven Layered Multicast S. McCanne, V. Jacobsen and M. Vetterli SIGCOMM 1996.

1 Sources of Instability in Data Center Multicast Dmitry Basin Ken Birman Idit Keidar Ymir Vigfusson LADIS 2010.

Importance Sampling. What is Importance Sampling ? A simulation technique Used when we are interested in rare events Examples: Bit Error Rate on a channel,

Distributed Systems Fall 2010 Time and synchronization.

Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.

AQM for Congestion Control1 A Study of Active Queue Management for Congestion Control Victor Firoiu Marty Borden.

Transport Layer 3-1 Transport Layer r To learn about transport layer protocols in the Internet: m TCP: connection-oriented protocol m Reliability protocol.

Transis Efficient Message Ordering in Dynamic Networks PODC 1996 talk slides Idit Keidar and Danny Dolev The Hebrew University Transis Project.

Scalable Application Layer Multicast Suman Banerjee Bobby Bhattacharjee Christopher Kommareddy ACM SIGCOMM Computer Communication Review, Proceedings of.

On the Constancy of Internet Path Properties Yin Zhang, Nick Duffield AT&T Labs Vern Paxson, Scott Shenker ACIRI Internet Measurement Workshop 2001 Presented.

Yi Liang Multi-stream Voice Communication with Path Diversity.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.

Congestion Avoidance and Control Van Jacobson Jonghyun Kim April 1, 2004.

Abstractions for Fault-Tolerant Distributed Computing Idit Keidar MIT LCS.

ACN: AVQ1 Analysis and Design of an Adaptive Virtual Queue (AVQ) Algorithm for Active Queue Managment Srisankar Kunniyur and R. Srikant SIGCOMM’01 San.

Internet and Intranet Protocols and Applications Section V: Network Application Performance Lecture 11: Why the World Wide Wait? 4/11/2000 Arthur P. Goldberg.

E2E Routing Behavior in the Internet Vern Paxson Sigcomm 1996 Slides are adopted from Ion Stoica’s lecture at UCB.

Bluenet a New Scatternet Formation Scheme * Huseyin Ozgur Tan * Zifang Wang,Robert J.Thomas, Zygmunt Haas ECE Cornell Univ*

1 Idit Keidar MIT Lab for Computer Science Theory of Distributed Systems Group Paradigms for Building Distributed Systems: Performance Measurements and.

Dynamic routing – QoS routing Load sensitive routing QoS routing.

Department of Electronic Engineering City University of Hong Kong EE3900 Computer Networks Transport Protocols Slide 1 Transport Protocols.

1 Failure Detectors: A Perspective Sam Toueg LIX, Ecole Polytechnique Cornell University.

1 K. Salah Module 6.1: TCP Flow and Congestion Control Connection establishment & Termination Flow Control Congestion Control QoS.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.

Network Measurement Bandwidth Analysis. Why measure bandwidth? Network congestion has increased tremendously. Network congestion has increased tremendously.

Lecture 9: Time & Clocks CDK4: Sections 11.1 – 11.4 CDK5: Sections 14.1 – 14.4 TVS: Sections 6.1 – 6.2 Topics: Synchronization Logical time (Lamport) Vector.

Lecture 2-1 CS 425/ECE 428 Distributed Systems Lecture 2 Time & Synchronization Reading: Klara Nahrstedt.

CS332 Ch. 28 Spring 2014 Victor Norman. Access delay vs. Queuing Delay Q: What is the difference between access delay and queuing delay? A: I think the.

1 Physical Clocks need for time in distributed systems physical clocks and their problems synchronizing physical clocks u coordinated universal time (UTC)

Timing-sync Protocol for Sensor Networks (TPSN) Presenter: Ke Gao Instructor: Yingshu Li.

Flow Models and Optimal Routing. How can we evaluate the performance of a routing algorithm –quantify how well they do –use arrival rates at nodes and.

Communication (II) Chapter 4

COMT 4291 Communications Protocols and TCP/IP COMT 429.

TCP1 Transmission Control Protocol (TCP). TCP2 Outline Transmission Control Protocol.

Computer Science Lecture 10, page 1 CS677: Distributed OS Last Class: Naming Name distribution: use hierarchies DNS X.500 and LDAP.

Transport Control Protocol (TCP) Features of TCP, packet loss and retransmission, adaptive retransmission, flow control, three way handshake, congestion.

CS244A Midterm Review Ben Nham Some slides derived from: David Erickson (2007) Paul Tarjan (2007)

Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,

Distributed Systems Principles and Paradigms Chapter 05 Synchronization.

1 Evaluating NGI performance Matt Mathis

Deadline-based Resource Management for Information- Centric Networks Somaya Arianfar, Pasi Sarolahti, Jörg Ott Aalto University, Department of Communications.

Computer Networking Lecture 18 – More TCP & Congestion Control.

TCP: Transmission Control Protocol Part II : Protocol Mechanisms Computer Network System Sirak Kaewjamnong Semester 1st, 2004.

1 SIGCOMM ’ 03 Low-Rate TCP-Targeted Denial of Service Attacks A. Kuzmanovic and E. W. Knightly Rice University Reviewed by Haoyu Song 9/25/2003.

Scalable Group Communication for the Internet Idit Keidar MIT Lab for Computer Science Theory of Distributed Systems Group.

Physical clock synchronization Question 1. Why is physical clock synchronization important? Question 2. With the price of atomic clocks or GPS coming down,

Network Computing Laboratory 1 Vivaldi: A Decentralized Network Coordinate System Authors: Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris MIT Published.

1 Transport Control Protocol for Wireless Connections ElAarag and Bassiouni Vehicle Technology Conference 1999.

The Macroscopic behavior of the TCP Congestion Avoidance Algorithm.

A+MAC: A Streamlined Variable Duty-Cycle MAC Protocol for Wireless Sensor Networks 1 Sang Hoon Lee, 2 Byung Joon Park and 1 Lynn Choi 1 School of Electrical.

TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.

On the Performance of Consensus Algorithms: Theory and Practice Idit Keidar Technion & MIT.

Distributed Systems Lecture 5 Time and synchronization 1.

CSE331: Introduction to Networks and Security Lecture 2 Fall 2002.

1 ICCCN 2003 Modelling TCP Reno with Spurious Timeouts in Wireless Mobile Environments Shaojian Fu School of Computer Science University of Oklahoma.

Vivaldi: A Decentralized Network Coordinate System

Distributed Computing

Exercises for Chapter 11: TIME AND GLOBAL STATES

TCP: Transmission Control Protocol Part II : Protocol Mechanisms

Performance Evaluation of a Communication Round over the Internet

Evaluating the Running Time of a Communication Round over the Internet

Presentation transcript:

Evaluating the Running Time of a Communication Round over the Internet Omar Bakr Idit Keidar MIT MIT/Technion PODC 2002

© Omar Bakr and Idit Keidar; PODC July 2002 Communication Round Exchange of information from all hosts to all hosts Part of many distributed algorithms, systems –consensus, atomic commit, replication,...

© Omar Bakr and Idit Keidar; PODC July 2002 Common Metric for Evaluating Algorithms Number of rounds (or steps) they require

© Omar Bakr and Idit Keidar; PODC July 2002 Questions What is the best way to implement a communication round over the Internet –decentralized vs. centralized How long is a communication round over the Internet?

© Omar Bakr and Idit Keidar; PODC July 2002 Prediction is Hard Internet is unpredictable, diverse, … Different answers for different topologies, different times Different performance metrics –local running time one host is engaged in algorithm –overall running time from when first host starts to when last host finishes

© Omar Bakr and Idit Keidar; PODC July 2002 “Communication Round” Primitive Initiated by some host Propagates data from every host to every other host connected to it

© Omar Bakr and Idit Keidar; PODC July 2002 Example Implementations All-to-all Leader Secondary Leader

© Omar Bakr and Idit Keidar; PODC July 2002 Experiment I 10 hosts: Taiwan, Korea, US academia, ISPs TCP/IP (connections always up) Algorithms: –All-to-all –Leader (initiator) –Secondary leader (not initiator) periodically initiated at each host times over 3.5 days

© Omar Bakr and Idit Keidar; PODC July 2002 Computing Overall Running Time Elapsed time from initiation (at initiator) until all hosts terminate Requires estimating clock differences –Clocks not synchronized, drift –We compute difference over short intervals –Compute 3 different ways –Achieve accuracy within 20 ms. on 90% of runs

© Omar Bakr and Idit Keidar; PODC July 2002 Teaser: Comparing Performance Based on Number of Steps All-to-all: 2 Leader: 3 Secondary Leader: 4

© Omar Bakr and Idit Keidar; PODC July 2002 Predicting Overall Runnig Times From MIT Ping-measured latencies (IP): – Longest link latency 240 milliseconds – Longest link to MIT 150 milliseconds = = 450

© Omar Bakr and Idit Keidar; PODC July 2002 Measured Running Times Runs Initiated at MIT All-to-AllLeader OverallLocalOverallLocal Prediction Average (runs under 2 sec) % runs over 2 seconds 55%3%13%6% Running times in milliseconds

© Omar Bakr and Idit Keidar; PODC July 2002 What’s Going On? Loss rates on two links are very high –42% and 37% –Taiwan to two ISPs in the US Loss rates on other links up to 8% Upon loss, TCP’s timeout is big –More than round-trip-time All-to-all sends messages on lossy links –Often delayed by loss

© Omar Bakr and Idit Keidar; PODC July 2002 Distribution of Running Times Up to 1.3 sec. at MIT

© Omar Bakr and Idit Keidar; PODC July 2002 Running Times Runs Initiated at Taiwan % runs over 2 seconds Average (runs under 2 sec) Sec. Leader overall local Leader overall local All-to-all overall local 7%13%43%64%24%54% Running times in milliseconds

© Omar Bakr and Idit Keidar; PODC July 2002 Distribution of Running Times in Taiwan

© Omar Bakr and Idit Keidar; PODC July 2002 What’s Going On? All-to-all LeaderSecondary Leader Taiwan MIT Hosts with bad links to Taiwan Other Hosts Good link Lossy link

© Omar Bakr and Idit Keidar; PODC July 2002 Experiment II: Removing Taiwan Overall running times much better –For every initiator and algorithm, less than 10% over 2 seconds (as opposed to 55% previously) All-to-all overall still worse than others! –either Leader or Secondary Leader best, depending on initiator –loss rates of 2% - 8% are not negligible –all-to-all sends O(n 2 ) messages; suffers But, all-to-all has best local running times

© Omar Bakr and Idit Keidar; PODC July 2002 Probability of Delay due to Loss If all links would have same latency –assume 1% loss on all links; 10 hosts (n=10) –Leader sends 3(n-1) = 27 messages probability of at least one loss:  24% –All-2-all sends n(n-1) = 90 messages probability of at least one loss:  60% In reality, links don’t have same latency –only loss on long links matters

© Omar Bakr and Idit Keidar; PODC July 2002 Conclusions Message loss causes high variation in TCP link latencies –latency distribution has high variance, heavy tail Latency distribution determines expected time for receiving O(n) concurrent messages Secondary leader helps –No triangle inequality, especially for loss Different for overall vs. local running times Number of rounds/steps not sufficient metric –One-to-all and all-to-all have different costs

© Omar Bakr and Idit Keidar; PODC July 2002 Getting a Clock Difference Sample Assume    half of RTT between A and B Can be estimated by sending messages A computes clock differences as: t A -   - t B A B tBtB tAtA

© Omar Bakr and Idit Keidar; PODC July 2002 Estimating the clock difference Average over multiple samples during a time period Period must be long enough to contain enough samples Period must be short enough to minimize effect of clock drift We chose a period length of 15 minutes –With roughly 50 samples between every pair of hosts

© Omar Bakr and Idit Keidar; PODC July 2002 Computing Overall Running Times Pick a host (MIT), –For each period, adjust all clocks to MIT –E.g., if computed difference from Utah to MIT is 15 ms., subtract 15 ms. from all logged start and termination times at Utah Pick a second host (Cornell) –Repeat the above –Compare the results

© Omar Bakr and Idit Keidar; PODC July 2002 Limitations of our Method RTTs between different hosts in the Internet vary over time Unsymmetries in links –latency is not the same in both direction Since all messages are exchanged over TCP, latencies vary substantially –Sensitive to packet loss (TCP's exponential backoff algorithm)

© Omar Bakr and Idit Keidar; PODC July 2002 What do we do?? Pick a host with reliable links to all other hosts as baseline for computing differences (MIT) Repeat the process with another host that also has reliable links to every other host (Cornell) Check consistency of the results: –Overall running times of “good samples” (under 2 seconds) were within 20ms. (90% within 10ms.)