FAST TCP Cheng Jin David Wei Steven Low netlab.CALTECH.edu.

Slides:

Advertisements

Similar presentations

Martin Suchara, Ryan Witt, Bartek Wydrowski California Institute of Technology Pasadena, U.S.A. TCP MaxNet Implementation and Experiments on the WAN in.

Advertisements

Helping TCP Work at Gbps Cheng Jin the FAST project at Caltech

FAST TCP Anwis Das Ajay Gulati Slides adapted from : IETF presentation slides Link:

Internet Protocols Steven Low CS/EE netlab.CALTECH.edu October 2004 with J. Doyle, L. Li, A. Tang, J. Wang.

Cheng Jin David Wei Steven Low FAST TCP: design and experiments.

WHITE – Achieving Fair Bandwidth Allocation with Priority Dropping Based on Round Trip Time Name : Choong-Soo Lee Advisors : Mark Claypool, Robert Kinicki.

CSIT560 Internet Infrastructure: Switches and Routers Active Queue Management Presented By: Gary Po, Henry Hui and Kenny Chong.

Ahmed El-Hassany CISC856: CISC 856 TCP/IP and Upper Layer Protocols Slides adopted from: Injong Rhee, Lisong Xu.

CUBIC : A New TCP-Friendly High-Speed TCP Variant Injong Rhee, Lisong Xu Member, IEEE v 0.2.

Advanced Computer Networking Congestion Control for High Bandwidth-Delay Product Environments (XCP Algorithm) 1.

Congestion Control An Overview -Jyothi Guntaka. Congestion  What is congestion ?  The aggregate demand for network resources exceeds the available capacity.

XCP: Congestion Control for High Bandwidth-Delay Product Network Dina Katabi, Mark Handley and Charlie Rohrs Presented by Ao-Jan Su.

Receiver-driven Layered Multicast S. McCanne, V. Jacobsen and M. Vetterli SIGCOMM 1996.

Texas A&M University Improving TCP Performance in High Bandwidth High RTT Links Using Layered Congestion Control Sumitha.

An Implementation and Experimental Study of the eXplicit Control Protocol (XCP) Yongguang Zhang and Tom Henderson INFOCOMM 2005 Presenter - Bob Kinicki.

Control Theory in TCP Congestion Control and new “FAST” designs. Fernando Paganini and Zhikui Wang UCLA Electrical Engineering July Collaborators:

WAN in Lab NSF Site Visit John Doyle, CDS/EE/BE Steven Low (PI), CS/EE Harvey Newman, Physics Demetri Psaltis, EE/CNS Steven Yip, Cisco March 5, 2003.

Cheng Jin David Wei Steven Low FAST TCP: Motivation, Architecture, Algorithms, Performance.

High-performance bulk data transfers with TCP Matei Ripeanu University of Chicago.

Comparison between TCPWestwood and eXplicit Control Protocol (XCP) Jinsong Yang Shiva Navab CS218 Project - Fall 2003.

Heterogeneous Congestion Control Protocols Steven Low CS, EE netlab.CALTECH.edu with A. Tang, J. Wang, D. Wei, Caltech M. Chiang, Princeton.

FAST TCP in Linux Cheng Jin David Wei

The Effect of Router Buffer Size on HighSpeed TCP Performance Dhiman Barman Joint work with Georgios Smaragdakis and Ibrahim Matta.

1 Emulating AQM from End Hosts Presenters: Syed Zaidi Ivor Rodrigues.

Presented by Anshul Kantawala 1 Anshul Kantawala FAST TCP: From Theory to Experiments C. Jin, D. Wei, S. H. Low, G. Buhrmaster, J. Bunn, D. H. Choe, R.

Congestion Control for High Bandwidth-delay Product Networks Dina Katabi, Mark Handley, Charlie Rohrs.

FAST Protocols for Ultrascale Networks netlab.caltech.edu/FAST Internet: distributed feedback control system  TCP: adapts sending rate to congestion 

CS 552 Reliable Data Transfer R. Martin Credit slides from I. Stoica, C. Jin, M. Mitzenmacher.

A WAN-in-LAB for Protocol Development Netlab, Caltech Lachlan Andrew, George Lee, Steven Low(PI), John Doyle, Harvey Newman.

Experiences in Design and Implementation of a High Performance Transport Protocol Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data.

Utility, Fairness, TCP/IP Steven Low CS/EE netlab.CALTECH.edu Feb 2004.

1 MaxNet and TCP Reno/RED on mice traffic Khoa Truong Phan Ho Chi Minh city University of Technology (HCMUT)

CS/EE 145A Congestion Control Netlab.caltech.edu/course.

FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.

1 IEEE Meeting July 19, 2006 Raj Jain Modeling of BCN V2.0 Jinjing Jiang and Raj Jain Washington University in Saint Louis Saint Louis, MO

Acknowledgments S. Athuraliya, D. Lapsley, V. Li, Q. Yin (UMelb) S. Adlakha (UCLA), J. Doyle (Caltech), K. Kim (SNU/Caltech), F. Paganini (UCLA), J. Wang.

Congestion Control for High Bandwidth-Delay Product Networks D. Katabi (MIT), M. Handley (UCL), C. Rohrs (MIT) – SIGCOMM’02 Presented by Cheng.

TCP with Variance Control for Multihop IEEE Wireless Networks Jiwei Chen, Mario Gerla, Yeng-zhong Lee.

AQM & TCP models Courtesy of Sally Floyd with ICIR Raj Jain with OSU.

A WAN-in-LAB for Protocol Development Netlab, Caltech Lachlan Andrew, George Lee, Steven Low(PI), John Doyle, Harvey Newman.

Transport Layer 3-1 Chapter 3 Transport Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March

1 Time-scale Decomposition and Equivalent Rate Based Marking Yung Yi, Sanjay Shakkottai ECE Dept., UT Austin Supratim Deb.

SCinet Caltech-SLAC experiments netlab.caltech.edu/FAST SC2002 Baltimore, Nov 2002  Prototype C. Jin, D. Wei  Theory D. Choe (Postech/Caltech), J. Doyle,

1 Wide Area Network Emulation on the Millennium Bhaskaran Raman Yan Chen Weidong Cui Randy Katz {bhaskar, yanchen, wdc, Millennium.

TCP transfers over high latency/bandwidth networks Internet2 Member Meeting HENP working group session April 9-11, 2003, Arlington T. Kelly, University.

Performance Engineering E2EpiPEs and FastTCP Internet2 member meeting - Indianapolis World Telecom Geneva October 15, 2003

TCP Westwood: Efficient Transport for High-speed wired/wireless Networks 2008.

Scalable Laws for Stable Network Congestion Control Fernando Paganini UCLA Electrical Engineering IPAM Workshop, March Collaborators: Steven Low,

Bartek Wydrowski Steven Low

An Evaluation of Fairness Among Heterogeneous TCP Variants Over 10Gbps High-speed Networks Lin Xue*, Suman Kumar', Cheng Cui* and Seung-Jong Park* *School.

TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot

XCP: eXplicit Control Protocol Dina Katabi MIT Lab for Computer Science

FAST Protocols for High Speed Network David netlab, Caltech For HENP WG, Feb 1st 2003.

Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.

INDIANAUNIVERSITYINDIANAUNIVERSITY Status of FAST TCP and other TCP alternatives John Hicks TransPAC HPCC Engineer Indiana University APAN Meeting – Hawaii.

Fast TCP Cheng JinDavid WeiSteven Low Caltech Infocom, March 2004 Offense Team: Santa & Animesh.

FAST TCP Cheng Jin David Wei Steven Low netlab.CALTECH.edu GNEW, CERN, March 2004.

A WAN-in-LAB for Protocol Development Netlab, Caltech George Lee, Lachlan Andrew, David Wei, Bartek Wydrowski, Cheng Jin, John Doyle, Steven Low, Harvey.

Masaki Hirabaru (NICT) and Jin Tanaka (KDDI) Impact of Bottleneck Queue on Long Distant TCP Transfer August 25, 2005 NOC-Network Engineering Session Advanced.

1 FAST TCP for Multi-Gbps WAN: Experiments and Applications Les Cottrell & Fabrizio Coccetti– SLAC Prepared for the Internet2, Washington, April 2003

@Yuan Xue A special acknowledge goes to J.F Kurose and K.W. Ross Some of the slides used in this lecture are adapted from their.

1 Network Transport Layer: TCP Analysis and BW Allocation Framework Y. Richard Yang 3/30/2016.

Congestion Control for High Bandwidth-Delay Product Networks Dina Katabi, Mark Handley, Charlie Rohrs Presented by Yufei Chen.

Introduction to Congestion Control

Fast TCP Matt Weaver CS622 Fall 2007.

FAST TCP : From Theory to Experiments

TCP Congestion Control

Presentation transcript:

FAST TCP Cheng Jin David Wei Steven Low netlab.CALTECH.edu

Acknowledgments  Caltech Bunn, Choe, Doyle, Hegde, Jayaraman, Newman, Ravot, Singh, X. Su, J. Wang, Xia  UCLA Paganini, Z. Wang  CERN Martin  SLAC Cottrell  Internet2 Almes, Shalunov  MIT Haystack Observatory Lapsley, Whitney  TeraGrid Linda Winkler  Cisco Aiken, Doraiswami, McGugan, Yip  Level(3) Fernes  LANL Wu

Outline  Motivation & approach  FAST architecture  Window control algorithm  Experimental evaluation skip: theoretical foundation

Performance at large windows capacity = 155Mbps, 622Mbps, 2.5Gbps, 5Gbps, 10Gbps; 100 ms round trip latency; 100 flows J. Wang (Caltech, June 02) ns-2 simulation 10Gbps 27% txq=100txq= % 1G Linux TCP Linux TCP FAST 19% average utilization capacity = 1Gbps; 180 ms round trip latency; 1 flow C. Jin, D. Wei, S. Ravot, etc (Caltech, Nov 02) DataTAG Network: CERN (Geneva) – StarLight (Chicago) – SLAC/Level3 (Sunnyvale) txq=100

Congestion control x i (t) p l (t) Example congestion measure p l (t) Loss (Reno) Queueing delay (Vegas)

TCP/AQM  Congestion control is a distributed asynchronous algorithm to share bandwidth  It has two components TCP: adapts sending rate (window) to congestion AQM: adjusts & feeds back congestion information  They form a distributed feedback control system Equilibrium & stability depends on both TCP and AQM And on delay, capacity, routing, #connections p l (t) x i (t) TCP: Reno Vegas AQM: DropTail RED REM/PI AVQ

Difficulties at large window  Equilibrium problem Packet level: AI too slow, MD too drastic Flow level: required loss probability too small  Dynamic problem Packet level: must oscillate on binary signal Flow level: unstable at large window 5

Packet & flow level ACK: W  W + 1/W Loss: W  W – 0.5W  Packet level Reno TCP  Flow level Equilibrium Dynamics pkts (Mathis formula)

Reno TCP  Packet level Designed and implemented first  Flow level Understood afterwards  Flow level dynamics determines Equilibrium: performance, fairness Stability  Design flow level equilibrium & stability  Implement flow level goals at packet level

Reno TCP  Packet level Designed and implemented first  Flow level Understood afterwards  Flow level dynamics determines Equilibrium: performance, fairness Stability Packet level design of FAST, HSTCP, STCP guided by flow level properties

Packet level ACK: W  W + 1/W Loss: W  W – 0.5W  Reno AIMD(1, 0.5) ACK: W  W + a(w)/W Loss: W  W – b(w)W  HSTCP AIMD(a(w), b(w)) ACK: W  W Loss: W  W – 0.125W  STCP MIMD(a, b)  FAST

Flow level: Reno, HSTCP, STCP, FAST  Similar flow level equilibrium  = (Reno), (HSTCP), (STCP) pkts/sec (Mathis formula)

Flow level: Reno, HSTCP, STCP, FAST  Different gain  and utility U i They determine equilibrium and stability  Different congestion measure p i Loss probability (Reno, HSTCP, STCP) Queueing delay (Vegas, FAST)  Common flow level dynamics! window adjustment control gain flow level goal =

Implementation strategy  Common flow level dynamics window adjustment control gain flow level goal =  Small adjustment when close, large far away Need to estimate how far current state is wrt target Scalable  Window adjustment independent of p i Depends only on current window Difficult to scale

Outline  Motivation & approach  FAST architecture  Window control algorithm  Experimental evaluation skip: theoretical foundation

Architecture RTT timescale Loss recovery <RTT timescale

Architecture Each component  designed independently  upgraded asynchronously

Architecture Each component  designed independently  upgraded asynchronously Window Control

Uses delay as congestion measure Delay provides finer congestion info Dealy scales correctly with network capacity Can operate with low queuing delay FAST-TCP basic idea Loss CWindow Queue Delay FAST Loss Based TCP

Window control algorithm  Full utilization regardless of bandwidth-delay product  Globally stable exponential convergence  Fairness weighted proportional fairness parameter 

Window control algorithm target backlogmeasured backlog

Outline  Motivation & approach  FAST architecture  Window control algorithm  Experimental evaluation Abilene-HENP network Haystack Observatory DummyNet

Abilene Test OC48 OC192 (Yang Xia, Harvey Newman, Caltech) Periodic losses every 10mins

(Yang Xia, Harvey Newman, Caltech) Periodic losses every 10mins

(Yang Xia, Harvey Newman, Caltech) Periodic losses every 10mins FAST backs off to make room for Reno

“ Ultrascale ” protocol development: FAST TCP FAST TCP  Based on TCP Vegas  Uses end-to-end delay and loss to dynamically adjust the congestion window  Defines an explicit equilibrium Linux TCP Westwood+ BIC TCP FAST BW use 30% BW use 50%BW use 79% Capacity = OC Gbps; 264 ms round trip latency; 1 flow BW use 40% (Yang Xia, Caltech)

Haystack Experiments Lapsley, MIT Haystack

Haystack - 1 Flow (Atlanta-> Japan) Iperf used to generate traffic. Sender is a Xeon 2.6 Ghz Window was constant: Burstiness in rate due to Host processing and ack spacing. Lapsley, MIT Haystack

Haystack – 2 Flows from 1 machine (Atlanta -> Japan) Lapsley, MIT Haystack

Timeout All outstanding packets marked as lost. 1.SACKs reduce lost packets 2. Lost packets retransmitted slowly as cwnd is capped at 1 (bug). Linux Loss Recovery

DummyNet Experiments  Experiments using emulated network.  800 Mbps emulated bottleneck in DummyNet. Sender PC Dual Xeon 2.6Ghz 2Gb Intel GbE Linux DummyNet PC Dual Xeon 3.06Ghz 2Gb FreeBSD Mbps Receiver PC Dual Xeon 2.6Ghz 2Gb Intel GbE Linux

Dynamic sharing: 3 flows FASTLinux Dynamic sharing on Dummynet  capacity = 800Mbps  delay=120ms  3 flows  iperf throughput  Linux 2.4.x (HSTCP: UCL)

Dynamic sharing: 3 flows FASTLinux HSTCP BIC Steady throughput

FASTLinux throughput loss queue STCPHSTCP Dynamic sharing on Dummynet  capacity = 800Mbps  delay=120ms  14 flows  iperf throughput  Linux 2.4.x (HSTCP: UCL) 30min

FASTLinux throughput loss queue HSTCP 30min Room for mice ! HSTCP BIC

Average Queue vs Buffer Size Dummynet  capacity = 800Mbps  Delay =200ms  1 flows  Buffer size: 50, …, 8000 pkts (S. Hedge, B. Wydrowski, etc, Caltech)

Is large queue necessary for high throughput?

 FAST TCP: motivation, architecture, algorithms, performance. IEEE Infocom March 2004  -release: April 2004 Source freely available for any non-profit use netlab.caltech.edu/FAST

Aggregate throughput ideal performance Dummynet: cap = 800Mbps; delay = ms; #flows = 1-14; 29 expts

Aggregate throughput small window 800pkts large window 8000 Dummynet: cap = 800Mbps; delay = ms; #flows = 1-14; 29 expts

Fairness Jain’s index HSTCP ~ Reno Dummynet: cap = 800Mbps; delay = ms; #flows = 1-14; 29 expts

Stability Dummynet: cap = 800Mbps; delay = ms; #flows = 1-14; 29 expts stable in diverse scenarios

 FAST TCP: motivation, architecture, algorithms, performance. IEEE Infocom March 2004  -release: April 2004 Source freely available for any non-profit use netlab.caltech.edu/FAST

BACKUP Slides

IP Rights  Caltech owns IP rights applicable more broadly than TCP leave all options open  IP freely available if FAST TCP becomes IETF standard  Code available on FAST website for any non-commercial use

WAN in Lab Caltech: John Doyle, Raj Jayaraman, George Lee, Steven Low (PI), Harvey Newman, Demetri Psaltis, Xun Su, Yang Xia Cisco: Bob Aiken, Vijay Doraiswami, Chris McGugan, Steven Yip netlab.caltech.edu NSF

Key Personnel  Steven Low, CS/EE  Harvey Newman, Physics  John Doyle, EE/CDS  Demetri Psaltis, EE Cisco  Bob Aiken  Vijay Doraiswami  Chris McGugan  Steven Yip  Raj Jayaraman, CS  Xun Su, Physics  Yang Xia, Physics  George Lee, CS  2 grad students  3 summer students  Cisco engineers

Spectrum of tools log(cost) log(abstraction) mathsimulationemulationlive nkWANiLab NS SSFNet QualNet JavaSim Mathis formula Optimization Control theory Nonlinear model Stocahstic model DummyNet EmuLab ModelNet WAIL PlanetLab Abilene NLR DataTAG CENIC WAIL etc ? …we use them all

Spectrum of tools mathsimulationemulationlive nk WANiLab DistanceHigh SpeedHigh Low RealismHigh Low TrafficHighLow ConfigurableLowMediumHigh MonitoringLowMediumHigh CostHighMediumLow Critical in development e.g. Web100

Goal State-of-the-art hybrid WAN  High speed, large distance 2.5G  10G 50 – 200ms  Wireless devices connected by optical core  Controlled & repeatable experiments  Reconfigurable & evolvable  Built in monitoring capability

WAN in Lab  5-year plan  6 Cisco ONS15454  4 routers  10s servers  Wireless devices  800km fiber  ~100ms RTT V. Doraiswami (Cisco) R. Jayaraman (Caltech)

WAN in Lab  Year-1 plan  3 Cisco ONS  2 routers  10s servers  Wireless devices V. Doraiswami (Cisco) R. Jayaraman (Caltech)

Hybrid Network Scenarios:  Ad hoc network  Cellular network  Sensor network How optical core supports wireless edges? X. Su (Caltech)

Experiments  Transport & network layer TCP, AQM, TCP/IP interaction  Wireless hybrid networking Wireless media delivery Fixed wireless access Sensor networks  Optical control plane  Grid computing UltraLight

 WAN in Lab Capacity: 2.5 – 10 Gbps Delay: 0 – 100 ms round trip Delay: 0 – 400 ms round trip  Configurable & evolvable Topology, rate, delays, routing Always at cutting edge  Flexible, active debugging Passive monitoring, AQM  Integral part of R&A networks Transition from theory, implementation, demonstration, deployment Transition from lab to marketplace  Global resource Part of global infrastructure UltraLight led by Newman Unique capabilities Calren2/Abilene Chicago Amsterdam CERN Geneva SURFNet StarLight WAN in Lab Caltech research & production networks Multi-Gbps ms delay Experiment

Network debugging  Performance problems in real network Simulation will miss Emulation might miss Live network hard to debug  WAN in Lab Passive monitoring inside network Active debugging possible

Passive monitoring Fiber splitter DAG RAID Timestamp Header GPS Monitor  No overhead on system  Can capture full info at OC48 UofWaikato’s DAG card captures at OC48 speed Can filter if necessary Disk speed = 2.5Gbps*40/1500 = 66Mbps  Monitors synchronized by GPS or cheaper alternatives  Data stored for offline analysis D. Wei (Caltech)

Passive monitoring D. Wei (Caltech) Fiber splitter DAG RAID Timestamp Header GPS Monitor Server router monitor Web100, MonALISA

UltraLight testbed UltraLight team (Newman)

Status  Hardware Optical transport design: finalized IP infrastructure design: finalized (almost) Wireless infrastructure design: finalized Price negotiation/ordering/delivery: summer 04  Software Passive monitoring: summer student Management software:  Physical lab Renovation: to be completed by summer 04

hardware design physical building fund raising NSF funds 10/03 Status usable testbed 12/04 monitoring traffic generation connected UltraLight useful testbed 12/05 ARO funds 5/04 expansion support management

CS Dept Jorgensen Lab Net Lab WAN in Lab G. Lee, R. Jayaraman, E. Nixon (Caltech)

Summary  Testbed driven by research agenda Rich and strong networking effort Integrated approach: theory + implementation + experiments “A network that can break”  Integral part of real testbeds Part of global infrastructure UltraLight led by Harvey Newman (Caltech)  Integrated monitoring & measurement facility Fiber splitter passive monitors MonALISA