Web100/Net100 at Oak Ridge National Lab Tom Dunigan August 1, 2002.

Slides:



Advertisements
Similar presentations
Click to edit Master title style Click to edit Master text styles –Second level Third level –Fourth level »Fifth level 1 List of Nominations Whats Good.
Advertisements

Appropriateness of Transport Mechanisms in Data Grid Middleware Rajkumar Kettimuthu 1,3, Sanjay Hegde 1,2, William Allcock 1, John Bresnahan 1 1 Mathematics.
Using NetLogger and Web100 for TCP analysis Data Intensive Distributed Computing Group Lawrence Berkeley National Laboratory Brian L. Tierney.
TCP/IP Over Lossy Links - TCP SACK without Congestion Control.
ORNL Net100 status July 31, UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory ORNL Net100 Focus Areas (first year) –TCP optimizations.
1 TCP Congestion Control. 2 TCP Segment Structure source port # dest port # 32 bits application data (variable length) sequence number acknowledgement.
Congestion Control Created by M Bateman, A Ruddle & C Allison As part of the TCP View project.
XCP: Congestion Control for High Bandwidth-Delay Product Network Dina Katabi, Mark Handley and Charlie Rohrs Presented by Ao-Jan Su.
Profiling Network Performance in Multi-tier Datacenter Applications
Chapter 3 Transport Layer slides are modified from J. Kurose & K. Ross CPE 400 / 600 Computer Communication Networks Lecture 12.
Congestion Control Tanenbaum 5.3, /12/2015Congestion Control (A Loss Based Technique: TCP)2 What? Why? Congestion occurs when –there is no reservation.
Transport Layer3-1 Congestion Control. Transport Layer3-2 Principles of Congestion Control Congestion: r informally: “too many sources sending too much.
CSEE W4140 Networking Laboratory Lecture 7: TCP flow control and congestion control Jong Yul Kim
High-performance bulk data transfers with TCP Matei Ripeanu University of Chicago.
TCP Congestion Control TCP sources change the sending rate by modifying the window size: Window = min {Advertised window, Congestion Window} In other words,
1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.
Data Communication and Networks
Transport Level Protocol Performance Evaluation for Bulk Data Transfers Matei Ripeanu The University of Chicago Abstract:
Introduction 1 Lecture 14 Transport Layer (Congestion Control) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer Science.
The Effects of Systemic Packets Loss on Aggregate TCP Flows Thomas J. Hacker May 8, 2002 Internet 2 Member Meeting.
Courtesy: Nick McKeown, Stanford 1 TCP Congestion Control Tahir Azim.
Development of network-aware operating systems Tom Dunigan
Transport Layer 4 2: Transport Layer 4.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Transport Layer3-1 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4 Principles.
TFRC: TCP Friendly Rate Control using TCP Equation Based Congestion Model CS 218 W 2003 Oct 29, 2003.
1 Transport Protocols (continued) Relates to Lab 5. UDP and TCP.
CSE 461 University of Washington1 Topic How TCP implements AIMD, part 1 – “Slow start” is a component of the AI portion of AIMD Slow-start.
1 Project Goals Project Elements Future Plans Scheduled Accomplishments Project Title: Net Developing Network-Aware Operating Systems PI: G. Huntoon,
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
HighSpeed TCP for High Bandwidth-Delay Product Networks Raj Kettimuthu.
NET100 Development of network-aware operating systems Tom Dunigan
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 PIs: Wendy Huntoon/PSC, Tom Dunigan/ORNL, Brian Tierney/LBNL Impact and Connections.
TCP performance Sven Ubik FTP throughput capacity load ftp.uninett.no 12.3 Mb/s 1.2 Gb/s 80 Mb/s (6.6%) ftp.stanford.edu 1.3 Mb/s 600.
Network-aware OS DOE/MICS Project Review August 18, 2003 Tom Dunigan Matt Mathis Brian Tierney
1 On Dynamic Parallelism Adjustment Mechanism for Data Transfer Protocol GridFTP Takeshi Itou, Hiroyuki Ohsaki Graduate School of Information Sci. & Tech.
Pavel Cimbál, Sven Ubik CESNET TNC2005, Poznan, 9 June 2005 Tools for TCP performance debugging.
NET100 … as seen from ORNL Tom Dunigan November 8, 2001.
NET100 Development of network-aware operating systems Tom Dunigan
Network-aware OS DOE/MICS Project Final Review September 16, 2004 Tom Dunigan Matt Mathis Brian Tierney ORNL.
Transport Layer 3-1 Chapter 3 Transport Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March
Computer Networking Lecture 18 – More TCP & Congestion Control.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Transport Layer 3- Midterm score distribution. Transport Layer 3- TCP congestion control: additive increase, multiplicative decrease Approach: increase.
Winter 2008CS244a Handout 71 CS244a: An Introduction to Computer Networks Handout 7: Congestion Control Nick McKeown Professor of Electrical Engineering.
Janey C. Hoe Laboratory for Computer Science at MIT 노상훈, Pllab.
Advance Computer Networks Lecture#09 & 10 Instructor: Engr. Muhammad Mateen Yaqoob.
Chapter 11.4 END-TO-END ISSUES. Optical Internet Optical technology Protocol translates availability of gigabit bandwidth in user-perceived QoS.
NET100 Development of network-aware operating systems Tom Dunigan
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100: developing network-aware operating systems New (9/01) DOE-funded (Office of.
Peer-to-Peer Networks 13 Internet – The Underlay Network
Network-aware OS ESCC Miami February 5, 2003 Tom Dunigan Matt Mathis Brian Tierney
Network-aware OS DOE/MICS Project Review August 18, 2003 Tom Dunigan Matt Mathis Brian Tierney CSM lunch.
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 year 1 leftovers (proposal): PSC –none ORNL –router access to SNMP data (besides.
Network-aware OS DOE/MICS ORNL site visit January 8, 2004 ORNL team: Tom Dunigan, Nagi Rao, Florence Fowler, Steven Carter Matt Mathis Brian.
Transport Layer session 1 TELE3118: Network Technologies Week 11: Transport Layer TCP Some slides have been taken from: r Computer Networking:
A TCP Tuning Daemon SC2002 November 19, 2002 Tom Dunigan Matt Mathis Brian Tierney
Chapter 3 outline 3.1 transport-layer services
Chapter 6 TCP Congestion Control
Chapter 3 outline 3.1 Transport-layer services
Transport Protocols over Circuits/VCs
Chapter 6 TCP Congestion Control
Sven Ubik TCP performance Sven Ubik
CS4470 Computer Networking Protocols
Chapter 3 outline 3.1 Transport-layer services
TCP flow and congestion control
Anant Mudambi, U. Virginia
Using NetLogger and Web100 for TCP analysis
Presentation transcript:

Web100/Net100 at Oak Ridge National Lab Tom Dunigan August 1, 2002

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Web100 at ORNL Funding and goals Web100 tools and insights –Java bandwidth server –instrumented probes and log daemon –trace daemons –my favorite Web100 variables TCP tuning with Web100 –tuning daemon (WAD) –tuning buffer sizes, slow-start, AIMD/VMSS, delayed ACK, reordering, parallel Web100 needs

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100: developing network-aware operating systems DOE-funded (Office of Science) project ($1M/yr, 3 yrs beginning 9/01) Principal investigators –Matt Mathis, PSC ( ) –Brian Tierney, LBNL ( ) –Tom Dunigan, ORNL ( ) Florence Fowler Nagi Rao Objective: –measure and understand end-to-end network and application performance –tune network applications (grid and bulk transfer) – first year emphasis : bulk transfer over high delay/bandwidth nets Components (leverage Web100) –Network Tool Analysis Framework (NTAF) tool design and analysis active network probes and passive sensors network metrics data base –transport protocol analysis –tuning daemon (WAD) to tune network flows based on network metrics

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Web100 tools Java applet bandwidth/client tester –measure in/out data rates –report flow characteristics – Try it – INSIGHTS : what happened, what you can expect from server log: – 25,755 flows – 53% with loss, 23% timeouts Post-transfer statistics –ttcp100/iperf100 –Web100 daemon avoid modifying applications log designated paths/ports/variables – INSIGHTS : later...

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Web100 tools Tracer daemon –collect Web100 variables at 0.1 second intervals –config file specifies source/port dest/port web100 variables (current/delta) –log to disk with timestamp and CID –C and python (LBL-based) – INSIGHTS : watch uninstrumented app’s (GridFTP) analyze flow dynamics with plots (cwnd, ssthresh, re-xmits,RTT…) analyze tuned flows aggregate parallel flow data # traced config file #local lport remote rport #v=value d=delta d PktsOut d PktsRetrans v CurrentCwnd v SampledRTT

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory My favorite Web100 variables Post-transfer –CurrentMSS/Timeouts: PIX firewall problems –RetransThresh: out of order packets –MaxCwnd/MaxSsthresh: path capacity, linux 2.4 caching –MinRTT/MaxRTT/*RTO: queuing, bandwidth-delay –SendStall/OtherReductions: linux 2.4 slowups –MaxRwinRcvd/Sndbuf: buffer limits, web100 wscale clamp –CongestionSignals/PacketsRetrans: loss intensity –SndLimTime* : bottleneck Dynamic –CongestionSignals/PacketsRetrans/CurrentCwnd: type of loss, when (ss) –SampledRTT: queueing delays –CurrentSsthresh/Pktsout: recovery, timeouts –CurrentRwinRcvd: linux 2.4 window advertisement

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory PIX SACK problem Web100 reports timeouts into ORNL, not at other sites ?? Theory 1: yet another linux 2.4 TCP feature our TCP-over-UDP: no timeouts Tcpdump/tcptrace/xplot of flow both inside and outside ORNL ? Tcptrace bug -- SACK blocks wrong for one of the dumps… NOT. ORNL PIX firewall randomizing TCP sequence numbers, but failed to adjust SACK blocks RESULT: TCP timeouts

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory TCP tuning with Web100+/Net100 Path characterization (NTAF) –both active and passive measurement –data base of measurement data –NTAF/Web100 hosts at PSC, NCAR,LBL,ORNL Application tuning (tuning daemon, WAD) –Web100 extensions disable Linux 2.4 caching/SendStall event notification more tuning options –daemon tunes application at start up static tuning information query NTAF and calculate optimum TCP parameters –dynamically tune application (Web100 feedback) adjust parameters during flow split optimum among parallel flows Transport protocol optimizations –what to tune? –is it fair? stable?

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 TCP tuning TCP performance –reliable/stable/fair –need buffer = bandwidth*RTT ORNL/NERSC (80 ms, OC12) need 6 MB –TCP slow-start and loss recovery proportional to MSS/RTT slow on today’s high delay/bandwidth paths –TCP is lossy be design TCP tuning –set optimal (?) buffer size –avoid losses modified slow-start reduce bursts anticipate (Vegas?) loss reorder threshold –speed recovery bigger MTU or “virtual MSS” modified AIMD (0.5,1) delayed ACKs and initial window ns simulation: 500 mbs link, 80 ms RTT Packet loss early in slow start. Standard TCP with del ACK takes 10 minutes to recover!

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 TCP tuning Work-around Daemon (WAD) –tune unknowing sender/receiver at startup and/or during flow –Web100 kernel extensions uses netlink to alert daemon of socket open/close Besides existing Web100 buffer tuning, new code and WAD_* variables knobs to disable Linux 2.4 caching and sendstall –config file with static tuning data mode specifies dynamic tuning (Floyd AIMD, NTAF buffer size, concurrent streams) –daemon periodically polls NTAF for fresh tuning data –written in C (LBL has python version) WAD config file [bob] src_addr: src_port: 0 dst_addr: dst_port: 0 mode: 1 sndbuf: rcvbuf: wadai: 6 wadmd: 0.3 maxssth: 100 divide: 1 reorder: 9 delack: 0 floyd: 1

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory WAD tuning results (your mileage may vary …) Classic buffer tuning : ORNL to PSC, OC12, 80ms RTT network-challenged app. gets 10 Mbs same app., WAD/NTAF tuned buffer get 143 Mbs Virtual MSS tune TCP’s additive increase (WAD_AI) add K segments per RTT during recovery k=6 like GigE jumboframe

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory WAD tuning Modified slow-start and AI ORNL to NERSC, OC12, 80 ms RTT often losses in slow start WAD tuned Floyd slowstart (WAD_MaxThresh) and AI (6) WAD tuned AIMD and slow start ORNL to CERN, OC12, 150ms RTT parallel streams AIMD (1/(2k),k) WAD tune single stream (0.125,4) WAD_MD Can tuned single stream compete with parallel streams ? pre-tune Floyd AIMD or dynamically adjust tune concurrent flows -- subdivide buffer

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100 TCP tuning Reorder threshold seeing more out of order packets WAD tune a bigger reorder threshold Linux 2.4 does a good job already LBL to ORNL (using our TCP-over-UDP) dup3 case had 289 retransmits, but all were unneeded! WAD could turn off delayed ACKs -- 2x improvement in recovery rate and slowstart linux 2.4 already turns off delayed ACKs for initial slow-start WARNING : could be unfair, probably stable use only on intranet Web100 has proven very useful for experimenting with TCP tuning options.

UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Futures Net100 –analyze effectiveness of current tuning options –NTAF probes -- characterizing a path to tune a flow –additional tuning algorithms –parallel/multipath selection/tuning –WAD-to-WAD tuning Web100 extensions –Web100 trace files -- log all data efficiently –variable for count of duplicate data segments at receiver –remove wscale restriction