Performance Diagnostic Research at PSC Matt Mathis John Heffner Ragu Reddy 5/12/05 PathDiag20050512.ppt.

Slides:



Advertisements
Similar presentations
Martin Suchara, Ryan Witt, Bartek Wydrowski California Institute of Technology Pasadena, U.S.A. TCP MaxNet Implementation and Experiments on the WAN in.
Advertisements

Pushing Up Performance for Everyone Matt Mathis 7-Dec-99.
Cs/ee 143 Communication Networks Chapter 6 Internetworking Text: Walrand & Parekh, 2010 Steven Low CMS, EE, Caltech.
Fundamentals of Computer Networks ECE 478/578 Lecture #20: Transmission Control Protocol Instructor: Loukas Lazos Dept of Electrical and Computer Engineering.
Transport Layer3-1 TCP. Transport Layer3-2 TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581 r full duplex data: m bi-directional data flow in same connection.
BZUPAGES.COM 1 User Datagram Protocol - UDP RFC 768, Protocol 17 Provides unreliable, connectionless on top of IP Minimal overhead, high performance –No.
11 TROUBLESHOOTING Chapter 12. Chapter 12: TROUBLESHOOTING2 OVERVIEW  Determine whether a network communications problem is related to TCP/IP.  Understand.
QoS Solutions Confidential 2010 NetQuality Analyzer and QPerf.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 OSI Transport Layer Network Fundamentals – Chapter 4.
TDC365 Spring 2001John Kristoff - DePaul University1 Internetworking Technologies Transmission Control Protocol (TCP)
Transport Layer 3-1 Transport Layer r To learn about transport layer protocols in the Internet: m TCP: connection-oriented protocol m Reliability protocol.
Transport Layer 3-1 Transport Layer r To learn about transport layer protocols in the Internet: m TCP: connection-oriented protocol m Reliability protocol.
TDC375 Winter 03/04 John Kristoff - DePaul University 1 Network Protocols Transmission Control Protocol (TCP)
User-level Internet Path Diagnosis R. Mahajan, N. Spring, D. Wetherall and T. Anderson.
L13: Sharing in network systems Dina Katabi Spring Some slides are from lectures by Nick Mckeown, Ion Stoica, Frans.
3-1 Transport services and protocols r provide logical communication between app processes running on different hosts r transport protocols run in end.
Error Checking continued. Network Layers in Action Each layer in the OSI Model will add header information that pertains to that specific protocol. On.
Draft-constantine-ippm-tcp-throughput-tm-00.txt 1 TCP Throughput Testing Methodology IETF 76 Hiroshima Barry Constantine
Process-to-Process Delivery:
Peter O’Neil Executive Director November 29, 2007 MAX Fall Member Meeting.
Using NDT July 22 nd 2013, XSEDE Network Performance Tutorial Jason Zurawski – Internet2/ESnet.
CIS 725 Wireless networks. Low bandwidth High error rates.
Pathdiag: Automatic TCP Diagnosis Matt Mathis John Heffner Ragu Reddy 8/01/08 PathDiag ppt.
COMT 4291 Communications Protocols and TCP/IP COMT 429.
Wireless TCP Prasun Dewan Department of Computer Science University of North Carolina
Fundamentals of Computer Networks ECE 478/578 Lecture #19: Transport Layer Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
Transport Layer: TCP and UDP. Overview of TCP/IP protocols Comparing TCP and UDP TCP connection: establishment, data transfer, and termination Allocation.
TCP1 Transmission Control Protocol (TCP). TCP2 Outline Transmission Control Protocol.
Networked & Distributed Systems TCP/IP Transport Layer Protocols UDP and TCP University of Glamorgan.
1 Network Monitoring Mi-Jung Choi Dept. of Computer Science KNU
Transport Layer Moving Segments. Transport Layer Protocols Provide a logical communication link between processes running on different hosts as if directly.
1 Internet Control Message Protocol (ICMP) Used to send error and control messages. It is a necessary part of the TCP/IP suite. It is above the IP module.
CS551: End-to-End Packet Dynamics Paxon’99 Christos Papadopoulos (
4061 Session 25 (4/17). Today Briefly: Select and Poll Layered Protocols and the Internets Intro to Network Programming.
CCNA 2 Week 9 Router Troubleshooting. Copyright © 2005 University of Bolton Topics Routing Table Overview Network Testing Troubleshooting Router Issues.
Network Path and Application Diagnostics Matt Mathis John Heffner Ragu Reddy 4/24/06 PathDiag ppt.
NET100 Development of network-aware operating systems Tom Dunigan
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Basil Irwin & George Brett.
1 Capacity Dimensioning Based on Traffic Measurement in the Internet Kazumine Osaka University Shingo Ata (Osaka City Univ.)
The TCP-ESTATS-MIB Matt Mathis John Heffner Raghu Reddy Pittsburgh Supercomputing Center Rajiv Raghunarayan Cisco Systems J. Saperia JDS Consulting, Inc.
1 Evaluating NGI performance Matt Mathis
1 CS 4396 Computer Networks Lab TCP – Part II. 2 Flow Control Congestion Control Retransmission Timeout TCP:
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
Web100 Basil Irwin National Center for Atmospheric Research Matt Mathis Pittsburgh Supercomputing Center Halloween, 2000.
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Roll Out I2 Members Meeting.
Advance Computer Networks Lecture#09 & 10 Instructor: Engr. Muhammad Mateen Yaqoob.
Transport Protocols.
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 and Logistical Networking.
Network Path and Application Diagnostics Matt Mathis John Heffner Ragu Reddy 7/19/05 PathDiag ppt.
UT-BATTELLE U.S. Department of Energy Oak Ridge National Laboratory Net100: developing network-aware operating systems New (9/01) DOE-funded (Office of.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Congestion Control 0.
IP1 The Underlying Technologies. What is inside the Internet? Or What are the key underlying technologies that make it work so successfully? –Packet Switching.
McGraw-Hill Chapter 23 Process-to-Process Delivery: UDP, TCP Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
05 October 2001 End-to-End Performance Initiative Network Measurement Matt Zekauskas, Fall 2001 Internet2 Member Meeting Network Measurement.
11 CS716 Advanced Computer Networks By Dr. Amir Qayyum.
Network Protocols: Design and Analysis Polly Huang EE NTU
Connect communicate collaborate Performance Metrics & Basic Tools Robert Stoy, DFN EGI TF, Madrid September 2013.
DMET 602: Networks and Media Lab Amr El Mougy Yasmeen EssamAlaa Tarek.
3. END-TO-END PROTOCOLS (PART 1) Rocky K. C. Chang Department of Computing The Hong Kong Polytechnic University 22 March
DMET 602: Networks and Media Lab
Network Path and Application Diagnostics
5. End-to-end protocols (part 1)
Process-to-Process Delivery:
“Detective”: Integrating NDT and E2E piPEs
Process-to-Process Delivery: UDP, TCP
Anant Mudambi, U. Virginia
Error Checking continued
Presentation transcript:

Performance Diagnostic Research at PSC Matt Mathis John Heffner Ragu Reddy 5/12/05 PathDiag ppt

The Wizard Gap

The non-experts are falling behind YearExpertsNon-expertsRatio Mb/s300 kb/s3: Mb/s Mb/s Gb/s Gb/s 3 Mb/s 3000: Gb/s Why?

TCP tuning requires expert knowledge By design TCP/IP hides the ‘net from upper layers –TCP/IP provides basic reliable data delivery –The “hour glass” between applications and networks This is a good thing, because it allows: –Old applications to use new networks –New application to use old networks –Invisible recovery from data loss, etc But then (nearly) all problems have the same symptom –Less than expected performance –The details are hidden from nearly everyone

TCP tuning is really debugging Application problems: –Inefficient or inappropriate application designs Operating System or TCP problems: –Negotiated TCP features (SACK, WSCALE, etc) –Failed MTU discovery –Too small retransmission or reassembly buffers Network problems: –Packet losses, congestion, etc –Packets arriving out of order or even duplicated –“Scenic” IP routing or excessive round trip times –Improper packet sizes limits (MTU)

TCP tuning is painful debugging All problems reduce performance –But the specific symptoms are hidden But any one problem can prevent good performance –Completely masking all other problems Trying to fix the weakest link of an invisible chain –General tendency is to guess and “fix” random parts –Repairs are sometimes “random walks” –Repair one problem at time at best

The Web100 project When there is a problem, just ask TCP –TCP has the ideal vantage point In between the application and the network –TCP already “measures” key network parameters Round Trip Time (RTT) and available data capacity Can add more –TCP can identify the bottleneck Why did it stop sending data? –TCP can even adjust itself “autotuning” eliminates one major class of bugs See:

Key Web100 components Better instrumentation within TCP –120 internal performance monitors –Poised to become Internet standard “MIB” TCP Autotuning –Selects the ideal buffer sizes for TCP –Eliminate the need for user expertise Basic network diagnostic tools –Requires less expertise than prior tools Excellent for network admins But still not useful for end users

Web100 Status Two year no-cost extension –Can only push standardization after most of the work –Ongoing support of research users Partial adoption –Current Linux includes (most of) autotuning John Heffner is maintaining patches for the rest of Web100 –Microsoft Experimental TCP instrumentation Working on autotuning (to support FTTH) –IBM “z/OS Communications Server” Experimental TCP instrumentation

The next step Web100 tools still require too much expertise –They are not really end user tools –Too easy to overlook problems –Current diagnostic procedures are still cumbersome New insight from web100 experience –Nearly all symptoms scale with round trip time New NSF funding –Network Path and Application Diagnosis (NPAD) –3 Years, we are at the midpoint

Nearly all symptoms scale with RTT For example –TCP Buffer Space, Network loss and reordering, etc –On a short path TCP can compensate for the flaw Local Client to Server: all applications work –Including all standard diagnostics Remote Client to Server: all applications fail –Leading to faulty implication of other components

Examples of flaws that scale Chatty application (e.g., 50 transactions per request) –On 1ms LAN, this adds 50ms to user response time –On 100ms WAN, this adds 5s to user response time Fixed TCP socket buffer space (e.g., 32kBytes) –On a 1ms LAN, limit throughput to 200Mb/s –On a 100ms WAN, limit throughput to 2Mb/s Packet Loss (e.g., 0.1% loss at 1500 bytes) –On a 1ms LAN, models predict 300 Mb/s –On a 100ms WAN, models predict 3 Mb/s

Review For nearly all network flaws –The only symptom is reduced performance –But this reduction is scaled by RTT On short paths many flaws are undetectable –False pass for even the best conventional diagnostics –Leads to faulty inductive reasoning about flaw locations –This is the essence of the “end-to-end” problem –Current state-of-the-art diagnosis relies on tomography and complicated inference techniques

Our new tool: pathdiag Specify End-to-End application performance goal –Round Trip Time (RTT) of the full path –Desired application data rate Measure the performance of a short path section –Use Web100 to collect detailed statistics –Loss, delay, queuing properties, etc Use models to extrapolate results to full path –Assume that the rest of the path is ideal Pass/Fail on the basis of the extrapolated performance

Deploy as a Diagnostic Server Use pathdiag in a Diagnostic Server (DS) in the GigaPop Specify End to End target performance –from server (S) to client (C) (RTT and data rate) Measure the performance from DS to C –Use Web100 in the DS to collect detailed statistics –Extrapolate performance assuming ideal backbone Pass/Fail on the basis of extrapolated performance

Example diagnostic output 1 Tester at IP address: xxx.xxx Target at IP address: xxx.xxx Warning: TCP connection is not using SACK Fail: Received window scale is 0, it should be 2. Diagnosis: TCP on the test target is not properly configured for this path. > See TCP tuning instructions at Pass data rate check: maximum data rate was Mb/s Fail: loss event rate: % (3960 pkts between loss events) Diagnosis: there is too much background (non-congested) packet loss. The events averaged losses each, for a total loss rate of % FYI: To get 4 Mb/s with a 1448 byte MSS on a 200 ms path the total end-to-end loss budget is % (9733 pkts between losses). Warning: could not measure queue length due to previously reported bottlenecks Diagnosis: there is a bottleneck in the tester itself or test target (e.g insufficient buffer space or too much CPU load) > Correct previously identified TCP configuration problems > Localize all path problems by testing progressively smaller sections of the full path. FYI: This path may pass with a less strenuous application: Try rate=4 Mb/s, rtt=106 ms Or if you can raise the MTU: Try rate=4 Mb/s, rtt=662 ms, mtu=9000 Some events in this run were not completely diagnosed.

Example diagnostic output 2 Tester at IP address: Target at IP address: FYI: TCP negotiated appropriate options: WSCALE=8, SACKok, and Timestamps) Pass data rate check: maximum data rate was Mb/s Pass: measured loss rate % (22364 pkts between loss events) FYI: To get 10 Mb/s with a 1448 byte MSS on a 10 ms path the total end-to-end loss budget is % (152 pkts between losses). FYI: Measured queue size, Pkts: 33 Bytes: Drain time: ms Passed all tests! FYI: This path may even pass with a more strenuous application: Try rate=10 Mb/s, rtt=121 ms Try rate=94 Mb/s, rtt=12 ms Or if you can raise the MTU: Try rate=10 Mb/s, rtt=753 ms, mtu=9000 Try rate=94 Mb/s, rtt=80 ms, mtu=9000

Example diagnostic output 3 Tester at IP address: Target at IP address: Fail: Received window scale is 0, it should be 1. Diagnosis: TCP on the test target is not properly configured for this path. > See TCP tuning instructions at Test 1a (7 seconds): Coarse Scan Test 2a (17 seconds): Search for the knee Test 2b (10 seconds): Duplex test Test 3a (8 seconds): Accumulate loss statistics Test 4a (17 seconds): Measure static queue space The maximum data rate was Mb/s This is below the target rate ( ). Diagnosis: there seems to be a hard data rate limit > Double check the path: is it via the route and equipment that you expect? Pass: measured loss rate % (7834 pkts between loss events) FYI: To get 10 Mb/s with a 1448 byte MSS on a 50 ms path the total end-to-end loss budget is % (3802 pkts between losses).

Key DS features Nearly complete coverage for OS and Network flaws –Does not address flawed routing at all –May fail to detect flaws that only affect outbound data Unless you have Web100 in the client or a (future) portable DS –May fail to detect a few rare corner cases –Eliminates all other false pass results Tests becomes more sensitive on shorter paths –Conventional diagnostics become less sensitive –Depending on models, perhaps too sensitive New problem is false fail (queue space tests) Flaws no longer completely mask other flaws –A single test often detects several flaws E.g. both OS and network flaws in the same test –They can be repaired in parallel

Key features, continued Results are specific and less geeky –Intended for end-users –Provides a list of action items to be corrected Failed tests are showstoppers for high performance app. –Details for escalation to network or system admins Archived results include raw data –Can reprocess with updated reporting SW

The future Current service is “pre-alpha” –Please use it so we can validate the tool We can often tell when it got something wrong –Please report confusing results So we can improve the reports –Please get us involved if it is non-helpful We need interesting pathologies Will soon have another server near FRGP –NCAR in Boulder CO Will someday be in a position to deploy more –Should there be one at PSU?

What about flaws in applications? NPAD is also thinking about applications Using an entirely different collection of techniques –Symptom scaling still applies Tools to emulate ideal long paths on a LAN –Prove or bench test applications in the lab Also checks some OS and TCP features –If it fails in the lab, it can not work on a WAN

For example classic ssh & scp Long known performance problems Recently diagnosed –Internal flow control for port forwarding –NOT encryption Chris Rapier developed a patch –Update flow control windows from kernel buffer size –Already running on most PSC systems See:

NPAD Goal Build a minimal tool set that can detect “every” flaw –Pathdiag: all flaws affecting inbound data –Web100 in servers or portable diagnostic servers: All flaws affecting outbound data –Application bench test: All application flaws –Traceroute: routing flaws We believe that this is a complete set