1 A new infrastructure for high throughput network and application performance measurement. Les Cottrell – SLAC Prepared for the IPAM meeting, UCLA Mar.

Slides:



Advertisements
Similar presentations
Pathload A measurement tool for end-to-end available bandwidth Manish Jain, Univ-Delaware Constantinos Dovrolis, Univ-Delaware Sigcomm 02.
Advertisements

Web Server Benchmarking Using the Internet Protocol Traffic and Network Emulator Carey Williamson, Rob Simmonds, Martin Arlitt et al. University of Calgary.
An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN
QoS Solutions Confidential 2010 NetQuality Analyzer and QPerf.
Linux Networking TCP/IP stack kernel controls the TCP/IP protocol Ethernet adapter is hooked to the kernel in with the ipconfig command ifconfig sets the.
Internet Traffic Patterns Learning outcomes –Be aware of how information is transmitted on the Internet –Understand the concept of Internet traffic –Identify.
1 SLAC Internet Measurement Data Les Cottrell, Jerrod Williams, Connie Logg, Paola Grosso SLAC, for the ISMA Workshop, SDSC June,
MAGGIE NIIT- SLAC On Going Projects Measurement & Analysis of Global Grid & Internet End to end performance.
1 A new infrastructure for high throughput network and application performance measurement. Les Cottrell – SLAC Prepared for the IPAM meeting, UCLA Mar.
1 Achieving high performance throughput in production networks Les Cottrell – SLAC Presented at the Internet 2 HENP Networking Working Group kickoff meeting.
1 IEPM-BW a new network/application throughput performance measurement infrastructure Les Cottrell – SLAC Presented at the GGF4 meeting, Toronto Feb 20-21,
Introduction. 2 What Is SmartFlow? SmartFlow is the first application to test QoS and analyze the performance and behavior of the new breed of policy-based.
Internet Bandwidth Measurement Techniques Muhammad Ali Dec 17 th 2005.
Network Monitoring grid network performance measurement, simulation & analysis Presented by Warren Matthews at the Performance.
1 Chapter Overview Creating Sites and Subnets Configuring Intersite Replication Troubleshooting Active Directory Replication.
The Effects of Systemic Packets Loss on Aggregate TCP Flows Thomas J. Hacker May 8, 2002 Internet 2 Member Meeting.
GridFTP Guy Warner, NeSC Training.
Reading Report 14 Yin Chen 14 Apr 2004 Reference: Internet Service Performance: Data Analysis and Visualization, Cross-Industry Working Team, July, 2000.
A measurement study of vehicular internet access using in situ Wi-Fi networks Vladimir Bychkovsky, Bret Hull, Allen Miu, Hari Balakrishnan, and Samuel.
KEK Network Qi Fazhi KEK SW L2/L3 Switch for outside connections Central L2/L3 Switch A Netscreen Firewall Super Sinet Router 10GbE 2 x GbE IDS.
1 Monitoring Internet connectivity of Research and Educational Institutions Les Cottrell – SLAC/Stanford University Prepared for the workshop on “Developing.
PingER: Research Opportunities and Trends R. Les Cottrell, SLAC University of Malaya.
Chapter 4. After completion of this chapter, you should be able to: Explain “what is the Internet? And how we connect to the Internet using an ISP. Explain.
Experiences in Design and Implementation of a High Performance Transport Protocol Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Identifying Application Impacts on Network Design Designing and Supporting Computer.
1 Network performance measurements Les Cottrell – SLAC Prepared for the ICFA-SCIC, CERN December 8, 2001 Partially funded by DOE/MICS Field Work Proposal.
POSTECH DP&NM Lab. Internet Traffic Monitoring and Analysis: Methods and Applications (1) 4. Active Monitoring Techniques.
GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year UCL, 17 th June 2002.
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
1 Using Netflow data for forecasting Les Cottrell SLAC and Fawad Nazir NIIT, Presented at the CHEP06 Meeting, Mumbai India, February
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
1 ESnet/HENP Active Internet End-to-end Performance & ESnet/University performance Les Cottrell – SLAC Presented at the ESSC meeting Albuquerque, August.
1 Overview of IEPM-BW - Bandwidth Testing of Bulk Data Transfer Tools Connie Logg & Les Cottrell – SLAC/Stanford University Presented at the Internet 2.
1 The PingER Project: Measuring the Digital Divide PingER Presented by Les Cottrell, SLAC At the SIS Show Palexpo/Geneva December 2003.
1 High performance Throughput Les Cottrell – SLAC Lecture # 5a presented at the 26 th International Nathiagali Summer College on Physics and Contemporary.
Scavenger performance Cern External Network Division - Caltech Datagrid WP January, 2002.
TCP performance Sven Ubik FTP throughput capacity load ftp.uninett.no 12.3 Mb/s 1.2 Gb/s 80 Mb/s (6.6%) ftp.stanford.edu 1.3 Mb/s 600.
1 Measurements of Internet performance for NIIT, Pakistan Jan – Feb 2004 PingER From Les Cottrell, SLAC For presentation by Prof. Arshad Ali, NIIT.
1 Network Measurement Summary ESCC, Feb Joe Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
Analysis of QoS Arjuna Mithra Sreenivasan. Objectives Explain the different queuing techniques. Describe factors affecting network voice quality. Analyse.
NET100 Development of network-aware operating systems Tom Dunigan
1 Internet End-to-end Monitoring Project - Overview Les Cottrell – SLAC/Stanford University Partially funded by DOE/MICS Field Work Proposal on Internet.
1 Capacity Dimensioning Based on Traffic Measurement in the Internet Kazumine Osaka University Shingo Ata (Osaka City Univ.)
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
1 Passive and Active Monitoring on a High-performance Network Les Cottrell, Warren Matthews, Davide Salomoni, Connie Logg – SLAC
1 Achieving high performance throughput in production networks Les Cottrell – SLAC Presented at the Internet 2 HENP Networking Working Group kickoff meeting.
Internet Connectivity and Performance for the HEP Community. Presented at HEPNT-HEPiX, October 6, 1999 by Warren Matthews Funded by DOE/MICS Internet End-to-end.
1 Experiences and results from implementing the QBone Scavenger Les Cottrell – SLAC Presented at the CENIC meeting, San Diego, May
1 Transport Layer: Basics Outline Intro to transport UDP Congestion control basics.
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
1 PingER performance to Bangladesh Prepared by Les Cottrell, SLAC for Prof. Hilda Cerdeira May 27, 2004 Partially funded by DOE/MICS Field Work Proposal.
Networking and the Grid Ahmed Abdelrahim NeSC NeSC PPARC e-Science Summer School 10 th May 2005.
1 WAN Monitoring Prepared by Les Cottrell, SLAC, for the Joint Engineering Taskforce Roadmap Workshop JLab April 13-15,
Run-time Adaptation of Grid Data Placement Jobs George Kola, Tevfik Kosar and Miron Livny Condor Project, University of Wisconsin.
1 IEPM / PingER project & PPDG Les Cottrell – SLAC Presented at the NGI workshop, Berkeley, 7/21/99 Partially funded by DOE/MICS Field Work Proposal on.
Connect communicate collaborate Performance Metrics & Basic Tools Robert Stoy, DFN EGI TF, Madrid September 2013.
IEPM-BW (or PingER on steroids) and the PPDG
Fast Pattern-Based Throughput Prediction for TCP Bulk Transfers
R. Hughes-Jones Manchester
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
Milestones/Dates/Status Impact and Connections
High Speed File Replication
Using Netflow data for forecasting
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
The PingER Project: Measuring the Digital Divide
Wide Area Networking at SLAC, Feb ‘03
File Transfer Issues with TCP Acceleration with FileCatalyst
SLAC monitoring Web Services
By Manish Jain and Constantinos Dovrolis 2003
The PingER Project: Measuring the Digital Divide
Presentation transcript:

1 A new infrastructure for high throughput network and application performance measurement. Les Cottrell – SLAC Prepared for the IPAM meeting, UCLA Mar 19 ‘02 Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM), also supported by IUPAP

2 PingER deployment Measurements from –34 monitors in 14 countries –Over 600 remote hosts –Over 72 countries –Over 3300 monitor-remote site pairs –Measurements go back to Jan-95 –Reports on RTT, loss, reachability, jitter, reorders, duplicates … Countries monitored –Contain 78% of world population –99% of online users of Internet Lightweight (100bps/host pair) –Very useful for inter-regional and poor links, need more intensive for high performance & Grid sites

3 New stuff Eu Data Grid WP7 have extended PingER to throughput measurements (iperf, udpmon …) –Manchester, Daresbury, IN2P3 & UCL SLAC has added iperf, bbcp, bbft measurements to PingER tables –www-iepm.slac.stanford.edu/cgi- wrap/pingtable.pl?dataset=iperfwww-iepm.slac.stanford.edu/cgi- wrap/pingtable.pl?dataset=iperf Metrics: pipechar Iperf Bbcp bbftp

4 PingER losses: World by region, Jan ‘02 Packet loss 5%=bad Russia, S America bad Balkans, M East, Africa, S Asia, Caucasus poor

5 Throughput quality improvements TCP BW < MSS/(RTT*sqrt(loss)) (1) (1) Macroscopic Behavior of the TCP Congestion Avoidance Algorithm, Matthis, Semke, Mahdavi, Ott, Computer Communication Review 27(3), July 1997 China 80% annual improvement ~ factor 10/4yr SE Europe not keeping up

6 We need to better understand PingER losses insufficient for high perf links –Need more measurements & ping losses != TCP losses Closer to applications, e.g. FTP Understand how to make throughput measurements: –Duration & frequency (balance impact against granularity needed), –Windows and or vs parallel streams, –OS dependencies, cpu utilization, interface speeds, security (e.g. ssh) –Impact on others, variability on different time-scales –Can we use QBSS, can/should application self limit? –How well does simulation work, how to improve? –How to relate to simpler measurements –How does file transfer work compared to iperf? –Is compression useful and when? –How to steer applications

7 IEPM-BW: Main issues being addressed Provide a simple, robust infrastructure for: –Continuous/persistent and one-off measurement of high network AND application performance –management infrastructure – flexible remote host configuration Optimize impact of measurements –Duration, frequency of active measurements, and use passive Integrate standard set of measurements including: ping, traceroute, pipechar, iperf, bbcp … Allow/encourage adding measure/app tools Develop tools to gather, reduce, analyze, and publicly report on the measurements: –Web accessible data, tables, time series, scatterplots, histograms, forecasts … Compare, evaluate, validate various measurement tools and strategies (minimize impact on others, effects of app self rate limiting, QoS, compression…), find better/simpler tools, choose best set Provide simple forecasting tools to aid applications and to adapt the active measurement frequency Provide tool suite for high throughput monitoring and prediction

8 IEPM-BW Deliverables Understand and identify resources needed to achieve high throughput performance for Grid and other data intensive applications Provide access to archival and near real-time data and results for eyeballs and applications: –planning and expectation setting, see effects of upgrades –assist in trouble-shooting problems by identifying what is impacted, time and magnitude of changes and anomalies –as input for application steering (e.g. data grid bulk data transfer), changing configuration parameters –for prediction and further analysis Identify critical changes in performance, record and notify administrators and/or users Provide a platform for evaluating new network tools (e.g. pathrate, pathload, GridFTP, INCITE, UDPmon …) Provide measurement/analysis/reporting suite for Grid & hi-perf sites

IEPM-BW Deployment

10 Results so far 1/2 Reasonable estimates of throughput achievable with 10 sec iperf meas. Multiple streams and big windows are critical –Improve over default by 5 to 60. –There is an optimum windows*streams Continuous data at 90 min intervals from SLAC to 33 hosts in 8 countries since Dec ‘01

11 Results so far 2/ Disk Mbps 80 Iperf Mbps 1MHz ~ 1Mbps Bbcp mem to mem tracks iperf BBFTP & bbcp disk to disk tracks iperf until disk performance limits High throughput affects RTT for others –E.g. to Europe adds ~ 100ms –QBSS helps reduce impact Archival raw throughput data & graphs already available via http

12 E.g. Iperf vs File copy (mem-to-mem) Iperf TCP Mbits/s File copy mem to mem File copy mem to mem ~ 72% iperf

13 Iperf vs file copy disk to disk Iperf TCP Mbits/s File copy disk-to-disk Fast Ethernet OC3 Disk limited Over 60Mbits/s iperf >> file copy

Iperf TCP Mbits/s Working with Developer. Not using TCP, but should track Pipechar disagrees badly above 100Mbits/s (6 hosts, 20%), 50% of hosts have reasonable agreement Pipechar min throughpt Mbits/s E.g. iperf vs pipechar

15 Windows and Streams 1/2 Now well accepted that multiple streams and/or big windows are important to achieve optimal throughput Can be unfriendly to others Optimum windows & streams changes with changes in path So working on new methods to predict both windows & streams automatically & optimize –Netest from LBNL uses UDP& TCP Predicts window sizes and streams using packet streams, UDP & TCP Needs to be validated –Brute force: run iperf multi windows & streams & find optimum, need to automate measurement & analysis –Only need to run occasionally or when suspect change e.g. Step change in number of hops or RTT or other measurement

16 Optimizing CPU vs window vs streams vs throughput MHz ~ 0.97 * Mbps Bigger windows = less cpu for fixed throughput Increasing window Increasing streams Hooks at end = saturation

17 Optimizing streams & flows vs losses Throughput =  i=1,n (MSS i /(RTT i /sqrt(loss i )) (1) ~ (MSS/RTT)  (1/sqrt(loss i ) ~ [n*MSS]/(RTT*sqrt(loss)) So problem reduces to optimizing throughput w.r.t. loss (2) (2) “The End-to-End Performance Effects of parallel TCP sockets on a Lossy Wide-Area network”, Hacker & Athey, Proc 16 th Int Parallel & Distributed Processing Symposium, Ft Lauderdale, FL April 2002

18 Forecasting Given access to the data one can do real-time forecasting for –TCP bandwidth, file transfer/copy throughput E.g. NWS, Predicting the Performance of Wide Area Data Transfers by Vazhkudai, Schopf & Foster Developing simple prototype using average of previous measurements –Validate predictions versus observations –Get better estimates to adapt frequency of active measurements & reduce impact Also use ping RTTs and route information –Look at need for diurnal corrections –Use for steering applications Working with NWS for more sophisticated forecasting Can also use on demand bandwidth estimators (e.g. pipechar, but need to know range of applicability)

19 Forecast results 33 nodesIperf TCP Bbcp mem Bbcp disk bbftppipechar Average % error 13% +- 11% 23% +- 18% 15% +-13% 14% +-12% 13% +-8% Predict=Moving average of last 5 measurements +-  Iperf TCP throughput SLAC to Wisconsin, Jan ‘ Mbits/s x Predicted Observed % average error = average(abs(observe-predict)/observe)

20 Passive (Netflow) 1/2 Use Netflow measurements from border router –Netflow records time, duration, bytes, packets etc./flow –Calculate throughput from Bytes/duration for big flows (>10MB) –Validate vs. iperf, bbcp etc. –No extra load on network, provides other SLAC & remote hosts & applications, ~ 10-20K flows/day, unique pairs/day

21 Passive 2/2 Mbits/s Date Iperf SLAC to Caltech (Feb-Mar ’02) 0 80 Mbits/s Date Bbftp SLAC to Caltech (Feb-Mar ’02) + Active + Passive + Active + Passive Iperf matches well BBftp reports under what it achieves

22 Compression 60Mbyte Objectivity file, using zlib, 8 streams, 64KB window Can improve throughput on this link with these hosts (Sun Ultra Sparcs with 360MHz cpus) by more than a factor of 2. Working on formula f(file compressibility, host MHz, link Mbits/s) whether compression is worth it

23 Impact on Others Make ping measurements with & without iperf loading –Loss loaded(unloaded) –RTT Looking at how to avoid impact: e.g. QBSS/LBE, application pacing, control loop on stdev(RTT) reducing streams, want to avoid scheduling

24 Pings to host on show floor Priority loss: 9+-2 ms BE loss: ms QBSS loss: ms SC2001 QBSS demo Send data from 3 SLAC/FNAL booth computers (emulate a tier 0 or 1 HENP site) to over 20 other sites with good connections in about 6 countries –Iperf TCP throughputs ranged from 3Mbps to ~ 300Mbps Saturate 2Gbps connection to floor network –Maximum aggregate throughput averaged over 5 min. ~ 1.6Gbps Apply QBSS to highest performance site, and rest BE Time Mbits/s Iperf TCP Throughput Per GE interface QBSS No QBSS 5mins

25 Possible HEP usage of QBSS Apply priority to lower volume interactive voice/video-conferencing and real time control Apply QBSS to high volume data replication Leave the rest as Best Effort Since 40-65% of bytes to/from SLAC come from a single application, we have modified to enable setting of TOS bits Need to identify bottlenecks and implement QBSS there Bottlenecks tend to be at edges so hope to try with a few HEP sites

26 Possible scenario BaBar user wants to transfer large volume (e.g. TByte) of data from SLAC to IN2P3: –Select initial windows and streams from a table of pre-measured optimal values, or use an on demand tool (extended iperf), or reasonable default if none available Application uses data volume to be transferred and simple forecast to estimate how much time is needed –Forecasts from active archive, Netflow, on demand use one-end bandwidth estimation tools (e.g. pipechar, NWS TCP throughput estimator) –If estimate duration is longer than some threshold, then more careful duration estimate is made using diurnal forecasting –Application reports to user who decides whether to proceed Application turns on QBSS and starts transferring –For long measurements, provide progress feedback, using progress so far, Netflow measurements of this flow for last few half hours, diurnal corrections etc. –If falling behind required duration, turn off QBSS, go to best effort –If throughput drops off below some threshold, check for other sites

27 Experiences Getting ssh accounts and resources on remote hosts –Tremendous variation in account procedures from site to site, takes up to 7 weeks, requires knowing somebody who cares, sites are becoming increasingly circumspect –Steep learning curve on ssh, different versions Getting disk space for file copies (100s Mbytes) –Diversity of OSs, userids, directory structures, where to find perl, iperf..., contacts Required database to track –Also anonymizes hostnames, tracks code versions, whether to execute command (e.g. no ping if site blocks ping) & with what options, Developed tools to download software and to check remote configurations –Remote server (e.g. iperf) crashes: Start & kill server remotely for each measurement –Commands lock up or never end: –Time out all commands Some commands (e.g. pipechar) take a long time, so run infrequently AFS tokens to allow access to.ssh identity timed out, used trscron –Protocol port blocking –Ssh following Xmas attacks; bbftp, iperf ports, big variation between sites –Wrote analyses to recognize and worked with site contacts Ongoing issue, especially with increasing need for security, and since we want to measure inside firewalls close to real applications Simple tool built for tracking problems

28 Next steps Develop/extend management, analysis, reporting, navigating tools – improve robustness, manageability, workaround ssh anomalies Get improved forecasters and quantify how they work, provide tools to access Optimize intervals (using forecasts, and lighter weight measurements) and durations Evaluate use of compression Evaluate self rate limiting application (bbcp) Tie in passive Netflow measurements Add gridFTP (with UDPmon & new BW measurers – netest INCITE pathrate, pathload –Make early data available via http to interested & “friendly” researchers –CAIDA for correlation and validation of Pipechar & iperf etc. (sent documentaion) –NWS for forecasting with UCSB (sent documentation) Understand correlations & validate various tools, choose optimum set Make data available by std methods (e.g. MDS, GMA, …) – with Jenny & Make tools portable, set up other monitoring sites, e.g. PPDG sites –Working with INFN/Trieste to port –Work with NIMI/GIMI to evaluate deploying dedicated engines –More uniformity, easier management, greater access granularity & authorization Still need non dedicated: Want measurements from real application hosts, closer to real end user Some apps may not be ported to GIMI OS –Not currently funded for GIMI engines –Use same analysis, reporting etc.

29 More Information IEPM/PingER home site: –www-iepm.slac.stanford.edu/www-iepm.slac.stanford.edu/ Bulk throughput site: –www-iepm.slac.stanford.edu/monitoring/bulk/www-iepm.slac.stanford.edu/monitoring/bulk/ SC2001 & high throughput measurements –www-iepm.slac.stanford.edu/monitoring/bulk/sc2001/www-iepm.slac.stanford.edu/monitoring/bulk/sc2001/ Transfer tools: – – – – TCP Tuning: – –www-didc.lbl.gov/tcp-wan.htmlwww-didc.lbl.gov/tcp-wan.html QBSS measurements –www-iepm.slac.stanford.edu/monitoring/qbss/measure.htmlwww-iepm.slac.stanford.edu/monitoring/qbss/measure.html

30 Further slides

31 Typical QBSS test bed Set up QBSS testbed Configure router interfaces –3 traffic types: QBSS, BE, Priority –Define policy, e.g. QBSS > 1%, priority < 30% –Apply policy to router interface queues 10Mbps 100Mbps 1Gbps Cisco 7200s

32 Example of effects Also tried: 1 stream for all, and priority at 30%, 100 Mbps & 2 Gbps bottlenecks 2Gbps challenge to saturate: did at SC2001, 3 Linux cpus with 5*1 Gbps NIC cards and 2 Gbps trunk from subnet to floor network, sending to 17 hosts in 5 countries Kicks in fast (<~ 1 s)

33 Impact on response time (RTT) Run ping with Iperf loading with various QoS settings, iperf ~ 93Mbps –No iperf ping avg RTT ~ 300usec (regardless of QoS) –Iperf = QBSS, ping=BE or Priority: RTT~550usec 70% greater than unloaded –Iperf=Ping QoS (exc. Priority) then RTT~5msec > factor of 10 larger RTT than unloaded

34 File Transfer Used bbcp (written by Andy Hanushevsky) –similar methodology to iperf, except ran for file length rather than time, provides incremental throughput reports, supports /dev/zero, adding duration –looked at /afs/, /tmp/, /dev/null –checked different file sizes Behavior with windows & streams similar to iperf Thru bbcp ~0.8*Thru iperf For modest throughputs (< 50Mbits/s) rates are independent of whether destination is /afs/, /tmp/ or /dev/null. Cpu utilization ~ 1MHz/Mbit/s is ~ 20% > than for iperf

35 Application rate-limiting Bbcp has transfer rate limiting –Could use network information (e.g. from Web100 or independent pinging) to bbcp to reduce/increase its transfer rate, or change number of parallel streams No rate limiting, 64KB window, 32 streams 15MB/s rate limiting, 64KB window, 32 streams