Monitoring 10Gbps and beyond

Slides:



Advertisements
Similar presentations
Pathload A measurement tool for end-to-end available bandwidth Manish Jain, Univ-Delaware Constantinos Dovrolis, Univ-Delaware Sigcomm 02.
Advertisements

Institute of Computer Science Foundation for Research and Technology – Hellas Greece Computer Architecture and VLSI Systems Laboratory Exploiting Spatial.
1 SLAC Internet Measurement Data Les Cottrell, Jerrod Williams, Connie Logg, Paola Grosso SLAC, for the ISMA Workshop, SDSC June,
1 Evaluation of Techniques to Detect Significant Performance Problems using End-to-end Active Network Measurements Les Cottrell, SLAC 2006 IEEE/IFIP Network.
Network Traffic Measurement and Modeling CSCI 780, Fall 2005.
1 Tools for High Performance Network Monitoring Les Cottrell, Presented at the Internet2 Fall members Meeting, Philadelphia, Sep
Internet Bandwidth Measurement Techniques Muhammad Ali Dec 17 th 2005.
Bandwidth Estimation: Metrics Mesurement Techniques and Tools By Ravi Prasad, Constantinos Dovrolis, Margaret Murray and Kc Claffy IEEE Network, Nov/Dec.
Sven Ubik, CESNET TNC2004, Rhodos, 9 June 2004 Performance monitoring of high-speed networks from NREN perspective.
Sven Ubik, Petr Žejdl CESNET TNC2008, Brugges, 19 May 2008 Passive monitoring of 10 Gb/s lines with PC hardware.
Network Monitoring School of Electronics and Information Kyung Hee University. Choong Seon HONG Selected from ICAT 2003 Material of James W. K. Hong.
KEK Network Qi Fazhi KEK SW L2/L3 Switch for outside connections Central L2/L3 Switch A Netscreen Firewall Super Sinet Router 10GbE 2 x GbE IDS.
Net Optics Confidential and Proprietary Net Optics appTap Intelligent Access and Monitoring Architecture Solutions.
workshop eugene, oregon What is network management? System & Service monitoring  Reachability, availability Resource measurement/monitoring.
1 Using Netflow data for forecasting Les Cottrell SLAC and Fawad Nazir NIIT, Presented at the CHEP06 Meeting, Mumbai India, February
EGEE is a project funded by the European Union under contract IST Bandwidth Measurements Loukik Kudarimoti Network Engineer, DANTE JRA4 Meeting,
Securing and Monitoring 10GbE WAN Links Steven Carter Center for Computational Sciences Oak Ridge National Laboratory.
1 Overview of IEPM-BW - Bandwidth Testing of Bulk Data Transfer Tools Connie Logg & Les Cottrell – SLAC/Stanford University Presented at the Internet 2.
Network Measurement Tools ESnet Site Coordinators Meeting 26 April 2000 Tracie Monk, UCSD/SDSC/CAIDA -
1 Network Measurement Summary ESCC, Feb Joe Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
1 High Performance Network Monitoring Challenges for Grids Les Cottrell, SLAC Presented at the International Symposium on Grid Computing 2006, Taiwan
GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
Company LOGO Network Management Architecture By Dr. Shadi Masadeh 1.
1 Internet Traffic Measurement and Modeling Carey Williamson Department of Computer Science University of Calgary.
1 Terapaths: DWMI: Datagrid Wide Area Monitoring Infrastructure Les Cottrell, SLAC Presented at DoE PI Meeting BNL September
Connect communicate collaborate Performance Metrics & Basic Tools Robert Stoy, DFN EGI TF, Madrid September 2013.
Recent experience with PCI-X 2.0 and PCI-E network interfaces and emerging server systems Yang Xia Caltech US LHC Network Working Group October 23, 2006.
1 Performance Network Monitoring for the LHC Grid Les Cottrell, SLAC International ICFA Workshop on Grid Activities within Large Scale International Collaborations,
Advisor: Hung Shi-Hao Presenter: Chen Yu-Jen
Network Monitoring Sebastian Büttrich, NSRC / IT University of Copenhagen Last edit: February 2012, ICTP Trieste
1 High Performance Network Monitoring Challenges for Grids Les Cottrell, Presented at the Internation Symposium on Grid Computing 2006, Taiwan
Voice Performance Measurement and related technologies
Les Cottrell & Yee-Ting Li, SLAC
Lessons Learned Monitoring the WAN
Neha Jain Shashwat Yadav
Top-Down Network Design Chapter Thirteen Optimizing Your Network Design Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Instructor Materials Chapter 9: Testing and Troubleshooting
Fast Pattern-Based Throughput Prediction for TCP Bulk Transfers
Planning and Troubleshooting Routing and Switching
Paola Grosso SLAC October
A Deterministic End to End Performance Verification Architecture
R. Hughes-Jones Manchester
Networking between China and Europe
Chapter 4 Data Link Layer Switching
Network and Services Management
Hubs Hubs are essentially physical-layer repeaters:
Deployment & Advanced Regular Testing Strategies
Tools for High Performance Network Monitoring
Terapaths: DWMI: Datagrid Wide Area Monitoring Infrastructure
Hubs Hubs are essentially physical-layer repeaters:
High Speed File Replication
Using Netflow data for forecasting
Transport Protocols Relates to Lab 5. An overview of the transport protocols of the TCP/IP protocol suite. Also, a short discussion of UDP.
ESnet Network Measurements ESCC Feb Joe Metzger
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
Wide Area Networking at SLAC, Feb ‘03
My Experiences, results and remarks to TCP BW and CT Measurements Tools Jiří Navrátil SLAC.
End-to-end Anomalous Event Detection in Production Networks
Evaluation of Techniques to Detect Significant Performance Problems using End-to-end Active Network Measurements Mahesh Chhaparia & Les Cottrell, SLAC.
File Transfer Issues with TCP Acceleration with FileCatalyst
High Performance Network Monitoring for UltraLight
High Performance Network Monitoring for UltraLight
Forecasting Network Performance
Beyond FTP & hard drives: Accelerating LAN file transfers
Wide-Area Networking at SLAC
Chapter-5 Traffic Engineering.
pathChirp Efficient Available Bandwidth Estimation
pathChirp Efficient Available Bandwidth Estimation
Summer 2002 at SLAC Ajay Tirumala.
Presentation transcript:

Monitoring 10Gbps and beyond Les Cottrell, LHC Tier0, Tier1 Network meeting, CERN July 2005 www.slac.stanford.edu/grp/scs/net/talk05/hsmon-cern-jul05.ppt What we need to know is to understand how to monitor an optical private network, what are the best approaches and what should we watch out for. Ideally, how should it be implemented in order to be manageable. Partially funded by DOE/MICS for Internet End-to-end Performance Monitoring (IEPM)

Outline Why do we need monitoring? Active E2E measurements Passive Netflow, SNMP, Packet capture Conclusions Main focus on E2E, little emphasis on layer 1.

Uses of Measurements Automated problem identification & trouble shooting: Alerts for network administrators, e.g. Bandwidth changes in time-series, iperf, SNMP Alerts for systems people OS/Host metrics Forecasts for Grid Middleware, e.g. replica manager, data placement Engineering, planning, SLA (set & verify) Security: spot anomalies, intrusion detection Accounting

Active E2E Monitoring Layer 3 or 4. Layers 1 and 2 are less well exploited/understood/related to apps. Also lots of instances: FC, FICON, 10GE, SONET, OC192, SDH … Check vendor specs. E.g. Cisco, Juniper etc. SONET monitoring (from Endace): The PHYMON occupies 1U (OC3/12) or 2U (OC48/192) of vertical rack space and is equipped with two 10/100/1000 copper Ethernet interfaces for control and reporting via LAN. Key Features Monitors up to two OC3/OC12/OC48/OC192 network links Detects link-layer failures: LOS-S, LOF-S, AIS-L, REI-L, RDI-L, AIS-P, LOP_P, UNEQ-P, REI-P, RDI-P Derive errors: CV, ES, ESA, ESB, SES and UAS according to Bellcore GR-253, Issue 2 Rev 2 standard Sends SNMP traps for all failures and error thresholds according to user configuration Reports current status in real time via telnet, ssh or serial connection Reports accumulated status for 15m, 1h, 8h, 24h, 7d intervals Retains historical data for 35 days Supplies all the underlying data for SNMP SONET MIB (RFC2558) Techniques: Loop back, test patterns (BERT), e.g. ones & zeroes, various ITU-T specs, Loss of Signal, out of Frame, loss of frame, errored seconds, code violations, unavailable seconds, alarms, near & far end, how often, history

Using Active IEPM-BW measurements Focus on high performance for a few hosts needing to send data to a small number of collaborator sites, e.g. HEP tiered model Makes regular measurements Ping (RTT, connectivity), traceroute pathchirp, ABwE (packet pair dispersion) iperf (single & multi-stream), thrulay, Bbftp (file transfer application) Looking at GridFTP but complex requiring renewing certificates Lots of analysis and visualization Running at CERN, SLAC, FNAL, BNL, Caltech to about 40 remote sites http://www.slac.stanford.edu/comp/net/iepm-bw.slac.stanford.edu/slac_wan_bw_tests.html

Ping/traceroute Ping still useful (plus ca reste …) OWAMP similar Is path connected? RTT, loss, jitter Blocking unlikely OWAMP similar But needs server installed at other end Traceroute Little use for dedicated λ However still want to know topology of paths

Packet Pair Dispersion Bottleneck Min spacing At bottleneck Spacing preserved On higher speed links Send packets with known separation See how separation changes due to bottleneck Can be low network intrusive, e.g. ABwE only 20 packets/direction, also fast < 1 sec From PAM paper, pathchirp more accurate than ABwE, but Ten times as long (10s vs 1s) More network traffic (~factor of 10) Pathload factor of 10 again more http://www.pam2005.org/PDF/34310310.pdf IEPM-BW now supports ABwE, Pathchirp, Pathload

BUT… Packet pair dispersion relies on accurate timing of inter packet separation At > 1Gbps this is getting beyond resolution of Unix clocks AND 10GE NICs are offloading function Coalescing interrupts, Large Send & Receive Offload, TOE Need to work with TOE vendors Turn off offload (Neterion supports multiple channels, can eliminate offload to get more accurate timing in host) Do timing in NICs No standards for interfaces

Achievable Throughput Use TCP or UDP to end as much data as can memory to memory from source to destination Tools iperf, netperf, thrulay Bbcp and GridFTP have memory to memory mode

Iperf vs thrulay Iperf has multi streams Maximum RTT Iperf has multi streams Thrulay more manageable & gives RTT They agree well Throughput ~ 1/avg(RTT) Average RTT RTT ms Minimum RTT Achievable throughput Mbits/s

Forecasting Over-provisioned paths should be pretty flat time series Short/local term smoothing Long term linear trends Seasonal smoothing But seasonal trends (diurnal, weekly need to be accounted for) Use Holt-Winters triple exponential weighted moving averages Predicting how long a file transfer will take, requires forecasting network and application performance. However, such forecasting is beset with problems. These include seasonal (e.g. diurnal) variations in the measurements, the increasing difficulty of making accurate active low network intrusiveness measurements especially on high speed (>1 Gbits/s) networks and with Network Interface Card (NIC) offloading, the intrusivenss of making more realistic active measurements on the network, the differences in network and large file transfer performance, and the difficulty of getting sufficient relevant passive measurements to enable forecasting. We will discuss each of these problems, compare and contrast the effectiveness of various solutions, look at how some of the methods may be combined, and identify practical ways to move forward.

BUT… At 10Gbits/s on transatlantic path Slow start takes over 6 seconds To get 90% of measurement in congestion avoidance need to measure for 1 minute (5.25 GBytes at 7Gbits/s (today’s typical performance) Needs scheduling to scale, even then … It’s not disk-to-disk So use bbcp, bbftp, or GridFTP

Passive - Netflow

Netflow et. al. Switch identifies flow by sce/dst ports, protocol Cuts record for each flow: Sce, dst, ports, protocol, TOS, start, end time Collect records and analyze Can be a lot of data to collect each day, needs lot cpu Hundreds of Mbytes to GBytes No intrusion, real traffic, real collaborators No accounts/pwds/certs/keys Characterize traffic: top talkers, applications, flow lengths etc. Internet 2 backbone http://netflow.internet2.edu/weekly/ SLAC: www.slac.stanford.edu/comp/net/slac-netflow/html/SLAC-netflow.html NetraMet, SCAMPI are a couple of non-commercial flow projects. IPFIX is an IETF standardization effort for Netflow type passive monitoring.

Top talkers by application/port Hostname 1 100 10000 Volume dominated by single Application - bbcp MBytes/day (log scale)

Flow sizes SNMP Real A/V AFS file server 60% of TCP flows less than 1 second Would expect TCP streams longer lived But 60% of UDP flows over 10 seconds, maybe due to heavy use of AFS Heavy tailed, in ~ out, UDP flows shorter than TCP, packet~bytes 75% TCP-in < 5kBytes, 75% TCP-out < 1.5kBytes (<10pkts) UDP 80% < 600Bytes (75% < 3 pkts), ~10 * more TCP than UDP Top UDP = AFS (>55%), Real(~25%), SNMP(~1.4%)

Forecasting? Use Netflow records at border Collect records for several weeks Filter 40 major collaborator sites, big (> 100KBytes) flows, bulk transport apps/ports (bbcp, bbftp, iperf, thrulay …) Divide by remote site, add parallel streams Fold data onto one week, see bands at known capacities and RTTs ~ 500K flows

Netflow et. al. Peaks at known capacities and RTTs RTTs suggest windows not optimized

How many sites have enough flows? In May ’05 found 15 sites with > 1440 (1/30 mins) flows Enough for time series forecasting for seasonal effects Three sites (Caltech, BNL, CERN) were actively monitored Rest were “free” Only 10% sites have big seasonal effects Remainder need fewer flows So promising

Compare active with passive Predict flow throughputs from Netflow data for SLAC to Padova for May ’05 Compare with E2E active ABwE measurements

Netflow limitations Use of dynamic ports. GridFTP, bbcp, bbftp can use fixed ports P2P often uses dynamic ports Discriminate type of flow based on headers (not relying on ports) Types: bulk data, interactive … Discriminators: inter-arrival time, length of flow, packet length, volume of flow Use machine learning/neural nets to cluster flows E.g. http://www.pam2004.org/papers/166.pdf SCAMPI/FFPF/MAPI allows more flexible flow definition See www.ist-scampi.org/ Use application logs (OK if small number) For FTP port 21 only used as a control channel.

Passive SNMP MIBs

Apply forecasts to Network device utilizations to find bottlenecks Get measurements from Internet2/ESnet/Geant SONAR project ISP reads MIBs saves in RRD database Make RRD info available via web services Save as time series, forecast for each interface For given path and duration forecast most probable bottlenecks Use MPLS to apply QoS at bottlenecks (rather than for the entire path) for selected applications NSF proposal SONET monitoring (from Endace): The PHYMON occupies 1U (OC3/12) or 2U (OC48/192) of vertical rack space and is equipped with two 10/100/1000 copper Ethernet interfaces for control and reporting via LAN. Key Features Monitors up to two OC3/OC12/OC48/OC192 network links Detects link-layer failures: LOS-S, LOF-S, AIS-L, REI-L, RDI-L, AIS-P, LOP_P, UNEQ-P, REI-P, RDI-P Derive errors: CV, ES, ESA, ESB, SES and UAS according to Bellcore GR-253, Issue 2 Rev 2 standard Sends SNMP traps for all failures and error thresholds according to user configuration Reports current status in real time via telnet, ssh or serial connection Reports accumulated status for 15m, 1h, 8h, 24h, 7d intervals Retains historical data for 35 days Supplies all the underlying data for SNMP SONET MIB (RFC2558)

Passive – Packet capture

10G Passive capture Endace (www.endace.net ): OC192 Network Measurement Cards = DAG 6 (offload vs NIC) Commercial OC192Mon, non-commercial SCAMPI Line rate, capture up to >~ 1Gbps Expensive, massive data capture (e.g. PB/week) tap insertion D.I.Y. with NICs instead of NMC DAGs Need PCI-E or PCI-2DDR, powerful multi CPU host Apply sampling See www.uninett.no/publikasjoner/foredrag/scampi-noms2004.pdf Also have tcpdump/libpcap to capture CoralReef captures and analyzes packets NetraMet/Flowscan, latter analyzes and reports MAPI/FFPF with Intel IXP1200, plus LOBSTER Endace Key features Endace DAGMON 3U rackmount server system Pair of Single Channel DAG6.2SE OC192c Network Measurement Cards CPU, memory and disks according to configuration options Linux and DAG device drivers preinstalled, FreeBSD available Conditioned clock with 1PPS input and local synchronization capability Captures up to 1.064GBps of network traffic Applications Network monitoring systems Traffic characterization and delay measurement Packet header capture Network delay measurements Network security applications, such as Intrusion Detection Systems DAG card capable of full-packet capture, uses zero copy (shared memory, reduces interrupts), has high precision time stamp, large static circular buffer DAGs expensive (list for 10GE NMC ~ $50K) Intel Pro 1GE NIC in dual 1.8GHz AMD Athlon, @ 700Mbits/s ~ 50 loss free flows can be captured with libpcap Over DAG ~ 100

LambdaMon / Joerg Micheel NLANR Tap G709 signals in DWDM equipment Filter required wavelength Can monitor multiple λ‘s sequentially 2 tunable filters

LambdaMon Place at PoP, add switch to monitor many fibers More cost effective Multiple G.709 transponders for 10G Low level signals, amplification expensive Even more costly, funding/loans ended …

Conclusions Traceroute probably dead Some things continue to work Ping, owamp Iperf, thrulay, bbftp … but Packet pair dispersion needs work, its time may be over Passive looks promising with Netflow SNMP needs AS to make accessible Capture expensive ~$100K (Joerg Micheel) for OC192Mon

More information Comparisons of Active Infrastructures: www.slac.stanford.edu/grp/scs/net/proposals/infra-mon.html Some active public measurement infrastructures: www-iepm.slac.stanford.edu/ e2epi.internet2.edu/owamp/ amp.nlanr.net/ www-iepm.slac.stanford.edu/pinger/ Capture www.endace.com (DAG), www.pam2005.org/PDF/34310233.pdf www.ist-scampi.org/ (also MAPI, FFPF), www.ist-lobster.org Monitoring tools www.slac.stanford.edu/xorg/nmtf/nmtf-tools.html www.caida.org/tools/