Network Monitoring, WAN Performance Analysis, & Data Circuit Support at Fermilab Phil DeMar US-CMS Tier-3 Meeting Fermilab October 23, 2008.

Slides:



Advertisements
Similar presentations
US CMS Tier1 Facility Network Andrey Bobyshev (FNAL) Phil DeMar (FNAL) CHEP 2010 Academia Sinica Taipei, Taiwan.
Advertisements

Internet Access for Academic Networks in Lorraine TERENA Networking Conference - May 16, 2001 Antalya, Turkey Infrastructure and Services Alexandre SIMON.
FNAL Site Perspective on LHCOPN & LHCONE Future Directions Phil DeMar (FNAL) February 10, 2014.
An Analysis of Bulk Data Movement Patterns in Large-scale Scientific Collaborations W. Wu, P. DeMar, A. Bobyshev Fermilab CHEP 2010, TAIPEI TAIWAN
Trial of the Infinera PXM Guy Roberts, Mian Usman.
IP Performance Measurements using Surveyor Matt Zekauskas Guy Almes, Sunil Kalidindi August, 1998 ISMA 98.
1 ESnet Update Summer 2007 Joint Techs Workshop Joe Burrescia ESnet General Manager July 16,2007 Energy Sciences Network Lawrence Berkeley National Laboratory.
ATLAS Tier 2 Paths Within ESnet Mike O’Connor ESnet Network Engineering Group Lawrence Berkeley National Lab
Questionaire answers D. Petravick P. Demar FNAL. 7/14/05 DLP -- GDB2 FNAL/T1 issues In interpreting the T0/T1 document how do the T1s foresee to connect.
Network Performance Measurement Atlas Tier 2 Meeting at BNL December Joe Metzger
Performance Measurement Tools August 9 th 2011, OSG Site Admin Workshop Jason Zurawski – Internet2 Research Liaison.
Performance Measurement Tools March 10 th 2011, OSG All Hands Workshop - Network Performance Jason Zurawski, Internet2.
KEK Network Qi Fazhi KEK SW L2/L3 Switch for outside connections Central L2/L3 Switch A Netscreen Firewall Super Sinet Router 10GbE 2 x GbE IDS.
TeraPaths : A QoS Collaborative Data Sharing Infrastructure for Petascale Computing Research USATLAS Tier 1 & Tier 2 Network Planning Meeting December.
CD FY09 Tactical Plan Review FY09 Tactical Plan for Wide-Area Networking Phil DeMar 9/25/2008.
IRNC Special Projects: IRIS and DyGIR Eric Boyd, Internet2 October 5, 2011.
1 ESnet Network Measurements ESCC Feb Joe Metzger
User-Perceived Performance Measurement on the Internet Bill Tice Thomas Hildebrandt CS 6255 November 6, 2003.
1 ESnet Planning for the LHC T0-T1 Networking William E. Johnston ESnet Manager and Senior Scientist Lawrence Berkeley National Laboratory.
TeraPaths: A QoS Collaborative Data Sharing Infrastructure for Petascale Computing Research Bruce Gibbard & Dantong Yu High-Performance Network Research.
Fermi National Accelerator Laboratory, U.S.A. Brookhaven National Laboratory, U.S.A, Karlsruhe Institute of Technology, Germany CHEP 2012, New York, U.S.A.
TeraPaths TeraPaths: establishing end-to-end QoS paths - the user perspective Presented by Presented by Dimitrios Katramatos, BNL Dimitrios Katramatos,
OSCARS Overview Path Computation Topology Reachability Contraints Scheduling AAA Availability Provisioning Signalling Security Resiliency/Redundancy OSCARS.
1 Services to the US Tier-1 Sites LHCOPN April 4th, 2006 Joe Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
HOPI Update Rick Summerhill Director Network Research, Architecture, and Technologies Jerry Sobieski MAX GigaPoP and TSC Program Manager Mark Johnson MCNC.
Rick Summerhill Chief Technology Officer, Internet2 Internet2 Fall Member Meeting 9 October 2007 San Diego, CA The Dynamic Circuit.
Internet2 Performance Update Jeff W. Boote Senior Network Software Engineer Internet2.
1 ESnet Update Winter 2008 Joint Techs Workshop Joe Burrescia ESnet General Manager January 21, 2008 Energy Sciences Network Lawrence Berkeley National.
US LHC Tier-1 WAN Data Movement Security Architectures Phil DeMar (FNAL); Scott Bradley (BNL)
1 ESnet Update Joint Techs Meeting Minneapolis, MN Joe Burrescia ESnet General Manager 2/12/2007.
1 Measuring Circuit Based Networks Joint Techs Feb Joe Metzger
Thoughts on Future LHCOPN Some ideas Artur Barczyk, Vancouver, 31/08/09.
ASCR/ESnet Network Requirements an Internet2 Perspective 2009 ASCR/ESnet Network Requirements Workshop April 15/16, 2009 Richard Carlson -- Internet2.
TeraPaths TeraPaths: Establishing End-to-End QoS Paths through L2 and L3 WAN Connections Presented by Presented by Dimitrios Katramatos, BNL Dimitrios.
1 Network Measurement Summary ESCC, Feb Joe Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
Connect communicate collaborate Intercontinental Multi-Domain Monitoring for the LHC Community Domenico Vicinanza perfSONAR MDM Product Manager DANTE –
TeraPaths The TeraPaths Collaboration Presented by Presented by Dimitrios Katramatos, BNL Dimitrios Katramatos, BNL.
Connect. Communicate. Collaborate perfSONAR MDM Service for LHC OPN Loukik Kudarimoti DANTE.
Wide Area Network Performance Analysis Methodology Wenji Wu, Phil DeMar, Mark Bowden Fermilab ESCC/Internet2 Joint Techs Workshop 2007
Advanced Networks: The Past and the Future – The Internet2 Perspective APAN 7 July 2004, Cairns, Australia Douglas Van Houweling, President & CEO Internet2.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
CCIRN Meeting: Optical Networking Topic North America report Heather Boyles, Internet2
Internet2 End-to-End Performance Initiative Eric L. Boyd Director of Performance Architecture and Technologies Internet2.
TeraPaths: A QoS Enabled Collaborative Data Sharing Infrastructure for Petascale Computing Research The TeraPaths Project Team Usatlas Tier 2 workshop.
OSG Networking: Summarizing a New Area in OSG Shawn McKee/University of Michigan Network Planning Meeting Esnet/Internet2/OSG August 23 rd, 2012.
US LHC Tier-2 Network Performance BCP Mar-3-08 LHC Community Network Performance Recommended BCP Eric Boyd Deputy Technology Officer Internet2.
Connect communicate collaborate Connectivity Services, Autobahn and New Services Domenico Vicinanza, DANTE EGEE’09, Barcelona, 21 st -25 th September 2009.
Dynamic Network Services In Internet2 John Vollbrecht /Dec. 4, 2006 Fall Members Meeting.
DICE: Authorizing Dynamic Networks for VOs Jeff W. Boote Senior Network Software Engineer, Internet2 Cándido Rodríguez Montes RedIRIS TNC2009 Malaga, Spain.
LHCONE Monitoring Thoughts June 14 th, LHCOPN/LHCONE Meeting Jason Zurawski – Research Liaison.
David Foster, CERN GDB Meeting April 2008 GDB Meeting April 2008 LHCOPN Status and Plans A lot more detail at:
Use of Alternate Path Circuits at Fermilab {A Site Perspective of E2E Circuits} Phil DeMar I2/JointTechs Meeting Monday, Feb. 12, 2007.
A Strawman for Merging LHCOPN and LHCONE infrastructure LHCOPN + LHCONE Meeting Washington, DC, Jan. 31, 2013 W. E. Johnston and Chin Guok.
1 Network related topics Bartosz Belter, Wojbor Bogacki, Marcin Garstka, Maciej Głowiak, Radosław Krzywania, Roman Łapacz FABRIC meeting Poznań, 25 September.
Advanced Network Diagnostic Tools Richard Carlson EVN-NREN workshop.
Run - II Networks Run-II Computing Review 9/13/04 Phil DeMar Networks Section Head.
100GE Upgrades at FNAL Phil DeMar; Andrey Bobyshev CHEP 2015 April 14, 2015.
Brookhaven Science Associates U.S. Department of Energy 1 n BNL –8 OSCARS provisioned circuits for ATLAS. Includes CERN primary and secondary to LHCNET,
Campana (CERN-IT/SDC), McKee (Michigan) 16 October 2013 Deployment of a WLCG network monitoring infrastructure based on the perfSONAR-PS technology.
1 Deploying Measurement Systems in ESnet Joint Techs, Feb Joseph Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
1 Network Measurement Challenges LHC E2E Network Research Meeting October 25 th 2006 Joe Metzger Version 1.1.
J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology US LHCNet LHCNet WG September 12 th 2006.
Wide Area Network Performance Analysis Methodology
Dynamic Network Services In Internet2
Networking for the Future of Science
Establishing End-to-End Guaranteed Bandwidth Network Paths Across Multiple Administrative Domains The DOE-funded TeraPaths project at Brookhaven National.
ESnet Network Measurements ESCC Feb Joe Metzger
Network Performance Measurement
“Detective”: Integrating NDT and E2E piPEs
Presentation transcript:

Network Monitoring, WAN Performance Analysis, & Data Circuit Support at Fermilab Phil DeMar US-CMS Tier-3 Meeting Fermilab October 23, 2008

Active Wide-Area Network Monitoring PerfSONAR: distributed network monitoring infrastructure  Supported by US-LHC T1 sites and Internet2 community PerfSONAR-PS: Active monitoring package  Web services collection built on trusted monitoring tools: ping, BWCTL(iperf), owamp, NPAD, NDT toolkit Web service interface for pulling data into other monitoring tools  Zero configuration; out of box deployment Based on Knoppix Live CD bootable disk Optional software bundle deployment  Modest hardware requirements for on-site deployment

PerfSONAR Deployment Status US-Atlas moving ahead with perfSonar-PS at T1 & T2s:  Two dedicated systems per site; one each for latency & b/w testing  Systems are spec’ed devices, $628 each (Koi computer)  Utilize Knoppix disks & standard configurations We’ve recommended the same model for US-CMS Current PerfSONAR-PS deployment:  Both US-LHC Tier-1s (FNAL & BNL)  UNL (CMS), U-Mich (ATLAS); U-Delaware; Internet-2; ESnet  Complete active monitoring matrix of the above

Background information PerfSONAR-PS project Tour of perfSONAR-PS service is available Knoppix Live CD bootable disk info Appliance PCs:  Vendor: KOI Computing – (630)  Spec:1U Intel Pentium Dual-Core E GHz System  Cost:$628/each

Performance Analysis Support In 1999, Matt Mathis coined the term ‘Wizard’s Gap’  Today, it’s still an issue Users often don’t know about:  Common OS tuning issues for WAN data movement  Wide-area network path, its characteristics, available tools Its still an end-to-end problem  And the world is still short on wizards Our structured analysis methodology seeks to put some of the wizardry into structured process

Find the performance problem area(s)

Performance Analysis Methodology Structured approach to performance analysis Model the process like medical diagnosis  Collect the physical characteristics  Run diagnostic tests  Record everything; develop a history of the analysis Strategic approach:  Sub-divide problem space: Application-related problems Host diagnosis and tuning Network path analysis  Then divide and conquer

Network Performance Analysis Architecture PTDS

Host diagnosis  Script that pulls system configuration  Network Diagnostic Tool (NDT) Faulty network connections & NICs, duplex mismatches Network path diagnosis  OWAMP to collect and diagnose one-way network path statistics. Packet loss, latency, jitter  Other tools such as ping, traceroute, as needed Packet trace diagnosis  Port mirror on border router(s)  Tcpdump to collect packet traces  Tcptrace to analyze packet traces  Xplot for visual examination. Performance Analysis Tools…

Round-trip time Sequence of routers along the paths One-way delay, delay variance One-way packet drop rate Packet reordering Network path characteristics collected

Step 1: Definition of the problem space Step 2: Collect host information & network path characteristics Step 3: Host tuning & diagnosis Step 4: Network path performance analysis  Route changes frequently?  Network congestion: delay variance large?  Infrastructure failures: examine the counter one by one  Packet reordering: load balancing? Parallel processing? Step 5: Evaluate packet trace pattern Network Performance Analysis Methodology

Tier2/Tier3 Sites worked with UERJ (Brazil) IHEP (China) RAL (UK) University of Florida IFCA (Spain) TTU (Texas) CIEMAT (Spain) Belgium OWEA (Austria) CSCS (Swiss)

An available service for CMS Tier-2/3 sites  A work-in-progress at this point  Focus is on process as well as results  Willing to work with others in this area Future areas of effort:  Incorporate into work flow & content management system  Make use of perfSonar monitoring infrastructure How to get hold of us:  Send to  Wide Area Work Group video-conf meetings every other Friday Performance Analysis Status & Summary

Strategic Direction Toward Circuits DOE High Performance Network Planning Workshop established a strategic model to follow:  High bandwidth backbones for reliable production IP service ESnet  Separate high-bandwidth network paths for large scale science data flows Science Data Network  Metropolitan Area Networks (MAN) for local access Fermi LightPath a cornerstone for Chicago area MAN

ESnet4: Core networks Gbps by (10Gb/s circuits) Cleveland Europe (GEANT) Asia-Pacific New York Chicago Washington DC Atlanta CERN (30+ Gbps) Seattle Albuquerque Australia San Diego LA Denver South America (AMPATH) South America (AMPATH) Canada (CANARIE) CERN (30+ Gbps) Canada (CANARIE) Asia- Pacific Asia Pacific GLORIAD (Russia and China) Boise Houston Jacksonville Tulsa Boston Science Data Network Core IP Core Kansas City Australia Sunnyvale Production IP core (10Gbps) SDN core ( Gbps) MANs (20-60 Gbps) or backbone loops for site access International connections USLHCNet

Topology of circuit connections Circuits utilize MAN infrastructure:  10GE channel(s) reserved for routed IP service (purple)  LHCOPN circuit (orange) to CERN  SDN channels for E2E circuits to CMS Tier-2/3 (shades of green) Circuits based on end-to-end vLANs  Direct BGP peering with remote site Multiple provider domains is the norm  Deployed technology varies by domains involved  Complexity is higher than IP service

FNAL Alternate Path Circuits Supported since 2004 Serve a wide spectrum of experiments  CMS Tier-2s are heavy users Implemented on multiple technologies  But based on end-to-end layer-2 paths Usefulness has varied

E2E Circuit Summary FNAL currently supporting E2E circuits to Tier0 & Tier2s  A few Tier3s Today, circuits are largely static configurations Dynamic circuit services are becoming available  Driven largely by Internet2 DCN services Alternate path support services also emerging  Lambda Station (FNAL)  TeraPaths (BNL)  Contact for help or