Slide 1 9/29/15 End-to-End Performance Tuning and Best Practices Moderator: Charlie McMahon, Tulane University Jan Cheetham, University of Wisconsin-Madison.

Slides:



Advertisements
Similar presentations
-Grids and the OptIPuter Software Architecture Andrew A. Chien Director, Center for Networked Systems SAIC Chair Professor, Computer Science and Engineering.
Advertisements

Logically Centralized Control Class 2. Types of Networks ISP Networks – Entity only owns the switches – Throughput: 100GB-10TB – Heterogeneous devices:
Kathy Benninger, Pittsburgh Supercomputing Center Workshop on the Development of a Next-Generation Cyberinfrastructure 1-Oct-2014 NSF Collaborative Research:
Scalable and Crash-Tolerant Load Balancing based on Switch Migration
Maximizing End-to-End Network Performance Thomas Hacker University of Michigan October 5, 2001.
Dr. Zahid Anwar. Simplified Architecture of Linux Cluster Simplified Architecture of a Single Computer Simplified architecture of an enterprise cluster.
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
Lesson 3 – UNDERSTANDING NETWORKING. Network relationship types Network features OSI Networking model Network hardware components OVERVIEW.
CON Software-Defined Networking in a Hybrid, Open Data Center Krishna Srinivasan Senior Principal Product Strategy Manager Oracle Virtual Networking.
Firewalls and VPNS Team 9 Keith Elliot David Snyder Matthew While.
Computer Networks IGCSE ICT Section 4.
Network Topologies.
The Effects of Systemic Packets Loss on Aggregate TCP Flows Thomas J. Hacker May 8, 2002 Internet 2 Member Meeting.
In The Name Of Allah Whose Blessings Are Uncountable.
Network Performance Measurement Atlas Tier 2 Meeting at BNL December Joe Metzger
GridFTP Guy Warner, NeSC Training.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
Evaluating Centralized, Hierarchical, and Networked Architectures for Rule Systems Benjamin Craig University of New Brunswick Faculty of Computer Science.
1 ESnet Network Measurements ESCC Feb Joe Metzger
Making the Internet a Better Place for Business NIST PKI Steering Committee March 14, 2002.
Advanced Network Architecture Research Group 2001/11/149 th International Conference on Network Protocols Scalable Socket Buffer Tuning for High-Performance.
CS448 Computer Networking Chapter 1 Introduction to Computer Networks Instructor: Li Ma Office: NBC 126 Phone: (713)
Chapter 4. After completion of this chapter, you should be able to: Explain “what is the Internet? And how we connect to the Internet using an ISP. Explain.
Maximizing End-to-End Network Performance Thomas Hacker University of Michigan October 26, 2001.
Networking By Nachiket Agrawal 10DD Contents Network Stand Alone LAN Advantages and Disadvantages of LAN Advantages and Disadvantages of LAN Cabled LAN.
SDN based Network Security Monitoring in Dynamic Cloud Networks Xiuzhen CHEN School of Information Security Engineering Shanghai Jiao Tong University,
The Research and Education Network: Platform for Innovation Heather Boyles, Next Generation Network Symposium Malaysia 2007-March-15.
INTERNATIONAL NETWORKS At Indiana University Hans Addleman TransPAC Engineer, International Networks University Information Technology Services Indiana.
Developing a 100G TestBed for Life Science Collaborations  Taking advantage of existing UM/SURA dark fiber to create a research 100G pathway from St.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco PublicITE I Chapter 6 1 Identifying Application Impacts on Network Design Designing and Supporting.
The University of Bolton School of Games Computing & Creative Technologies LCT2516 Network Architecture CCNA Exploration LAN Switching and Wireless Chapter.
CON Software-Defined Networking in a Hybrid, Open Data Center Krishna Srinivasan Senior Principal Product Strategy Manager Oracle Virtual Networking.
UNM RESEARCH NETWORKS Steve Perry CCNP, CCDP, CCNP-V, CCNP-S, CCNP-SP, CCAI, CMNA, CNSS 4013 Director of Networks.
LAN Switching and Wireless – Chapter 1 Vilina Hutter, Instructor
SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program.
Advanced Network Architecture Research Group 2001/11/74 th Asia-Pacific Symposium on Information and Telecommunication Technologies Design and Implementation.
Innovations to Transition a Campus Core Cyberinfrastructure to Serve Diverse and Emerging Researcher Needs Prasad Calyam (Presenter), Jay Young, Paul Schopis.
OS Services And Networking Support Juan Wang Qi Pan Department of Computer Science Southeastern University August 1999.
Cisco 3 - Switch Perrine. J Page 111/6/2015 Chapter 5 At which layer of the 3-layer design component would users with common interests be grouped? 1.Access.
1 Network Measurement Summary ESCC, Feb Joe Metzger ESnet Engineering Group Lawrence Berkeley National Laboratory.
Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,
Practical Distributed Authorization for GARA Andy Adamson and Olga Kornievskaia Center for Information Technology Integration University of Michigan, USA.
Slide 1 Experiences with PerfSONAR and a Control Plane for Software Defined Measurement Yan Luo Department of Electrical and Computer Engineering University.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation Analyzing.
Slide 1 Campus Design – Successes and Challenges Michael Cato, Vassar College Mike James, Northwest Indian College Carrie Rampp, Franklin and Marshall.
Cyberinfrastructure: An investment worth making Joe Breen University of Utah Center for High Performance Computing.
ITGS Network Architecture. ITGS Network architecture –The way computers are logically organized on a network, and the role each takes. Client/server network.
Performance Engineering E2EpiPEs and FastTCP Internet2 member meeting - Indianapolis World Telecom Geneva October 15, 2003
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Internet Connectivity and Performance for the HEP Community. Presented at HEPNT-HEPiX, October 6, 1999 by Warren Matthews Funded by DOE/MICS Internet End-to-end.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
© 2015 Pittsburgh Supercomputing Center Opening the Black Box Using Web10G to Uncover the Hidden Side of TCP CC PI Meeting Austin, TX September 29, 2015.
1 GridFTP and SRB Guy Warner Training, Outreach and Education Team, Edinburgh e-Science.
SCIENCE_DMZ NETWORKS STEVE PERRY, DIRECTOR OF NETWORKS UNM PIYASAT NILKAEW, DIRECTOR OF NETWORKS NMSU.
Trojan Express Network II Goal: Develop Next Generation research network in parallel to production network to address increasing research data transfer.
Connect communicate collaborate Performance Metrics & Basic Tools Robert Stoy, DFN EGI TF, Madrid September 2013.
Advanced Network Diagnostic Tools Richard Carlson EVN-NREN workshop.
Slide 1 E-Science: The Impact of Science DMZs on Research Presenter: Alex Berryman Performance Engineer, OARnet Paul Schopis, Marcio Faerman.
UNM SCIENCE DMZ Sean Taylor Senior Network Engineer.
Implementing TMG Server Publishing
GGF15 – Grids and Network Virtualization
Chapter 16: Distributed System Structures
OpenFlow Switch as a low-impact Firewall
ONOS Drake Release September 2015.
An Introduction to Computer Networking
ExaO: Software Defined Data Distribution for Exascale Sciences
Automated Discovery of Failing TCP Flows with XSight
Network+ Guide to Networks, Fourth Edition
Client/Server and Peer to Peer
In-network computation
Presentation transcript:

Slide 1 9/29/15 End-to-End Performance Tuning and Best Practices Moderator: Charlie McMahon, Tulane University Jan Cheetham, University of Wisconsin-Madison Chris Rapier, Pittsburgh Supercomputing Center Paul Gessler, University of Idaho Maureen Dougherty, USC Wednesday, September 29, 2015

Slide 2 9/29/15 Slide 2 Professor & Director, Northwest Knowledge Network University of Idaho Paul Gessler

Slide 3 9/29/15 Slide 3 Enabling 10 Gbps connections to the Idaho Regional Optical Network UI Moscow campus network core Northwest Knowledge Network and DMZ DOE’s Idaho National Lab Implemented perfSONAR monitoring over Idaho Institute for Biological and Evolutionary Studies

Slide 4 9/29/15

Slide 5 9/29/15

Slide 6 9/29/15 Slide 6 Research and Instructional Technologies Consultant University of Wisconsin-Madison Jan Cheetham

Slide 7 9/29/15 Slide 7 University of Wisconsin Campus Network HEP Biotech IceCUBE SSEC Engineeri ng LOCI WID WEI CHTC Campus Network Distribution Science DMZ Internet2 Innovation Network 100G perfSONAR

Slide 8 9/29/15 Slide 8 Diagnosing Network Issues PerfSONAR helps uncover problems with: TCP window size issues to San Diego Optical fiber cut affecting latency-sensitive link between SSEC and NOAA Line card failure resulting in dropped packets on research partner’s (WID) LAN Transfers from internal data stores to distributed computer resources (HTCondor pools)

Slide 9 9/29/15 Slide 9 Dealing with Firewalls Can’t use firewall Security baseline for research computing Must be behind a firewall Upgrade firewall to high speed backplane to allow 10G throughput to campus in preparation for campus network upgrade Plan to use SDN to shunt some traffic (identified uses within our security policy)

Slide 10 9/29/15 Slide 10 Challenges 100 GE line card failure (pursuing buffer overflow) Separating spiky research traffic from the rest of campus network traffic Distributed campus—getting the word out to enable everyone to take advantage Internal network environments limitations for researchers Storage bottleneck

Slide 11 9/29/15 Slide 11 Senior Research Programmer Pittsburgh Supercomputing Center Chris Rapier

Slide 12 9/29/15 Slide 12 XSight & Web10G Goal: Use the metrics provided by Web10G to enhance workflow by early identification of pathological flows. A distributed set of Web10G enabled listeners on Data Transfer Nodes across multiple domains. Gather data on all flows of interest and collate at centralized DB. Analyze data to find marginal and failing flows Provide NOC with actionable data in near real time

Slide 13 9/29/15 Slide 13 Implementation Listener: C application periodically polls all TCP flows. Applies rule set to Database: InfluxDB. Time series DB. Analysis engine: Currently applies heuristic approach. Development of models in progress. UI: Web based logical map. Allows engineers to drill down to failing flows and display collected metrics.

Slide 14 9/29/15 Slide 14 Results Analysis engine and UI still in development Looking for partners for listener deployment (includes NOCs) 6 months left under EAGER grant. Will be seeking to renew grant.

Slide 15 9/29/15 Slide 15 Director, Center for High-Performance Computing USC Maureen Dougherty

Trojan Express Network II Goal: Develop Next Generation research network in parallel to production network to address increasing research data transfer demands Leverage existing 100G Science DMZ Instead of expensive routers, use cheaper high-end network switches Use OpenFlow running on a server to control the switch PerfSONSAR systems for metrics and monitoring

Trojan Express Network Buildout

Collaborative Bandwidth Tests 72.5ms round trip between USC and Clemson 100Gbps Shared Link 12 machine OrangeFS cluster at USC –Directly connected to Brocade Switch at 10Gbps Each 12 clients at Clemson USC ran nuttcp sessions between pairs of USC and Clemson hosts Clemson ran file copies to the USC OrangeFS cluster

Linux Network Configuration Bandwidth Delay Product 72.5ms x 10Gbits/second = bytes (90Mbytes) net.core.rmem_max = net.core.wmem_max = net.ipv4.tcp_rmem = net.ipv4.tcp_wmem = net.ipv4.tcp_congestion_control = yeah jumbo frames enabled (mtu 9000)

Nuttcp Bandwidth Test Peak Transfer of 72Gb/s with 9 nodes

Slide 21 9/29/15 Slide 21 Contact Information Charlie McMahon, Tulane University Jan Cheetham University of Wisconsin-Madison Chris Rapier, Pittsburgh Supercomputing Center Paul Gessler, University of Idaho