Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing

Slides:



Advertisements
Similar presentations
6/1/2014FLOCON 2009, Scottsdale, AZ. DoD Disclaimer 6/1/2014FLOCON 2009, Scottsdale, AZ This document was prepared as a service to the DoD community.
Advertisements

National Security Technology Center (NSTC)
Carnegie Mellon University Software Engineering Institute CERT® Knowledgebase Copyright © 1997 Carnegie Mellon University VU#14202 UNIX rlogin with stack.
Laser Direct Manufacturing of Nuclear Power Components
Smart Grid Communication System (SGCS) Jeff Nichols Sr. Director IT Infrastructure San Diego Gas & Electric 1.
SDN + Storage.
METHODOLOGY FOR IDENTIFYING NEAR-OPTIMAL INTERDICTION STRATEGIES FOR A POWER TRANSMISSION SYSTEM Vicki M. Bier, Eli Robert Gratz, Naraphorn J. Haphuriwat,
WinDS-H2 Model and Analysis Walter Short, Nate Blair, Donna Heimiller, Keith Parks National Renewable Energy Laboratory May 27, 2005 Project AN4 This presentation.
Concentrating Solar Deployment Systems (CSDS) A New Model for Estimating U.S. Concentrating Solar Power Market Potential Nate Blair, Walter Short, Mark.
Emme Mayle Dr. Charles Rovey Missouri State University
Technical Report NREL/TP April 2007 Controlled Hydrogen Fleet and Infrastructure Demonstration and Validation Project Spring 2007 Composite Data.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable.
David Ripplinger, Aradhana Narula-Tam, Katherine Szeto AIAA 2013 August 21, 2013 Scheduling vs Random Access in Frequency Hopped Airborne.
Smart Grid Primer Funded by the U.S. Department of Energy, Office of Electricity Delivery and Energy Reliability Energy Bar Association – Primer for Lawyers.
Megan Houchin Safety Analysis Engineering Y-12 National Security Complex SAWG May 7 th, 2012.
Slide 1 Upgrading the United States Transuranium and Uranium Registries’ Pathology Database Stacey L. McCord, MS USTUR Project Associate
1 Algorithms for Bandwidth Efficient Multicast Routing in Multi-channel Multi-radio Wireless Mesh Networks Hoang Lan Nguyen and Uyen Trang Nguyen Presenter:
Jeremy W. Poling B&W Y-12 L.L.C. Can’t Decide Whether to Use a DATA Step or PROC SQL? You Can Have It Both Ways with the SQL Function!
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable.
A Node-Centric Load Balancing Algorithm for Wireless Sensor Networks Hui Dai, Richar Han Department of Computer Science University of Colorado at Boulder.
Quasi Fat Trees for HPC Clouds and their Fault-Resilient Closed-Form Routing Technion - EE Department; *and Mellanox Technologies Eitan Zahavi* Isaac Keslassy.
© 2011 New York Independent System Operator, Inc. All Rights Reserved. Smart Grid Investment Grant Project Update Jim McNierney Enterprise Architect New.
1 Jon Sudduth Project Engineer, Intelligent Grid Deployment SWEDE April 26, 2011.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable.
Deposition Velocity Issues at Y-12 Bruce A Wilson Chief Engineer, Nuclear Facility Safety Douglas Clark Analyst B&W Technical Services Y-12 May 9, 2012.
Jeffrey C Quick, Utah Geological Survey Sara Pletcher, Project Manager National Energy Technology Laboratory.
1 Floyd Galvan October 12-13, 2011.
Department of Computer Science at Florida State LFTI: A Performance Metric for Assessing Interconnect topology and routing design Background ‒ Innovations.
1 Technical Report NREL/TP March 2009 Controlled Hydrogen Fleet Infrastructure Demonstration and Validation Project Spring 2009 Composite Data.
Modeling and Validation of a Large Scale, Multiphase Carbon Capture System William A. Lane a, Kelsey R. Bilsback b, Emily M. Ryan a a Department of Mechanical.
PRES-ET A011 Lynn J. Harkey SDIT Project Engineer Uranium Processing Facility Project B&W Y-12 August 26, 2009 The Process, Methods and Tool Used.
OG&E’s Smart Study TOGETHER: Impact Assessment of Enabling Technologies and Dynamic Pricing Rates Katie Chiccarelli, Craig Williamson January 24, 2012.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable.
USAEE Conference 2011, CJN Oct 2011 The Role of CCS under a Clean Energy Standard 30 th USAEE/IAEE Conference Oct 10, 2011 Washington, DC Chris Nichols,
2011 Broward Municipal Green Initiatives Survey Results GHG Mitigation Energy 2/3 of Broward’s reporting municipalities have implemented incentives or.
Gas-Electric System Interface Study OPSI Annual Meeting October 8, 2013 Raleigh, North Carolina.
Geographic Variation of Mercury Content, and Mercury Emissions Predicted For Existing Technologies, by U.S. County of Coal Origin Authors: Jeffrey C Quick.
Y-12 Integration of Security and Safety Basis, Including Firearms Safety David Sheffey Safety Analysis, Compliance, and Oversight Manager B&W Technical.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable.
V Work Environment Forecast – Dec There is a COLD WIND a coming… Larry Supina Manager, Pantex Readiness Consolidated Nuclear Security, LLC
Long Term National Impacts of State- level Policies WindPower 2006 Nate Blair, Walter Short, Paul Denholm, Donna Heimiller National Renewable Energy Laboratory.
1 Technical Report NREL/TP May 2010 Controlled Hydrogen Fleet and Infrastructure Demonstration and Validation Project Spring 2010 Composite Data.
Primer Briefing “Brand Name or Equal” Purchase Descriptions Ask a Professor - # Date:
Leveraging: What’s New ? Been There / Done That ? Heard this All Before? Meg Power, PhD Economic Opportunity Studies Washington, DC NASCSP Fall Meeting,
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable.
What’s All This I Hear About Information “Architecture?” InterLab 06 Joe Chervenak & Marsha Luevane National Renewable Energy Laboratory.
Technical Report NREL/TP April 2008 Controlled Hydrogen Fleet and Infrastructure Demonstration and Validation Project Spring 2008 Composite Data.
1 Technical Report NREL/TP June 2010 Early Fuel Cell Market Deployments: ARRA Quarter 1 of 2010 Composite Data Products Final Version February.
V UNCLASSIFIED This document has been reviewed by a Y-12 DC/UCNI-RO and has been determined to be UNCLASSIFIED and contains no UCNI. This review does not.
B O N N E V I L L E P O W E R A D M I N I S T R A T I O N BPA - WISP Project NASPI Work Group Meeting October 12-13, 2011 Scott Lissit – Project Manager,
PJM Interconnection Smart Grid Investment Grant Update
1 Technical Report NREL/TP October 2008 Controlled Hydrogen Fleet Infrastructure Demonstration and Validation Project Fall 2008 Composite Data.
National Renewable Energy Laboratory 1 Innovation for Our Energy Future.
Evaluation of the Impact to the Safety Basis of Research Conducted in Production Facilities at the Y-12 National Security Complex Rebecca N. Bell Senior.
On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable.
1 Technical Report NREL/TP September 2009 Controlled Hydrogen Fleet Infrastructure Demonstration and Validation Project Fall 2009 Composite Data.
2015 NARUC Winter Meeting Nick Wagner – Iowa Utilities Board 1.
Spring 2016 ICC Meeting – Subcommittee F Estimating the Value / Benefit of Diagnostics 1 of 3 – Perhaps? Josh Perkel and Nigel Hampton NEETRAC.
Scalable Coupled ICT and Power Grid Simulation - High-performance Coupled Transmission, Distribution, and Communication Simulation Tool 15PESGM2794 Liang.
Network Layer COMPUTER NETWORKS Networking Standards (Network LAYER)
Architecture and Algorithms for an IEEE 802
Authors: Sajjad Rizvi, Xi Li, Bernard Wong, Fiodar Kazhamiaka
Subset Selection in Multiple Linear Regression
Smart Grid Primer Energy Bar Association – Primer for Lawyers
Vicki M. Bier, Eli Robert Gratz, Naraphorn J
September Workshop and Advisory Board Meeting Presenter Affiliation
September Workshop and Advisory Board Meeting Presenter Affiliation
State Participation in Nonproliferation Regime Networks
2/3 20% 71% Half 54% Over Half 45% 14% Introduction GHG Mitigation
Presentation transcript:

Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing Staci A. Smith, Clara E. Cromey, David K. Lowenthal The University of Arizona Jens Domke Tokyo Institute of Technology Nikhil Jain, Jayaraman J. Thiagarajan, Abhinav Bhatele Lawrence Livermore National Laboratory

Inter-job network interference on production systems Dedicated nodes isolated compute resources Shared network inter-job network contention Credit: Bhatele et al. “There goes the neighborhood” SC’13. Torus-based systems can have up to 2x performance degradation [Bhatele 13]. Evidence on dragonfly-based systems indicates variability as well [Chunduri 17].

Our contributions Measured inter-job interference on dragonfly and fat-tree clusters There is more than 2x performance degradation with both interconnects.

Our contributions Measured inter-job interference on dragonfly and fat-tree clusters There is more than 2x performance degradation with both interconnects. Performed analysis of interference on the fat-tree cluster Performance degradation is caused by a few network hotspots on fat-tree.

Our contributions Measured inter-job interference on dragonfly and fat-tree clusters There is more than 2x performance degradation with both interconnects. Performed analysis of interference on the fat-tree cluster Performance degradation is caused by a few network hotspots on fat-tree. Developed a routing-based mitigation strategy on fat-tree clusters The strategy, Adaptive Flow-Aware Routing or AFAR, achieves up to 46% runtime improvement on benchmarks run under contention.

Background: Routing in modern systems Dragonfly: Routed adaptively path between each pair of nodes is non-deterministic multipath routing — attempts to avoid congestion Fat-tree: Typically routed statically path between each pair of nodes is deterministic single-path routing — oblivious to congestion adaptive routing has not been used until very recently (new Sierra and Summit systems [Vazhkudai et al. SC’18])

Inter-job interference experiments System: Cab, a 1296-node Infiniband-based fat-tree cluster at LLNL Benchmarks: bisection bandwidth nearest neighbors random pairs FFT proxy Methodology: Benchmarks spend 70-75% time in computation Each job ran (1) in isolation and (2) with competition nearest neighbors bisection bandwidth random pairs FFT proxy

Interference results Performance degrades under competition (with respect to isolated performance).

Interference results Median degradation is usually 30-50% for applications sensitive to contention.

Interference results Degradation varies significantly across different placements of a given job. Why?

Varying distribution of traffic across Cab System link loads I Minor slowdown

Varying distribution of traffic across Cab System link loads I System link loads II System link loads III Minor slowdown Moderate slowdown Significant slowdown

Varying distribution of traffic across Cab System link loads I System link loads II System link loads III Minor slowdown Moderate slowdown Significant slowdown

Varying distribution of traffic across Cab System link loads I System link loads II System link loads III Minor slowdown Moderate slowdown Significant slowdown

Varying distribution of traffic across Cab System link loads I System link loads II System link loads III Minor slowdown Moderate slowdown Significant slowdown

Correlating performance to traffic Per-process performance correlates to amount of interfering traffic on links.

Correlating performance to traffic Per-process performance correlates to amount of interfering traffic on links. Degradation increases starting at 60 GB traffic on link. 3.9 GB/s on average 78% of advertised maximum

Correlating performance to traffic Per-process performance correlates to amount of interfering traffic on links. Since most links are below 60 GB, can we reduce traffic on the others to improve performance? System link loads III Only a few links are too heavily loaded

AFAR: Adaptive Flow-Aware Routing Idea: periodically re-route to alleviate hotspots Given traffic for each pair of nodes in the system and given current routing Calculate current load on all links in system Find link with maximum load If maximum too high, re-route one flow crossing that link to a less utilized link Repeat from (1), using new routing

AFAR example Two jobs (blue and orange) scheduled on the system

AFAR example Two jobs (blue and orange) scheduled on the system Node-to-node traffic known

AFAR example Two jobs (blue and orange) scheduled on the system Node-to-node traffic known

AFAR example Two jobs (blue and orange) scheduled on the system Node-to-node traffic known

AFAR example Two jobs (blue and orange) scheduled on the system Node-to-node traffic known Current routing tables used to calculate links carrying flows

AFAR example Two jobs (blue and orange) scheduled on the system Node-to-node traffic known Current routing tables used to calculate links carrying flows

AFAR example Two jobs (blue and orange) scheduled on the system Node-to-node traffic known Current routing tables used to calculate links carrying flows

AFAR example Two jobs (blue and orange) scheduled on the system Node-to-node traffic known Current routing tables used to calculate links carrying flows Using all flows, find the link with maximum load (circled)

AFAR example Choose one of the flows on that link…

AFAR example Choose one of the flows on that link…

AFAR example Choose one of the flows on that link… and re-route it to a less utilized link

AFAR example Calculate new flows and repeat In this example, all links now have at most one flow.

AFAR prototype in OpenSM OpenSM is an InfiniBand subnet manager Handles computing and distributing routing tables Open-source availability Our prototype: Uses OpenSM file routing engine* AFAR provides routing tables to OpenSM using a shared file and a signal * Since publication, we have implemented the algorithm directly in OpenSM

Tests on Cab Methodology: For each workload, run AFAR offline to generate new routing tables (in practice, it achieves threshold in 10-20 iterations) Run workload with Default fat-tree routing (baseline) AFAR routing Evaluate per-job performance with each routing

Results of applying AFAR to our workloads AFAR achieves significant improvement when degradation is the worst... Results normalized to isolated performance.

Results of applying AFAR to our workloads AFAR achieves significant improvement when degradation is the worst... Maximum improvement: 46% Results normalized to isolated performance.

Results of applying AFAR to our workloads AFAR achieves significant improvement when degradation is the worst... Maximum improvement: 46% … and good median improvement across all experiments. Runtime improvement across all experiments Median for bisection 25% Median for nearest-neighbors 13% Results normalized to isolated performance.

Related Work Routing to improve system performance Scheduling-Aware Routing [Domke 16] System-wide re-routing, but not flow-aware. SDN in InfiniBand fat-tree networks [Lee 16] Requires extension to current InfiniBand networks, results simulated. Positive preliminary results of Mellanox adaptive routing on fat-tree [Vazhkudai 18] We have not yet evaluated AFAR in comparison. Performance variability on dragonfly Various sources of variability on dragonfly [Chunduri 17] Variability of communication benchmarks on dragonfly [Groves 17]

Conclusion Network interference on current systems can cause 2x performance degradation. AFAR targets network hotspots at runtime to significantly reduce interference. In tests on a fat-tree system our prototype achieves: Up to 46% runtime improvement 13%-25% median improvement for different job types

Acknowledgements and contact information Thanks to Livermore Computing at LLNL for making our experiments possible. Questions? Contact email: smiths949@cs.arizona.edu Code repository: https://bitbucket.org/stacismith/sc18-adaptive-flow-aware-routing

This work was performed under the auspices of the U. S This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 (LLNL-PRES-761428). This document was prepared as an account of work sponsored by an agency of the United States government. Neither the United States government nor Lawrence Livermore National Security, LLC, nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States government or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes.