Allen D. Malony Department of Computer and Information Science TAU Performance Research Laboratory University of Oregon Discussion:

Slides:



Advertisements
Similar presentations
Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
Advertisements

1 VLDB 2006, Seoul Mapping a Moving Landscape by Mining Mountains of Logs Automated Generation of a Dependency Model for HUG’s Clinical System Mirko Steinle,
Priority Research Direction (I/O Models, Abstractions and Software) Key challenges What will you do to address the challenges? – Develop newer I/O models.
4.1.5 System Management Background What is in System Management Resource control and scheduling Booting, reconfiguration, defining limits for resource.
A Brief Introduction. Acknowledgements  The material in this tutorial is based in part on: Concurrency: State Models & Java Programming, by Jeff Magee.
Productive Performance Tools for Heterogeneous Parallel Computing Allen D. Malony Department of Computer and Information Science University of Oregon Shigeo.
The Path to Multi-core Tools Paul Petersen. Multi-coreToolsThePathTo 2 Outline Motivation Where are we now What is easy to do next What is missing.
IBM Software Group ® Recommending Materialized Views and Indexes with the IBM DB2 Design Advisor (Automating Physical Database Design) Jarek Gryz.
Hiperspace Lab University of Delaware Antony, Sara, Mike, Ben, Dave, Sreedevi, Emily, and Lori.
Robert Bell, Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science.
Self-Correlating Predictive Information Tracking for Large-Scale Production Systems Zhao, Tan, Gong, Gu, Wambolt Presented by: Andrew Hahn.
Tools for Engineering Analysis of High Performance Parallel Programs David Culler, Frederick Wong, Alan Mainwaring Computer Science Division U.C.Berkeley.
On the Integration and Use of OpenMP Performance Tools in the SPEC OMP2001 Benchmarks Bernd Mohr 1, Allen D. Malony 2, Rudi Eigenmann 3 1 Forschungszentrum.
Instrumentation and Profiling David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston, MA
Copyright 2004 David J. Lilja1 Measuring Computer Performance: A Practitioner’s Guide David J. Lilja Electrical and Computer Engineering University of.
1 FM Overview of Adaptation. 2 FM RAPIDware: Component-Based Design of Adaptive and Dependable Middleware Project Investigators: Philip McKinley, Kurt.
1 Presenter: Chien-Chih Chen Proceedings of the 2002 workshop on Memory system performance.
Allen D. Malony, Sameer Shende, Robert Bell Department of Computer and Information Science Computational Science Institute, NeuroInformatics.
Kai Li, Allen D. Malony, Robert Bell, Sameer Shende Department of Computer and Information Science Computational.
Instrumentation and Measurement CSci 599 Class Presentation Shreyans Mehta.
A Research Agenda for Accelerating Adoption of Emerging Technologies in Complex Edge-to-Enterprise Systems Jay Ramanathan Rajiv Ramnath Co-Directors,
Data Mining Chun-Hung Chou
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
University of Kansas Electrical Engineering Computer Science Jerry James and Douglas Niehaus Information and Telecommunication Technology Center Electrical.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Analyzing parallel programs with Pin Moshe Bach, Mark Charney, Robert Cohn, Elena Demikhovsky, Tevi Devor, Kim Hazelwood, Aamer Jaleel, Chi- Keung Luk,
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Integrated Performance Views in Charm++: Projections meets TAU Scott Biersdorff Allen D. Malony Department Computer and Information Science University.
Scalable Analysis of Distributed Workflow Traces Daniel K. Gunter and Brian Tierney Distributed Systems Department Lawrence Berkeley National Laboratory.
What are the main differences and commonalities between the IS and DA systems? How information is transferred between tasks: (i) IS it may be often achieved.
Composing Adaptive Software Authors Philip K. McKinley, Seyed Masoud Sadjadi, Eric P. Kasten, Betty H.C. Cheng Presented by Ana Rodriguez June 21, 2006.
Performance Model & Tools Summary Hung-Hsun Su UPC Group, HCS lab 2/5/2004.
Programming Models & Runtime Systems Breakout Report MICS PI Meeting, June 27, 2002.
CS 584. Performance Analysis Remember: In measuring, we change what we are measuring. 3 Basic Steps Data Collection Data Transformation Data Visualization.
Allen D. Malony Department of Computer and Information Science Performance Research Laboratory University of Oregon Performance Technology.
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Chapter 3: Software Project Management Metrics
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
Workshop BigSim Large Parallel Machine Simulation Presented by Eric Bohm PPL Charm Workshop 2004.
Allen D. Malony, Sameer Shende, Li Li, Kevin Huck Department of Computer and Information Science Performance.
Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.
Lecture 3 : Performance of Parallel Programs Courtesy : MIT Prof. Amarasinghe and Dr. Rabbah’s course note.
CSE 303 – Software Design and Architecture
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
1 University of Maryland Runtime Program Evolution Jeff Hollingsworth © Copyright 2000, Jeffrey K. Hollingsworth, All Rights Reserved. University of Maryland.
Shangkar Mayanglambam, Allen D. Malony, Matthew J. Sottile Computer and Information Science Department Performance.
Integrated Performance Views in Charm++: Projections meets TAU Scott Biersdorff Allen D. Malony Department Computer and Information Science University.
Satisfying Requirements BPF for DRA shall address: –DAQ Environment (Eclipse RCP): Gumtree ISEE workbench integration; –Design Composing and Configurability,
Performane Analyzer Performance Analysis and Visualization of Large-Scale Uintah Simulations Kai Li, Allen D. Malony, Sameer Shende, Robert Bell Performance.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Online Performance Analysis and Visualization of Large-Scale Parallel Applications Kai Li, Allen D. Malony, Sameer Shende, Robert Bell Performance Research.
Beyond Application Profiling to System Aware Analysis Elena Laskavaia, QNX Bill Graham, QNX.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
Why is Design so Difficult? Analysis: Focuses on the application domain Design: Focuses on the solution domain –The solution domain is changing very rapidly.
Information Aids for Diagnosis Tasks Based on Operators’ Strategies 김 종 현.
Sub-fields of computer science. Sub-fields of computer science.
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
Productive Performance Tools for Heterogeneous Parallel Computing
OPERATING SYSTEMS CS 3502 Fall 2017
Software Architecture
Performance Technology for Scalable Parallel Systems
Allen D. Malony, Sameer Shende
Human Complexity of Software
Allen D. Malony Computer & Information Science Department
Recommending Materialized Views and Indexes with the IBM DB2 Design Advisor (Automating Physical Database Design) Jarek Gryz.
BigSim: Simulating PetaFLOPS Supercomputers
Presentation transcript:

Allen D. Malony Department of Computer and Information Science TAU Performance Research Laboratory University of Oregon Discussion: How to Address Tools Scalability

IBM Petascale Tools2 Scale and Scaling  What is meant by scale?  Processors  execution concurrency/parallelism   Memory  memory behavior, problem size  Network  concurrent communications  File system  parallel file operations / data size  Scaling in the physical size / concurrency of the system  What else?  Program  code size / interacting modules  Power  electrical power consumption  Performance  potential computational power  Dimension  Terascale … Petascale … and beyond

Discussion: How to Address Tools ScalabilityIBM Petascale Tools3 Tools Scalibility  Types of tools  Performance  analytical / simulation / empirical   Debugging  detect / correct concurrency errors  Programming  parallel languages / computation  Compiling  parallel code / libraries  Scheduling  systems allocation and launching  What does it mean for a tool to be scalable?  Tool dependent (different problems and scaling aspects)  What changes about the tool?  Naturally scalable vs. change in function / operation  Is a paradigm shift required?  To what extent is portability important?  What tools would you say are scalable? How? Why?

Discussion: How to Address Tools ScalabilityIBM Petascale Tools4 Focus – Parallel Performance Tools/Technology   Tools for performance problem solving  Empirical-based performance optimization process  Performance technology concerns characterization Performance Tuning Performance Diagnosis Performance Experimentation Performance Observation hypotheses properties Instrumentation Measurement Analysis Visualization Performance  Technology Experiment management Performance data storage Performance Technology

Discussion: How to Address Tools ScalabilityIBM Petascale Tools5 Large Scale Performance Problem Solving  How does our view of this process change when we consider very large-scale parallel systems?  What are the significant issues that will affect the technology used to support the process?  Parallel performance observation is required  In general, there is the concern for intrusion  Seen as a tradeoff with performance diagnosis accuracy  Scaling complicates observation and analysis  Nature of application development may change  What will enhance productive application development?  Paradigm shift in performance process and technology?

Discussion: How to Address Tools ScalabilityIBM Petascale Tools6 Instrumentation and Scaling  Make events visible to the measurement system  Direct instrumentation (code instrumentation)  Static instrumentation modifies code prior to execution  does not get removed (always will get executed)  source instrumentation may alter optimization  Dynamic instrumentation modifies code at runtime  can be inserted and deleted at runtime  incurs runtime cost  Indirect instrumentation generates events outside of code  Does scale affect the number of events?  Runtime instrumentation is more difficult with scale  Affected by increased parallelism

Discussion: How to Address Tools ScalabilityIBM Petascale Tools7 Measurement and Scaling  What makes performance measurement not scalable?  More parallelism  more performance data overall  performance data specific to each thread of execution  possible increase in number interactions between threads  Harder to manage the data (memory, transfer, storage)  Issues of performance intrusion  Performance data size  Number of event generated X metrics per event  Are there really more events? Which are important?  Control number of events generated  Control what is measured (to a point)  Need for performance data versus cost of obtaining it  Portability!

Discussion: How to Address Tools ScalabilityIBM Petascale Tools8 Measurement and Scaling (continued)  Consider “traditional” measurement methods  Profiling: summary statistics calculated during execution  Tracing: time-stamped sequence of execution events  Statistical sampling: indirect triggers, PC + metrics  Monitoring: access to performance data at runtime  How does the performance data grow?  How does per thread profile / trace size grow?  Consider communication  Strategies for scaling  Control performance data production and volume  Change in measurement type or approach  Event and/or measurement control  Filtering, throttling, and sampling

Discussion: How to Address Tools ScalabilityIBM Petascale Tools9 Concern for Performance Measurement Intrusion  Performance measurement can affect the execution  Perturbation of “actual” performance behavior  Minor intrusion can lead to major execution effects  Problems exist even with small degree of parallelism  Intrusion is accepted consequence of standard practice  Consider intrusion (perturbation) of trace buffer overflow  Scale exacerbates the problem … or does it?  Traditional measurement techniques tend to be localized  Suggests scale may not compound local intrusion globally  Measuring parallel interactions likely will be affected  Use accepted measurement techniques intelligently

Discussion: How to Address Tools ScalabilityIBM Petascale Tools10 Analysis and Visualization Scalability  How to understand all the performance data collected?  Objectives  Meaningful performance results in meaningful forms  Want tools to be reasonably fast and responsive  Integrated, interoperable, portable, …  What does “scalability” mean here?  Performance data size  Large data size should not impact analysis tool use  Data complexity should not overwhelm interpretation  Results presentation should understandable  Tool integration and usability

Discussion: How to Address Tools ScalabilityIBM Petascale Tools11 Analysis and Visualization Scalability (continued)  Online analysis and visualization  Potential interference with execution  Single experiment analysis versus multiple experiments  Strategies  Statistical analysis  data dimension reduction, clustering, correlation, …  Scalable and semantic presentation methods  statistical, 3D  relate metrics to physical domain  Parallelization of analysis algorithms (e.g., trace analysis)  Increase system resources for analysis / visualization tools  Integration with performance modeling  Integration with parallel programming environment

Discussion: How to Address Tools ScalabilityIBM Petascale Tools12 Role of Intelligence and Specificity  How to make the process more effective (productive)?  Scale forces performance observation to be intelligent  Standard approaches deliver a lot of data with little value  What are the important performance events and data?  Tied to application structure and computational mode  Tools have poor support for application-specific aspects  Process and tools can be more application-aware  Will allow scalability issues to be addressed in context  More control and precision of performance observation  More guided performance experimentation / exploration  Better integration with application development

Discussion: How to Address Tools ScalabilityIBM Petascale Tools13 Role of Automation and Knowledge Discovery  Even with intelligent and application-specific tools, the decisions of what to analyze may become intractable  Scale forces the process to become more automated  Performance extrapolation must be part of the process  Build autonomic capabilities into the tools  Support broader experimentation methods and refinement  Access and correlate data from several sources  Automate performance data analysis / mining / learning  Include predictive features and experiment refinement  Knowledge-driven adaptation and optimization guidance  Address scale issues through increased expertise

Discussion: How to Address Tools ScalabilityIBM Petascale Tools14 ParaProf – Histogram View (Miranda) 8k processors 16k processors

Discussion: How to Address Tools ScalabilityIBM Petascale Tools15 ParaProf – 3D Full Profile (Miranda) 16k processors

Discussion: How to Address Tools ScalabilityIBM Petascale Tools16 ParaProf – 3D Scatterplot (Miranda)  Each point is a “thread” of execution  A total of four metrics shown in relation  ParaVis 3D profile visualization library  JOGL

Discussion: How to Address Tools ScalabilityIBM Petascale Tools17 Hierarchical and K-means Clustering (sPPM)

Discussion: How to Address Tools ScalabilityIBM Petascale Tools18 Vampir Next Generation (VNG) Architecture Merged Traces Analysis Server Classic Analysis:  monolithic  sequential Worker 1 Worker 2 Worker m Master Trace 1 Trace 2 Trace 3 Trace N File System Internet Parallel Program Monitor System Event Streams Visualization Client Segment Indicator 768 Processes Thumbnail Timeline with 16 visible Traces Process Parallel I/O Message Passing