Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science Institute University.

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
Sameer Shende Department of Computer and Information Science NeuroInformatics Center University of Oregon Generating Proxy Components.
Dynamic performance measurement control Dynamic event grouping Multiple configurable counters Selective instrumentation Application-Level Performance Access.
Sameer Shende, Allen D. Malony, and Alan Morris {sameer, malony, Steven Parker, and J. Davison de St. Germain {sparker,
Robert Bell, Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science.
Sameer Shende Department of Computer and Information Science Neuro Informatics Center University of Oregon Tool Interoperability.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
CCA Common Component Architecture Performance Technology for Component Software - TAU Allen D. Malony (U. Oregon) Sameer Shende (U. Oregon) Craig Rasmussen.
Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science Institute University.
Profiling S3D on Cray XT3 using TAU Sameer Shende
TAU: Tuning and Analysis Utilities. TAU Performance System Framework  Tuning and Analysis Utilities  Performance system framework for scalable parallel.
Allen D. Malony Department of Computer and Information Science Computational Science Institute University of Oregon Integrating Performance.
Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science Institute University.
The TAU Performance Technology for Complex Parallel Systems (Performance Analysis Bring Your Own Code Workshop, NRL Washington D.C.) Sameer Shende, Allen.
Nick Trebon, Alan Morris, Jaideep Ray, Sameer Shende, Allen Malony {ntrebon, amorris, Department of.
On the Integration and Use of OpenMP Performance Tools in the SPEC OMP2001 Benchmarks Bernd Mohr 1, Allen D. Malony 2, Rudi Eigenmann 3 1 Forschungszentrum.
1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Future Parallel Computing Systems – what to remember from the past RAMP Workshop FCRC.
DANSE Central Services Michael Aivazis Caltech NSF Review May 23, 2008.
The TAU Performance System: Advances in Performance Mapping Sameer Shende University of Oregon.
Performance Tools BOF, SC’07 5:30pm – 7pm, Tuesday, A9 Sameer S. Shende Performance Research Laboratory University.
Performance Instrumentation and Measurement for Terascale Systems Jack Dongarra, Shirley Moore, Philip Mucci University of Tennessee Sameer Shende, and.
Allen D. Malony Department of Computer and Information Science Computational Science Institute University of Oregon TAU Performance.
June 2, 2003ICCS Performance Instrumentation and Measurement for Terascale Systems Jack Dongarra, Shirley Moore, Philip Mucci University of Tennessee.
Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science Institute University.
An overview of the DANSE software architecture Michael Aivazis Caltech DANSE Kick-Off Meeting Pasadena Aug 15, 2006.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space User Oriented Provisioning of Secure Virtualized.
Allen D. Malony, Sameer Shende, Robert Bell Department of Computer and Information Science Computational Science Institute, NeuroInformatics.
Kai Li, Allen D. Malony, Robert Bell, Sameer Shende Department of Computer and Information Science Computational.
The TAU Performance System Sameer Shende, Allen D. Malony, Robert Bell University of Oregon.
Sameer Shende, Allen D. Malony Computer & Information Science Department Computational Science Institute University of Oregon.
Performance Technology for Complex Parallel Systems REFERENCES.
Understanding and Managing WebSphere V5
Hossein Bastan Isfahan University of Technology 1/23.
SC’01 Tutorial Nov. 7, 2001 TAU Performance System Framework  Tuning and Analysis Utilities  Performance system framework for scalable parallel and distributed.
Exploring the Applicability of Scientific Data Management Tools and Techniques on the Records Management Requirements for the National Archives and Records.
TRACEREP: GATEWAY FOR SHARING AND COLLECTING TRACES IN HPC SYSTEMS Iván Pérez Enrique Vallejo José Luis Bosque University of Cantabria TraceRep IWSG'15.
Introduction 1-1 Introduction to Virtual Machines From “Virtual Machines” Smith and Nair Chapter 1.
Programming Models & Runtime Systems Breakout Report MICS PI Meeting, June 27, 2002.
DANSE Central Services Michael Aivazis Caltech NSF Review May 31, 2007.
Grid Computing Research Lab SUNY Binghamton 1 XCAT-C++: A High Performance Distributed CCA Framework Madhu Govindaraju.
Case Study in Computational Science & Engineering - Lecture 2 1 Parallel Architecture Models Shared Memory –Dual/Quad Pentium, Cray T90, IBM Power3 Node.
Performance Analysis Tool List Hans Sherburne Adam Leko HCS Research Laboratory University of Florida.
Modeling Component-based Software Systems with UML 2.0 George T. Edwards Jaiganesh Balasubramanian Arvind S. Krishna Vanderbilt University Nashville, TN.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Dynamic performance measurement control Dynamic event grouping Multiple configurable counters Selective instrumentation Application-Level Performance Access.
Grid programming with components: an advanced COMPonent platform for an effective invisible grid © 2006 GridCOMP Grids Programming with components. An.
Allen D. Malony, Sameer S. Shende, Alan Morris, Robert Bell, Kevin Huck, Nick Trebon, Suravee Suthikulpanit, Kai Li, Li Li
Preparatory Research on Performance Tools for HPC HCS Research Laboratory University of Florida November 21, 2003.
1 SciDAC High-End Computer System Performance: Science and Engineering Jack Dongarra Innovative Computing Laboratory University of Tennesseehttp://
Allen D. Malony, Sameer S. Shende, Robert Bell Kai Li, Li Li, Kevin Huck Department of Computer.
Allen D. Malony Department of Computer and Information Science Performance Research Laboratory.
SDM Center High-Performance Parallel I/O Libraries (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group.
Performane Analyzer Performance Analysis and Visualization of Large-Scale Uintah Simulations Kai Li, Allen D. Malony, Sameer Shende, Robert Bell Performance.
Allen D. Malony Department of Computer and Information Science Computational Science Institute University of Oregon Integrating Performance.
Online Performance Analysis and Visualization of Large-Scale Parallel Applications Kai Li, Allen D. Malony, Sameer Shende, Robert Bell Performance Research.
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
Introduction to the TAU Performance System®
Performance Technology for Scalable Parallel Systems
TAU integration with Score-P
Tutorial Outline – Part 1
Allen D. Malony, Sameer Shende
TAU Parallel Performance System
Performance Technology for Complex Parallel and Distributed Systems
TAU Parallel Performance System
TAU: A Framework for Parallel Performance Analysis
Outline Introduction Motivation for performance mapping SEAA model
Allen D. Malony, Sameer Shende
Parallel Program Analysis Framework for the DOE ACTS Toolkit
TAU Performance DataBase Framework (PerfDBF)
Presentation transcript:

Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science Institute University of Oregon Performance EngineeringTechnology for Complex Scientific Component Software

Argonne CCA Meeting 2 June 24, 2002 Outline  Complexity and performance technology  Developing performance interfaces for CCA  Performance knowledge repository  Performance observation  TAU performance system  Applications  Implementation  Concluding remarks

Argonne CCA Meeting 3 June 24, 2002 Problem Statement How do we create robust and ubiquitous performance technology for the analysis and tuning of component software in the presence of (evolving) complexity challenges? How do we apply performance technology effectively for the variety and diversity of performance problems that arise in the context of CCA components? 

Argonne CCA Meeting 4 June 24, 2002 Extended Component Design  PKC: Performance Knowledge Component  POC: Performance Observability Component generic component

Argonne CCA Meeting 5 June 24, 2002 Performance Knowledge  Describe and store “known” component’s performance  Benchmark characterizations in performance database  Empirical or analytical performance models  Saved information about component performance  Use for performance-guided selection and deployment  Use for runtime adaptation  Representation must be in common forms with standard means for accessing the performance information

Argonne CCA Meeting 6 June 24, 2002 Performance Knowledge Repository & Component  Component performance repository  Implement in component architecture framework  Similar to CCA component repository [Alexandria]  Access by component infrastructure  View performance knowledge as component (PKC)  PKC ports give access to performance knowledge  to other components back to original component  Static/dynamic component control and composition  Component composition performance knowledge

Argonne CCA Meeting 7 June 24, 2002 Performance Observation  Ability to observe execution performance is important  Empirically-derived performance knowledge  Does not require measurement integration in component  Monitor during execution to make dynamic decisions  Measurement integration is key  Performance observation integration  Component integration: core and variant  Runtime measurement and data collection  On-line and off-line performance analysis

Argonne CCA Meeting 8 June 24, 2002 Performance Observation Component (POC)  Performance observation in a performance-engineered component model  Functional extension of original component design ( )  Include new component methods and ports ( ) for other components to access measured performance data  Allow original component to access performance data  Encapsulate as tightly-couple and co-resident performance observation object  POC “provides” port allow use optmized interfaces ( ) to access ``internal'' performance observations

Argonne CCA Meeting 9 June 24, 2002 Component Composition Performance Engineering  Performance of component-based scientific applications depends on interplay of component functions and the computational resources available  Management of component compositions throughout execution is critical to successful deployment and use  Identify key technological capabilities needed to support the performance engineering of component compositions  Two model concepts  performance awareness  performance attention

Argonne CCA Meeting 10 June 24, 2002 Performance Awareness of Component Ensembles  Composition performance knowledge and observation  Composition performance knowledge  Can come from empirical and analytical evaluation  Can utilize information provided at the component level  Can be stored in repositories for future review  Extends the notion of component observation to ensemble-level performance monitoring  Associate monitoring components hierarchical component grouping  Build upon component-level observation support  Monitoring components act as performance integrators and routers  Use component framework mechanisms

Argonne CCA Meeting 11 June 24, 2002 Performance Engineered Component  Four parts  Performance knowledge  Characterization  Model  Performance observation  Measurement  Analysis  Performance query  Performance control  Extend component design for performance engineering  Keep consistent with CCA model

Argonne CCA Meeting 12 June 24, 2002 TAU Performance System Framework  Tuning and Analysis Utilities  Performance system framework for scalable parallel and distributed high- performance computing  Targets a general complex system computation model  nodes / contexts / threads  Multi-level: system / software / parallelism  Measurement and analysis abstraction  Integrated toolkit for performance instrumentation, measurement, analysis, and visualization  Portable, configurable performance profiling/tracing facility  Open software approach  University of Oregon, LANL, FZJ Germany 

Argonne CCA Meeting 13 June 24, 2002 General Complex System Computation Model  Node: physically distinct shared memory machine  Message passing node interconnection network  Context: distinct virtual memory space within node  Thread: execution threads (user/system) in context memory Node VM space Context SMP Threads node memory … … Interconnection Network Inter-node message communication * * physical view model view

Argonne CCA Meeting 14 June 24, 2002 TAU Performance System Architecture EPILOG Paraver

Argonne CCA Meeting 15 June 24, 2002 TAU Status  Instrumentation supported:  Source, preprocessor, compiler, MPI, runtime, virtual machine  Languages supported:  C++, C, F90, Java, Python  HPF, ZPL, HPC++, pC++...  Packages supported:  PAPI [UTK], PCL [FZJ] (hardware performance counter access),  Opari, PDT [UO,LANL,FZJ], DyninstAPI [U.Maryland] (instrumentation),  EXPERT, EPILOG[FZJ],Vampir[Pallas], Paraver [CEPBA] (visualization)  Platforms supported:  IBM SP, SGI Origin, Sun, HP Superdome, Compaq ES,  Linux clusters (IA-32, IA-64, PowerPC, Alpha), Apple, Windows,  Hitachi SR8000, NEC SX, Cray T3E...  Compilers suites supported:  GNU, Intel KAI (KCC, KAP/Pro), Intel, SGI, IBM, Compaq,HP, Fujitsu, Hitachi, Sun, Apple, Microsoft, NEC, Cray, PGI, Absoft, …  Thread libraries supported:  Pthreads, SGI sproc, OpenMP, Windows, Java, SMARTS

Argonne CCA Meeting 16 June 24, 2002 Program Database Toolkit Application / Library C / C++ parser Fortran 77/90 parser C / C++ IL analyzer Fortran 77/90 IL analyzer Program Database Files IL DUCTAPE PDBhtml SILOON CHASM TAU_instr Program documentation Application component glue C++ / F90 interoperability Automatic source instrumentation

Argonne CCA Meeting 17 June 24, 2002 Program Database Toolkit (PDT)  Program code analysis framework for developing source-based tools for C99, C++ and F90  High-level interface to source code information  Widely portable:  IBM, SGI, Compaq, HP, Sun, Linux clusters,Windows, Apple, Hitachi, Cray T3E...  Integrated toolkit for source code parsing, database creation, and database query  commercial grade front end parsers (EDG for C99/C++, Mutek for F90)  Intel/KAI C++ headers for std. C++ library distributed with PDT  portable IL analyzer, database format, and access API  open software approach for tool development  Target and integrate multiple source languages  Used in CCA for automated generation of SIDL  Use in TAU to build automated performance instrumentation tools (tau_instrumentor)  Can be used to generate code for performance ports in CCA

Argonne CCA Meeting 18 June 24, 2002 Performance Database Framework... Raw performance data PerfDML data description Performance analysis programs PerfDML translators Performance analysis and query toolkit ORDB PostgreSQL XML profile data representation Multiple experiment performance database

Argonne CCA Meeting 19 June 24, 2002 Integrated Performance Evaluation Environment

Argonne CCA Meeting 20 June 24, 2002 Applications: VTF (ASCI ASAP Caltech)  C++, C, F90, Python  PDT, MPI

Argonne CCA Meeting 21 June 24, 2002 Applications: SAMRAI (LLNL)  C++  PDT, MPI  SAMRAI timers (groups)

Argonne CCA Meeting 22 June 24, 2002 Applications: Uintah (U. Utah ASCI L1 Center)  C++  Mapping performance data, EXPARE experiment system  MPI, sproc

Argonne CCA Meeting 23 June 24, 2002 Applications: Uintah (U. Utah) TAU uses SCIRun [U. Utah] for visualization of performance data (online/offline)

Argonne CCA Meeting 24 June 24, 2002 Applications: Uintah (contd.) Scalability analysis

Argonne CCA Meeting 25 June 24, 2002 Implementation  We need the CCA forum to help:  standardize component performance knowledge repository specification to facilitate sharing  define protocols for accessing performance data  define the interface for performance ports  support this effort  Prototype implementation using TAU  Identify target CCA projects

Argonne CCA Meeting 26 June 24, 2002 Concluding Remarks  Complex component systems pose challenging performance analysis problems that require robust methodologies and tools  New performance problems will arise  Instrumentation and measurement  Data analysis and presentation  Diagnosis and tuning  Performance engineered components  Performance knowledge, observation, query and control

Argonne CCA Meeting 27 June 24, 2002 References  A. Malony and S. Shende, “Performance Technology for Complex Parallel and Distributed Systems,” Proc. 3rd Workshop on Parallel and Distributed Systems (DAPSYS), pp , Aug  S. Shende, A. Malony, and R. Ansell-Bell, “Instrumentation and Measurement Strategies for Flexible and Portable Empirical Performance Evaluation,” Proc. Int’l. Conf. on Parallel and Distributed Processing Techniques and Applications (PDPTA), CSREA, pp , July  S. Shende, “The Role of Instrumentation and Mapping in Performance Measurement,” Ph.D. Dissertation, Univ. of Oregon, Aug  J. de St. Germain, A. Morris, S. Parker, A. Malony, and S. Shende, “Integrating Performance Analysis in the Uintah Software Development Cycle,” ISHPC 2002, Nara, Japan, May,  URL:

Support Acknowledgement  TAU and PDT support:  Department of Energy (DOE)  DOE 2000 ACTS contract  DOE MICS contract  DOE ASCI Level 3 (LANL, LLNL)  U. of Utah DOE ASCI Level 1 subcontract  DARPA  NSF National Young Investigator (NYI) award