Lawrence Livermore National Laboratory ROSE Compiler Project Computational Exascale Workshop December 2010 Dan Quinlan Chunhua Liao, Justin Too, Robb Matzke,

Slides:



Advertisements
Similar presentations
Conclusion Kenneth Moreland Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,
Advertisements

XtreemOS IP project is funded by the European Commission under contract IST-FP XtreemOS: Building and Promoting a Linux-based Operating System.
ETCCDI approaches to development of the software
SCARF Duncan Tooke RAL HPCSG. Overview What is SCARF? Hardware & OS Management Software Users Future.
Lawrence Livermore National Laboratory SciDAC Reaction Theory Year-5-End plans LLNL-PRES Lawrence Livermore National Laboratory, P. O. Box 808,
Discussion of Infrastructure Clouds A NERSC Magellan Perspective Lavanya Ramakrishnan Lawrence Berkeley National Lab.
Lawrence Livermore National Laboratory Denise Sumikawa CIAC Program Leader LLNL-PRES Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
1 Slides presented by Hank Childs at the VACET/SDM workshop at the SDM Center All-Hands Meeting. November 26, 2007 Snoqualmie, Wa Work performed under.
Lawrence Livermore National Laboratory ROSE Slides August 2010 Dan Quinlan Center for Applied Scientific Computing Lawrence Livermore National Laboratory.
Matt Wolfe LC Development Environment Group Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA.
0 Arun Rodrigues, Scott Hemmert, Dave Resnick: Sandia National Lab (ABQ) Keren Bergman: Columbia University Bruce Jacob: U. Maryland John Shalf, Paul Hargrove:
1 Lawrence Livermore National Laboratory By Chunhua (Leo) Liao, Stephen Guzik, Dan Quinlan A node-level programming model framework for exascale computing*
University of Houston So What’s Exascale Again?. University of Houston The Architects Did Their Best… Scale of parallelism Multiple kinds of parallelism.
PyMercury: Interactive Python for the Mercury Particle Transport Code Forrest Iandola*, Matthew O’Brien, and Richard Procassini *University of Illinois,
Engineering Problem Solving With C++ An Object Based Approach Fundamental Concepts Chapter 1 Engineering Problem Solving.
Bronis R. de Supinski Center for Applied Scientific Computing Lawrence Livermore National Laboratory June 2, 2005 The Most Needed Feature(s) for OpenMP.
1 Engineering Problem Solving With C++ An Object Based Approach Fundamental Concepts Chapter 1 Engineering Problem Solving.
VisIt Software Engineering Infrastructure and Release Process LLNL-PRES Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
BERAC Charge A recognized strength of the Office of Science, and BER is no exception, is the development of tools and technologies that enable science.
1 Aug 7, 2004 GPU Req GPU Requirements for Large Scale Scientific Applications “Begin with the end in mind…” Dr. Mark Seager Asst DH for Advanced Technology.
Real-Time and Multimedia Systems Laboratory Carnegie Mellon System Integration Raj Rajkumar Professor, ECE and CS Director, Real-Time and Multimedia Systems.
1 Dan Quinlan, Markus Schordan, Qing Yi Center for Applied Scientific Computing Lawrence Livermore National Laboratory Semantic-Driven Parallelization.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Planning for SATE V Paul E. Black National Institute of Standards and Technology
1 CASC ROSE Compiler Infrastructure Source-to-Source Analysis and Optimization Dan Quinlan Rich Vuduc, Qing Yi, Markus Schordan Center for Applied Scientific.
Dax: Rethinking Visualization Frameworks for Extreme-Scale Computing DOECGF 2011 April 28, 2011 Kenneth Moreland Sandia National Laboratories SAND P.
SAINT2002 Towards Next Generation January 31, 2002 Ly Sauer Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation,
Lawrence Livermore National Laboratory Manycore Optimizations: A Compiler and Language Independent ManyCore Runtime System ROSE Team Center for Applied.
LLNL-PRES-XXXXXX This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.
LLNL-PRES-XXXXXX This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Lawrence Livermore National Laboratory A system for strong local account management. SLAM David Frye Lawrence Livermore National Laboratory, P. O. Box.
Computing careers in the real world Or “I have my degree, now what?”
Advantages Cloud Computing. customers only pay for the access and interfaces that they need. The customer buys only the services they need Cost Advantages.
Computational Science & Engineering Department CSE The Software Engineering Group 1 Software Engineering Tools for Fortran Software Development Chris Greenough.
After step 2, processors know who owns the data in their assumed partitions— now the assumed partition defines the rendezvous points Scalable Conceptual.
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
Accelerating Scientific Exploration Using Workflow Automation Systems Terence Critchlow (LLNL) Ilkay Altintas (SDSC) Scott Klasky(ORNL) Mladen Vouk (NCSU)
Grid MP at ISIS Tom Griffin, ISIS Facility. Introduction About ISIS Why Grid MP? About Grid MP Examples The future.
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
A CASE Tool For Robot Behavior Development The KSE CASE Tool - Liveness Formula Editor, text editor ‐ Liveness2IAC transformation tool ‐ Graphical Statechart.
EG280 Computer Science for Engineers Fundamental Concepts Chapter 1.
SAS ‘05 Reducing Software Security Risk through an Integrated Approach David P. Gilliam, John D. Powell Jet Propulsion Laboratory, California Institute.
School of Computer Science & Information Technology G6DICP Introduction to Computer Programming Milena Radenkovic.
1 Optimizing compiler tools and building blocks project Alexander Drozdov, PhD Sergey Novikov, PhD.
1Managed by UT-Battelle for the U.S. Department of Energy Discussion questions from you 1.Do we care about exaflops machines? 2.How are we going to compare.
Lawrence Livermore National Laboratory Pianola: A script-based I/O benchmark Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA
Livermore Solar Electrical Generation System. Background Procurement Timeframe Location Buffer Zone Land Restrictions Consortium Lawrence Berkeley National.
ASC Tri-Lab Code Development Tools Workshop Thursday, July 29, 2010 Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA This work.
DIRECTION RECHERCHE & INNOVATION Centre de Recherche et Innovation Gaz et Energies Nouvelles Guillain Chapelon 12 september 2013.
Scott Kohn with Tammy Dahlgren, Tom Epperly, and Gary Kumfert Center for Applied Scientific Computing Lawrence Livermore National Laboratory October 2,
LLNL-PRES-?????? This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory ASC STAT Team: Greg Lee, Dong Ahn (LLNL), Dane Gardner (LANL)
Lawrence Livermore National Laboratory Centralized Desktop Management at LLNL A Major Paradigm Shift CDM David Frye This work performed under the auspices.
LLNL-PRES-xxxxxx This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Lawrence Livermore National Laboratory S&T Principal Directorate - Computation Directorate Tools and Scalable Application Preparation Project Computation.
Lawrence Livermore National Laboratory BRdeS-1 Science & Technology Principal Directorate - Computation Directorate How to Stop Worrying and Learn to Love.
Site Report DOECGF April 26, 2011 W. Alan Scott Sandia National Laboratories Sandia National Laboratories is a multi-program laboratory managed and operated.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344.
Bronis R. de Supinski and Jeffrey S. Vetter Center for Applied Scientific Computing August 15, 2000 Umpire: Making MPI Programs Safe.
Code Motion for MPI Performance Optimization The most common optimization in MPI applications is to post MPI communication earlier so that the communication.
LLNL-PRES-XXXXXX This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344.
Other Tools HPC Code Development Tools July 29, 2010 Sue Kelly Sandia is a multiprogram laboratory operated by Sandia Corporation, a.
SAMCAHNG Yun Goo Kim I. Formal Model Based Development & Safety Analysis II. UML (Model) Based Safety RMS S/W Development February KIM, YUN GOO.
National Aeronautics and Space Administration Jet Propulsion Laboratory March 17, 2009 Workflow Orchestration: Conducting Science Efficiently on the Grid.
Design and Planning Tools John Grosh Lawrence Livermore National Laboratory April 2016.
Engineering Problem Solving With C An Object Based Approach
Tools and Services Workshop
Transitioning VisIt to CMake
14th International IEEE eScience Conference
Presentation transcript:

Lawrence Livermore National Laboratory ROSE Compiler Project Computational Exascale Workshop December 2010 Dan Quinlan Chunhua Liao, Justin Too, Robb Matzke, Peter Pirkelbauer Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA Operated by Lawrence Livermore National Security, LLC, or the U.S. Department of Energy, National Nuclear Security Administration under Contract DE-AC52-07NA27344

2 Science & Technology: Computation Directorate ROSE supports custom software analysis tools  Software analysis exposes behavioral properties of large scale software systems  ROSE is an open compiler infrastructure for custom domain-specific analysis Source code: C, C++, Fortran 2003, OpenMP, UPC, PHP Binary executables (Windows and Linux) (x86, Power-PC, ARM) Program analysis uses/generates different graphs Graphs for real software have millions of nodes Graphs are for use in automated analysis (not users) int aFunction(int a, int b) { int c=b; return a; } main() { int a,b,c,d,e; int i=4; for (i=0;i<10;i++) { int j=55; c=i+j; c=aFunction(i,c); a=aFunction(a+1,b); } #pragma SliceTarget a; return 0; } Data-dependency System-dependency Sliced-system-dependency Control-dependency Control-Flow Source Code or Binary Executable

3 ROSE is a software analysis and source-to-source transformation infrastructure for building tools Science & Technology: Computation Directorate Source Code or Binary Executable Transformed Source Code ROSE IR Analyses/ Transformation/ Optimizations System-dependency Sliced-system- dependency Control- Flow Control dependency Control flow Unparser ROSE Frontend ROSE-based tool

4 GOAL: Position ROSE as a Strategic Part of DOE Research  Supports the languages used within DOE (Fortran/C/C++)  The only compiler project within DOE for DOE apps  ROSE targets all of the real-world applications across all of DOE  Used by many national labs: ANL, LANL, LBL, ORNL, PNL, SNL, and LLNL  Used worldwide for external research projects  Supports many other research projects within DOE  Open source (top download on SciDAC web site)  BSD licensed  Fully automated testing and release process Science & Technology: Computation Directorate

5 Programming Model Research  OpenMP 3.0 compiler implemented using ROSE Autoparallelization using OpenMP  UPC source-to-source support within ROSE  Custom analysis and transformation MPI (code motion optimizations) Basis of DARPA PACE project at Rice Co-Array Fortran PGAS programming model (Rice) C++ optimization of array language abstractions E.g. Intel’s Threaded Building Blocks  Platform for imbedded domain specific language research Language extension/restrictions  Exascale specific uses: Analysis and transformation for active power control Introduction of custom instructions (Co-Design) Data layout transformations (Keasler, McGraw) Science & Technology: Computation Directorate

6 ROSE: Open Source OpenMP 3.0 compiler  OpenMP compiler built using ROSE compiler infrastructure  Research platform for multi-core and parallel programming model research  Unique open source platform for OpenMP research  Demonstrated competitive performance with commercial OpenMP compilers (as shown below on OpenMP benchmarks: ROSE-GOMP and ROSE-Omni) Platform: Dell Precision T5400, 3.16GHz quad-core Xeon X5460 dual processor, 8GB Benchmarks: NAS parallel benchmark suite v 2.3, Barcelona OpenMP task suite v 1.0 Compilers: ROSE, Omni 1.6, GCC 4.4.1, Mercurium compiler with Nanos runtime. Intel compiler speedup

7 ROSE: Automating Multi-core use for MPI applications  MPI applications have to use multi-core nodes in current and future architectures  Hybrid MPI and Multi-core programming models square the complexity of using either one separately  ROSE’s new Autopar tool automatically inserts multicore optimizations into existing MPI applications, automatically detects existing parallelism.  Ongoing work on large scale LLNL applications  Demonstrated performance improvements on application kernels (8x speedup)  Uniquely applicable to loops, standard C++ abstractions, and user-defined types Platform: Dell Precision T5400, 3.16GHz quad-core Xeon X5460 dual processor, 8GB Compiler: ROSE OpenMP translator + Omni 1.6 Runtime + GCC 4.1.2

8 ROSE: Autotuning Performance Optimization Research  ROSE has rich support for custom autotuning (plus autotuning documentation)  Specific focus on whole application autotuning (ongoing research with PERI)  Applicable to wide range of Multi-core architectures  Demonstrated superior performance using autotuning (4.14x speedup)  A platform for autotuning research on programming models (results below show the autotuning of parameters to the OpenMP run-time system to control the depth of nested parallelism) Platform: Xeon X GHz, 2x4=8 cores with HyperThreading (8x2=16 threads), 12GB Compiler: ROSE-OpenMP + libgomp of GCC (modified) Search space: threads [1,16,2], queue length multiplier: 1, 2, 4,.., 64 Best sequential (n=14): s Best performance: 8 threads, 2*16 = 32 queue len, 31.8 s ( 4.14x)

9 ROSE: Understanding Software for Exascale Co-Design  ROSE communicates subtle software design to architecture design groups  Software characterization for exascale software/architecture co-design ROSE measures amount of parallelism in DOE applications Classifies types of memory usage to inform hardware designers Many other custom requests from exascale design teams for co-design  Instrumentation of applications to assess potential hardware exascale features  ROSE will generate skeleton apps to drive exascale hardware simulators main() { int a,b,c,d,e; int i=4; for (i=0;i<10;i++) { int j=55; c=i+j; c=aFunction(i,c); a=aFunction(a+1,b); } a; return 0; } Data-dependency System-dependency Sliced-system-dependency Control-Flow Original Source Code main() { int a,b,c,d,e; int i=4; for (i=0;i<10;i++) { int j=55; c=i+j; c=aFunction(i,c); a=aFunction(a+1,b); } #pragma SliceTarget a; return 0; } Exascale Design Data for Architecture Team Modified (Instrumented) Source Code ROSE Compiler Analysis and Exascale Co-Design Transformations Dynamic and Static Analysis

10 Science & Technology: Computation Directorate We have developed numerous collaborations through the ROSE compiler toolset DOE Laboratories: LLNL (A Div, B Div, Cyber-Security) ANL, LANL, LBL, ORNL, PNNL DOE Research Programs: PERI (SLAC, Fortran/C/C++ Optimization, UT, ANL, ISI, LBL) Collaborations: DOD (Air Force) NIST(SAMATE) CERT (CMU) GrammaTech IBM (Program Verification Group) Absint (developers of PAG) London Imperial College Texas A&M Rice University Vienna University of Technology University of Tennessee Cornell University Indiana University University of California at Berkeley University of Bergen University of Maryland Friedrich-Alexander-University Erlangen-Nuremberg University of Texas at Austin UCSD UC Davis ROSE used for graduate courses at seven different universities (compiler construction to cyber security) ROSE being used to build products at five different companies (BSD license) ROSE released at Top Download on SciDAC web site… ROSE is winner of 2009 R&D100 award

11 Economics of compilers…  Vendor compiler teams serve their masters Compiler groups are subsidized (~75% from hardware) Hardware groups rule, but are often ignored… Marketing Department (competitive vendors)  Make the new chips look good  Direct competition against other vendors  There are few non-vendor associated compiler teams  The market for compilers is not really HPC…  A one off machine is not worth changing the compiler to support…  So the compiler you will get is the substantially the same vendor compiler that exists today (vendor compilers change slowly)  Free compilers are the norm for a lot of HPC work…(GNU) Science & Technology: Computation Directorate