1 CASC ROSE Compiler Infrastructure Source-to-Source Analysis and Optimization Dan Quinlan Rich Vuduc, Qing Yi, Markus Schordan Center for Applied Scientific.

Slides:



Advertisements
Similar presentations
Optimizing Compilers for Modern Architectures Syllabus Allen and Kennedy, Preface Optimizing Compilers for Modern Architectures.
Advertisements

Lawrence Livermore National Laboratory ROSE Compiler Project Computational Exascale Workshop December 2010 Dan Quinlan Chunhua Liao, Justin Too, Robb Matzke,
Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
Whole-Program Linear-Constant Analysis with Applications to Link-Time Optimization Ludo Van Put – Dominique Chanet – Koen De Bosschere Ghent University.
Lawrence Livermore National Laboratory ROSE Slides August 2010 Dan Quinlan Center for Applied Scientific Computing Lawrence Livermore National Laboratory.
AUTOMATIC GENERATION OF CODE OPTIMIZERS FROM FORMAL SPECIFICATIONS Vineeth Kumar Paleri Regional Engineering College, calicut Kerala, India. (Currently,
1 Lawrence Livermore National Laboratory By Chunhua (Leo) Liao, Stephen Guzik, Dan Quinlan A node-level programming model framework for exascale computing*
Tammy Dahlgren, Tom Epperly, Scott Kohn, and Gary Kumfert Center for Applied Scientific Computing Common Component Architecture Working Group October 3,
DataFoundry: An Approach to Scientific Data Integration Terence Critchlow Ron Musick Ida Lozares Center for Applied Scientific Computing Tom SlezakKrzystof.
Intermediate Representations Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
TM Pro64™: Performance Compilers For IA-64™ Jim Dehnert Principal Engineer 5 June 2000.
Vertically Integrated Analysis and Transformation for Embedded Software John Regehr University of Utah.
Bronis R. de Supinski Center for Applied Scientific Computing Lawrence Livermore National Laboratory June 2, 2005 The Most Needed Feature(s) for OpenMP.
Cpeg421-08S/final-review1 Course Review Tom St. John.
Telescoping Languages: A Compiler Strategy for Implementation of High-Level Domain-Specific Programming Systems Ken Kennedy Rice University.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.
May 25, 2010 Mary Hall May 25, 2010 Advancing the Compiler Community’s Research Agenda with Archiving and Repeatability * This work has been partially.
CStar Optimizing a C Compiler
Java for High Performance Computing Jordi Garcia Almiñana 14 de Octubre de 1998 de la era post-internet.
1 Dan Quinlan, Markus Schordan, Qing Yi Center for Applied Scientific Computing Lawrence Livermore National Laboratory Semantic-Driven Parallelization.
Guide To UNIX Using Linux Third Edition
Generative Programming. Generic vs Generative Generic Programming focuses on representing families of domain concepts Generic Programming focuses on representing.
Register Allocation and Spilling via Graph Coloring G. J. Chaitin IBM Research, 1982.
MIT Lincoln Laboratory XYZ 3/11/2005 Automatic Extraction of Software Models for Exascale Hardware/Software Co-Design Damian Dechev 1,2, Amruth.
Optimizing Compilers Nai-Wei Lin Department of Computer Science and Information Engineering National Chung Cheng University.
Applying Data Copy To Improve Memory Performance of General Array Computations Qing Yi University of Texas at San Antonio.
C++ Programming. Table of Contents History What is C++? Development of C++ Standardized C++ What are the features of C++? What is Object Orientation?
Center for Research on Multicore Computing (CRMC) Overview Ken Kennedy Rice University
P ARALLEL P ROCESSING I NSTITUTE · F UDAN U NIVERSITY 1.
UPC/SHMEM PAT High-level Design v.1.1 Hung-Hsun Su UPC Group, HCS lab 6/21/2005.
LLNL-PRES-XXXXXX This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
PRESTO Research Group, Ohio State University Interprocedural Dataflow Analysis in the Presence of Large Libraries Atanas (Nasko) Rountev Scott Kagan Ohio.
Michelle Mills Strout OpenAnalysis: Representation- Independent Program Analysis CCA Meeting January 17, 2008.
November 13, 2006 Performance Engineering Research Institute 1 Scientific Discovery through Advanced Computation Performance Engineering.
John Mellor-Crummey Robert Fowler Nathan Tallent Gabriel Marin Department of Computer Science, Rice University Los Alamos Computer Science Institute HPCToolkit.
Khoros Yongqun He Dept. of Computer Science, Virginia Tech.
Tammy Dahlgren with Tom Epperly, Scott Kohn, and Gary Kumfert Center for Applied Scientific Computing Common Component Architecture Working Group October.
Development Timelines Ken Kennedy Andrew Chien Keith Cooper Ian Foster John Mellor-Curmmey Dan Reed.
Accelerating Scientific Exploration Using Workflow Automation Systems Terence Critchlow (LLNL) Ilkay Altintas (SDSC) Scott Klasky(ORNL) Mladen Vouk (NCSU)
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
Generative Programming. Automated Assembly Lines.
Programming Language Support for Generic Libraries Jeremy Siek and Walid Taha Abstract The generic programming methodology is revolutionizing the way we.
C++ History C++ was designed at AT&T Bell Labs by Bjarne Stroustrup in the early 80's Based on the ‘C’ programming language C++ language standardised in.
Synchronization Transformations for Parallel Computing Pedro Diniz and Martin Rinard Department of Computer Science University of California, Santa Barbara.
Generic Code Optimization Jack Dongarra, Shirley Moore, Keith Seymour, and Haihang You LACSI Symposium Automatic Tuning of Whole Applications Workshop.
1 CS 201 Compiler Construction Introduction. 2 Instructor Information Rajiv Gupta Office: WCH Room Tel: (951) Office.
Scott Kohn with Tammy Dahlgren, Tom Epperly, and Gary Kumfert Center for Applied Scientific Computing Lawrence Livermore National Laboratory October 2,
Improving I/O with Compiler-Supported Parallelism Why Should We Care About I/O? Disk access speeds are much slower than processor and memory access speeds.
SDM center Supporting Heterogeneous Data Access in Genomics Terence Critchlow Center for Applied Scientific Computing Lawrence Livermore National Laboratory.
LaHave House Project 1 LaHave House Project Automated Architectural Design BML + ARC.
Presented by Performance Engineering Research Institute (PERI) Patrick H. Worley Computational Earth Sciences Group Computer Science and Mathematics Division.
Bronis R. de Supinski and Jeffrey S. Vetter Center for Applied Scientific Computing August 15, 2000 Umpire: Making MPI Programs Safe.
Code Motion for MPI Performance Optimization The most common optimization in MPI applications is to post MPI communication earlier so that the communication.
The Performance Evaluation Research Center (PERC) Participating Institutions: Argonne Natl. Lab.Univ. of California, San Diego Lawrence Berkeley Natl.
Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 6: Assembler slide 1www.nand2tetris.org Building a Modern Computer.
Mapping of Regular Nested Loop Programs to Coarse-grained Reconfigurable Arrays – Constraints and Methodology Presented by: Luis Ortiz Department of Computer.
High Performance Embedded Computing © 2007 Elsevier Lecture 10: Code Generation Embedded Computing Systems Michael Schulte Based on slides and textbook.
R-Verify: Deep Checking of Embedded Code James Ezick † Donald Nguyen † Richard Lethin † Rick Pancoast* (†) Reservoir Labs (*) Lockheed Martin The Eleventh.
Compiler Construction CPCS302 Dr. Manal Abdulaziz.
1 HPJAVA I.K.UJJWAL 07M11A1217 Dept. of Information Technology B.S.I.T.
1PLDI 2000 Off-line Variable Substitution for Scaling Points-to Analysis Atanas (Nasko) Rountev PROLANGS Group Rutgers University Satish Chandra Bell Labs.
Programming Language Theory 2014, 1 Chapter 1 :: Introduction Origin : Michael L. Scott School of Computer & Information Engineering,
Introduction to Optimization
The architecture of the P416 compiler
Introduction to Optimization
Introduction to Optimization
The SGI Pro64 Compiler Infrastructure
Presentation transcript:

1 CASC ROSE Compiler Infrastructure Source-to-Source Analysis and Optimization Dan Quinlan Rich Vuduc, Qing Yi, Markus Schordan Center for Applied Scientific Computing Lawrence Livermore National Laboratory This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract W-7405-Eng-48.

2 CASC Overview l ROSE Compiler Infrastructure l Research Objectives —General Optimization of existing applications —Optimization of High-Level Abstractions –Telescoping Language –Plus/Minus Languages –Many other names for this —Empirical Optimization l Targets non-compiler audience l Emphasis on whole program capabilities l Open Source (EDG part as binary) l Conclusions

3 CASC Motivation for Compiler Based Tools l Current Status: —DOE generates huge amounts of software —ROSE provides a mechanism to automatically read, analyze, and fully rewrite ASC scale software in C, C++ (and eventually F90 as part of collaboration with Rice, we hope). l ROSE Project focus IS on optimization But a lot of tools could be built …, Simple tools can only discover superficial things about software, to really know what is going on in an application you need a compiler infrastructure.

4 CASC ROSE Source-to-Source Approach Vendor’s Compiler EDG C++ Front-end (EDG AST) SAGE C++ AST (ROSETTA C++ AST Restructuring Tool) AST Analysis and Transformation ROSE Unparser Optimized C++ Source Code Recognition of High-Level Abstractions by Construction of Hierarchy of ASTs ROSETTA C++ High-Level AST Restructuring Tool Translator Built Using ROSE Executable Source Code ROSE Translator acts just like the vendor compiler Replaces compiler in application’s Makefile

5 CASC l Software analysis and optimization for scientific applications l Tool for building source-to-source translators l Support for C and C++ l F90 in development l Loop optimizations l Lab and academic use l Software engineering l Performance analysis l Domain-specific analysis and optimizations l Development of new optimization approaches l Optimization of object-oriented abstractions ROSE Project

6 CASC Program Analysis and Optimization l Program Analysis (most are from Qing + contributions from students) —Call graph –Resolution of function pointers –Resolution of Virtual functions –Resolution of pointers to virtual functions —Dependence (procedural) —Control Flow (working on inter-procedural case) —Slicing (inter-procedural) —Partial Redundancy Elimination (PRE) —Connection to Open Analysis (work with ANL) l Optimizations —Loop Optimization (Qing Yi) –Loop fusion, fission, blocking, unroling, array copy, etc. –inlining and outlining —Annotation Based Optimizations —Custom optimizations –Define your own optimization (high level or low level)

7 CASC Automated Recognition (Library Abstractions and other things) Even performance information is in the AST (HPC Toolkit, RICE)

8 CASC ROSE Whole Application Analysis Three Separate source files (ASTs) Merged ASTs save space and permit whole ASC scale application analysis Supports Whole Program Analysis (Alternative to SQLite Interface) Shares AST nodes Preserves simplicity Preserves all analysis info Simple tools work on whole ASC applications Supports hundreds of source files Supports million line applications

9 CASC Large-Scale Application Support l Call Graph Analysis l Scaling up existing analysis

10 CASC Automated Generation of Symbolic Equations for building Application Models

11 CASC Unparsed Example Preserves formatting, comments, and preprocessor control structure Original Input C++ Source codeUnparsed Output C++ Source code

12 CASC Interactions with Others DOE Laboratories: LLNL ( A-Div (Kull), B-Div (IRS), Mark Graff, TSTT, Overture, Babel) ANL (Paul Hovland) ORNL DOE Research Programs: PERC (SLAC, TSTT, C/C++ Optimization, UT, ANL, Dyninst Binary Rewriting) Collaborations: IBM Haifa (Shmuel Ur) Texas A&M (Lawrence Rauchwerger, Bjarne Stroustrup) Rice University (Ken Kennedy, John Mellnor-Crummey) Vienna University of Technology (Markus Schordan) University of Tennessee (Jack Dongarra’s group) Cornell University (Sally McKee, Brian White) Indiana University (Andrew Lumsdaine, Jeremiah Willcock) University of California at Berkeley (UPC, Kathy Yelick) University of Oslo (Hans, Andreas, Are) University of Maryland (Jeff Hollingsworth, Chadd Williams) Friedrich-Alexander-University Erlangen-Nuremberg (Markus Kowarschik, Nils Thurey) University of Texas at Austin (Calvin Lin) USCD (Scott Baden) London Imperial College (Olav Beckman, Paul Kelly) UC Davis (Su, Bishop)