HPC User Forum: Back-End Compiler Technology Panel

Slides:



Advertisements
Similar presentations
Introduction to Grid Application On-Boarding Nick Werstiuk
Advertisements

Chapt.2 Machine Architecture Impact of languages –Support – faster, more secure Primitive Operations –e.g. nested subroutine calls »Subroutines implemented.
INSPIRE The Insieme Parallel Intermediate Representation Herbert Jordan, Peter Thoman, Simone Pellegrini, Klaus Kofler, and Thomas Fahringer University.
8. Code Generation. Generate executable code for a target machine that is a faithful representation of the semantics of the source code Depends not only.
The OpenUH Compiler: A Community Resource Barbara Chapman University of Houston March, 2007 High Performance Computing and Tools Group
NewsFlash!! Earth Simulator no longer #1. In slightly less earthshaking news… Homework #1 due date postponed to 10/11.
Thoughts on Shared Caches Jeff Odom University of Maryland.
University of Maryland Locality Optimizations in cc-NUMA Architectures Using Hardware Counters and Dyninst Mustafa M. Tikir Jeffrey K. Hollingsworth.
The Functions and Purposes of Translators Code Generation (Intermediate Code, Optimisation, Final Code), Linkers & Loaders.
University of Houston So What’s Exascale Again?. University of Houston The Architects Did Their Best… Scale of parallelism Multiple kinds of parallelism.
Presented by Rengan Xu LCPC /16/2014
Java for High Performance Computing Jordi Garcia Almiñana 14 de Octubre de 1998 de la era post-internet.
Memory Management 2010.
1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.
2005/6/2 IWOMP05 1 IWOMP05 panel “OpenMP 3.0” Mitsuhisa Sato ( University of Tsukuba, Japan)
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Programming for High Performance Computers John M. Levesque Director Cray’s Supercomputing Center Of Excellence.
Multi-core Programming Thread Profiler. 2 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Topics Look at Intel® Thread Profiler features.
OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.
Effective User Services for High Performance Computing A White Paper by the TeraGrid Science Advisory Board May 2009.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Center for Programming Models for Scalable Parallel Computing: Project Meeting Report Libraries, Languages, and Execution Models for Terascale Applications.
Results Matter. Trust NAG. Numerical Algorithms Group Mathematics and technology for optimized performance Alternative Processors Panel IDC, Tucson, Sept.
Compiler BE Panel IDC HPC User Forum April 2009 Don Kretsch Director, Sun Developer Tools Sun Microsystems.
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
Alternative ProcessorsHPC User Forum Panel1 HPC User Forum Alternative Processor Panel Results 2008.
Reuse Distance as a Metric for Cache Behavior Kristof Beyls and Erik D’Hollander Ghent University PDCS - August 2001.
HPC User Forum Back End Compiler Panel SiCortex Perspective Kevin Harris Compiler Manager April 2009.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
1 The Portland Group, Inc. Brent Leback HPC User Forum, Broomfield, CO September 2009.
© 2009 IBM Corporation Parallel Programming with X10/APGAS IBM UPC and X10 teams  Through languages –Asynchronous Co-Array Fortran –extension of CAF with.
1 Qualifying ExamWei Chen Unified Parallel C (UPC) and the Berkeley UPC Compiler Wei Chen the Berkeley UPC Group 3/11/07.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS-303 Introduction to Programming
Threaded Programming Lecture 2: Introduction to OpenMP.
Benchmarking and Applications. Purpose of Our Benchmarking Effort Reveal compiler (and run-time systems) weak points and lack of adequate automatic optimizations.
Global Trees: A Framework for Linked Data Structures on Distributed Memory Parallel Systems D. Brian Larkins, James Dinan, Sriram Krishnamoorthy, Srinivasan.
Advanced Computer Systems
Dieter an Mey Center for Computing and Communication, RWTH Aachen University, Germany V 3.0.
Chapter 1 Introduction.
Turing Lecture External Version.ppt
Language Translation Compilation vs. interpretation.
Compiler Construction (CS-636)
Conception of parallel algorithms
John Levesque Director Cray Supercomputing Center of Excellence
Computer Engg, IIT(BHU)
Chapter 1 Introduction.
CE-105 Spring 2007 Engr. Faisal ur Rehman
Whole program compilation for embedded software: the ADSL experiment
Performance Analysis, Tools and Optimization
Operating System Concepts
Many-core Software Development Platforms
Compiler Construction
Intel® Parallel Studio and Advisor
Multi-core CPU Computing Straightforward with OpenMP
HPC User Forum 2012 Panel on Potential Disruptive Technologies Emerging Parallel Programming Approaches Guang R. Gao Founder ET International.
Dycore Rewrite Tobias Gysi.
Compiler Back End Panel
Benjamin Goldberg Compiler Verification and Optimization
Compiler Back End Panel
Q: What Does the Future Hold for “Parallel” Languages?
Alternative Processor Panel Results 2008
Bin Ren, Gagan Agrawal, Brad Chamberlain, Steve Deitz
Allen D. Malony Computer & Information Science Department
Back End Compiler Panel
Defining the Grid Fabrizio Gagliardi EMEA Director Technical Computing
Support for Adaptivity in ARMCI Using Migratable Objects
The George Washington University
Question 1 How are you going to provide language and/or library (or other?) support in Fortran, C/C++, or another language for massively parallel programming.
Presentation transcript:

HPC User Forum: Back-End Compiler Technology Panel Luiz DeRose Programming Environment Director Cray Inc. ldr@cray.com

Luiz DeRose (ldr@cray.com) © Cray Inc. Are compiler code generation techniques going to transition along with the hardware transition from multi-core to many-core and hybrid systems and at what speed? Multi-core to many-core The hard part is determining if it can go 2-wide From there, going n-wide is straightforward The next release of the Cray Compiler will support Automatic Parallelism for “n-Cores” Hybrid systems will take more work and more time April 21, 2009 Luiz DeRose (ldr@cray.com) © Cray Inc.

Luiz DeRose (ldr@cray.com) © Cray Inc. What information do you need from a Compiler Intermediate Format to efficiently utilize multi-core, many-core and hybrid systems that is not available from traditional languages like C, C++, or F90? Are you looking at directive-based or library-based approaches or is there another approach that you like? The parallelism concepts should be fully-integrated into the IR UPC and CAF need to be optimized like any ordinary references OpenMP needs to be treated as part of the language, not as an add-on Our approach is: Fully automatic utilization of parallel resources, whatever and wherever they are Use of language extensions (e.g. UPC) to guide this parallelization Use of directives as necessary for tuning, parallel placement, and OpenMP Use of user-called libraries as a last resort But fully automatic is our principal approach, without the need for hand-inserted directives or library calls April 21, 2009 Luiz DeRose (ldr@cray.com) © Cray Inc.

Luiz DeRose (ldr@cray.com) © Cray Inc. Is embedded global memory addressing (like Co-Array Fortran) to be widely available and supported even on distributed memory systems? YES! (after all, we invented it!) UPC & CoArray Fortran are already fully optimized and integrated into the Cray Compiler No preprocessor involved The Cray X86 Compiler already supports the proposed Fortran 2008 Co-Arrays April 21, 2009 Luiz DeRose (ldr@cray.com) © Cray Inc.

Luiz DeRose (ldr@cray.com) © Cray Inc. What kind of hybrid systems or processor extensions are going to be supported by your compiler's code generation suite? yes April 21, 2009 Luiz DeRose (ldr@cray.com) © Cray Inc.

Luiz DeRose (ldr@cray.com) © Cray Inc. What new run-time libraries will be available to utilize multi-core, many-core, and hybrid systems and will they work seamlessly through dynamic linking? OpenMP supports nested parallelism and, depending on the application, scales nicely Automatic Parallelism on the Cray X86 Compiler interacts with the OpenMP runtime library Dynamic linking tends to have a negative impact on performance, sometimes very significantly However, we have in our road map to support DSO So, it will be available for those users that are willing to take the performance hit, April 21, 2009 Luiz DeRose (ldr@cray.com) © Cray Inc.