U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Chapel: The Cascade High Productivity Language Ting Yang University of Massachusetts.

Slides:



Advertisements
Similar presentations
Introduction to Openmp & openACC
Advertisements

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Grace: Safe Multithreaded Programming for C/C++ Emery Berger University of Massachusetts,
879 CISC Parallel Computation High Performance Fortran (HPF) Ibrahim Halil Saruhan Although the [Fortran] group broke new ground …
High Productivity Computing Systems for Command and Control 13 th ICCRTS: C2 for Complex Endeavors Bellevue, WA June 17 – 19, 2008 Scott Spetka – SUNYIT.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Potential Languages of the Future Chapel,
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science X10: IBM’s bid into parallel languages Paul B Kohler Kevin S Grimaldi University.
Chapter 7:: Data Types Programming Language Pragmatics
Starting Out with C++, 3 rd Edition 1 Chapter 1. Introduction to Computers and Programming.
Chapel Chao, Dave, Esteban. Overview High-Productivity Computing Systems (HPCS, Darpa) High-Productivity Computing Systems (HPCS, Darpa) Cray’s Cascade.
Chapel: Features Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.
Reference: Message Passing Fundamentals.
DISTRIBUTED AND HIGH-PERFORMANCE COMPUTING CHAPTER 7: SHARED MEMORY PARALLEL PROGRAMMING.
High Performance Fortran (HPF) Source: Chapter 7 of "Designing and building parallel programs“ (Ian Foster, 1995)
Introduction to Fortran Fortran Evolution Drawbacks of FORTRAN 77 Fortran 90 New features Advantages of Additions.
Java for High Performance Computing Jordi Garcia Almiñana 14 de Octubre de 1998 de la era post-internet.
Scripting Languages For Virtual Worlds. Outline Necessary Features Classes, Prototypes, and Mixins Static vs. Dynamic Typing Concurrency Versioning Distribution.
Chapter 8: Introduction to High-level Language Programming Invitation to Computer Science, C++ Version, Third Edition.
High Performance Fortran (HPF) Source: Chapter 7 of "Designing and building parallel programs“ (Ian Foster, 1995)
C# Programming: From Problem Analysis to Program Design1 Advanced Object-Oriented Programming Features C# Programming: From Problem Analysis to Program.
Course Map The Java Programming Language Basics Object-Oriented Programming Exception Handling Graphical User Interfaces and Applets Multithreading Communications.
An Introduction to Chapel Cray Cascade’s High-Productivity Language compiled for Mary Hall, February 2006 Brad Chamberlain Cray Inc. compiled for Mary.
Tile Reduction: the first step towards tile aware parallelization in OpenMP Ge Gan Department of Electrical and Computer Engineering Univ. of Delaware.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles C/C++ Emery Berger and Mark Corner University of Massachusetts.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
Peter Juszczyk CS 492/493 - ISGS. // Is this C# or Java? class TestApp { static void Main() { int counter = 0; counter++; } } The answer is C# - In C#
SEC(R) 2008 Intel® Concurrent Collections for C++ - a model for parallel programming Nikolay Kurtov Software and Services.
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
Dilemma of Parallel Programming Xinhua Lin ( 林新华 ) HPC Lab of 17 th Oct 2011.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
CS 403: Programming Languages Lecture 2 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
AN EXTENDED OPENMP TARGETING ON THE HYBRID ARCHITECTURE OF SMP-CLUSTER Author : Y. Zhao 、 C. Hu 、 S. Wang 、 S. Zhang Source : Proceedings of the 2nd IASTED.
Algorithm Programming Bar-Ilan University תשס"ח by Moshe Fresko.
University of Houston-Clear Lake Proprietary© 1997 Evolution of Programming Languages Basic cycle of improvement –Experience software difficulties –Theory.
Presented by High Productivity Language and Systems: Next Generation Petascale Programming Wael R. Elwasif, David E. Bernholdt, and Robert J. Harrison.
Presented by High Productivity Language Systems: Next-Generation Petascale Programming Aniruddha G. Shet, Wael R. Elwasif, David E. Bernholdt, and Robert.
Introduction and Features of Java. What is java? Developed by Sun Microsystems (James Gosling) A general-purpose object-oriented language Based on C/C++
High Performance Fortran (HPF) Source: Chapter 7 of "Designing and building parallel programs“ (Ian Foster, 1995)
The Vesta Parallel File System Peter F. Corbett Dror G. Feithlson.
1 Parallel Programming Aaron Bloomfield CS 415 Fall 2005.
Chapter 2: A Brief History Object- Oriented Programming Presentation slides for Object-Oriented Programming by Yahya Garout KFUPM Information & Computer.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Fortress John Burgess and Richard Chang CS691W University of Massachusetts Amherst.
COMP3190: Principle of Programming Languages
SOFTWARE DESIGN. INTRODUCTION There are 3 distinct types of activities in design 1.External design 2.Architectural design 3.Detailed design Architectural.
The course. Description Computer systems programming using the C language – And possibly a little C++ Translation of C into assembly language Introduction.
October 11, 2007 © 2007 IBM Corporation Multidimensional Blocking in UPC Christopher Barton, Călin Caşcaval, George Almási, Rahul Garg, José Nelson Amaral,
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
Fortress Aaron Becker Abhinav Bhatele Hassan Jafri 2 May 2006.
Orca A language for parallel programming of distributed systems.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Charm++ overview L. V. Kale. Parallel Programming Decomposition – what to do in parallel –Tasks (loop iterations, functions,.. ) that can be done in parallel.
SWE 4743 Abstraction Richard Gesick. CSE Abstraction the mechanism and practice of abstraction reduces and factors out details so that one can.
How Are Computers Programmed? CPS120: Introduction to Computer Science Lecture 5.
Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.
PHP Reusing Code and Writing Functions 1. Function = a self-contained module of code that: Declares a calling interface – prototype! Performs some task.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
1 HPJAVA I.K.UJJWAL 07M11A1217 Dept. of Information Technology B.S.I.T.
From Use Cases to Implementation 1. Structural and Behavioral Aspects of Collaborations  Two aspects of Collaborations Structural – specifies the static.
Introduction to C Programming CE Lecture 6 Functions, Parameters and Arguments.
Using Charm++ with Arrays Laxmikant (Sanjay) Kale Parallel Programming Lab Department of Computer Science, UIUC charm.cs.uiuc.edu.
Chapter 1: Preliminaries Lecture # 2. Chapter 1: Preliminaries Reasons for Studying Concepts of Programming Languages Programming Domains Language Evaluation.
From Use Cases to Implementation 1. Mapping Requirements Directly to Design and Code  For many, if not most, of our requirements it is relatively easy.
JAVA TRAINING IN NOIDA. JAVA Java is a general-purpose computer programming language that is concurrent, class-based, object-oriented and specifically.
1 Sections Java Virtual Machine and Byte Code Fundamentals of Java: AP Computer Science Essentials, 4th Edition Lambert / Osborne.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
Application of Design Patterns to Geometric Decompositions V. Balaji, Thomas L. Clune, Robert W. Numrich and Brice T. Womack.
Distributed Shared Memory
Parallel Programming By J. H. Wang May 2, 2017.
Computer Engg, IIT(BHU)
An Orchestration Language for Parallel Objects
Presentation transcript:

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Chapel: The Cascade High Productivity Language Ting Yang University of Massachusetts Amherst

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 2 Context DARPA HPCS Program Cray’s Cascade Project Chapel Language HPCS = High Productivity Computing Systems Programmability Performance Portability Robustness Cascade = Cray’s HPCS Project System-wide consideration of productivity impacts Processors, memory, network, OS Runtime, compilers, languages Chapel = Cascade High-Productivity Language IBM Sun

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 3 Introduction – Why Chapel Fragmented Model: MPI, SHMEM, UPC Write code on processor-by-processor basis Break data structure Break control flow Mix algorithms with per-processor management details in the computation Virtual processor topology Communication details Choice of data structures, memory layout Fail to support composition of parallelism Lack of productivity, flexibility, portability. Difficult to understand and maintain

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 4 Introduction Global-view Model: HPF, OpenMP, ZPL, NESL Need not decompose data and control flow Decomposition: compiler and runtime Users provide high level guides Natural and Intuitive Lack of abstractions: set, hash, graph Performance is not as good as MPL. Difficult to compile

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 5 Introduction - Chapel Chapel: Cascade High-Productivity Language Built from HPF and ZPL Strictly typed Overall goal: Simplify the creation of parallel programs Provide high-performance production-grade codes More generality Motivating Language Technologies: Multithreaded parallel programming Locality-aware programming Object-oriented programming Generic programming and type inference

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 6 Outline Introduction Multithreaded Parallel Programming Data Parallel Task Parallel Locality-aware Programming Data Distribution Computation Distribution Other Features Summery

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 7 Multithreaded Parallel Programming Provide global view of computation and data structures Composition of parallelism Abstraction of data and task parallelism Data: domains, arrays, graphs, Task: cobegins, atomic, sync variables Virtualization of threads locales

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 8 Data Parallelism: Domains Domain: an index set (first class) Specifies the size and shape of “arrays” Support sequence and parallel iteration Potentially decomposed across locales Each domain has an index type: index(domain) Fundamental concept of data parallelism Generalization of ZPL’s region Important Domains Arithmetic: indices are Cartesian tuples Arrays, multidimensional Arrays Can be strided and arbitrarily sparse Infinite: indices are hash keys Maps, hash tables, associative arrays Opaque: anonymous Sets, trees, graphs Others: Enumerate

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 9 Domain Declaration

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 10 More domain declarations

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 11 Domain Uses Declaring Arrays var A, B [D] : float Sub-array references A(DInner) = B(DInner); Sequential iteration for (i,j) in Dinner { … A(I,j)… } or: for ij in Dinner { …A(ij)… } Parallel iteration forall (i,j) in Dinner { … A(I,j)… } or: for [ij in Dinner { …A(ij)… } Array re-allocation D = [1..2*m, 1..2/n] A B A DInner B DInner D D

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 12 Infinite Domains var People: domain( string); var Age: [People] integer; var Birthdate: [People] string; Age(“john”) = 60; Birthdate[“john”] = “12/11/1946” forall person in People { if (Birthdate(person) == today ) { Age(person) += 1; }

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 13 Opaque Domains var Vertices: domain( opaque) for i in (1..5) { Vertices.newIndex(); } Var AV, BV: [Vertices] float Vertices AV BV

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 14 Building A Tree var Vertices: domain( opaque); var left, right: [Vertices] index(Vertices); var root: index(Vertices); root = Vertices.newIndex(); left(root) = Vertices.newIndex(); right(root) = Vertices.newIndex(); left(right(root)) = Vertices.newIndex(); root

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 15 The Domain/Index Hierarchy Every Domain has an Index type Eliminates most runtime boundary checks

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 16 Task Parallelism co-begins: statements that may run in parallel cobegin { ComputeTaskA (…); ComputeTaskB (…); } atomic blocks atomic { newnode.next = insertpt; newnode.prev = insertpt.prev; insertpt.prev.next = newnode; insertpt.prev = newnode; } sync and single-assignment variables Synchronize tasks ComputeTaskA ( ) { cobegin { ComputeTaskC (…); ComputeTaskD (…); } ComputeTaskE(…); }

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 17 Outline Introduction Multithreaded Parallel Programming Data Parallel Task Parallel Locality-aware Programming Data Distribution Computation Distribution Other Features Summery

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 18 Locality-aware programming locale: machine unit of storage and processing Specify number of locales on command-line./myProgram –nl 8 Chapel provides with built-in locale array: const Locales: [1..numLocales] locale ; Users may define their own locale arrays: var CompGrid: [1..GridRows, 1..GridCols] locale = …; var TaskALocs: [1..numTaskALocs] locale = …; var TaskBLocs: [1..numTaskBLocs] locale = …;

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 19 Data Distribution Domains can be distributed across locales var D: domain(2) distrubuted(block(2) to CompGrid) = …; Distributions specified by Mapping of indices to locales Per-locale storage layout of domain indices and array element Distributions implemented a a class hierarchy Chapel provides a group of standard distributions User may also write their own ??? Support reduce and scan (parallel prefix) Including user-defined operations

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 20 Computation Distribution “on” keyward associates tasks to locale(s) “on” can also used as data-driven manner

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 21 Outline Introduction Multithreaded Parallel Programming Data Parallel Task Parallel Locality-aware Programming Data Distribution Computation Distribution Other Features Summery

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 22 Other Features Object Oriented Interface Optional OO style overloading Advanced language features expressed in class Generics and Type Inferences Type variables and Parameters Similar to class template in C++ Sequences (“seq”), iterators; “ordered” keyword suppresses parallelism Modules (for name-space management) Parallel garbage collection ???

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 23 Outline Introduction Multithreaded Parallel Programming Data Parallel Task Parallel Locality-aware Programming Data Distribution Computation Distribution Other Features Chapel Status

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 24 Chapel Status First sequential prototype on one locale Not finished yet Currently can run programs simple domains up to 2-dimensions partial type Inferences Threads  locales  processors A full prototype in one or two years