University of Minnesota Comments on Co-Array Fortran Robert W. Numrich Minnesota Supercomputing Institute University of Minnesota, Minneapolis.

Slides:



Advertisements
Similar presentations
Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec.
Advertisements

879 CISC Parallel Computation High Performance Fortran (HPF) Ibrahim Halil Saruhan Although the [Fortran] group broke new ground …
High Productivity Computing Systems for Command and Control 13 th ICCRTS: C2 for Complex Endeavors Bellevue, WA June 17 – 19, 2008 Scott Spetka – SUNYIT.
The new features of Fortran 2003 David Muxworthy BSI Fortran Convenor.
1 Tuesday, November 07, 2006 “If anything can go wrong, it will.” -Murphy’s Law.
DISTRIBUTED AND HIGH-PERFORMANCE COMPUTING CHAPTER 7: SHARED MEMORY PARALLEL PROGRAMMING.
Portability Issues. The MPI standard was defined in May of This standardization effort was a response to the many incompatible versions of parallel.
Exceptions in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Message-Passing Programming and MPI CS 524 – High-Performance Computing.
G Robert Grimm New York University Extensibility: SPIN and exokernels.
11 3 / 12 CHAPTER Databases MIS105 Lec14 Irfan Ahmed Ilyas.
Programming Languages Structure
PowerPoint Presentation for Dennis, Wixom & Tegarden Systems Analysis and Design Copyright 2001 © John Wiley & Sons, Inc. All rights reserved. Slide 1.
Chapter 2: Impact of Machine Architectures What is the Relationship Between Programs, Programming Languages, and Computers.
1 ICS103 Programming in C Lecture 2: Introduction to C (1)
Tile Reduction: the first step towards tile aware parallelization in OpenMP Ge Gan Department of Electrical and Computer Engineering Univ. of Delaware.
Guide To UNIX Using Linux Third Edition
A Coarray Fortran Implementation to Support Data-Intensive Application Development Deepak Eachempati 1, Alan Richardson 2, Terrence Liao 3, Henri Calandra.
1 MPI-2 and Threads. 2 What are Threads? l Executing program (process) is defined by »Address space »Program Counter l Threads are multiple program counters.
Principles of Programming Chapter 1: Introduction  In this chapter you will learn about:  Overview of Computer Component  Overview of Programming 
Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.
Modern Concurrency Abstractions for C# by Nick Benton, Luca Cardelli & C´EDRIC FOURNET Microsoft Research.
Parallel Programming in Java with Shared Memory Directives.
MPI3 Hybrid Proposal Description
C++ Programming. Table of Contents History What is C++? Development of C++ Standardized C++ What are the features of C++? What is Object Orientation?
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
1 A Multi-platform Co-Array Fortran Compiler Yuri Dotsenko Cristian Coarfa John Mellor-Crummey Department of Computer Science Rice University Houston,
Center for Programming Models for Scalable Parallel Computing: Project Meeting Report Libraries, Languages, and Execution Models for Terascale Applications.
1 John Mellor-Crummey Cristian Coarfa, Yuri Dotsenko Department of Computer Science Rice University Experiences Building a Multi-platform Compiler for.
Compilation Technology SCINET compiler workshop | February 17-18, 2009 © 2009 IBM Corporation Software Group Coarray: a parallel extension to Fortran Jim.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Porting from the Cray T3E to the IBM SP Jonathan Carter NERSC User Services.
Co-Array Fortran Open-source compilers and tools for scalable global address space computing John Mellor-Crummey Rice University.
CS 838: Pervasive Parallelism Introduction to OpenMP Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from online references.
Cohesion and Coupling CS 4311
C# Versus Java Author: Eaddy, Marc Source: Software Tools for the Professional Programmer. Dr. Dobb's Journal. Feb2001, Vol. 26 Issue 2, p74 Hong Lu CS699A.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1 Programming Languages B.J. Maclennan 4. Syntax and Elagance: Algol-60.
CS CS CS IA: Procedural Programming CS IB: Object-Oriented Programming.
1 Parallel Programming Aaron Bloomfield CS 415 Fall 2005.
COP4020 Programming Languages Names, Scopes, and Bindings Prof. Xin Yuan.
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
A Multi-platform Co-array Fortran Compiler for High-Performance Computing John Mellor-Crummey, Yuri Dotsenko, Cristian Coarfa {johnmc, dotsenko,
University of Minnesota Introduction to Co-Array Fortran Robert W. Numrich Minnesota Supercomputing Institute University of Minnesota, Minneapolis and.
1 Qualifying ExamWei Chen Unified Parallel C (UPC) and the Berkeley UPC Compiler Wei Chen the Berkeley UPC Group 3/11/07.
Introduction to OpenMP Eric Aubanel Advanced Computational Research Laboratory Faculty of Computer Science, UNB Fredericton, New Brunswick.
Java Basics Opening Discussion zWhat did we talk about last class? zWhat are the basic constructs in the programming languages you are familiar.
System Components ● There are three main protected modules of the System  The Hardware Abstraction Layer ● A virtual machine to configure all devices.
1 Chapter 9 Distributed Shared Memory. 2 Making the main memory of a cluster of computers look as though it is a single memory with a single address space.
Principles of Programming Chapter 1: Introduction  In this chapter you will learn about:  Overview of Computer Component  Overview of Programming 
CSSE501 Object-Oriented Development. Chapter 10: Subclasses and Subtypes  In this chapter we will explore the relationships between the two concepts.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
Exploring Parallelism with Joseph Pantoga Jon Simington.
1 HPJAVA I.K.UJJWAL 07M11A1217 Dept. of Information Technology B.S.I.T.
PowerPoint Presentation for Dennis, Wixom, & Tegarden Systems Analysis and Design with UML, 3rd Edition Copyright © 2009 John Wiley & Sons, Inc. All rights.
April 24, 2002 Parallel Port Example. April 24, 2002 Introduction The objective of this lecture is to go over a simple problem that illustrates the use.
CIS 595 MATLAB First Impressions. MATLAB This introduction will give Some basic ideas Main advantages and drawbacks compared to other languages.
University of Minnesota Introduction to Co-Array Fortran Robert W. Numrich Minnesota Supercomputing Institute University of Minnesota, Minneapolis and.
Chapter 1: Preliminaries Lecture # 2. Chapter 1: Preliminaries Reasons for Studying Concepts of Programming Languages Programming Domains Language Evaluation.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
JAVA TRAINING IN NOIDA. JAVA Java is a general-purpose computer programming language that is concurrent, class-based, object-oriented and specifically.
Two New UML Diagram Types Component Diagram Deployment Diagram.
Distributed Shared Memory
Computer Engg, IIT(BHU)
CSCI/CMPE 3334 Systems Programming
Introduction to MATLAB
The Future of Fortran is Bright …
Programming Languages
Software Design Lecture : 9.
Simulation And Modeling
CSE 153 Design of Operating Systems Winter 2019
Presentation transcript:

University of Minnesota Comments on Co-Array Fortran Robert W. Numrich Minnesota Supercomputing Institute University of Minnesota, Minneapolis

2 Philosophy Behind the CAF Model A minimum number of new features that look and feel like Fortran. The rules for co-dimensions are the same as those for normal dimensions with a few exceptions. The CAF model is purely local. The compiler performs normal optimization between synchronization points. Compiler is not required to perform, is not expected to perform, is deliberately prevented from performing, global optimization. Programmer is responsible for explicit data distribution, explicit communication, and explicit synchronization. Programmer is responsible for memory consistency.

3 The Essential Features of CAF Co-array syntax Co-dimensions (How images are related to each other) Data communication (How data moves between images) Synchronization Full barrier Pair-wise handshake Dynamic Memory management Allocatable co-arrays Allocatable/pointer components of co-array derived types

4 Recommendations Delete all of Section 8.5 Substitute SYNC[([myPal])] Delete everything on I/O except Only image 1 reads stdin Records from different images not mixed to stdout Delete all the stuff on collectives Substitute the functions glbSum(x), glbMin(x), glbMax(x) Argument is not necessarily a co-array.

5 Ragged Arrays 1.The most important, the most powerful, feature of the CAF model. Allows each image to have data structures with different sizes and different locations on each image. No other model can handle this. Fortran with CAF extensions handles it in a very natural way. 2.Allocatable/pointer components of co-array derived types type x real,allocatable,target :: z(:) real,pointer :: ptr(:) type(x) :: y[  ] end type x allocate(z(someRule(this_image))) y%ptr => z sync y[p]%ptr = Most difficult feature to implement on crummy hardware and/or crummy operating systems. Good systems should be rewarded for getting this right. We should not allow bad systems to drag everything down to the lowest common denominator.

6 Memory Consistency The programmer is responsible for maintaining memory consistency, not the compiler. The rules must be very simple and very clear. With just SYNC, it is very simple The addition of NOTIFY/QUERY makes it not so simple Segment boundary statements - Compiler may optimize between segment boundaries but not across segment boundaries - Processor must make co-arrays “visible” to all images across segment boundaries - The programmer must make sure that one and only one image defines a co-array variable “at the same time” - The programmer must make sure that no image tries to reference a co-array variable “at the same time” as another image is trying to define it.

7 Input/Output What’s the minimum needed? –stdin/stdout are special cases – Always connected to all images – Only image 1 can read stdin – System must not mix records to stdout from different images –Shared files – Allow each image to open to same unit – Direct access only –Allowing sequential access requires changes to backspace, rewind, etc. – open(unit=u,access= ,…) – open(unit=u,access=connectList,…) –Do we need teams? –Do we need to sync?

8 Collectives Avoid language bloat - Not everything in the MPI Library needs to be reproduced in CAF as intrinsic procedures - What do we really need and want? - What should the interface be? CAF is intended as a low-level language - Collectives are easy to write in CAF for any specific procedure - Throwing a long list of new intrinsic procedures over the wall may discourage vendors from adopting CAF If anything, supply intrinsic functions: glbSum(x), glbMin(x), glbMax(x) argument x need not be a co-array Propose a supporting library for CAF.

9 VOLATILE Co-arrays? What do we want the VOLATILE attribute to mean for co-arrays? - Inhibit optimization? - Always read from memory? - Always flush to memory? Can VOLATILE make spin-loops work without the need for an artificial sync-memory() from the programmer? - Statements with co-arrays are segment boundary statements

10 SYNC/BARRIER One new statement SYNC - Implies full BARRIER for all images - No arguments No SYNC_IMAGES - Synchronization between subsets of images can be done with NOTIFY/QUERY

11 NOTIFY/QUERY Why do we want NOTIFY/QUERY? - Split-phase sync - Subset sync - Master-slave work distribution - Should they match in pairs? - Should we expose which notify matches which query? Maybe we really want EVENTS? - EVENT_POST(tag) - EVENT_WAIT(tag) - EVENT_CLEAR(tag)