PP POMPA status Xavier Lapillonne.

Slides:



Advertisements
Similar presentations
Portable and Predictable Performance on Heterogeneous Embedded Manycores (ARTEMIS ) ARTEMIS 2 nd Project Review October 2014 WP2 “Application use.
Advertisements

Presented by Rengan Xu LCPC /16/2014
4/27/2006Education Technology Presentation Visual Grid Tutorial PI: Dr. Bina Ramamurthy Computer Science and Engineering Dept. Graduate Student:
PP POMPA (WG6) Overview Talk COSMO GM12, Lugano Oliver Fuhrer (MeteoSwiss) and the whole POMPA project team.
Simple Interface for Polite Computing (SIPC) Travis Finch St. Edward’s University Department of Computer Science, School of Natural Sciences Austin, TX.
1 Developing Native Device for MPJ Express Advisor: Dr. Aamir Shafi Co-advisor: Ms Samin Khaliq.
Status of Dynamical Core C++ Rewrite (Task 5) Oliver Fuhrer (MeteoSwiss), Tobias Gysi (SCS), Men Muhheim (SCS), Katharina Riedinger (SCS), David Müller.
Slide 1 Auburn University Computer Science and Software Engineering Scientific Computing in Computer Science and Software Engineering Kai H. Chang Professor.
WORK ON CLUSTER HYBRILIT E. Aleksandrov 1, D. Belyakov 1, M. Matveev 1, M. Vala 1,2 1 Joint Institute for nuclear research, LIT, Russia 2 Institute for.
Science Plan and SPM report COSMO General Meeting, 9 September 2010.
TRACEREP: GATEWAY FOR SHARING AND COLLECTING TRACES IN HPC SYSTEMS Iván Pérez Enrique Vallejo José Luis Bosque University of Cantabria TraceRep IWSG'15.
Porting the physical parametrizations on GPUs using directives X. Lapillonne, O. Fuhrer, Cristiano Padrin, Piero Lanucara, Alessandro Cheloni Eidgenössisches.
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Operational COSMO Demonstrator OPCODE André Walser and.
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss News from COSMO COSMO User Workshop 2010.
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
Status of the Support Activities WG 6: Reference Version and Implementation WG Coordinator: Ulrich Schättler.
SPM review COSMO General Meeting, 13 September 2012.
1 06/09/2011, COSMO GM Xavier Lapillonne Porting the physical parametrizations on GPU using directives X. Lapillonne, O. Fuhrer Eidgenössisches Departement.
Status of the COSMO-Model Package Ulrich Schättler.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
Manno, , © by Supercomputing Systems 1 1 COSMO - Dynamical Core Rewrite Approach, Rewrite and Status Tobias Gysi POMPA Workshop, Manno,
Status of Dynamical Core C++ Rewrite Oliver Fuhrer (MeteoSwiss), Tobias Gysi (SCS), Men Muhheim (SCS), Katharina Riedinger (SCS), David Müller (SCS), Thomas.
Yang Yu, Tianyang Lei, Haibo Chen, Binyu Zang Fudan University, China Shanghai Jiao Tong University, China Institute of Parallel and Distributed Systems.
Full and Para Virtualization
Deutscher Wetterdienst COSMO-ICON Physics Current Status and Plans Ulrich Schättler Source Code Administrator COSMO-Model.
WG 6 and Support Activities Overview Status Report Reference Version and Implementation WG Coordinator and Project Leader: Ulrich Schättler.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 1.
EU-Russia Call Dr. Panagiotis Tsarchopoulos Computing Systems ICT Programme European Commission.
December 13, G raphical A symmetric P rocessing Prototype Presentation December 13, 2004.
Toward Geant4 version 10 Makoto Asai (SLAC PPA/SCA) For the Geant4 Collaboration Geant4 Technical Forum December 6 th, 2012.
PEER 2003 Meeting 03/08/031 Interdisciplinary Framework Major focus areas Structural Representation Fault Systems Earthquake Source Physics Ground Motions.
INSTYTUT METEOROLOGII I GOSPODARKI WODNEJ INSTITUTE OF METEOROLOGY AND WATER MANAGEMENT TITLE: IMGW in COSMO AUTHOR: Michał Ziemiański DATA:
Parallel OpenFOAM CFD Performance Studies Student: Adi Farshteindiker Advisors: Dr. Guy Tel-Zur,Prof. Shlomi Dolev The Department of Computer Science Faculty.
Group Members Hamza Zahid (131391) Fahad Nadeem khan Abdual Hannan AIR UNIVERSITY MULTAN CAMPUS.
Parallel Programming Models
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
VisIt Project Overview
NFV Compute Acceleration APIs and Evaluation
MASS Java Documentation, Verification, and Testing
CI Updates and Planning Discussion
Early Results of Deep Learning on the Stampede2 Supercomputer
GMAO Seasonal Forecast
Status of the COSMO-Software and Documentation
Summary of WG6 activities
PT Evaluation of the Dycore Parallel Phase (EDP2)
ENIAC ENabling the Icon model on heterogeneous ArChitecture
A451 Theory – 7 Programming 7A, B - Algorithms.
Overview of the COSMO NWP model
Ray-Cast Rendering in VTK-m
Experience with Maintaining the GPU Enabled Version of COSMO
Scientific Discovery via Visualization Using Accelerated Computing
External Projects related to WG 6
Dycore Rewrite Tobias Gysi.
Early Results of Deep Learning on the Stampede2 Supercomputer
Chapter 2: The Linux System Part 1
Scalable Parallel Interoperable Data Analytics Library
Discussion HPC Priority project for COSMO consortium
PASC PASCHA Project The next HPC step for the COSMO model
and news about our activities
Alternative Processor Panel Results 2008
Bin Ren, Gagan Agrawal, Brad Chamberlain, Steve Deitz
Cristiano Padrin (CASPUR)
Back End Compiler Panel
Ulrich Schättler Source Code Administrator
Non blocking communications in RK dynamics
What I've done in past 6 months
Preparing a new Model Version according to Source Code Management
Support Activities and WG 6 Overview Status Report
Question 1 How are you going to provide language and/or library (or other?) support in Fortran, C/C++, or another language for massively parallel programming.
Presentation transcript:

PP POMPA status Xavier Lapillonne

Performance On Massively Parallel Architectures Main outcomes : Performance portable version of the Dycore -> STELLA DSL GPU capable version of the COSMO model Possibility to run the model in single precision

Main task in 2017: Merge all developments into official version 5.X 5.0+ POMPA 4.19

C++/STELLA Dynamical core Integrated: works in GPU/CPU mode – will be distributed with 5.05 Full developer documentation distributed to SCA

Integration Status (OpenACC+comm.) Components Status radiation, surface, shallow conv., sso merged turbulence, microphysics on going Tiedke conv. , flake To do Sea-ice, Bechtold Not considered Assimilation Organize code Comminication (GLC) Release of the GPU version Q1 2018 ?

Integrating GPU enabled Code (SCA view) Is in principle no problem: OpenACC statements are (as directives) ignored by a non-GPU compiler Readability of the code is not influenced (at least you get accustomed to it). Problematic are the parts, where code has to be modified (ifdef or different implementation) The C++ DyCore Need to gain experience, currently installed on workstation by P. Spoerri

Testing 3 ways On linux work station On Cray Test Cluster at DWD On CSCS machines using github/jenkins and the automated test suite Github Pull Request (ok if red)

COMET experience with GPU Large GPU system – 112 K80 GPUs Experience with COSMO-ME (ensemble), shows significant speedup

What does 3x speed up mean GPU system CPU system

Related projects ENIAC 3 years, funded by PASC initiative (3 Post-Docs), PI U. Lohmann Collaboration : ETH, CSCS, MPI-M, MetetoSwiss, DWD Adapt the ICON model to run on GPU and many-core architecture (XeonPhi) Achieving a high degree of performance portability by: using source-to-source translation tool : CLAW Explore alternative approaches such as the GridTools domain-specific lan- guage (DSL) Prepare the ICON model for actual use-cases in weather and climate on emerging HPC systems

Related projects PASCHA 3 years, funded by PASC initiative (2 Post-Docs), PI T. Hoefler Focus on the COSMO model Replace the STELLA library with the GridTools library Port to Xeon-Phi (GridTool backend and code adaptation) Improve strong scalability of COSMO by introducing functional parallelism Acceptance of DSL low : explore new approach =>

Prototype high-level DSL Example: gtclang Coriolis force No loops No data structures No halo-updates function avg { offset off; storage in; avg = 0.5 * ( in(off) + in() ) ; } ; function coriolis_force { storage fc, in; coriolis_force = fc() * in(); }; operator coriolis { storage u_tend, u, v_tend, v, fc; vertical_region ( k_start , k_end ) { u_tend += avg(j-1, coriolis_force(fc, avg(i+1, v)); v_tend -= avg(i-1, coriolis_force(fc, avg(j+1, u));

High-level DSLs Acceptance of STELLA / GridTools is low (C++, boiler plate, code safety, manual optimization, …) High-level DSL can leverage the power of GridTools but provide a more acceptable and safe frontend

IMGW : port of Eulag dycore to GridTools Hardwork, not enough example/documentation available Difficult to debug, lack of user friendly tools. Difficult to diagnose the actual state of computation But It is worth the effort if we can get performance portable code It could pay off from an economical point of view (cheaper machine at the end)

HPC Priority project for COSMO consortium POMPA project will finish end of 2017 Possible content: Achieve performance portability: x86 multi-core CPU, NVIDIA GPU, Intel Xeon Phi Increase strong scalability through task parallelism (e.g. different components run in parallel) Apply and evaluate different programming models: DSLs, directives, source-to-source compilers More intensive use of library Address low precision (single precision) Questions Only consider ICON (what about COSMO) How to achieve a broader engagement within COSMO as compared to the POMPA priority project?

Thanks for your attention