A Domain Decomposition Parallel Implementation of an Elasto-viscoplasticCoupled elasto-plastic Fast Fourier Transform Micromechanical Solver with Spectral.

Slides:



Advertisements
Similar presentations
Parallel Processing with OpenMP
Advertisements

Introduction to Openmp & openACC
XI International Conference on COMPUTATIONAL PLASTICITY FUNDAMENTALS AND APPLICATIONS COMPLAS XI 7-9 September 2011 Barcelona - Spain Multiscale Modelling.
Delivering High Performance to Parallel Applications Using Advanced Scheduling Nikolaos Drosinos, Georgios Goumas Maria Athanasaki and Nectarios Koziris.
Program Analysis and Tuning The German High Performance Computing Centre for Climate and Earth System Research Panagiotis Adamidis.
Aug 9-10, 2011 Nuclear Energy University Programs Materials: NEAMS Perspective James Peltz, Program Manager, NEAMS Crosscutting Methods and Tools.
System Simulation Of 1000-cores Heterogeneous SoCs Shivani Raghav Embedded System Laboratory (ESL) Ecole Polytechnique Federale de Lausanne (EPFL)
1 Lawrence Livermore National Laboratory By Chunhua (Leo) Liao, Stephen Guzik, Dan Quinlan A node-level programming model framework for exascale computing*
Parallel Computation of the 2D Laminar Axisymmetric Coflow Nonpremixed Flames Qingan Andy Zhang PhD Candidate Department of Mechanical and Industrial Engineering.
Productive Performance Tools for Heterogeneous Parallel Computing Allen D. Malony Department of Computer and Information Science University of Oregon Shigeo.
SKELETON BASED PERFORMANCE PREDICTION ON SHARED NETWORKS Sukhdeep Sodhi Microsoft Corp Jaspal Subhlok University of Houston.
1 Multi - Core fast Communication for SoPC Multi - Core fast Communication for SoPC Technion – Israel Institute of Technology Department of Electrical.
High Performance Communication using MPJ Express 1 Presented by Jawad Manzoor National University of Sciences and Technology, Pakistan 29 June 2015.
CS 732: Advance Machine Learning Usman Roshan Department of Computer Science NJIT.
The hybird approach to programming clusters of multi-core architetures.
Parallelization: Conway’s Game of Life. Cellular automata: Important for science Biology – Mapping brain tumor growth Ecology – Interactions of species.
Jawwad A Shamsi Nouman Durrani Nadeem Kafi Systems Research Laboratories, FAST National University of Computer and Emerging Sciences, Karachi Novelties.
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
Parallel Processing LAB NO 1.
Performance Evaluation of Hybrid MPI/OpenMP Implementation of a Lattice Boltzmann Application on Multicore Systems Department of Computer Science and Engineering,
Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Summary of Contributions Background: MapReduce and FREERIDE Wavelet.
HPC Technology Track: Foundations of Computational Science Lecture 1 Dr. Greg Wettstein, Ph.D. Research Support Group Leader Division of Information Technology.
Parallelization: Area Under a Curve. AUC: An important task in science Neuroscience – Endocrine levels in the body over time Economics – Discounting:
Bin Wen and Nicholas Zabaras
Materials Process Design and Control Laboratory Finite Element Modeling of the Deformation of 3D Polycrystals Including the Effect of Grain Size Wei Li.
Multiscale modeling of materials or the importance of multidisciplinary dialogue Rémi Dingreville NYU-Poly Research Showcase Collaborative Opportunities.
Materials Process Design and Control Laboratory MULTISCALE MODELING OF ALLOY SOLIDIFICATION LIJIAN TAN NICHOLAS ZABARAS Date: 24 July 2007 Sibley School.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage April 2010.
October 2008 Integrated Predictive Simulation System for Earthquake and Tsunami Disaster CREST/Japan Science and Technology Agency (JST)
Parallelization Strategies Laxmikant Kale. Overview OpenMP Strategies Need for adaptive strategies –Object migration based dynamic load balancing –Minimal.
LOGO Parallel computing technique for EM modeling makai 天津大学电子信息工程学院 School of Electronic Information Engineering.
CS 732: Advance Machine Learning
Distributed Real-time Systems- Lecture 01 Cluster Computing Dr. Amitava Gupta Faculty of Informatics & Electrical Engineering University of Rostock, Germany.
Motivation: dynamic apps Rocket center applications: –exhibit irregular structure, dynamic behavior, and need adaptive control strategies. Geometries are.
Hybrid Parallel Implementation of The DG Method Advanced Computing Department/ CAAM 03/03/2016 N. Chaabane, B. Riviere, H. Calandra, M. Sekachev, S. Hamlaoui.
Multicore Applications in Physics and Biochemical Research Hristo Iliev Faculty of Physics Sofia University “St. Kliment Ohridski” 3 rd Balkan Conference.
Accelerating K-Means Clustering with Parallel Implementations and GPU Computing Janki Bhimani Miriam Leeser Ningfang Mi
PERFORMANCE OF THE OPENMP AND MPI IMPLEMENTATIONS ON ULTRASPARC SYSTEM Abstract Programmers and developers interested in utilizing parallel programming.
Defining the Competencies for Leadership- Class Computing Education and Training Steven I. Gordon and Judith D. Gardiner August 3, 2010.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Institute of Mechanics and Advanced Materials An Adaptive Multiscale Method for Modelling of Fracture in Polycrystalline Materials Ahmad Akbari R., Pierre.
Evolution at CERN E. Da Riva1 CFD team supports CERN development 19 May 2011.
USEIMPROVEEVANGELIZE High-Performance Computing and OpenSolaris ● Silveira Neto ● Sun Campus Ambassador ● Federal University of Ceará ● ParGO - Paralellism,
Recent Development on IN3D-ACC July 22, 2014 Recent Progress: 3D MPI Performance 1 Lixiang (Eric) Luo, Jack Edwards, Hong Luo Department of Mechanical.
Productive Performance Tools for Heterogeneous Parallel Computing
Generalized and Hybrid Fast-ICA Implementation using GPU
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Computational Techniques for Efficient Carbon Nanotube Simulation
R. Rastogi, A. Srivastava , K. Sirasala , H. Chavhan , K. Khonde
Parallel Plasma Equilibrium Reconstruction Using GPU
I. E. Venetis1, N. Nikoloutsakos1, E. Gallopoulos1, John Ekaterinaris2
Introduction to Parallelism.
Multi-Processing in High Performance Computer Architecture:
NGS computation services: APIs and Parallel Jobs
CMAQ PARALLEL PERFORMANCE WITH MPI AND OpenMP George Delic, Ph
Development of the Nanoconfinement Science Gateway
Scalable Parallel Interoperable Data Analytics Library
Alternative Processor Panel Results 2008
Hybrid Programming with OpenMP and MPI
By Brandon, Ben, and Lee Parallel Computing.
A Comparison-FREE SORTING ALGORITHM ON CPUs
Hybrid Parallel Programming
Visco-plastic self-consistent modeling of high strain rate and
Computational Techniques for Efficient Carbon Nanotube Simulation
Department of Computer Science, University of Tennessee, Knoxville
Chapter 01: Introduction
Hybrid Parallel Programming
Question 1 How are you going to provide language and/or library (or other?) support in Fortran, C/C++, or another language for massively parallel programming.
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

A Domain Decomposition Parallel Implementation of an Elasto-viscoplasticCoupled elasto-plastic Fast Fourier Transform Micromechanical Solver with Spectral Database Constitutive Representation Adnan Eghtesad, Timothy Barret, Kai Germaschewski, Ricardo A. Lebensohn, Rodney J. Mcabe, and Marko Knezevic Department of Mechanical Engineering, University of New Hampshire, Durham, NH 03824, USA Department of Physics, University of New Hampshire, Durham, NH 03824, USA Materials Science and Technology Division, Los Alamos National Laboratory, Los Alamos, NM 87544, USA A multiscale plasticity approach which explicitly models discrete grains and slip systems to capture the microstructure anisotropy of material What is crystal plasticity? Slip systems Macroscale homogenized model Grain Microstructure RVE (Polycrystalline) Crystal plasticity constitutive framework 1 Computational crystal plasticity Crystal plasticity simulations are more accurate, predictive and robust comparing to macroscale plasticity but slow! One needs to accelerate them to facilitate further efficient research into this field Fast CP simulations Parallel run on cluster of CPUs (Domain decomposition) Texture compaction Parallel run on multiple GPUs Non-iterative spectral solvers (Pre-computed databases) 2 https://www.hec.nasa.gov/news/features/2011/sbus_042111.html Nasa advanced super computing (NAS) cluster with 29,368 computing cores High Performance Computing (HPC) Tools for Parallel computing on CPUs OpenMP MPI Hybrid OpenMP-MPI 3 OpenMP (Shared memory parallel programming) * Supports multi-threaded parallel programming on shared memory * * Uses directives around loops to parallelize them * 4 MPI (Message Passing Interface) * Facilitates communication among nodes of a cluster connected through high speed network * * MPI uses domain decomposition for parallelization * EVPFFT (Elasto-viscoplasticCoupled elasto-plastic Fast Fourier Transform) 6 5 7 EVPFFT domain decomposition 8 Profile the code (Performance Profiler) 9 Up to 95x Speedup on only 64 CPUs!