Excursions into Parallel Programming

Slides:



Advertisements
Similar presentations
Delivering High Performance to Parallel Applications Using Advanced Scheduling Nikolaos Drosinos, Georgios Goumas Maria Athanasaki and Nectarios Koziris.
Advertisements

Hallway Traffic Simulator Peter Riggs Computer Systems Lab
Beowulf Supercomputer System Lee, Jung won CS843.
The Who, What, Why and How of High Performance Computing Applications in the Cloud Soheila Abrishami 1.
Parallelizing Incremental Bayesian Segmentation (IBS) Joseph Hastings Sid Sen.
Reference: Message Passing Fundamentals.
Software Group © 2006 IBM Corporation Compiler Technology Task, thread and processor — OpenMP 3.0 and beyond Guansong Zhang, IBM Toronto Lab.
Parallelization of FFT in AFNI Huang, Jingshan Xi, Hong Department of Computer Science and Engineering University of South Carolina.
Tile Reduction: the first step towards tile aware parallelization in OpenMP Ge Gan Department of Electrical and Computer Engineering Univ. of Delaware.
Company LOGO Hashing System based on MD5 Algorithm Characterization Students: Eyal Mendel & Aleks Dyskin Instructor: Evgeny Fiksman High Speed Digital.
New Mexico Computer Science for All Computational Science Investigations (from the Supercomputing Challenge Kickoff 2012) Irene Lee December 9, 2012.
Parallel Processing LAB NO 1.
CC02 – Parallel Programming Using OpenMP 1 of 25 PhUSE 2011 Aniruddha Deshmukh Cytel Inc.
Statistical Performance Analysis for Scientific Applications Presentation at the XSEDE14 Conference Atlanta, GA Fei Xing Haihang You Charng-Da Lu July.
1 Blue Gene Simulator Gengbin Zheng Gunavardhan Kakulapati Parallel Programming Laboratory Department of Computer Science.
SUPERCOMPUTING CHALLENGE KICKOFF 2015 A Model for Computational Science Investigations Oct 2015 © challenge.org Supercomputing Around.
An Investigation into Implementations of DNA Sequence Pattern Matching Algorithms Peden Nichols Computer Systems Research April,
Computing Simulation in Orders Based Transparent Parallelizing Pavlenko Vitaliy Danilovich, Odessa National Polytechnic University Burdeinyi Viktor Viktorovych,
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
Thinking in Parallel – Implementing In Code New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR.
Parallel Computing in Numerical Simulation of Laser Deposition The objective of this proposed project is to research and develop an effective prediction.
By Chi-Chang Chen.  Cluster computing is a technique of linking two or more computers into a network (usually through a local area network) in order.
David Angulo Rubio FAMU CIS GradStudent. Introduction  GPU(Graphics Processing Unit) on video cards has evolved during the last years. They have become.
General requirements for BES III offline & EF selection software Weidong Li.
Exploring Parallelism with Joseph Pantoga Jon Simington.
Optimizing Parallel Programming with MPI Michael Chen TJHSST Computer Systems Lab Abstract: With more and more computationally- intense problems.
Parallel Computing Presented by Justin Reschke
Examining the Mechanisms of the Human Brain with Computer Science Arvind Ravichandran.
1 Hierarchical Parallelization of an H.264/AVC Video Encoder A. Rodriguez, A. Gonzalez, and M.P. Malumbres IEEE PARELEC 2006.
Yinglei Cheng1,2, Ying Li2, Rongchun Zhao2 2010/01/07 黃千峰 A Parallel Image Fusion Algorithm Based on Wavelet Packet.
7 th session of the AER Working Group “f “ - Spent Fuel Transmutations Simulations of experimental “ADT systems” Mitja Majerle Nuclear Physics Institute.
Sadegh Aliakbary Sharif University of Technology Fall 2010.
Fermi National Accelerator Laboratory & Thomas Jefferson National Accelerator Facility SciDAC LQCD Software The Department of Energy (DOE) Office of Science.
DNA Computing. What is it?  “DNA computing is a branch of computing which uses DNA, biochemistry, and molecular biology hardware, instead of the traditional.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Parallel OpenFOAM CFD Performance Studies Student: Adi Farshteindiker Advisors: Dr. Guy Tel-Zur,Prof. Shlomi Dolev The Department of Computer Science Faculty.
Sub-fields of computer science. Sub-fields of computer science.
cs612/2002sp/projects/ CS612 Term Projects cs612/2002sp/projects/
A Parallel Communication Infrastructure for STAPL
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
A Mental Game as a Source of CS Case Studies
Chaos Theory The theory of non-linear functions, such that small differences in the input of the function can result in large and unpredictable differences.
Introduction Abstract
PowerGrid for harnessing high performance computing to reconstruct large sets of high-resolution non-Cartesian MRI data Brad Sutton Assoc Prof, Bioengineering.
CSE 501N Fall ‘09 21: Introduction to Multithreading
Parallel Objects: Virtualization & In-Process Components
Team 1 Aakanksha Gupta, Solomon Walker, Guanghong Wang
Parallelizing Incremental Bayesian Segmentation (IBS)
Performance Evaluation of Adaptive MPI
Introduction to Quantum ESPRESSO and Computer System
Simulation of Marketing Mix – Placement of Business
Development of the Nanoconfinement Science Gateway
LightRing with Tunable Transceivers
Parallel Spectral Renderer
Advanced Programming Behnam Hatami Fall 2017.
2009 AAG Annual Meeting Las Vegas, NV March 25th, 2009
Chaos Theory The theory of non-linear functions, such that small differences in the input of the function can result in large and unpredictable differences.
Presented by Nagesh Adluru
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt Oct 24, 2013.
Indiana University, Bloomington
Multithreaded Programming
The Challenge of Cross - Language Interoperability
Overview of Workflows: Why Use Them?
CPS120: Introduction to Computer Science
Applications of Genetic Algorithms TJHSST Computer Systems Lab
CPS120: Introduction to Computer Science
Simulation of Marketing Mix – Placement of Business
Parallel programming in Java
Excursions into Parallel Programming
Presentation transcript:

Excursions into Parallel Programming Sample Code: Excursions into Parallel Programming void move() { if (xs>0) MPI_Send(arr,arrsize*arrsize,MPI_INT,rank-1,tag,MPI_COMM_WORLD); if (xs<sqrt(size)-1) MPI_Send(arr,arrsize*arrsize,MPI_INT,rank+1,tag,MPI_COMM_WORLD); if (ys>0) MPI_Send(arr,arrsize*arrsize,MPI_INT,rank-sqrt(size),tag,MPI_COMM_WORLD); if (ys<sqrt(size)-1) MPI_Send(arr,arrsize*arrsize,MPI_INT,rank+sqrt(size),tag,MPI_COMM_WORLD); //after sending, all computers waiting for other processes MPI_Recv(rec,arrsize*arrsize,MPI_INT,rank-1,tag,MPI_COMM_WORLD,&status); for (x=amount;x>=1;x--) for (y=ya;y<ya+xymax;y++) arr[xa-amount][y]=rec[xa-amount][y]; } MPI_Recv(rec,arrsize*arrsize,MPI_INT,rank+1,tag,MPI_COMM_WORLD,&status); for(x=amount;x>=1;x--) for(y=ya;y<ya+xymax;y++) arr[xa+xymax+amount-1][y]=rec[xa+xymax+amount-1][y]; MPI_Recv(rec,arrsize*arrsize,MPI_INT,rank-sqrt(size),tag,MPI_COMM_WORLD,&status); for (y=amount;y>=1;y--) for (x=xa;x<xa+xymax;x++) arr[x][ya-amount]=rec[x][ya-amount]; MPI_Recv(rec,arrsize*arrsize,MPI_INT,rank+sqrt(size),tag,MPI_COMM_WORLD,&status); arr[x][ya+xymax+amount-1]=rec[x][ya+xymax+amount-1]; Abstract: With more and more computationally-intense problems appearing through the fields of math, science, and technology, a need for better processing power is needed in computers. MPI (Message Passing Interface) is one of the most crucial and effective ways of programming in parallel, and despite its difficulty to use at times, it has remained the de facto standard. But technology has been advancing, and new MPI libraries have been created to keep up with the capabilities of supercomputers. Efficiency has become the new keyword in finding the number of computers to use, as well as the latency when passing messages. The purpose of this project is to explore some of these methods of optimization in specific cases like the game of life problem. By Michael Chen November 1, 2007 TJHSST Computer Systems Lab 2007-2008 Introduction: -MPI has high computational power, used in every field for intense calculations (AI, molecular biology, ecosystems) -Expansions in library to adjust for use with supercomputers -Efficiency becoming an issue with latency v. processing Power: where is the middle point? Running simulation of game of life with 9 processors Portion of move() method from the Game of Life Procedures and Methodology: -Uses C, C++, and Fortran -Flexible, not restricted to certain capabilities -Start with non-mpi code, then convert -Works very case by case, no general testing program -Optimization of code depends on a computer’s specific latency and its processing power. Depending on which works better, the number of computers best used as well as amount of passing in code used changes. Results and Conclusions: -Many real-life applications -blocking vs. non-blocking checkpointing -supercomputing -With the game of life, more factors will have to be considered than just number of computers: how often to pass information, and how much information to pass. -Moving the program from theoretical to the practical -Latency algorithms for testing possible Diagram of latency vs. processing