Lecture 1 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.

Slides:



Advertisements
Similar presentations
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Advertisements

1 Computational models of the physical world Cortical bone Trabecular bone.
Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Parallel Programming Yang Xianchun Department of Computer Science and Technology Nanjing University Introduction.
GPU System Architecture Alan Gray EPCC The University of Edinburgh.
Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.
Information Technology Center Introduction to High Performance Computing at KFUPM.
PARALLEL PROCESSING COMPARATIVE STUDY 1. CONTEXT How to finish a work in short time???? Solution To use quicker worker. Inconvenient: The speed of worker.
Parallel Programming Henri Bal Rob van Nieuwpoort Vrije Universiteit Amsterdam Faculty of Sciences.
Introduction CS 524 – High-Performance Computing.
Some Thoughts on Technology and Strategies for Petaflops.
Parallel Computing Overview CS 524 – High-Performance Computing.
An Introduction to Princeton’s New Computing Resources: IBM Blue Gene, SGI Altix, and Dell Beowulf Cluster PICASso Mini-Course October 18, 2006 Curt Hillegas.
Parallel/Concurrent Programming on the SGI Altix Conley Read January 25, 2007 UC Riverside, Department of Computer Science.
Lecture 1: Introduction to High Performance Computing.
Heterogeneous Computing Dr. Jason D. Bakos. Heterogeneous Computing 2 “Traditional” Parallel/Multi-Processing Large-scale parallel platforms: –Individual.
Parallel Programming Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
HPCC Mid-Morning Break Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery Introduction to the new GPU (GFX) cluster.
Scientific Computing on Smartphones David P. Anderson Space Sciences Lab University of California, Berkeley April 17, 2014.
Lecture 2 : Introduction to Multicore Computing Bong-Soo Sohn Associate Professor School of Computer Science and Engineering Chung-Ang University.
08/21/2012CS4230 CS4230 Parallel Programming Lecture 1: Introduction Mary Hall August 21,
GPU Programming with CUDA – Accelerated Architectures Mike Griffiths
1 Programming Multicore Processors Aamir Shafi High Performance Computing Lab
Training Program on GPU Programming with CUDA 31 st July, 7 th Aug, 14 th Aug 2011 CUDA Teaching UoM.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Slides Courtesy Michael J. Quinn Parallel Programming in C.
© David Kirk/NVIDIA and Wen-mei W. Hwu, 1 Programming Massively Parallel Processors Lecture Slides for Chapter 1: Introduction.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
Reminder Lab 0 Xilinx ISE tutorial Research Send me an if interested Looking for those interested in RC with skills in compilers/languages/synthesis,
Introduction to Reconfigurable Computing Greg Stitt ECE Department University of Florida.
- Rohan Dhamnaskar. Overview  What is a Supercomputer  Some Concepts  Couple of examples.
Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Multicore Computing Lecture 1 : Course Overview Bong-Soo Sohn Associate Professor School of Computer Science and Engineering Chung-Ang University.
1 Ceng 545 GPU Computing. Grading 2 Midterm Exam: 20% Homeworks: 40% Demo/knowledge: 25% Functionality: 40% Report: 35% Project: 40% Design Document:
Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson, Dec 26, 2012outline.1 ITCS 4145/5145 Parallel Programming Spring 2013 Barry Wilkinson Department.
By Chi-Chang Chen.  Cluster computing is a technique of linking two or more computers into a network (usually through a local area network) in order.
CS 52500, Parallel Computing Spring 2011 Alex Pothen Lectures: Tues, Thurs, 3:00—4:15 PM, BRNG 2275 Office Hours: Wed 3:00—4:00 PM; Thurs 4:30—5:30 PM;
CPE432: Computer Design Course Introduction Dr. Gheith Abandah د. غيث علي عبندة.
Background Computer System Architectures Computer System Software.
VU-Advanced Computer Architecture Lecture 1-Introduction 1 Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 1.
Hybrid Parallel Implementation of The DG Method Advanced Computing Department/ CAAM 03/03/2016 N. Chaabane, B. Riviere, H. Calandra, M. Sekachev, S. Hamlaoui.
Fermi National Accelerator Laboratory & Thomas Jefferson National Accelerator Facility SciDAC LQCD Software The Department of Energy (DOE) Office of Science.
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
Introduction. News you can use Hardware –Multicore chips (2009: mostly 2 cores and 4 cores, but doubling) (cores=processors) –Servers (often.
Why Parallel/Distributed Computing Sushil K. Prasad
Introduction CSE 410, Spring 2005 Computer Systems
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Welcome to CSE 502 Introduction.
Introduction Super-computing Tuesday
Constructing a system with multiple computers or processors
Introduction to Reconfigurable Computing
What is Parallel and Distributed computing?
Introduction.
Introduction.
Dr. Barry Wilkinson © B. Wilkinson Modification date: Jan 9a, 2014
CSCE569 Parallel Computing
CS/EE 6810: Computer Architecture
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Chapter 1 Introduction.
Welcome to CSE 502 Introduction.
Human Media Multicore Computing Lecture 1 : Course Overview
Types of Parallel Computers
Presentation transcript:

Lecture 1 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of Computer Science and Engineering

CSCE569 Course Information Meet time: TTH 03:30AM-04:45PM Swearingen 2A21 4 Homework Use CSE turn-in system to submit your Homework ( Deadline policy 1 Midterm Exam (conceptual understanding) 1 Final Project (deliverable to your future employer!) Teamwork Implementation project/research project TA: No TA.

CSCE569 Course Information Textbook and references Parallel Programming: for Multicore and Cluster Systems By:Thomas Rauber (Author), Gudula Rünger (Author) Publisher: Springer; 1st Edition. edition (March 10, 2010) Good reference book: Parallel Programming in C with MPI and OpenMP by Michael J. QuinnMichael J. Quinn Most important information sources: Slides. Grading policy 4 homeworks, 1 midterm, 1 final project, in-class participation

About Your Instructor Dr. Jianjun Hu Office hours: TTH 2:30-3:20PM or Drop by any time Office Phone#: A66 SWNG Background: Mechanical Engineering/CAD Machine learning/Computational intelligence/Genetic Algorithms/Genetic Programming (PhD) Bioinformatics and Genomics (Postdoc) Multi-disciplinary just as parallel computing app.

Outline Motivation Modern scientific method Evolution of supercomputing Modern parallel computers Seeking concurrency Data clustering case study Programming parallel computers

Why You are Here? Solve BIG problems Use Supercomputers Write parallel programs

Why Faster Computers? Solve compute-intensive problems faster Make infeasible problems feasible Reduce design time Solve larger problems in same amount of time Improve answer’s precision Reduce design time Gain competitive advantage

Why Parallel Computing? The massively parallel architecture of GPUs, coming from its graphics heritage, is now delivering transformative results for scientists and researchers all over the world. For some of the world’s most challenging problems in medical research, drug discovery, weather modeling, and seismic exploration – computation is the ultimate tool. Without it, research would still be confined to trial and error-based physical experiments and observation.

What problems need Parallel Computing?

Parallel Computing in the Real-world Engineering Science Business Game Cloud-computing

What This course can do for You? Understanding of parallel computer architectures Developing parallel programs for both clusters and shared memory multi-core system MPI/OpenMP Know basics of CUDA programming Learn to do performance analysis of parallel programs

Definitions Parallel computing Using parallel computer to solve single problems faster Parallel computer Multiple-processor/core system supporting parallel programming Parallel programming Programming in a language that supports concurrency explicitly

Classical Science Nature Observation Theory Physical Experimentation

Modern Scientific Method Nature Observation Theory Physical Experimentation Numerical Simulation

Evolution of Supercomputing World War II Hand-computed artillery tables Need to speed computations ENIAC Cold War Nuclear weapon design Intelligence gathering Code-breaking

Supercomputer General-purpose computer Solves individual problems at high speeds, compared with contemporary systems Typically costs $10 million or more Traditionally found in government labs

Commercial Supercomputing Started in capital-intensive industries Petroleum exploration Automobile manufacturing Other companies followed suit Pharmaceutical design Consumer products

CPUs 1 Million Times Faster Faster clock speeds Greater system concurrency Multiple functional units Concurrent instruction execution Speculative instruction execution

Systems 1 Billion Times Faster Processors are 1 million times faster Combine thousands of processors Parallel computer Multiple processors Supports parallel programming Parallel computing = Using a parallel computer to execute a program faster

Beowulf Concept NASA (Sterling and Becker) Commodity processors Commodity interconnect Linux operating system Message Passing Interface (MPI) library High performance/$ for certain applications

Computing speed of supercomputers

Projected Computing speed of supercomputers

Top 10 Supercomputers GPU

What you can use Hardware Multicore chips (2011: mostly 2 cores and 4 cores, but doubling) (cores=processors) Servers (often 2 or 4 multicores sharing memory) Clusters (often several, to tens, and many more servers not sharing memory) Supercomputer at USC CEC

Supercomputers at USC CEC 76 Compute Nodes w/ dual 3.4 GHz 64 Nodes: Dual CPU

Supercomputers at USC CEC SGI Altix 4700 Shared-memory system Hardware 128 Itanium 1.6 GHz/ 8MB Cache 256 GB RAM 8TB storage NUMAlink Interconnect Fabric Software SUSE10 w/SGI PROPACK Intel C/C++ and Fortran Compilers VASP PBSPro scheduling software Message Passing Toolkit Intel Math Kernel Library GNU Scientific Library Boost library

Some historical machines

Earth Simulator was #1

Some interesting hardware Nvidia Cell Processor Sicortex – “Teraflops from Milliwatts”

GPU-based supercomputing+CUDA

Topic1: Hardware Architecture of parallel computing system

Topic2: Programming/Software Common parallel computing methods PBS- job scheduling system MPI: The Message Passing Interface Low level “lowest common denominator” language that the world has stuck with for nearly 20 years Can get performance, but can be a hindrance as well Pthread for multi-core shared memory parallel programming CUDA GPU programming MapReduce Google style high-performance computing

Why MPI? MPI = “Message Passing Interface” Standard specification for message-passing libraries Libraries available on virtually all parallel computers Free libraries also available for networks of workstations or commodity clusters

Why OpenMP? OpenMP an application programming interface (API) for shared-memory systems Supports higher performance parallel programming of symmetrical multiprocessors

Topic3: Performance Single processor speeds for now no longer growing. Moore’s law still allows for more real estate per core (transistors double/nearly every two years) People want performance but hard to get Slowdowns seen before speedups Flops (floating point ops / second) Gigaflops (10 9 ), Teraflops (10 12 ),Petaflops(10 15 )

Summary (1/2) High performance computing U.S. government Capital-intensive industries Many companies and research labs Parallel computers Commercial systems Commodity-based systems

Summary (2/2) Power of CPUs keeps growing exponentially Parallel programming environments changing very slowly Two standards have emerged MPI library, for processes that do not share memory OpenMP directives, for processes that do share memory

Places to Look Best current news: Huge Conference: Top500.org