CPS 258, Fall 2004 Introduction to Computational Science.

Slides:



Advertisements
Similar presentations
Datorteknik F1 bild 1 Higher Level Parallelism The PRAM Model Vector Processors Flynn Classification Connection Machine CM-2 (SIMD) Communication Networks.
Advertisements

Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Superscalar and VLIW Architectures Miodrag Bolic CEG3151.
SE-292 High Performance Computing
Instruction-Level Parallel Processors {Objective: executing two or more instructions in parallel} 4.1 Evolution and overview of ILP-processors 4.2 Dependencies.
The University of Adelaide, School of Computer Science
TIE Extensions for Cryptographic Acceleration Charles-Henri Gros Alan Keefer Ankur Singla.
1 Lecture 10: Static ILP Basics Topics: loop unrolling, static branch prediction, VLIW (Sections 4.1 – 4.4)
1 Parallel Scientific Computing: Algorithms and Tools Lecture #3 APMA 2821A, Spring 2008 Instructors: George Em Karniadakis Leopold Grinberg.
Instruction Level Parallelism (ILP) Colin Stevens.

©UCB CS 162 Computer Architecture Lecture 1 Instructor: L.N. Bhuyan
Slide 1 Instructor: Dr. Hong Jiang Teaching Assistants: Hailong Cai & Zhimin Wang Department of Computer Science & Engineering University of Nebraska-Lincoln.
Fall 2008Introduction to Parallel Processing1 Introduction to Parallel Processing.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Introduction to Parallel Processing Ch. 12, Pg
CMSC 611: Advanced Computer Architecture Parallel Computation Most slides adapted from David Patterson. Some from Mohomed Younis.
10-1 Chapter 10 - Trends in Computer Architecture Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles.
CS 470/570:Introduction to Parallel and Distributed Computing.
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
1 Parallel computing and its recent topics. 2 Outline 1. Introduction of parallel processing (1)What is parallel processing (2)Classification of parallel.
Parallel and Distributed Computing Overview and Syllabus Professor Johnnie Baker Guest Lecturer: Robert Walker.
Parallel Computing Basic Concepts Computational Models Synchronous vs. Asynchronous The Flynn Taxonomy Shared versus Distributed Memory Interconnection.
Basics and Architectures
10-1 Chapter 10 - Advanced Computer Architecture Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring.
HPC Technology Track: Foundations of Computational Science Lecture 2 Dr. Greg Wettstein, Ph.D. Research Support Group Leader Division of Information Technology.
Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs.
1 Chapter 1 Parallel Machines and Computations (Fundamentals of Parallel Processing) Dr. Ranette Halverson.
ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2006outline.1 ITCS 4145/5145 Parallel Programming (Cluster Computing) Fall 2006 Barry Wilkinson.
Parallel Processing - introduction  Traditionally, the computer has been viewed as a sequential machine. This view of the computer has never been entirely.
CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING CS 147 Guy Wong page
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
An Overview of Parallel Computing. Hardware There are many varieties of parallel computing hardware and many different architectures The original classification.
CIS 662 – Computer Architecture – Fall Class 16 – 11/09/04 1 Compiler Techniques for ILP  So far we have explored dynamic hardware techniques for.
Pipelining and Parallelism Mark Staveley
1 Basic Components of a Parallel (or Serial) Computer CPU MEM CPU MEM CPU MEM CPU MEM CPU MEM CPU MEM CPU MEM CPU MEM CPU MEM.
Parallel and Distributed Computing Overview and Syllabus Professor Johnnie Baker Guest Lecturer: Robert Walker.
Outline Why this subject? What is High Performance Computing?
1 Introduction ELG 6158 Digital Systems Architecture Miodrag Bolic.
EKT303/4 Superscalar vs Super-pipelined.
10-1 Chapter 10 - Trends in Computer Architecture Department of Information Technology, Radford University ITEC 352 Computer Organization Principles of.
Lecture 3: Computer Architectures
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 February Session 9.
Parallel Processing Presented by: Wanki Ho CS147, Section 1.
CPS 258 Announcements –Lecture calendar with slides –Pointers to related material.
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
Slide 1 Instructor: Dr. Hong Jiang Teaching Assistant: Ms. Yuanyuan Lu Department of Computer Science & Engineering University of Nebraska-Lincoln Classroom:
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Processor Level Parallelism 1
1 ECE 486/586 Computer Architecture I Chapter 1 Instructor and You Herbert G. Mayer, PSU Status 7/21/2016.
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
Advanced Architectures
CHAPTER SEVEN PARALLEL PROCESSING © Prepared By: Razif Razali.
Higher Level Parallelism
ECE 486/586 Computer Architecture Introductions Instructor and You
CMSC 611: Advanced Computer Architecture
buses, crossing switch, multistage network.
Parallel Processing - introduction
Flynn’s Classification Of Computer Architectures
Morgan Kaufmann Publishers
Symmetric Multiprocessing (SMP)
STUDY AND IMPLEMENTATION
Coe818 Advanced Computer Architecture
buses, crossing switch, multistage network.
Mattan Erez The University of Texas at Austin
AN INTRODUCTION ON PARALLEL PROCESSING
Part 2: Parallel Models (I)
Overview Prof. Eric Rotenberg
Parallel & Distributed Computing Fall 2006
Presentation transcript:

CPS 258, Fall 2004 Introduction to Computational Science

Introduction to Computational Science An introduction to a wide variety of methods in computational science to facilitate interdisciplinary collaborative research Cover leading techniques and potential applications to research Introduce PRACTICAL computational methods Bring together graduate students from different disciplines in science, engineering and the basic sciences Underline the "common-ground" of computational methods

Administeria Lead Instructor: Nikos Pitsianis Prerequisites: Programming experience, calculus, numerical linear algebra or equivalent Schedule: Tu-Th, 2:50 PM-4:05 PM LSRC A156 Grading: –30% Class Participation –30% Homework Assignments –40% Final Project Credit: 3 hours Office Hours: to be announced Course Admin: Mindy Quigley

Instructors Nikos Pitsianis and Rachael Brady Tod Laursen Bill Rankin …

Syllabus High performance computer architectures Linear Algebra Visualization Spatial & Time Integration Finite Elements, Applied to PDEs and ODEs Fast Transforms Introduction to MPI programming Schroedinger’s Equation Molecular Dynamics Stochastic Optimization and Integration

Abstraction and Portability vs Performance High level programming Easy maintenance Flexible and reusable code Portable to other architectures

Questions Can my program be faster/more accurate/stable –Computational complexity –Algorithm choice –Implementation How can I make it better Is it worth the effort

Parallel Architectures

Parallelism Levels Job Program Instruction Bit

Parallel Architectures Pipelining Multiple execution units –Superscalar –VLIW Multiple processors

Pipelining Example Load UALUStore load x(i) load y(i) load x(i+1)add z(i),x(i),y(i) load xy(i+1) store z(i) add z(i+1),x(i+1),y(i+1) store z(i+1) Prologue Loop body Epilogue for i = 1:n z(i) = x(i) + y(i); end

Generic Computer CPU Memory Bus

Memory Organization Distributed memory Shared memory

Shared Memory

Distributed Memory

Interleaved Memory

Network Topologies Ring Torus Tree Star Hypercube Cross-bar

Flynn’s Taxonomy SISD SIMD MISD MIMD

Instruction Processing Stages Fetch Decode Execute Post

Vector Architectures Single Instruction Multiple Data Exploit uniformity of operations Multiple execution units Pipelining Hardware assisted loops Vectorizing compilers