Course Description: Parallel Computer Architecture

Slides:



Advertisements
Similar presentations
TU/e Processor Design 5Z0321 Processor Design 5Z032 Computer Systems Overview Chapter 1 Henk Corporaal Eindhoven University of Technology 2011.
Advertisements

COE 502 / CSE 661 Parallel and Vector Architectures Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals.
Computer Architecture & Organization
Lecture 2: Modern Trends 1. 2 Microprocessor Performance Only 7% improvement in memory performance every year! 50% improvement in microprocessor performance.
Chapter1 Fundamental of Computer Design Dr. Bernard Chen Ph.D. University of Central Arkansas.
ECE669 L1: Course Introduction January 29, 2004 ECE 669 Parallel Computer Architecture Lecture 1 Course Introduction Prof. Russell Tessier Department of.
Room: E-3-31 Phone: Dr Masri Ayob TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Computer Performance.
8/30/2006eleg F1 Principles of Parallel Architecture Fall 2006 Keys to a happy Life: Diversity and Variety. Diversity in the people that you meet.
Introduction What is Parallel Algorithms? Why Parallel Algorithms? Evolution and Convergence of Parallel Algorithms Fundamental Design Issues.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 2 - Technology.
EET 4250: Chapter 1 Performance Measurement, Instruction Count & CPI Acknowledgements: Some slides and lecture notes for this course adapted from Prof.
Why Parallel Architecture? Todd C. Mowry CS 495 January 15, 2002.
CPE 731 Advanced Computer Architecture Multiprocessor Introduction
Comparing Computing Machines Dr. André DeHon UC Berkeley November 3, 1998.
Computer performance.
Chapter1 Fundamental of Computer Design Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
Computer Architecture Challenges Shriniwas Gadage.
2007 Sept 06SYSC 2001* - Fall SYSC2001-Ch1.ppt1 Computer Architecture & Organization  Instruction set, number of bits used for data representation,
Lecture 03: Fundamentals of Computer Design - Trends and Performance Kai Bu
EET 4250: Chapter 1 Computer Abstractions and Technology Acknowledgements: Some slides and lecture notes for this course adapted from Prof. Mary Jane Irwin.
1 Lecture 1: CS/ECE 3810 Introduction Today’s topics:  Why computer organization is important  Logistics  Modern trends.
Lecture 1: Performance EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2013, Dr. Rozier.
ECE 568: Modern Comp. Architectures and Intro to Parallel Processing Fall 2006 Ahmed Louri ECE Department.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
1 Recap (from Previous Lecture). 2 Computer Architecture Computer Architecture involves 3 inter- related components – Instruction set architecture (ISA):
2015/10/14Part-I1 Introduction to Parallel Processing.
Advanced Computer Architecture Fundamental of Computer Design Instruction Set Principles and Examples Pipelining:Basic and Intermediate Concepts Memory.
ECE 569: High-Performance Computing: Architectures, Algorithms and Technologies Spring 2006 Ahmed Louri ECE Department.
CS 240A Applied Parallel Computing John R. Gilbert Thanks to Kathy Yelick and Jim Demmel at UCB for.
Chapter 1 Computer Abstractions and Technology. Chapter 1 — Computer Abstractions and Technology — 2 The Computer Revolution Progress in computer technology.
Computer Organization & Assembly Language © by DR. M. Amer.
M U N - February 17, Phil Bording1 Computer Engineering of Wave Machines for Seismic Modeling and Seismic Migration R. Phillip Bording February.
CS591x -Cluster Computing and Parallel Programming
Morgan Kaufmann Publishers
1 chapter 1 Computer Architecture and Design ECE4480/5480 Computer Architecture and Design Department of Electrical and Computer Engineering University.
INEL6067 Technology ---> Limitations & Opportunities Wires -Area -Propagation speed Clock Power VLSI -I/O pin limitations -Chip area -Chip crossing delay.
CS433 Spring 2001 Introduction Laxmikant Kale. 2 Course objectives and outline You will learn about: –Parallel programming models Emphasis on 3: message.
EEL5708/Bölöni Lec 2.1 Fall 2004 August 27, 2004 Lotzi Bölöni Fall 2004 EEL 5708 High Performance Computer Architecture Lecture 2 Introduction: the big.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3.
Parallel Computers Today Oak Ridge / Cray Jaguar > 1.75 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS  TFLOPS = floating.
CS203 – Advanced Computer Architecture
VU-Advanced Computer Architecture Lecture 1-Introduction 1 Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 1.
BLUE GENE Sunitha M. Jenarius. What is Blue Gene A massively parallel supercomputer using tens of thousands of embedded PowerPC processors supporting.
SPRING 2012 Assembly Language. Definition 2 A microprocessor is a silicon chip which forms the core of a microcomputer the concept of what goes into a.
William Stallings Computer Organization and Architecture 6th Edition
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
Introduction to Parallel Processing
CS203 – Advanced Computer Architecture
Green cloud computing 2 Cs 595 Lecture 15.
Chapter1 Fundamental of Computer Design
Architecture & Organization 1
Introduction to Reconfigurable Computing
CS775: Computer Architecture
Computer Architecture
Architecture & Organization 1
BIC 10503: COMPUTER ARCHITECTURE
Performance of computer systems
Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.
CS 258 Parallel Computer Architecture
Chapter 1 Introduction.
INTRODUCTION TO COMPUTER ARCHITECTURE
Computer Evolution and Performance
What is Computer Architecture?
COMS 361 Computer Organization
What is Computer Architecture?
What is Computer Architecture?
Chapter 4 Multiprocessors
Performance of computer systems
CSE378 Introduction to Machine Organization
Presentation transcript:

Course Description: Parallel Computer Architecture 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Reading List Slides: Topic1x Henn&Patt: Chapter 1 CullerSingh98: Chapter 1 Other assigned readings from homework and classes 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Why Study Parallel Architecture? Role of a computer architect: To design and engineer the various levels of a computer system to maximize performance and programmability within limits of technology and cost. Parallelism: Provides alternative to faster clock for performance Applies at all levels of system design Is a fascinating perspective from which to view architecture Is increasingly central in information processing 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Inevitability of Parallel Computing Application demands Technology Trends Architecture Trends Economics 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Application Trends Demand for cycles fuels advances in hardware, and vice-versa Range of performance demands Goal of applications in using parallel machines: Speedup Productivity requirement 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Summary of Application Trends Transition to parallel computing has occurred for scientific and engineering computing In rapid progress in commercial computing Desktop also uses multithreaded programs, which are a lot like parallel programs Demand for improving throughput on sequential workloads Demand on productivity 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Technology: A Closer Look Basic advance is decreasing feature size ( ) Clock rate improves roughly proportional to improvement in  Number of transistors improves like (or faster) Performance > 100x per decade; clock rate 10x, rest transistor count How to use more transistors? Parallelism in processing Locality in data access Both need resources, so tradeoff Proc $ Interconnect 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Clock Frequency Growth Rate 30% per year 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Transistor Count Growth Rate 1 billion transistors on chip in early 2000’s A.D. Transistor count grows much faster than clock rate - 40% per year, order of magnitude more contribution in 2 decades 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Similar Story for Storage Divergence between memory capacity and speed more pronounced Larger memories are slower Need deeper cache hierarchies Parallelism and locality within memory systems Disks too: Parallel disks plus caching 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Moore’s Law and Headcount Along with the number of transistors, the effort and headcount required to design a microprocessor has grown exponentially 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Architectural Trends Architecture: performance and capability Tradeoff between parallelism and locality Current microprocessor: 1/3 compute, 1/3 cache, 1/3 off-chip connect Understanding microprocessor architectural trends Four generations of architectural history: tube, transistor, IC, VLSI 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Technology Progress Overview Processor speed improvement: 2x per year (since 85). 100x in last decade. DRAM Memory Capacity: 2x in 2 years (since 96). 64x in last decade. DISK capacity: 2x per year (since 97). 250x in last decade. 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Motorola’s PowerPC 604 Pentium 12/8/2018 \course\eleg652-04F\Topic0a.ppt

12/8/2018 \course\eleg652-04F\Topic0a.ppt

Technology Progress Overview Processor speed improvement: 2x per year (since 85). 100x in last decade. DRAM Memory Capacity: 2x in 2 years (since 96). 64x in last decade. DISK capacity: 2x per year (since 97). 250x in last decade. 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Summary: Parallel Architecture? Increasingly attractive Economics, technology, architecture, application Parallelism exploited at many levels Same story from memory system perspective Wide range of parallel architectures make sense 12/8/2018 \course\eleg652-04F\Topic0a.ppt

12/8/2018 \course\eleg652-04F\Topic0a.ppt

12/8/2018 \course\eleg652-04F\Topic0a.ppt

12/8/2018 \course\eleg652-04F\Topic0a.ppt

12/8/2018 \course\eleg652-04F\Topic0a.ppt

12/8/2018 \course\eleg652-04F\Topic0a.ppt

The Earth Simulator Machine in Japan Max 40 TFLOPS No.1 in TOP500 list General purpose Parallel vector processors 400 M$(development) 12/8/2018 \course\eleg652-04F\Topic0a.ppt

12/8/2018 \course\eleg652-04F\Topic0a.ppt

HPC Architecture Vector Processor ⇒ 1976~ Parallel Processors ⇒ 1985~ MPU Cluster、Grid ⇒ 1997~ massively PP ⇒ 2008~2010 (CRAY-1) (CM-1) (ASCI-RED) (DARPA-HPCS machines GRAPE-DR BlueGene/L BG/C64 ) 12/8/2018 \course\eleg652-04F\Topic0a.ppt

Cluster computer of commodity MPU ⇒ 1997~ ASCI Project   ASCI-Q 20TFLOPS(2003)       8,192 CPUs、 ASCI-Purple 100TFLOPS(2005)   12,544 CPUs OLNL project (2004) Limitation of current cluster Low utilization of CPU due to high-latency in interconnection No automatic parallelization Limitation by size and power ASCI-Purple (12,544 CPUs) 3MW ASCI-Q 20TFLOPS 12/8/2018 \course\eleg652-04F\Topic0a.ppt

New generation parallel systems ⇒ 2008~ IBM BlueGene/L Project (360TFLOPS、2005) High density parallel processor   (65,536 CPU chips in 64 racks、                            131,072 processors) IBM BlueGene/C64 Project (1.1 PFlops, 2007 ?) HPCS Project IBM PERCS Cray Cascade SUN Hero project  IBM Blue Gene/L 12/8/2018 \course\eleg652-04F\Topic0a.ppt