Download presentation
Presentation is loading. Please wait.
Published byTyrone Dennis Modified over 6 years ago
1
Course Description: Parallel Computer Architecture
12/8/2018 \course\eleg652-04F\Topic0a.ppt
2
Reading List Slides: Topic1x Henn&Patt: Chapter 1
CullerSingh98: Chapter 1 Other assigned readings from homework and classes 12/8/2018 \course\eleg652-04F\Topic0a.ppt
3
Why Study Parallel Architecture?
Role of a computer architect: To design and engineer the various levels of a computer system to maximize performance and programmability within limits of technology and cost. Parallelism: Provides alternative to faster clock for performance Applies at all levels of system design Is a fascinating perspective from which to view architecture Is increasingly central in information processing 12/8/2018 \course\eleg652-04F\Topic0a.ppt
4
Inevitability of Parallel Computing
Application demands Technology Trends Architecture Trends Economics 12/8/2018 \course\eleg652-04F\Topic0a.ppt
5
Application Trends Demand for cycles fuels advances in hardware, and vice-versa Range of performance demands Goal of applications in using parallel machines: Speedup Productivity requirement 12/8/2018 \course\eleg652-04F\Topic0a.ppt
6
Summary of Application Trends
Transition to parallel computing has occurred for scientific and engineering computing In rapid progress in commercial computing Desktop also uses multithreaded programs, which are a lot like parallel programs Demand for improving throughput on sequential workloads Demand on productivity 12/8/2018 \course\eleg652-04F\Topic0a.ppt
7
Technology: A Closer Look
Basic advance is decreasing feature size ( ) Clock rate improves roughly proportional to improvement in Number of transistors improves like (or faster) Performance > 100x per decade; clock rate 10x, rest transistor count How to use more transistors? Parallelism in processing Locality in data access Both need resources, so tradeoff Proc $ Interconnect 12/8/2018 \course\eleg652-04F\Topic0a.ppt
8
Clock Frequency Growth Rate
30% per year 12/8/2018 \course\eleg652-04F\Topic0a.ppt
9
Transistor Count Growth Rate
1 billion transistors on chip in early 2000’s A.D. Transistor count grows much faster than clock rate - 40% per year, order of magnitude more contribution in 2 decades 12/8/2018 \course\eleg652-04F\Topic0a.ppt
10
Similar Story for Storage
Divergence between memory capacity and speed more pronounced Larger memories are slower Need deeper cache hierarchies Parallelism and locality within memory systems Disks too: Parallel disks plus caching 12/8/2018 \course\eleg652-04F\Topic0a.ppt
11
Moore’s Law and Headcount
Along with the number of transistors, the effort and headcount required to design a microprocessor has grown exponentially 12/8/2018 \course\eleg652-04F\Topic0a.ppt
12
Architectural Trends Architecture: performance and capability
Tradeoff between parallelism and locality Current microprocessor: 1/3 compute, 1/3 cache, 1/3 off-chip connect Understanding microprocessor architectural trends Four generations of architectural history: tube, transistor, IC, VLSI 12/8/2018 \course\eleg652-04F\Topic0a.ppt
13
Technology Progress Overview
Processor speed improvement: 2x per year (since 85). 100x in last decade. DRAM Memory Capacity: 2x in 2 years (since 96). 64x in last decade. DISK capacity: 2x per year (since 97) x in last decade. 12/8/2018 \course\eleg652-04F\Topic0a.ppt
14
Motorola’s PowerPC 604 Pentium 12/8/2018
\course\eleg652-04F\Topic0a.ppt
15
12/8/2018 \course\eleg652-04F\Topic0a.ppt
16
Technology Progress Overview
Processor speed improvement: 2x per year (since 85). 100x in last decade. DRAM Memory Capacity: 2x in 2 years (since 96). 64x in last decade. DISK capacity: 2x per year (since 97) x in last decade. 12/8/2018 \course\eleg652-04F\Topic0a.ppt
17
Summary: Parallel Architecture?
Increasingly attractive Economics, technology, architecture, application Parallelism exploited at many levels Same story from memory system perspective Wide range of parallel architectures make sense 12/8/2018 \course\eleg652-04F\Topic0a.ppt
18
12/8/2018 \course\eleg652-04F\Topic0a.ppt
19
12/8/2018 \course\eleg652-04F\Topic0a.ppt
20
12/8/2018 \course\eleg652-04F\Topic0a.ppt
21
12/8/2018 \course\eleg652-04F\Topic0a.ppt
22
12/8/2018 \course\eleg652-04F\Topic0a.ppt
23
The Earth Simulator Machine in Japan
Max 40 TFLOPS No.1 in TOP500 list General purpose Parallel vector processors 400 M$(development) 12/8/2018 \course\eleg652-04F\Topic0a.ppt
24
12/8/2018 \course\eleg652-04F\Topic0a.ppt
25
HPC Architecture Vector Processor ⇒ 1976~ Parallel Processors ⇒ 1985~
MPU Cluster、Grid ⇒ ~ massively PP ⇒ ~2010 (CRAY-1) (CM-1) (ASCI-RED) (DARPA-HPCS machines GRAPE-DR BlueGene/L BG/C64 ) 12/8/2018 \course\eleg652-04F\Topic0a.ppt
26
Cluster computer of commodity MPU ⇒ 1997~
ASCI Project ASCI-Q 20TFLOPS(2003) 8,192 CPUs、 ASCI-Purple 100TFLOPS(2005) 12,544 CPUs OLNL project (2004) Limitation of current cluster Low utilization of CPU due to high-latency in interconnection No automatic parallelization Limitation by size and power ASCI-Purple (12,544 CPUs) 3MW ASCI-Q 20TFLOPS 12/8/2018 \course\eleg652-04F\Topic0a.ppt
27
New generation parallel systems ⇒ 2008~
IBM BlueGene/L Project (360TFLOPS、2005) High density parallel processor (65,536 CPU chips in 64 racks、 131,072 processors) IBM BlueGene/C64 Project (1.1 PFlops, 2007 ?) HPCS Project IBM PERCS Cray Cascade SUN Hero project IBM Blue Gene/L 12/8/2018 \course\eleg652-04F\Topic0a.ppt
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.