Download presentation
Presentation is loading. Please wait.
1
Husky Energy Chair in Oil and Gas Research
Computer Engineering of Wave Machines for Seismic Modeling and Seismic Migration R. Phillip Bording February 17, 2004 Husky Energy Chair in Oil and Gas Research Memorial University of Newfoundland Max Address M U N - February 17, Phil Bording
2
Session 1 History of Design. Tyco Brahe. Napier
Session 1 History of Design Tyco Brahe Napier Charles Babbage – mechanical design John Atanasoff – Storage – spinning capacitor - Konrad Zuse - Floating Point Mauchley and Ekert von-Neumann Harvard memory – code memory - data Princeton memory code and data
3
Session 2 Current Design Issues. Scaling laws. Moore’s Law
Session 2 Current Design Issues Scaling laws Moore’s Law Transistors – VLSI Memory – Technology Division of Design The memory Challenge The processor Challenge The ILLIAC – PEPE IBM IBM 360/44 IBM 360/95 Array Processors the software of array processor calls Programming Models vectors shared memory distributed memory
4
M U N - February 17, 2005 - Phil Bording
Lamda Rules M U N - February 17, Phil Bording
5
M U N - February 17, 2005 - Phil Bording
Division of design Company A ALU Memory Memory Weak Link ALU One Company Company B M U N - February 17, Phil Bording
6
M U N - February 17, 2005 - Phil Bording
Moore’s Laws Every 18 months the density of transistors on a VLSI chip doubles The investments of $ doubles with every new VLSI plant M U N - February 17, Phil Bording
7
M U N - February 17, 2005 - Phil Bording
Illiac 8 X 8 Processors Nearest Neighbor Connections M U N - February 17, Phil Bording
8
Parallel Ensemble Processing Elements - PEPE
Radar Processing Computer Associative Computing Data Outputs P0 Pn-3 Pn-2 Pn-1 Pn Data Inputs M U N - February 17, Phil Bording
9
M U N - February 17, 2005 - Phil Bording
IBM Machines Early 1960’s 7094, 36 bit arithmetic 1600 and 1400 processors completely different Middle 1960’s New Machine – IBM 360 36 bit words, but memory parity was added 8 bit byte + 1 bit parity Uniform business machine architectures 32 and 64 bit floating point Not any industry standard for format of floating point M U N - February 17, Phil Bording
10
M U N - February 17, 2005 - Phil Bording
Array Processors IBM and CDC designed DMA processors – Direct Memory Access Frees the main processor to compute Allows separate simple processors to do the i/o The idea translated into attached processors for arithmetic processing M U N - February 17, Phil Bording
11
M U N - February 17, 2005 - Phil Bording
Array Processors Arrays of data are moved to a local very high speed memory – fast registers Arithmetic is performed by special instructions passed to array processor CPU Array Processor M U N - February 17, Phil Bording
12
Software Design Issues
Vector Programming Cache Programming Message Passing Programming NUMA Programming Grid Programming ALL of these memory operations have a Fixed Cost Code Performance Improvements are dominated by fixed costs M U N - February 17, Phil Bording
13
Hardware Design Issues
10 Years equals 100 Fold Speedup Memory Latency – cost of getting the first word is a constant Wires have failed to scale Bigger cache memories are slower Code Performance Improvements are dominated by fixed costs M U N - February 17, Phil Bording
14
M U N - February 17, 2005 - Phil Bording
Linear Address Space Max Address Address Pointer Latency is the time to access the first word Bandwidth is the rate of accessing successive words M U N - February 17, Phil Bording
15
von Neumann Architecture Princeton
Memory Address Pointer Arithmetic Logic Unit (ALU) Data/Instructions Pc = Pc + 1 Program Counter Featuring Deterministic Execution M U N - February 17, Phil Bording
16
Cache Memory Architecture
N T R L Memory Main Memory is large and slow. Cache is much smaller and much faster. Control logic control keeps the main memory coherent. Cache Memory Address Pointer Featuring Non-Deterministic Execution M U N - February 17, Phil Bording
17
Cache Memory - Three Levels Architecture
Multi- Gigabytes Large and Slow 160 X Cache Control Logic 2 Gigahertz Clock 2X 8X 16X L3 Cache Memory L2 Cache Memory L1 Cache Memory 32 Kilobytes 128 Kilobytes 16 Megabytes Featuring Really Non-Deterministic Execution Address Pointer M U N - February 17, Phil Bording
18
Programming Models for Parallel Computing
M U N - February 17, Phil Bording
19
Distributed Computing Message Passing Interface
Program Address Spaces Max Max Max Max Multiple Address Pointers M U N - February 17, Phil Bording
20
Distributed Computing with Message Passing
Program Address Spaces Messages Left and Right Multiple Address Pointers M U N - February 17, Phil Bording
21
M U N - February 17, 2005 - Phil Bording
22
Multi-Threading OpenMP Programming Model
Global Program Address Space Local Local Local Local n-1 n n-1 2n n-1 3n n-1 Address and Cache Bus with Conflict Resolution Multiple Address Pointers M U N - February 17, Phil Bording
23
Uniqueness of Store Multi-Threading
Program Address Space Multiple Address Pointers Duplicate Pointers to the same Location – Conflict on storing a result So who is managing the multiple pointers? It is the programmers responsibility. M U N - February 17, Phil Bording
24
Multiple Bank Memory Systems
Memory Banks Bank Starting Address N N N Mod 4 Vector Programming Model M U N - February 17, Phil Bording
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.