Download presentation
Presentation is loading. Please wait.
Published byJack Robbins Modified over 9 years ago
1
Parallelism: A Serious Goal or a Silly Mantra (some half-thought-out ideas)
2
Random thoughts on Parallelism Why the sudden preoccupation with parallelism? The Silliness (or what I call Meganonsense) –Break the problem Use half the energy –1000 mickey mouse cores –Hardware is sequential –Server throughput (how many pins?) –What about GPUs and Data Base? Current bugs to exploiting parallelism (or are they?) –Dark silicon –Amdahl’s Law –The Cloud The answer –The fundamental concept vis-à-vis parallelism –What it means re: the transformation hierarchy
3
Random thoughts on Parallelism Why the sudden preoccupation with parallelism? The Silliness (or what I call Meganonsense) –Break the problem Use half the energy –1000 mickey mouse cores –Hardware is sequential –Server throughput (how many pins?) –What about GPUs and Data Base? Current bugs to exploiting parallelism (or are they?) –Dark silicon –Amdahl’s Law –The Cloud The answer –The fundamental concept vis-à-vis parallelism –What it means re: the transformation hierarchy
4
It starts with the raw material (Moore’s Law) The first microprocessor (Intel 4004), 1971 –2300 transistors –106 KHz The Pentium chip, 1992 –3.1 million transistors –66 MHz Today –more than one billion transistors –Frequencies in excess of 5 GHz Tomorrow ?
5
And what we have done with this raw material
6
Too many people do not realize: Parallelism did not start with Multi-core Pipelining Out-of-order Execution Multiple operations in a single microinstruction VLIW (horizontal microcode exposed to the software)
7
Random thoughts on Parallelism Why the sudden preoccupation with parallelism? The Silliness (or what I call Meganonsense) –Break the problem Use half the energy –1000 mickey mouse cores –Hardware is sequential –Server throughput (how many pins?) –What about GPUs and Data Base? Current bugs to exploiting parallelism (or are they?) –Dark silicon –Amdahl’s Law –The Cloud The answer –The fundamental concept vis-à-vis parallelism –What it means re: the transformation hierarchy
8
One thousand mickey mouse cores Why not a million? Why not ten million? Let’s start with 16 –What if we could replace 4 with one more powerful core? …and we learned: –One more powerful core is not enough –Sometimes we need several –Morphcore was born –BUT not all morphcore (fixed function vs flexibility)
9
The Asymmetric Chip Multiprocessor (ACMP) Niagara -like core Large core ACMP Approach Niagara -like core “Niagara” Approach Large core Large core Large core “Tile-Large” Approach
10
Large core vs. Small Core Out-of-order Wide fetch e.g. 4-wide Deeper pipeline Aggressive branch predictor (e.g. hybrid) Many functional units Trace cache Memory dependence speculation In-order Narrow Fetch e.g. 2-wide Shallow pipeline Simple branch predictor (e.g. Gshare) Few functional units Large Core Small Core
11
Throughput vs. Serial Performance
12
Server throughput The Good News: Not a software problem –Each core runs its own problem The Bad News: How many pins? –Memory bandwidth More Bad News: How much energy? –Each core runs its own problem
13
What about GPUs and Data Base In theory, absolutely! GPUs (SMT + SIMD + Predication) –Provided there are no conditional branches (Divergence) –Provided memory accesses line up nicely (Coalescing) Data Bases –Provided there are no critical sections
14
Random thoughts on Parallelism Why the sudden preoccupation with parallelism? The Silliness (or what I call Meganonsense) –Break the problem Use half the energy –1000 mickey mouse cores –Hardware is sequential –Server throughput (how many pins?) –What about GPUs and Data Base? Current bugs to exploiting parallelism (or are they?) –Dark silicon –Amdahl’s Law –The Cloud The answer –The fundamental concept vis-à-vis parallelism –What it means re: the transformation hierarchy
15
Dark Silicon Too many transistors: we can not power them all –All those cores powered down –All that parallelism wasted Not really: The Refrigerator! (aka: Accelerators) –Fork (in parallel) –Although not all at the same time!
16
Amdahl’s Law The serial bottleneck always limits performance Heterogeneous cores AND control over them can minimize the effect
17
The Cloud It is behind the curtain, how to manage it Answer: the on-chip run-time system Answer: Pragmas beyond the Cloud
18
Random thoughts on Parallelism Why the sudden preoccupation with parallelism? The Silliness (or what I call Meganonsense) –Break the problem Use half the energy –1000 mickey mouse cores –Hardware is sequential –Server throughput (how many pins?) –What about GPUs and Data Base? Current bugs to exploiting parallelism (or are they?) –Dark silicon –Amdahl’s Law –The Cloud The answer –The fundamental concept vis-à-vis parallelism –What it means re: the transformation hierarchy
19
The fundamental concept: Synchronization
20
Algorithm Program ISA (Instruction Set Arch) Microarchitecture Circuits Problem Electrons
21
At every layer we synchronize Algorithm: task dependencies ISA: sequential control flow (implicit) Microarchitecture: ready bits Circuit : clock cycle (implicit)
22
Who understands this? Should this be part of students’ parallelism education? Where should it come in the curriculum? Can students even understand these different layers?
23
Parallel to Sequential to Parallel Guri says: think sequential, execute parallel –i.e. don’t throw away 60 years of computing experience –The original HPS model of out-of-order execution –Synchronization is obvious: restricted data flow At the higher level, parallel at larger granularity –Pragmas in JAVA? Who would have thought! –Dave Kuck’s CEDAR project, vintage 1985 –Synchronization is necessary: course grain data flow
24
Can we do more? The run-time system – part of the chip design –The chip knows the chip resources –On-chip monitoring can supply information –The run-time system can direct the use of those resources The Cloud – the other extreme, and today’s be-all –How do we harness its capability? –What is needed from the hierarchy to make it work
25
My message Parallelism is a serious goal IF we want to solve the most challenging problems (Cure cancer, predict tsunamis) Telling people to think parallel is nice, but often silly Examining the transformation hierarchy and seeing where we can leverage seems to me a sounder approach
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.