Download presentation
Presentation is loading. Please wait.
Published byMegan Mitchell Modified over 9 years ago
1
Yale Patt The University of Texas at Austin World University Presidents’ Symposium University of Belgrade April 4, 2009 Future Microprocessors: What must we do differently to effectively utilize multi-core and many-core chips? …and what are the implications re: education?
2
What I want to try to do today: First, explain multi-core chips –How we got to where we are –What we have to do differently moving forward Second, discuss what that means with respect to education
3
Algorithm Program ISA (Instruction Set Arch) Microarchitecture Circuits Problem Electronic Devices
4
Algorithm Program ISA (Instruction Set Arch) Microarchitecture Circuits Problem Electronic Devices
5
Algorithm Program ISA (Instruction Set Arch) Microarchitecture Circuits Problem Electronic Devices
6
How many electronic devices? The first microprocessor (Intel 4004), 1971 –2300 transistors –106 KHz The Pentium chip, 1992 –3.1 million transistors –66 MHz Today –more than one billion transistors –Frequencies in excess of 5 GHz
7
In the next few years: Process technology: 50 billion transistors –Gelsinger says we are can go down to 10 nanometers (I like to say 100 angstroms just to keep us focused) Dreamers will use whatever we come up with How will we design the microarchitecture? –How will we harness 50 billion transistors?
8
How will we use 50 billion transistors? How have we used the transistors up to now?
9
How have we used the available transistors?
10
Intel Pentium M
11
Intel Core 2 Duo Penryn, 2007 45nm, 3MB L2
12
Why Multi-core chips? It is easier than designing a much better uni-core It is embarrassing to continue making L2 bigger It is the next obvious step It is not the Holy Grail
13
that is… In the beginning: a better and better uniprocessor –improving performance on the hard problems More recently: a uniprocessor with a bigger L2 cache –forsaking further improvement on the “hard” problems –poorly utilizing the chip area –and blaming the processor for not delivering performance Today: dual core, quad core Tomorrow: ???
14
The Good News: Lots of cores on the chip The Bad News: Not much benefit.
15
In my opinion the reason is: Our inability to effectively exploit: -- The transformation hierarchy -- Parallel programming
16
Algorithm Program ISA (Instruction Set Arch) Microarchitecture Circuits Problem Electrons
17
Up to now Maintain the artificial walls between the layers Keep the abstraction layers secure –Makes for a better comfort zone (Mostly) Improving the Microarchitecture –Pipelining, Branch Prediction, Speculative Execution Out-of-order Execution, Caches, Trace Cache Today, we have too many transistors –BANDWIDTH and POWER are blocking improvement –We MUST change the paradigm
18
…and that means (I think) breaking the layers: Compiler, Microarchitecture –Multiple levels of cache –Block-structured ISA –Part by compiler, part by uarch –Fast track, slow track Algorithm, Compiler, Microarchitecture –X + superscalar – the Refrigerator –Niagara X / Pentium Y Microarchitecture, Circuits –Verification Hooks –Internal fault tolerance
19
IF we break the layers: 50 billion transistors naturally leads to –A large number of simple processors, AND –A few very heavyweight processors, AND –Enough “accelerators” for handling lots of special tasks Heavyweight processors mean we can exploit ILP –i.e., Improve performance on hard, sequential problems –The death of ILP has been greatly exaggerated –There is still plenty of head room yet to be pursued We need software that can utilize both We need multiple interfaces
20
that is: IF we are willing to continue to pursue ILP IF we are willing to break the layers IF we are willing to embrace parallel programming
21
The Microprocessor of 2019 It WILL BE a Multi-core chip –But it will be PentiumX / Niagara Y –It will tackle off-chip bandwidth (better L2 miss handling) –It will tackle power consumption (ON/OFF switches) –It will tackle soft errors (internal fault tolerance) –It will tackle security And it WILL CONTAIN a heavyweight ILP processor –With the levels of transformation integrated
22
The Heavyweight ILP Processor: Compiler/Microarchitecture Symbiosis –Multiple levels of cache –Fast track / Slow track –Part by compiler, part by microarchitecture –Block-structured ISA Better Branch Prediction (e.g., indirect jumps) Ample sprinkling of Refrigerators SSMT (Also known as helper threads) Power Awareness (more than ON/OFF switches) Verification hooks (CAD a first class citizen) Internal Fault tolerance (for soft errors) Better security
23
Unfortunately: We train computer people to work within their layer Too few understand anything outside their layer and, as to multiple cores: People think sequential
24
At least two problems
25
Conventional Wisdom Problem 1: “Abstraction” is Misunderstood Taxi to the airport The Scheme Chip (Deeper understanding) Sorting (choices) Microsoft developers (Deeper understanding) Wireless networks (Layers revisited)
26
Conventional Wisdom Problem 2: Thinking in Parallel is Hard Perhaps: Thinking is Hard How do we get people to believe: Thinking in parallel is natural
27
Too many computer professionals don’t get it. We can exploit all these transistors –IF we can understand each other’s layer Thousands of cores, hundreds of accelerators –Ability to power on/off under program control Algorithms, Compiler, Microarchitecture, Circuits all talking to each other … Harnessing 50 billion transistor chips We have an Education Problem We have an Education Opportunity
28
What does this mean for most countries? One does NOT have to produce chips to be a major force in the multi-core, many core era One DOES have to understand multiple layers –which does not require huge capital investment –BUT does require a very large teacher investment One DOES have to embrace parallel thinking
29
How does one do it? FIRST, Do not accept the premise: Parallel programming is hard SECOND, Do not accept the premise: It is okay to know only one layer
30
Parallel Programming is Hard? What if we start teaching parallel thinking in the first course to freshmen For example: –Factorial –Parallel search –Streaming
31
How does one do it? FIRST, Do not accept the premise: Parallel programming is hard SECOND, Do not accept the premise: It is okay to know only one layer
32
Students can understand more than one layer What if we get rid of “top-down” FIRST –Students do not get it – they have no underpinnings –Objects are too high a level of abstraction –So, students end up memorizing –Memorizing isn’t learning (and certainly not understanding) What if we START with “motivated” bottom up –Students build on what they already know –Memorizing is replaced with real learning –Continually raise the level of abstraction The student sees the layers from the start –The student makes the connections –The student understands what is going on
33
If students understand, they can fix their own bugs …and, there is no substitute for Design it wrong, Debug it yourself, Fix it yourself, AND see the working result.
34
And, while I am at it: Students in every country are smart –No country has an advantage in that domain Don’t be afraid to work them very, very hard –Students can master very difficult material –And, students will not complain if they are learning –But they will complain if they are being fed “tedium.” That means teachers who know the difference
35
Again: what does this mean for most countries? One does NOT have to produce chips to be a major force in the multi-core, many core era One DOES have to understand multiple layers –which does not require huge capital investment –BUT does require a very large teacher investment One DOES have to embrace parallel thinking
36
The answer to both is education.
37
Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.