Download presentation
Presentation is loading. Please wait.
Published byShavonne Waters Modified over 9 years ago
1
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah http://www.cs.utah.edu/~rajeev
2
2 What is Computer Architecture?
3
3 If the Intel Pentium4 has a faster clock speed than the IBM Power4, does it execute your programs faster?
4
4 What is Computer Architecture? If the Intel Pentium4 has a faster clock speed than the IBM Power4, does it execute your programs faster? Completing instruction Clock tick Case 1: Case 2: Time
5
5 What is Computer Architecture? To a large extent, computer architecture determines: the number of instructions used to execute a program the time each instruction takes to execute the idle cycles when no work gets done the number of instructions that can execute in parallel
6
6 A Typical Microprocessor Branch Predictor Decode & Rename Issue Logic ALU L2 Cache L1 Instr Cache L1 Data Cache Register File
7
7 Architecture Trends in the 90s Performance was the ultimate metric Transistors were a limiting factor As on-chip transistors became available in the 90s, more functionality and complex circuitry was added to boost performance – most of the low-hanging fruit has now been picked
8
8 Hitting the Wall We have now hit the following walls: Single core performance Memory Complexity Power, temperature
9
9 Hitting the Power Wall Power is as important a metric today as performance From Shekhar Borkar, MICRO’99
10
10 The Advent of Multi-Core Chips In the past, performance magically increased by 50% every year In the future, this improvement will be only ~20% every year … unless … the application is multi-threaded! Core Cache bank
11
11 Upcoming Architecture Challenges Improving single core performance Functionalities in multi-core chips Simplifying the programmer’s task Efficient interconnects Power and temperature-efficient designs Designs tolerant of errors For publications, see http://www.cs.utah.edu/~rajeev/research.html
12
12 Interconnects as a Bottleneck In the past, on-chip data transmission on wires cost almost nothing Interconnect speed and power has been improving, but not at the same rate as transistor speeds Hence, relative to computation, communication is much more expensive In the near future, it will take 100 cycles to travel across the chip 50% of chip power can be attributed to interconnects
13
13 Interconnects in Multi-Core Chips A L1 A CPU 3 CPU 1CPU 2 L2 cache L2 control AA A A A L2 control
14
14 Not all Wires are Created Equal B-WiresL-WiresW-WiresPW-Wires Relative latency 1x 0.5x 1.6x 3.2x Relative area 1x 4x 0.5x 0.5x Dynamic power (W/m) 2.65 1.46 2.9 0.87 Static Power (W/m) 1.02 0.57 1.16 0.31
15
15 Data Transfers have Varying Needs Example of a cache coherence transaction: Read exclusive request for a shared block
16
16 Other Interconnect Choices Optical interconnects: speed of light, cost in converting between optical and electrical domains 3D chips: reduces communication distances, low cost for vertical signal transmission, increase in power density
17
17 3D Layouts Cluster (a) Arch-1 (cache-on-cluster)(b) Arch-2 (cluster on cluster)(c) Arch-3 (staggered) Cache bankIntra-die horizontal wireInter-die vertical wire Die 1 Die 0
18
18 Upcoming Architecture Challenges Improving single core performance Functionalities in multi-core chips Simplifying the programmer’s task Efficient interconnects Power and temperature-efficient designs Designs tolerant of errors Clustered architectures: relatively low complexity scalable solution easily handles multiple threads
19
19 Upcoming Architecture Challenges Improving single core performance Functionalities in multi-core chips Simplifying the programmer’s task Efficient interconnects Power and temperature-efficient designs Designs tolerant of errors Heterogeneous perf/power Cores that execute the OS Cores that verify results
20
20 Upcoming Architecture Challenges Improving single core performance Functionalities in multi-core chips Simplifying the programmer’s task Efficient interconnects Power and temperature-efficient designs Designs tolerant of errors Hardware to support transactional memory
21
21 Upcoming Architecture Challenges Improving single core performance Functionalities in multi-core chips Simplifying the programmer’s task Efficient interconnects Power and temperature-efficient designs Designs tolerant of errors Faults are caused by high energy particles that deposit enough charge to toggle bits Variations in conditions may cause a circuit to not produce its result in time
22
22 Research Methodologies It’s all about the simulators! Simplescalar & Wattch & Hotspot: about 10,000 lines of C code that models the flow of instructions through a modern processor Inputs: configuration file that specifies processor parameters, benchmark program (say, gzip) Outputs: how long the program runs on the simulated processor (Simplescalar), how much power is consumed (Wattch), what is the peak temperature (Hotspot)
23
23 Evaluating a New Idea Lots of reading (it’s better than waiting for divine inspiration) Identify bottlenecks, identify problems, develop an idea, repeatedly question that idea Understand simulator Engineer a solution, modify simulator code (perhaps, write fewer than 1000 lines of C code) Analyze data (things never work the first time), engineer/optimize/debug your solution Write papers Implement in silicon?
24
24 To Learn More… CS/EE 3810: Computer Organization CS/EE 6810: Computer Architecture CS/EE 7810: Advanced Computer Architecture CS/EE 7820: Parallel Computer Architecture CS 7937 / 7940: Architecture Reading Seminar
25
25 Title Bullet
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.