1 Multi-core processors 12/1/09. 2 Multiprocessors inside a single chip It is now possible to implement multiple processors (cores) inside a single chip.

Slides:



Advertisements
Similar presentations
CS136, Advanced Architecture Limits to ILP Simultaneous Multithreading.
Advertisements

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Lecture 6: Multicore Systems
Microprocessor Microarchitecture Multithreading Lynn Choi School of Electrical Engineering.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 7:
Multithreading Peer Instruction Lecture Materials for Computer Architecture by Dr. Leo Porter is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.
Single-Chip Multiprocessor Nirmal Andrews. Case for single chip multiprocessors Advances in the field of integrated chip processing. - Gate density (More.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Dec 5, 2005 Topic: Intro to Multiprocessors and Thread-Level Parallelism.
Instruction Level Parallelism (ILP) Colin Stevens.
Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.
Multithreading and Dataflow Architectures CPSC 321 Andreas Klappenecker.
Chapter Hardwired vs Microprogrammed Control Multithreading
Chapter 17 Parallel Processing.
How Multi-threading can increase on-chip parallelism
1 Lecture 10: ILP Innovations Today: ILP innovations and SMT (Section 3.5)
Chapter 7 Multicores, Multiprocessors, and Clusters.
Single-Chip Multi-Processors (CMP) PRADEEP DANDAMUDI 1 ELEC , Fall 08.
1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.
Joram Benham April 2,  Introduction  Motivation  Multicore Processors  Overview, CELL  Advantages of CMPs  Throughput, Latency  Challenges.
Chapter 18 Multicore Computers
CPE 731 Advanced Computer Architecture Thread Level Parallelism Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of California,
8 – Simultaneous Multithreading. 2 Review from Last Time Limits to ILP (power efficiency, compilers, dependencies …) seem to limit to 3 to 6 issue for.
CPE 631: Multithreading: Thread-Level Parallelism Within a Processor Electrical and Computer Engineering University of Alabama in Huntsville Aleksandar.
Multi-core architectures. Single-core computer Single-core CPU chip.
Multi-Core Architectures
POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? Multithreaded and multicore processors Marco D. Santambrogio:
1 Superscalar Pipelines 11/24/08. 2 Scalar Pipelines A single k stage pipeline capable of executing at most one instruction per clock cycle. All instructions,
Multiprocessors Speed of execution is a paramount concern, always so … If feasible … the more simultaneous execution that can be done on multiple computers.
COMP Multithreading. Coarse Grain Multithreading Minimal pipeline changes – Need to abort instructions in “shadow” of miss – Resume instruction.
Hardware Multithreading. Increasing CPU Performance By increasing clock frequency By increasing Instructions per Clock Minimizing memory access impact.
SYNAR Systems Networking and Architecture Group CMPT 886: Computer Architecture Primer Dr. Alexandra Fedorova School of Computing Science SFU.
.1 Multiprocessor on a Chip & Simultaneous Multi-threads [Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005]
SIMULTANEOUS MULTITHREADING Ting Liu Liu Ren Hua Zhong.
Multi-core processors. 2 Processor development till 2004 Out-of-order Instruction scheduling Out-of-order Instruction scheduling.
Thread Level Parallelism Since ILP has inherent limitations, can we exploit multithreading? –a thread is defined as a separate process with its own instructions.
Processor Level Parallelism. Improving the Pipeline Pipelined processor – Ideal speedup = num stages – Branches / conflicts mean limited returns after.
Chapter 3.4: Loop-Level Parallelism and Thread-Level Parallelism
HyperThreading ● Improves processor performance under certain workloads by providing useful work for execution units that would otherwise be idle ● Duplicates.
EKT303/4 Superscalar vs Super-pipelined.
1 Lecture: SMT, Cache Hierarchies Topics: SMT processors, cache access basics and innovations (Sections B.1-B.3, 2.1)
Advanced Computer Architecture pg 1 Embedded Computer Architecture 5SAI0 Chip Multi-Processors (ch 8) Henk Corporaal
Computer Structure 2015 – Intel ® Core TM μArch 1 Computer Structure Multi-Threading Lihu Rappoport and Adi Yoaz.
SYNAR Systems Networking and Architecture Group CMPT 886: Computer Architecture Primer Dr. Alexandra Fedorova School of Computing Science SFU.
On-chip Parallelism Alvin R. Lebeck CPS 220/ECE 252.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
Advanced Computer Architecture 5MD00 / 5Z033 SMT Simultaneously Multi-Threading Henk Corporaal TUEindhoven.
Ch3. Limits on Instruction-Level Parallelism 1. ILP Limits 2. SMT (Simultaneous Multithreading) ECE562/468 Advanced Computer Architecture Prof. Honggang.
Processor Performance & Parallelism Yashwant Malaiya Colorado State University With some PH stuff.
Processor Level Parallelism 1
Fall 2012 Parallel Computer Architecture Lecture 4: Multi-Core Processors Prof. Onur Mutlu Carnegie Mellon University 9/14/2012.
COMP 740: Computer Architecture and Implementation
CPE 731 Advanced Computer Architecture Thread Level Parallelism
Simultaneous Multithreading
Multi-core processors
Embedded Computer Architecture 5SAI0 Chip Multi-Processors (ch 8)
/ Computer Architecture and Design
Electrical and Computer Engineering
Microprocessors Chapter 4.
Levels of Parallelism within a Single Processor
Computer Architecture Lecture 4 17th May, 2006
Limits to ILP Conflicting studies of amount
Hardware Multithreading
Simultaneous Multithreading in Superscalar Processors
Coe818 Advanced Computer Architecture
/ Computer Architecture and Design
Embedded Computer Architecture 5SAI0 Chip Multi-Processors (ch 8)
Levels of Parallelism within a Single Processor
Hardware Multithreading
Advanced Architecture +
8 – Simultaneous Multithreading
Presentation transcript:

1 Multi-core processors 12/1/09

2 Multiprocessors inside a single chip It is now possible to implement multiple processors (cores) inside a single chip. –The number of cores in multicore processors is expected to double every two years. –Manycore processors will become the norm. Such processors typically share some of the caches and external memory interfaces. Eliminates latencies associated with chip-to-chip communications.

3 Applications In some multi-core applications processors are running the same code. –SIMD Multithreading –State of tread must be saved when a processor switches to a new thread. –Requires separate copies of registers.

4 Multithreading Fine-grained multithreading Course-grained multithreading Simultaneous multithreading (SMT)

5 Fine-grained multithreading Switches between threads on each clock. Interleaved execution of threads. Must be able to switch threads very quickly. –On each clock Stalled threads are skipped Primary disadvantage is that execution of any particular thread is slowed since it is interleaved with other threads.

6 Course-grained multithreading Invented as an alternative to fine-grain multithreading Switches threads only on costly stall. –Such as level 2 cache miss Relieves need to switch threads on each clock. Less likely to slow execution of an individual thread. Disadvantage –Throughput suffers because of inability to switch threads on shorter stalls. –Pipeline must be empted (or frozen) and started up when a thread is switched.

7 Simultaneous multithreading A variation of multithreading Uses a multiple-issue, dynamically scheduled processor to exploit thread-level parallelism at the same time exploiting instruction-level parallelism. Motivated by the fact that modern multiple-issue processors often have more functional unit parallelism than can be used by a single thread. Register renaming and dynamic scheduling enable multiple instructions from independent treads without regard to the dependencies among them. Dependencies are resolved by the dynamic scheduling capability.

8 Example: Four threads executing independently on a superscalar processor with no multithreading support.

9 Example: The same four threads executing together on the a processor for the three types of multithreading. Thread A Thread B Thread A Thread B