Computer Architecture Lecture 4 17th May, 2006

Computer Architecture Lecture 4 17th May, 2006
Abhinav Agarwal Veeramani V.

Recap Simple Pipeline – hazards and solution
Data hazards Static compiler techniques – load delay slot, etc. Hardware solutions – Data forwarding, out-of-order execution, register renaming Control hazards Static compiler techniques Hardware speculation through branch predictors Structural hazards Increase hardware resources Superscalar out-of-order execution Memory organisation May 17, 2006 EE Summer Camp '06

Memory Organization in processors
Caches inside the chip Faster – ‘Closer’ SRAM cells They contain recently-used data They contain data in ‘blocks’ May 17, 2006 EE Summer Camp '06

Rational behind caches
Principle of spatial locality Principle of temporal locality Replacement policy (LRU, LFU, etc.) Principle of inclusivity May 17, 2006 EE Summer Camp '06

Outline Instruction Level Parallelism Thread-level Parallelism
Fine-Grain multithreading Simultaneous multithreading Sharable resources & Non-sharable resources Chip Multiprocessor Some design issues May 17, 2006 EE Summer Camp '06

Instruction Level Parallelism
Overlap execution of many instructions ILP techniques try to reduce data and control dependencies Issue out-of-order independent instructions May 17, 2006 EE Summer Camp '06

Thread Level Parallelism
Two different threads have more independent instructions Better utilization of functional units Multi-thread performance is improved drastically May 17, 2006 EE Summer Camp '06

A simple pipeline May 17, 2006 EE Summer Camp '06
source: EV8 DEC Alpha Processor, (c) Intel

Superscalar pipeline May 17, 2006 EE Summer Camp '06

Speculative execution
May 17, 2006 EE Summer Camp '06 source: EV8 DEC Alpha Processor, (c) Intel

Fine Grained Multithreading

Simultaneous Multithreading

Out of Order Execution May 17, 2006 EE Summer Camp '06

SMT pipeline May 17, 2006 EE Summer Camp '06

Resources – Replication required
Program counters Register maps May 17, 2006 EE Summer Camp '06

Replication not required
Register file (rename space) Instruction queue Branch predictor First and second level caches etc. May 17, 2006 EE Summer Camp '06

Chip multiprocessor Number of transistors going up
Have more than one core on the chip These still share the caches May 17, 2006 EE Summer Camp '06

Some design issues Trade-off in choosing the cache size
Power and performance Super pipelining trade-off Higher clock frequency and speculation penalty + Power Power consumption May 17, 2006 EE Summer Camp '06

Novel techniques for power
Clock gating Run non-critical elements at a slower clock Reduce voltage swings (Voltage of operation) Sleep Mode/ Standby Mode Dynamic Voltage Frequency scaling May 17, 2006 EE Summer Camp '06

Computer Architecture Lecture 4 17th May, 2006

Similar presentations

Presentation on theme: "Computer Architecture Lecture 4 17th May, 2006"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computer Architecture Lecture 4 17th May, 2006

Similar presentations

Presentation on theme: "Computer Architecture Lecture 4 17th May, 2006"— Presentation transcript:

Similar presentations

About project

Feedback