Download presentation
Presentation is loading. Please wait.
Published byLaurens Maes Modified over 5 years ago
4
Resource Replication 6 Integer Units 4 FP units
8 Sets of architectural registers Renaming registers (Int/FP) HW Context (PC, Return Stack etc.) Ports in I-cache
5
Replication (Contd) Per-thread mechanism for Pipeline Flushing
Instruction Retirement Trapping Precise Interrupts Thread Identifier in BTB, TLB
6
Inter-thread Interference
Increases with #threads 1.4% (2 thread) 4.8% (4) 5.3% (8) Does not hurt much 0.1% performance degradation Why? L1 misses covered by L2 misses Out of order execution, write buffer, multi thread
7
Memory Requirement Increases with number of threads
Memory requirement doubles as number of threads go from 1 to 8 Mostly for L1 Bank Conflict Multiple Thread Long L1 cache line Longer cache line has better locality Overall performance degrades by 3.4%
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.