Download presentation
Presentation is loading. Please wait.
Published byRandell Charles Modified over 9 years ago
1
A few issues on the design of future multicores André Seznec IRISA/INRIA
2
André Seznec CAPS project-team Irisa-Inria 2 Single Chip Uniprocessor: the end of the road (Very) wide issue superscalar processors are not cost effective: More than quadratic complexity on many key components: Register file Bypass network Issue logic Limited performance return Failure of EV8 = end of very wide issue superscalar processors
3
André Seznec CAPS project-team Irisa-Inria 3 Hardware thread parallelism High-end single chip component: Chip multiprocessors: IBM Power 5, dual-core Intel Pentium 4, dual-core Athlon-64 Many CMP SoCs for embedded markets Cell (Simultaneous) Multithreading: Pentium 4, Power 5, Multithreading
4
André Seznec CAPS project-team Irisa-Inria 4 Thread parallelism Expressed by the application developer: Depends on the application itself Depends on the programming language or paradigm Depends on the programmer Discovered by the compiler: Automatic (static) parallelization Exploited by the runtime: Task scheduling Dynamically discovered/exploited by hardware or software: Speculative hardware/software threading
5
André Seznec CAPS project-team Irisa-Inria 5 Direction of (single chip) architecture: betting on parallelism success (Future) applications are intrinsically parallel: As much as possible simple cores (Future) applications are moderately parallel A few complex state-of-the-art superscalar cores SSC: Sea of Simple Cores FCC: Few Complex Cores
6
André Seznec CAPS project-team Irisa-Inria 6 SSC: Sea of Simple Cores
7
André Seznec CAPS project-team Irisa-Inria 7 FCC: Few Complex Cores 4-way O-O-O superscalar 4-way O-O-O superscalar Shared L3 cache 4-way O-O-O superscalar
8
André Seznec CAPS project-team Irisa-Inria 8 Common architectural design issues
9
André Seznec CAPS project-team Irisa-Inria 9 Instruction Set Architecture Single ISAs ? Extension of “conventional” multiprocessors Shared or distributed memory ? Hetorogeneous ISAs: A la CELL ?: (master processor + slave processors) x N A la SoC ? : specialized coprocessors Radically new architecture ? Which one ?
10
André Seznec CAPS project-team Irisa-Inria 10 Hardware accelerators ? SIMD extensions: Seems to be accepted, report the burden to applications developers and compilers Reconfigurable datapaths: Popular when you get a well defined intrinsically parallel application Vector extensions: Might be the right move when targeting essentially scientific computing
11
André Seznec CAPS project-team Irisa-Inria 11 On-chip memory/processors/memory bandwidth The uniprocessor credo was: “Use the remaining silicon for caches” New issue: An extra processor or more cache Extra processing power = increased memory bandwidth demand Increased power consumption, more temperature hot spots Extra cache = decreased (external) memory demand
12
André Seznec CAPS project-team Irisa-Inria 12 Memory hierarchy organization ?
13
André Seznec CAPS project-team Irisa-Inria 13 Flat: sharing a big L2/L3 cache? μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ L3 cache
14
André Seznec CAPS project-team Irisa-Inria 14 Flat: communication issues? through the big cache μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ L3 cache
15
André Seznec CAPS project-team Irisa-Inria 15 Flat: communication issues? Grid-like ? μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ L3 cache
16
André Seznec CAPS project-team Irisa-Inria 16 Hierarchical organization ? μP μP$ μP μP$ L2 $ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ μP μP$ L3 $
17
André Seznec CAPS project-team Irisa-Inria 17 Hierarchical organization ? Arbitration at all levels Coherency at all levels Interleaving at all levels Bandwidth dimensioning
18
André Seznec CAPS project-team Irisa-Inria 18 NoC structure Very dependent of the memory hierarchy organization !! + sharing coprocessors/hardware accelerators + I/O buses/(processors ?) + memory interface + network interface
19
André Seznec CAPS project-team Irisa-Inria 19 Example μP μP$ μP μP$ L2 $ μP μP$ μP μP$ μP μP$ μP μP$ L3 $ Memory Int. IO
20
André Seznec CAPS project-team Irisa-Inria 20 Multithreading ? An extra level thread parallelism !! Might be an interesting alternative to prefetching on massively parallel applications
21
André Seznec CAPS project-team Irisa-Inria 21 Power and thermal issues Voltage/frequency scaling to adapt to the workload ? Adapting the workload to the available power ? Adapting/dimensioning the architecture to the power budget Activity migration for managing temperatures ?
22
André Seznec CAPS project-team Irisa-Inria 22 General issues for software/compiler Parallelism detection and partitioning: find the correct granularity Memory bandwidth mastering Non-uniform memory latency Optimizing sequential code portions
23
André Seznec CAPS project-team Irisa-Inria 23 SSC design specificities
24
André Seznec CAPS project-team Irisa-Inria 24 Basic core granularity RISC cores VLIW cores In-order superscalar cores
25
André Seznec CAPS project-team Irisa-Inria 25 Homogeneous vs. heterogeneous ISAs Core specialization: RISC + VLIW or DSP slaves ? Master core + a set of special purpose cores ?
26
André Seznec CAPS project-team Irisa-Inria 26 Sharing issue Simple cores: Lot of duplications and lots of unused resources at any time Adjacent cores can share: Caches Functional units: FP, mult/div, multimedia, Hardware accelerators
27
André Seznec CAPS project-team Irisa-Inria 27 An example of sharing μP μPFP μP μP DL1 $ Inst. fetch IL1 $ μP μPFP μP μP DL1 $ Inst. fetch IL1 $ Hardware accelerator L2 cache
28
André Seznec CAPS project-team Irisa-Inria 28 Multithreading/prefetching Multithreading: Is the extra complexity worth for simple cores ? Prefetching: Is it worth ? Sharing prefetch engines ?
29
André Seznec CAPS project-team Irisa-Inria 29 Vision of a SSC (my own vision )
30
André Seznec CAPS project-team Irisa-Inria 30 SSC: the basic brick μP μPFP μP μP D $ I $ μP μPFP μP μP D $ I $ L2 cache μP μPFP μP μP D $ I $ μP μPFP μP μP D $ I $
31
André Seznec CAPS project-team Irisa-Inria 31 Memory interface network interface System interface L3 cache μP μP FP μP μP D $ I $ μP μPFP μP μP D $ I $ L2 cache μP μPFP μP μP D $ I $ μP μP FP μP μP D $ I $ μP μP FP μP μP D $ I $ μP μPFP μP μP D $ I $ L2 cache μP μPFP μP μP D $ I $ μP μP FP μP μP D $ I $ μP μP FP μP μP D $ I $ μP μPFP μP μP D $ I $ L2 cache μP μPFP μP μP D $ I $ μP μP FP μP μP D $ I $ μP μP FP μP μP D $ I $ μP μPFP μP μP D $ I $ L2 cache μP μPFP μP μP D $ I $ μP μP FP μP μP D $ I $
32
André Seznec CAPS project-team Irisa-Inria 32 FCC design specificities
33
André Seznec CAPS project-team Irisa-Inria 33 Only limited available thread parallelism ? Focus on uniprocessor architecture: Find the correct tradeoff between complexity and performance Power and temperature issues Vector extensions ? Contiguous vectors ( a la SSE) ? Strided vectors in L2 caches ( Tarantula-like)
34
André Seznec CAPS project-team Irisa-Inria 34 Performance enablers SMT for parallel workloads ? Helper threads ? Run ahead threads Speculative multithreading hardware support
35
André Seznec CAPS project-team Irisa-Inria 35 Intermediate design ? SCCs: Shine on massively parallel applications Poor/ limited performance on sequential sections FCCs: Moderate performance on parallel applications Good performance on sequential sections
36
André Seznec CAPS project-team Irisa-Inria 36 Amdahl’s law Mix of FCC and SSC
37
André Seznec CAPS project-team Irisa-Inria 37 The basic brick L2 cache μP μPFP μP μP D $ I $ μP μPFP μP μP D $ I $ Ultimate Out-of-order Superscalar
38
André Seznec CAPS project-team Irisa-Inria 38 L2 $ D $ I $ D $ I $ Ult. O-O-O L2 $ D $ I $ D $ I $ Ult. O-O-O L2 $ D $ I $ D $ I $ Ult. O-O-O L2 $ D $ I $ D $ I $ Ult. O-O-O L3 cache Memory interface network interface System interface
39
André Seznec CAPS project-team Irisa-Inria 39 Conclusion The era of uniprocessor has come to the end No clear trend to continue Might be time for more architecture diversity
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.