Download presentation
Presentation is loading. Please wait.
Published byDarin Nasby Modified over 9 years ago
1
System Simulation Of 1000-cores Heterogeneous SoCs Shivani Raghav Embedded System Laboratory (ESL) Ecole Polytechnique Federale de Lausanne (EPFL)
2
ESL Work on Energy-Aware Datacenter Design 2 System Simulation for many-core
3
Emerging Data-Intensive Workloads Cloud Servers Molecular Dynamics Monte Carlo Simulations Gene Sequencing Online Gaming Services Financial Simulations Medical Imaging
4
Demand for Hardware Acceleration Tile based Manycores Intel SCC, Tile 64 (Integrated) GPU Clusters (off –chip Accelerators) Hybrid Cores AMD Fusion (on-chip)
5
Urgent Need for Simulation of Heterogeneous SoCs Thermal & Power Evaluations Benchmarking Profiling Debugging Design Space Exploration Early Software Development Simulation
6
How to Design a Fast and Scalable Many-Core Simulator? Parallel Target Parallel Simulator Parallel Host
7
Simulating Parallel Target on Parallel Host is an Old Technology… FPGA GPGPU Flexus RAMP Opportunity WWT II Graphite Cotson, OVPSim Large Parallel Systems
8
Target Architecture Data-Parallel Coprocessors Simple In-order Cores 1000s of cores in a tile network Fine grain parallelism Core Caches Memory Switch
9
Solution – Accelerating Simulation using GPGPUs Target ArchitectureHost Platform A Perfect Match
10
Outline Problem Overview Simulation of Heterogeneous SoCs Solution SIMinG-1k (GPU accelerated simulator) Evaluation Summary
11
Outline Problem Overview Simulation of Heterogeneous SoCs Solution SIMinG-1k: A GPU accelerated simulator Evaluation Summary
12
Overall Simulation Framework Host Platform Sequential Code Data Parallel Code Simulator Target Architecture General Purpose CPU General Purpose CPU Many-Core Accelerator Application
13
SIMinG-1k - Features Instruction Accurate Inexpensive and Easily Available Fast Development Cycle Equation Performance Model Portability (Target Independent) Interpretation based core-simulation
14
Challenges of using GPU as a host SIMT (Single inst multiple threads) Divergent Code is a problem Synchronization outside thread block Slow CPU-GPU communication Global Memory is slow and limited
15
Outline Problem Overview Simulation of Heterogeneous SoCs Solution SIMinG-1k (GPU accelerated simulator) Evaluation Summary
16
Results – Architecture 1 MIPS - Number of simulated instruction in host wall clock time ARM ISA Data Scratchpad Single tile of target Accelerator Inst Scratchpad
17
Speed Up – Architecture 1 Speedup compared to simulation on OVPSim (thousands of ARM cores)
18
Single tile of Data-parallel Accelerator (cores, caches, on-chip interconnect) Results – Architecture 2 Core Caches Memory Switch
19
Speed Up – Architecture 2 Speedup compared to serial simulation on QEMU
20
Outline Problem Overview Simulation of Heterogeneous SoCs Solution SIMinG-1k (GPU accelerated simulator) Evaluation Summary
21
Conclusion Challenge Fast and parallel simulator for heterogeneous SoCs Solution Parallelize 1000 core simulation using GPUs Design Full System Simulation using QEMU and SIMinG-1k Results High Scalability and speedup upto 4096 cores Extend the simulator for thermal and power evaluations Complete simulation of Cloud Data Centers Future Work
22
Thanks! Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.