SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD What can Manifold Enable? Manifold.

Slides:



Advertisements
Similar presentations
Chapter 3 Embedded Computing in the Emerging Smart Grid Arindam Mukherjee, ValentinaCecchi, Rohith Tenneti, and Aravind Kailas Electrical and Computer.
Advertisements

Arjun Suresh S7, R College of Engineering Trivandrum.
Computer Structure Power Management Lihu Rappoport and Adi Yoaz Thanks to Efi Rotem for many of the foils.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science University of Michigan.
Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMs Mrinmoy Ghosh Hsien-Hsin S. Lee School.
Using Hardware Vulnerability Factors to Enhance AVF Analysis Vilas Sridharan RAS Architecture and Strategy AMD, Inc. International Symposium on Computer.
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD Back-end Timing Models Core Models.
Arizona State University 2D BEAM STEERING USING ELECTROSTATIC AND THERMAL ACTUATION FOR NETWORKED CONTROL Jitendra Makwana 1, Stephen Phillips 1, Lifeng.
Helper Threads via Virtual Multithreading on an experimental Itanium 2 processor platform. Perry H Wang et. Al.
Techniques for Multicore Thermal Management Field Cady, Bin Fu and Kai Ren.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science August 20, 2009 Enabling.
3D Systems with On-Chip DRAM for Enabling
EET 4250: Chapter 1 Performance Measurement, Instruction Count & CPI Acknowledgements: Some slides and lecture notes for this course adapted from Prof.
Temperature-Aware Design Presented by Mehul Shah 4/29/04.
Akhil Langer, Harshit Dokania, Laxmikant Kale, Udatta Palekar* Parallel Programming Laboratory Department of Computer Science University of Illinois at.
1 Coordinated Control of Multiple Prefetchers in Multi-Core Systems Eiman Ebrahimi * Onur Mutlu ‡ Chang Joo Lee * Yale N. Patt * * HPS Research Group The.
Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU Presented by: Ahmad Lashgar ECE Department, University of Tehran.
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD Back-end Timing Models Core Models.
Exploring the Tradeoffs of Configurability and Heterogeneity in Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable.
Power Management Lecture notes S. Yalamanchili and S. Mukhopadhyay.
Comp-TIA Standards.  AMD- (Advanced Micro Devices) An American multinational semiconductor company that develops computer processors and related technologies.
Scaling and Packing on a Chip Multiprocessor Vincent W. Freeh Tyler K. Bletsch Freeman L. Rawson, III Austin Research Laboratory.
SYNAR Systems Networking and Architecture Group Scheduling on Heterogeneous Multicore Processors Using Architectural Signatures Daniel Shelepov and Alexandra.
University of Michigan Electrical Engineering and Computer Science 1 Extending Multicore Architectures to Exploit Hybrid Parallelism in Single-Thread Applications.
Introduction Computer Organization and Architecture: Lesson 1.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
Dong Hyuk Woo Nak Hee Seong Hsien-Hsin S. Lee
Advanced Computer Architecture, CSE 520 Generating FPGA-Accelerated DFT Libraries Chi-Li Yu Nov. 13, 2007.
CDA 3101 Fall 2013 Introduction to Computer Organization Computer Performance 28 August 2013.
University of Michigan Electrical Engineering and Computer Science Composite Cores: Pushing Heterogeneity into a Core Andrew Lukefahr, Shruti Padmanabha,
Dynamic Thermal Ratings for Overhead Lines Philip Taylor, Irina Makhkamova, Andrea Michiorri Energy Group, School of Engineering Durham University.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
1 Some Limits of Power Delivery in the Multicore Era Runjie Zhang, Brett H. Meyer, Wei Huang, Kevin Skadron and Mircea R. Stan University of Virginia,
Computational Sprinting on a Real System: Preliminary Results Arun Raghavan *, Marios Papaefthymiou +, Kevin P. Pipe +#, Thomas F. Wenisch +, Milo M. K.
Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.
Chapter 1 Computer Abstractions and Technology. Chapter 1 — Computer Abstractions and Technology — 2 The Computer Revolution Progress in computer technology.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) Reconfigurable Architectures Forces that drive.
A few issues on the design of future multicores André Seznec IRISA/INRIA.
BEAR: Mitigating Bandwidth Bloat in Gigascale DRAM caches
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD Manifold Execution Model and System.
MIAO ZHOU, YU DU, BRUCE CHILDERS, RAMI MELHEM, DANIEL MOSSÉ UNIVERSITY OF PITTSBURGH Writeback-Aware Bandwidth Partitioning for Multi-core Systems with.
DTM and Reliability High temperature greatly degrades reliability
Improving Energy Efficiency of Configurable Caches via Temperature-Aware Configuration Selection Hamid Noori †, Maziar Goudarzi ‡, Koji Inoue ‡, and Kazuaki.
Trading Cache Hit Rate for Memory Performance Wei Ding, Mahmut Kandemir, Diana Guttman, Adwait Jog, Chita R. Das, Praveen Yedlapalli The Pennsylvania State.
Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS
0 1 Thousand Core Chips A Technology Perspective Shekhar Borkar Intel Corp. June 7, 2007.
Hardware Architectures for Power and Energy Adaptation Phillip Stanley-Marbell.
Computer Science and Engineering Power-Performance Considerations of Parallel Computing on Chip Multiprocessors Jian Li and Jose F. Martinez ACM Transactions.
1 November 11, 2015 A Massively Parallel, Hybrid Dataflow/von Neumann Architecture Yoav Etsion November 11, 2015.
Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY THERMAL-AWARE RESOURCE.
University of Michigan Electrical Engineering and Computer Science 1 Embracing Heterogeneity with Dynamic Core Boosting Hyoun Kyu Cho and Scott Mahlke.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture.
-1- UC San Diego / VLSI CAD Laboratory Optimal Reliability-Constrained Overdrive Frequency Selection in Multicore Systems Andrew B. Kahng and Siddhartha.
CS203 – Advanced Computer Architecture
Die Stacking (3D) Microarchitecture Bryan Black, Murali Annavaram, Ned Brekelbaum, John DeVale, Lei Jiang, Gabriel H. Loh1, Don McCauley, Pat Morrow, Donald.
Hardware Architecture
Fall 2012 Parallel Computer Architecture Lecture 4: Multi-Core Processors Prof. Onur Mutlu Carnegie Mellon University 9/14/2012.
New Rules: Scaling Performance for Extreme Scale Computing
704 MHz cavity folded tuner Thermal Analysis C. Pai
Lynn Choi School of Electrical Engineering
Seth Pugsley, Jeffrey Jestes,
Lynn Choi School of Electrical Engineering
Data Center Energy Efficiency: Scale-Up/Scale-Out Processor Design Background & Analysis By Nick.
Accelerating Linked-list Traversal Through Near-Data Processing
Accelerating Linked-list Traversal Through Near-Data Processing
Gwangsun Kim Niladrish Chatterjee Arm, Inc. NVIDIA Mike O’Connor
An Automated Design Flow for 3D Microarchitecture Evaluation
Yiyu Shi*, Jinjun Xiong+, Howard Chen+ and Lei He*
Die Stacking (3D) Microarchitecture -- from Intel Corporation
Funded by the Horizon 2020 Framework Programme of the European Union
Presentation transcript:

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD What can Manifold Enable? Manifold enables cross-disciplinary evaluations Applications  Power  Thermal  Cooling Multi-scale simulation  cycle-level to functional Tradeoff studies 1 Performance ReliabilityEnergy/Power imaging1.com Large Graphs

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD Some Example Simulators Power capping studies Reliability studies Workload  Cooling interaction 2

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD 3 Power Capping: Simulation Model Power Targets  Controller gain is adjusted every 5 ms  Each core has its own core and power budget – two OOO and two IO cores.

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD 4 Power Capping Controller  High fixed-gain controller over-reacts to high power cores, whereas low fixed-gain control is slow to react to low power cores. N. Almoosa, W. Song, Y. Wardi, and S. Yalamanchili, “A Power Capping Controller for Multicore Processors,” American Control Conf., June New set point

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD Throughput Regulation: Adaptive 5  High fixed-gain controller over-reacts to high power cores, whereas low fixed-gain control is slow to react to low power cores. N. Almoosa, W. Song, Y. Wardi, and S. Yalamanchili, “Throughput Regulation on Multicore Processors via IPA,” 2012 IEEE 51st Annual Conference on Decision and Control (CDC)

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD Adaptation to Aging and Reliability 6 64-core asymmetric processor floor plan Failure probability comparison between per-core race-to-idle executions (left) and continuous low- voltage executions (right) Transient race-to-idle executions vs. continuous executions LVF: Low Voltage Frequency HVF: High Voltage Frequency NVF: Nominal Voltage Frequency

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU FE SCH DL1 INT FPU Workload-Cooling Interaction 7 Nehalem-like, OoO cores; 3GHz, 1.0V, max temp 100 ◦ C DL1: 128KB, 4096 sets, 64B IL1: 32KB, 256 sets, 32B, 4 cycles; L2 & Network Cache Layer: L2 (per core): 2MB, 4096 sets, 128B, 35 cycles; DRAM: 1GB, 50ns access time (for performance model) Ambient: Temperature: 300K Thermal Grids: 50x50 Sampling Period: 1us Steady-State Analysis 2.1mm x 2.1mm 8.4mm x 8.4mm 16 symmetric cores CORE DIE MICROFLUIDICS SRAM Coolant/ConfigurationABC Flow rate (ml/min)74284 Top Heat Coeff (W/um 2 -K)2.05e-85.71e-88.01e-8 Bot. Heat Coeff (W/um 2 -K)1.69e-84.72e-86.63e-8

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD Impact of Flow Rate & Workload on Energy Efficiency 8 Memory bound applications benefit more than computation bound applications Overall energy improvement 4.9%-17.1% over 12X increase in flow rate 4.0%-14.1% over 6X increase in flow rate Does not include pumping power

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD 3D Stacked ICs Structure Model 9 3D stacked ICs structure Simplified structure Conduction FE model and temperature results h eff =562.4 W/m 2 *K Effective heat transfer coefficient is obtained by FE model on the left: Z. Wan et. al., IEEE Therminic 2013, Berlin, Septemeber 2013 (accepted)

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD Case Study with Different Microgap Configurations 10 Microgap configurations Configuration 1: One microgap Configuration 2: Two microgaps Temperature results: One microgap, logic tier at bottom and memory tier on the top Pump power: 0.03 W ConfigurationT max,logic ( ℃ ) T max,memory ( ℃ ) Micro- gap TopBottom Case 11 ML Case 21 LM Case 32 ML Case 42 LM Logic tierMemory tier Results for different cases

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD Summary Not to provide a simulator, but 11 Composable simulation infrastructure for constructing multicore simulators, and Provide base library of components to build useful simulators Novel Cooling Technology Thermal Field Modeling Power Distr. Network Power Management μ architecture Algorithms Microarchitecture and Workload Execution Microarchitecture and Workload Execution Power Dissipation Thermal Coupling and Cooling Thermal Coupling and Cooling Degradation and Recovery Degradation and Recovery