My major mantra: We Must Break the Layers. Algorithm Program ISA (Instruction Set Arch)‏ Microarchitecture Circuits Problem Electrons.

Slides:



Advertisements
Similar presentations
The Interaction of Simultaneous Multithreading processors and the Memory Hierarchy: some early observations James Bulpin Computer Laboratory University.
Advertisements

The Implications of Multi-core. What I want to do today Given that everyone is heralding Multi-core –Is it really the Holy Grail? –Will it cure cancer?
Data Marshaling for Multi-Core Architectures M. Aater Suleman Onur Mutlu Jose A. Joao Khubaib Yale N. Patt.
Avenues for Research The Microarchitecture of Future Microprocessors.
Microprocessor Performance, Phase II Yale Patt The University of Texas at Austin STAMATIS Symposium TU-Delft September 28, 2007.
Microarchitecture is Dead AND We must have a multicore interface that does not require programmers to understand what is going on underneath.
Microprocessor Microarchitecture Multithreading Lynn Choi School of Electrical Engineering.
Yale Patt The University of Texas at Austin World University Presidents’ Symposium University of Belgrade April 4, 2009 Future Microprocessors: What must.
Yale Patt The University of Texas at Austin Universidade de Brasilia Brasilia, DF -- Brazil August 12, 2009 Future Microprocessors: Multi-core, Multi-nonsense,
CS2100 Computer Organisation Introduction to Computer Organisation (AY2014/2015)
Some Opportunities and Obstacles in Cross-Layer and Cross-Component (Power) Management Onur Mutlu NSF CPOM Workshop, 2/10/2012.
Understanding a Problem in Multicore and How to Solve It
Computer Design – Introduction 1 MAMAS – Computer Architecture Dr. Lihu Rappoport Some of the slides were taken from Avi Mendelson, Randi Katz,
Instruction Level Parallelism (ILP) Colin Stevens.
Multithreading and Dataflow Architectures CPSC 321 Andreas Klappenecker.
Current Trends in CMP/CMT Processors Wei Hsu 7/26/2006.
Dr. Gheith Abandah, Chair Computer Engineering Department The University of Jordan 20/4/20091.
Mechanisms: Run-time and Compile-time. Outline Agents of Evolution Run-time Branch prediction Trace Cache SMT, SSMT The memory problem (L2 misses) Compile-time.
BLDC MOTOR SPEED CONTROL USING EMBEDDED PROCESSOR
Single-Chip Multi-Processors (CMP) PRADEEP DANDAMUDI 1 ELEC , Fall 08.
Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs José A. Joao * M. Aater Suleman * Onur Mutlu ‡ Yale N. Patt * * HPS Research.
MorphCore: An Energy-Efficient Architecture for High-Performance ILP and High-Throughput TLP Khubaib * M. Aater Suleman *+ Milad Hashemi * Chris Wilkerson.
CCSE251 Introduction to Computer Organization
Intro to Digital Technology HARDWARE CONCEPTS. IT-IDT-4 Identify, describe, evaluate, select, and use appropriate technology. IT-IDT-5 Understand, communicate,
CBSSS 2002: DeHon Architecture as Interface André DeHon Friday, June 21, 2002.
Compiler BE Panel IDC HPC User Forum April 2009 Don Kretsch Director, Sun Developer Tools Sun Microsystems.
Multi-Core Architectures
1 Multi-core processors 12/1/09. 2 Multiprocessors inside a single chip It is now possible to implement multiple processors (cores) inside a single chip.
Lecture 1: Performance EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2013, Dr. Rozier.
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
Yale Patt The University of Texas at Austin University of California, Irvine March 2, 2012 High Performance in the Multi-core Era: The role of the Transformation.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
Nicolas Tjioe CSE 520 Wednesday 11/12/2008 Hyper-Threading in NetBurst Microarchitecture David Koufaty Deborah T. Marr Intel Published by the IEEE Computer.
CAPS project-team Compilation et Architectures pour Processeurs Superscalaires et Spécialisés.
Performance of mathematical software Agner Fog Technical University of Denmark
Parallelism: A Serious Goal or a Silly Mantra (some half-thought-out ideas)
Outline  Over view  Design  Performance  Advantages and disadvantages  Examples  Conclusion  Bibliography.
Chapter 1 Computer Abstractions and Technology. Chapter 1 — Computer Abstractions and Technology — 2 The Computer Revolution Progress in computer technology.
A few issues on the design of future multicores André Seznec IRISA/INRIA.
Department of Industrial Engineering Sharif University of Technology Session# 6.
Lecture 2: Computer Architecture: A Science ofTradeoffs.
Lev Finkelstein ISCA/Thermal Workshop 6/ Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)
Processor Level Parallelism. Improving the Pipeline Pipelined processor – Ideal speedup = num stages – Branches / conflicts mean limited returns after.
Computer Architecture: Multithreading (I) Prof. Onur Mutlu Carnegie Mellon University.
DR. SIMING LIU SPRING 2016 COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF NEVADA, RENO CS 219 Computer Organization.
Multi-core, Mega-nonsense. Will multicore cure cancer? Given that multicore is a reality –…and we have quickly jumped from one core to 2 to 4 to 8 –It.
Computer Architecture: Multi-Core Processors: Why? Prof. Onur Mutlu Carnegie Mellon University.
“Processors” issues for LQCD January 2009 André Seznec IRISA/INRIA.
15-740/ Computer Architecture Lecture 2: ISA, Tradeoffs, Performance Prof. Onur Mutlu Carnegie Mellon University.
MAHARANA PRATAP COLLEGE OF TECHNOLOGY SEMINAR ON- COMPUTER PROCESSOR SUBJECT CODE: CS-307 Branch-CSE Sem- 3 rd SUBMITTED TO SUBMITTED BY.
Spring 2003CSE P5481 WaveScalar and the WaveCache Steven Swanson Ken Michelson Mark Oskin Tom Anderson Susan Eggers University of Washington.
Fall 2012 Parallel Computer Architecture Lecture 4: Multi-Core Processors Prof. Onur Mutlu Carnegie Mellon University 9/14/2012.
Lecture 5 Approaches to Concurrency: The Multiprocessor
15-740/ Computer Architecture Lecture 3: Performance
Lynn Choi School of Electrical Engineering
Multi-core processors
CS161 – Design and Architecture of Computer Systems
5.2 Eleven Advanced Optimizations of Cache Performance
CS 286 Computer Organization and Architecture
CS703 - Advanced Operating Systems
Hyperthreading Technology
Computer Architecture: Multithreading (I)
The University of Texas at Austin
Introduction, Focus, Overview
Hardware Components & Software Concepts
The University of Texas at Austin
Computer Architecture: A Science of Tradeoffs
Introduction, Focus, Overview
Operating System Overview
Spring’19 Prof. Eric Rotenberg
Presentation transcript:

My major mantra: We Must Break the Layers

Algorithm Program ISA (Instruction Set Arch)‏ Microarchitecture Circuits Problem Electrons

IF we break the layers: Compiler, Microarchitecture –Multiple levels of cache –Block-structured ISA –Part by compiler, part by uarch –Fast track, slow track Algorithm, Compiler, Microarchitecture –X + superscalar – the Refrigerator –Niagara X / Pentium Y Microarchitecture, Circuits –Verification Hooks –Internal fault tolerance

…and More intelligently design our core (ILP is not dead ) Provide more than one interface Re-visit the Run-time system

Morphcore High performance out-of-order for serial code SMT in-order when throughput dominates Mickey mouse when neither is present Who decides which mode to operate in?

More than one interface Those who need the utmost in performance –and are willing to trade ease of use for extra performance –For them, we need to provide a more powerful engine (Heavyweight cores for ILP, Accelerators for specific tasks ) Those who just need to get their job done –and are content with trading performance for ease of use –For them, we need a software layer to bridge the multi-core That means –Chip designers understanding what algorithms need –Algorithms folks understanding what hardware provides –Knowledgeable Software folks bridging the two interfaces

The Run-time System: What? Part of the Operating System? Part of the Chip Design?

The Run-time System: Why? MorphCore (Khubaib, IEEE/ACM Micro 2012) Adaptive Processors, (Miftakhudinov, Micro 2012) Shared resources (Ebrahimi, IEEE/ACM ISCA 2012) Which thread to speed up (Joao, ISCA 2013)

The Run-time System: How?

Algorithm Program ISA (Instruction Set Arch)‏ Microarchitecture Circuits Problem Electrons

The Run-time System: How? Programmers insert pragmas in their code Compilers use it to insert procedure calls Serious on-chip monitoring provides inputs Run-time system decides how to utilize on-chip resources