Dataflow Model of Computation (From Dataflow to Multithreading)

Slides:



Advertisements
Similar presentations
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Advertisements

Instruction-Level Parallel Processors {Objective: executing two or more instructions in parallel} 4.1 Evolution and overview of ILP-processors 4.2 Dependencies.
Multithreading processors Adapted from Bhuyan, Patterson, Eggers, probably others.
Microprocessor Microarchitecture Multithreading Lynn Choi School of Electrical Engineering.
ΜP rocessor Architectures To : Eng. Ahmad Hassan By: Group 18.
DATAFLOW ARHITEKTURE. Dataflow Processors - Motivation In basic processor pipelining hazards limit performance –Structural hazards –Data hazards due to.
CPEG F-Topic-3-II1 Topic 3 -- II: System Software Fundamentals: Multithreaded Execution Models, Virtual Machines and Memory Models Guang R. Gao.
University of Michigan Electrical Engineering and Computer Science 1 Reducing Control Power in CGRAs with Token Flow Hyunchul Park, Yongjun Park, and Scott.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.3 Program Flow Mechanisms.
Midterm Wednesday Chapter 1-3: Number /character representation and conversion Number arithmetic Combinational logic elements and design (DeMorgan’s Law)
Multithreading and Dataflow Architectures CPSC 321 Andreas Klappenecker.
1 Multi Threaded Architectures Sima, Fountain and Kacsuk Chapter 16 CSE462.
Multithreaded ASC Kevin Schaffer and Robert A. Walker ASC Processor Group Computer Science Department Kent State University.
ELEC Fall 05 1 Very- Long Instruction Word (VLIW) Computer Architecture Fan Wang Department of Electrical and Computer Engineering Auburn.
6/26/2015\Seminar\Spain From EARTH to HTMT: The Evolution of a Multithreaded Architecture Model Guang R. Gao Computer Architecture & Parallel Systems.
Topic ? Course Overview. Guidelines Questions are rated by stars –One Star Question  Easy. Small definition, examples or generic formulas –Two Stars.
The Vector-Thread Architecture Ronny Krashinsky, Chris Batten, Krste Asanović Computer Architecture Group MIT Laboratory for Computer Science
High Performance Architectures Dataflow Part 3. 2 Dataflow Processors Recall from Basic Processor Pipelining: Hazards limit performance  Structural hazards.
Very Long Instruction Word (VLIW) Architecture. VLIW Machine It consists of many functional units connected to a large central register file Each functional.
Computer Architecture Dataflow Machines. Data Flow Conventional programming models are control driven Instruction sequence is precisely specified Sequence.
1 Chapter 1 Parallel Machines and Computations (Fundamentals of Parallel Processing) Dr. Ranette Halverson.
1 Multithreaded Architectures Lecture 3 of 4 Supercomputing ’93 Tutorial Friday, November 19, 1993 Portland, Oregon Rishiyur S. Nikhil Digital Equipment.
PARALLEL APPLICATIONS EE 524/CS 561 Kishore Dhaveji 01/09/2000.
Spring 2003CSE P5481 Midterm Philosophy What the exam looks like. Definitions, comparisons, advantages & disadvantages what is it? how does it work? why.
1 Chapter 2 Dataflow Processors. 2 Dataflow processors Recall from basic processor pipelining: Hazards limit performance. – Structural hazards – Data.
Different parallel processing architectures Module 2.
Chapter 7 Dataflow Architecture. 7.1 Dataflow Models  A dataflow program: the sequence of operations is not specified, but depends upon the need and.
Anshul Kumar, CSE IITD Other Architectures & Examples Multithreaded architectures Dataflow architectures Multiprocessor examples 1 st May, 2006.
Pipelining and Parallelism Mark Staveley
Computer Architecture: Multithreading (I) Prof. Onur Mutlu Carnegie Mellon University.
Computer Architecture: Out-of-Order Execution II
C H A P T E R E L E V E N Concurrent Programming Programming Languages – Principles and Paradigms by Allen Tucker, Robert Noonan.
December 5, 2006http://csg.csail.mit.edu/6.827/L22-1 Dynamic Dataflow Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of.
Advanced Computer Architecture pg 1 Embedded Computer Architecture 5SAI0 Chip Multi-Processors (ch 8) Henk Corporaal
Computer Architecture Dataflow Machines Sunset over Lifou, New Caledonia.
RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.
Autumn 2006CSE P548 - Dataflow Machines1 Von Neumann Execution Model Fetch: send PC to memory transfer instruction from memory to CPU increment PC Decode.
Computer Architecture: Dataflow (Part I) Prof. Onur Mutlu Carnegie Mellon University.
Processor Level Parallelism 1
COMP 740: Computer Architecture and Implementation
Electrical and Computer Engineering
Dataflow Machines CMPE 511
Advanced Topics in Concurrency and Reactive Programming: Asynchronous Programming Majeed Kassis.
Prof. Onur Mutlu Carnegie Mellon University
Prof. Onur Mutlu Carnegie Mellon University
Simultaneous Multithreading
Fall 2012 Parallel Computer Architecture Lecture 22: Dataflow I
CIS-550 Advanced Computer Architecture Lecture 10: Precise Exceptions
Scalable Processor Design
Embedded Computer Architecture 5SAI0 Chip Multi-Processors (ch 8)
/ Computer Architecture and Design
Prof. Onur Mutlu Carnegie Mellon University Spring 2014, 2/21/2014
Computer Architecture: Multithreading (I)
Topic A Dataflow Model of Computation
Toward a Unified HPC and Big Data Runtime
Levels of Parallelism within a Single Processor
Hardware Multithreading
Parallel Architectures
Samira Khan University of Virginia Sep 5, 2018
Mattan Erez The University of Texas at Austin
Fine-grained vs Coarse-grained multithreading
Topic B (Cont’d) Dataflow Model of Computation
/ Computer Architecture and Design
Embedded Computer Architecture 5SAI0 Chip Multi-Processors (ch 8)
The Vector-Thread Architecture
Levels of Parallelism within a Single Processor
Design of Digital Circuits Lecture 19a: VLIW
Samira Khan University of Virginia Jan 23, 2019
Prof. Onur Mutlu Carnegie Mellon University
Prof. Onur Mutlu Carnegie Mellon University
Presentation transcript:

Dataflow Model of Computation (From Dataflow to Multithreading) Guang R. Gao ACM Fellow and IEEE Fellow Endowed Distinguished Professor Electrical & Computer Engineering University of Delaware ggao@capsl.udel.edu 652-11S-Topic-Gao-Dataflow

Fine-Grain non-preemptive thread- CPU Memory Executor Locus A Single Thread Coarse-Grain thread- The family home model Unit Executor Locus CPU Memory Fine-Grain non-preemptive thread- The “hotel” model Thread Unit A Pool Thread Coarse-Grain vs. Fine-Grain Multithreading [Gao: invited talk at Fran Allen’s Retirement Workshop, 07/2002] 652-11S-Topic-Gao-Dataflow

Evolution of Multithreaded Execution and Architecture Models CHoPP’77 CHoPP’87 Non-dataflow based MASA Halstead 1986 Alwife Agarwal 1989-96 Eldorado HEP B. Smith 1978 Tera B. Smith 1990- CDC 6600 1964 CASCADE Flynn’s Processor 1969 Cosmic Cube Seiltz 1985 J-Machine Dally 1988-93 M-Machine Dally 1994-98 Others: Multiscalar (1994), SMT (1995), etc. Monsoon Papadopoulos & Culler 1988 Dataflow model inspired P-RISC Nikhil & Arvind 1989 *T/Start-NG MIT/Motorola 1991- MIT TTDA Arvind 1980 Cilk Leiserson LAU Syre 1976 Iannuci’s 1988-92 TAM Culler 1990 Manchester Gurd & Watson 1982 SIGMA-I Shimada 1988 EM-5/4/X RWC-1 1992-97 Static Dataflow Dennis 1972 MIT Arg-Fetching Dataflow DennisGao 1987-88 MDFA Gao 1989-93 MTA HumTheobald Gao 94 EARTH CARE PACT95’, ISCA96, Theobald99 Marquez04 652-11S-Topic-Gao-Dataflow

The Von Neumann-type Processing begin for i = 1 … … endfor end Compiler Sequential Machine Representation Source Code CPU Load Processor 652-11S-Topic-Gao-Dataflow

A Multithreaded Architecture To Other PE’s One PE 652-11S-Topic-Gao-Dataflow

McGill Data Flow Architecture Model (MDFA) 652-11S-Topic-Gao-Dataflow

652-11S-Topic-Gao-Dataflow n1 n1 store store store fetch fetch fetch n2 fetch n3 n2 n3 Argument –flow Principle Argument –fetching Principle 652-11S-Topic-Gao-Dataflow

A Dataflow Program Tuple Program Tuple = { P-Code . S-Code } S-Code P-Code 2 3 n1 a b n2 c d N1: x = a + b; N2: y = c – d; N3: z = x * y; IPU ISU 652-11S-Topic-Gao-Dataflow

The McGill Dataflow Architecture Model Pipelined Instruction Processing Unit (PIPU) Fire Done Dataflow Instruction Scheduling Unit (DISU) Enable Memory & Controller Signal Processing 652-11S-Topic-Gao-Dataflow

The McGill Dataflow Architecture Model Pipelined Instruction Processing Unit (PIPU) Important Features Fire Done Dataflow Instruction Scheduling Unit (DISU) Pipeline can be kept fully utilized provided that the program has sufficient parallelism = PC Enabled Instructions Waiting Instructions 652-11S-Topic-Gao-Dataflow

The Scheduling Memory (Enable) Dataflow Instruction Scheduling Unit (DISU) C O N T R L E 1 Signal Processing Fire Done Count Signal(s) 1 Enabled Instructions Waiting Instructions 652-11S-Topic-Gao-Dataflow

Advantages of the McGill Dataflow Architecture Model Eliminate unnecessary token copying and transmission overhead Instruction scheduling is separated from the main datapath of the processor (e.g. asynchronous, decoupled) 652-11S-Topic-Gao-Dataflow

Von Neumann Threads as Macro Dataflow Nodes 1 2 3 k A sequence of instructions is “packed” into a macro-dataflow node Synchronization is done at the macro-node level 652-11S-Topic-Gao-Dataflow

652-11S-Topic-Gao-Dataflow Hybrid Evaluation Von Neumann Style Instruction Execution” on the McGill Dataflow Architecture Group a “sequence” of dataflow instruction into a “thread” or a macro dataflow node. Data-driven synchronization among threads. “Von Neumann style sequencing” within a thread. Advantage: Preserves the parallelism among threads but avoids unnecessary fine-grain synchronization between instructions within a sequential thread. 652-11S-Topic-Gao-Dataflow

652-11S-Topic-Gao-Dataflow What Do We Get? A hybrid architecture model without sacrificing the advantage of fine-grain parallelism! (latency-hiding, pipelining support) 652-11S-Topic-Gao-Dataflow

A Realization of the Hybrid Evaluation Shortcut Pipelined Instruction Processing Unit (PIPU) Von Neumann bit 1 2 k Fire Done Dataflow Instruction Scheduling Unit (DISU) 652-11S-Topic-Gao-Dataflow

Case Studies – Dataflow Model Insired Multithreading McGill Dataflow Model (1988 - 1993) EARTH Model (1993 – mid 2000s ) The UHPC/Runnemede Model (2010 - ) 5/14/2019 421-10-F/Topic-3-II-FineGrain-Cases