March 4, 2008http://csg.csail.mit.edu/arvindDF3-1 Dynamic Dataflow Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.

Slides:



Advertisements
Similar presentations
CSI 3120, Implementing subprograms, page 1 Implementing subprograms The environment in block-structured languages The structure of the activation stack.
Advertisements

Procedures in more detail. CMPE12cGabriel Hugh Elkaim 2 Why use procedures? –Code reuse –More readable code –Less code Microprocessors (and assembly languages)
DATAFLOW ARHITEKTURE. Dataflow Processors - Motivation In basic processor pipelining hazards limit performance –Structural hazards –Data hazards due to.
Requirements on the Execution of Kahn Process Networks Marc Geilen and Twan Basten 11 April 2003 /e.
(1) ICS 313: Programming Language Theory Chapter 10: Implementing Subprograms.
Chapter 10 Implementing Subprograms. Copyright © 2012 Addison- Wesley. All rights reserved. 1-2 Chapter 10 Topics The General Semantics of Calls and Returns.
DATAFLOW PROCESS NETWORKS Edward A. Lee Thomas M. Parks.
ISBN Chapter 10 Implementing Subprograms.
Procedures in more detail. CMPE12cCyrus Bazeghi 2 Procedures Why use procedures? Reuse of code More readable Less code Microprocessors (and assembly languages)
A denotational framework for comparing models of computation Daniele Gasperini.
Processes CSCI 444/544 Operating Systems Fall 2008.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.3 Program Flow Mechanisms.
Run-Time Storage Organization
Run time vs. Compile time
Dataflow Process Networks Lee & Parks Synchronous Dataflow Lee & Messerschmitt Abhijit Davare Nathan Kitchen.
(Page 554 – 564) Ping Perez CS 147 Summer 2001 Alternative Parallel Architectures  Dataflow  Systolic arrays  Neural networks.
Chapter 9: Subprogram Control
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 11, 2009 Dataflow.
1 CSCI 360 Survey Of Programming Languages 9 – Implementing Subprograms Spring, 2008 Doug L Hoffman, PhD.
Processes April 5, 2000 Instructor: Gary Kimura Slides courtesy of Hank Levy.
Chapter 10 Implementing Subprograms. Copyright © 2007 Addison-Wesley. All rights reserved. 1–2 Semantics of Call and Return The subprogram call and return.
1 Chapter 5: Names, Bindings and Scopes Lionel Williams Jr. and Victoria Yan CSci 210, Advanced Software Paradigms September 26, 2010.
March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
September 24, L08-1 IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab.
The Procedure Abstraction Part I: Basics Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
CMPE 511 DATA FLOW MACHINES Mustafa Emre ERKOÇ 11/12/2003.
Voicu Groza, 2008 SITE, HARDWARE/SOFTWARE CODESIGN OF EMBEDDED SYSTEMS Hardware/Software Codesign of Embedded Systems Voicu Groza SITE Hall, Room.
High Performance Architectures Dataflow Part 3. 2 Dataflow Processors Recall from Basic Processor Pipelining: Hazards limit performance  Structural hazards.
Computer Architecture Dataflow Machines. Data Flow Conventional programming models are control driven Instruction sequence is precisely specified Sequence.
Implementing Processes and Process Management Brian Bershad.
1 Arvind Computer Science and Artificial Intelligence Lab M.I.T. Dataflow: Passing the Token ISCA 2005, Madison, WI June 6, 2005.
March 4, 2008 DF4-1http://csg.csail.mit.edu/arvind/ From Id/pH to Dataflow Graphs Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute.
Copyright © 2005 Elsevier Chapter 8 :: Subroutines and Control Abstraction Programming Language Pragmatics Michael L. Scott.
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-1 Process Concepts Department of Computer Science and Software.
1 Multithreaded Architectures Lecture 3 of 4 Supercomputing ’93 Tutorial Friday, November 19, 1993 Portland, Oregon Rishiyur S. Nikhil Digital Equipment.
1 Chapter 2 Dataflow Processors. 2 Dataflow processors Recall from basic processor pipelining: Hazards limit performance. – Structural hazards – Data.
March 4, 2008 DF5-1http://csg.csail.mit.edu/arvind/ The Monsoon Project Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of.
RUNTIME ENVIRONMENT AND VARIABLE BINDINGS How to manage local variables.
Implementing Subprograms
December 5, 2006http://csg.csail.mit.edu/6.827/L22-1 Dynamic Dataflow Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of.
© G. Drew Kessler and William M. Pottenger1 Subroutines, Part 2 CSE 262, Spring 2003.
Autumn 2006CSE P548 - Dataflow Machines1 Von Neumann Execution Model Fetch: send PC to memory transfer instruction from memory to CPU increment PC Decode.
Computer Architecture: Dataflow (Part I) Prof. Onur Mutlu Carnegie Mellon University.
ISBN Chapter 10 Implementing Subprograms.
LECTURE 19 Subroutines and Parameter Passing. ABSTRACTION Recall: Abstraction is the process by which we can hide larger or more complex code fragments.
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Dataflow Machines CMPE 511
Chapter 3: Process Concept
Prof. Onur Mutlu Carnegie Mellon University
Fall 2012 Parallel Computer Architecture Lecture 22: Dataflow I
Applied Operating System Concepts
Chapter 9 :: Subroutines and Control Abstraction
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Chap. 8 :: Subroutines and Control Abstraction
Chap. 8 :: Subroutines and Control Abstraction
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Sequential Circuits - 2 Constructive Computer Architecture Arvind
The University of Adelaide, School of Computer Science
Operating System Concepts
Processes Hank Levy 1.
Topic B (Cont’d) Dataflow Model of Computation
Languages and Compilers (SProg og Oversættere)
CS510 Operating System Foundations
Process.
Processes Hank Levy 1.
Well-behaved Dataflow Graphs
Samira Khan University of Virginia Jan 23, 2019
Advanced Computer Architecture Dataflow Processing
Prof. Onur Mutlu Carnegie Mellon University
Presentation transcript:

March 4, 2008http://csg.csail.mit.edu/arvindDF3-1 Dynamic Dataflow Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

March 4, 2008 DF3-2http://csg.csail.mit.edu/arvind/ Static Dataflow: Problems/Limitations Mismatch between the model and the implementation The model requires unbounded FIFO token queues per arc but the architecture provides storage for one token per arc The architecture does not ensure FIFO order in the reuse of an operand slot The merge operator has a unique firing rule The static model does not support Function calls Data Structures - No easy solution in the static framework - Dynamic dataflow provided a framework for solutions

March 4, 2008 DF3-3http://csg.csail.mit.edu/arvind/ Dynamic Dataflow Architectures Allocate instruction templates, i.e., a frame, dynamically to support each loop iteration and procedure call termination detection needed to deallocate frames The code can be shared if we separate the code and the operand storage frame pointer instruction pointer a token

March 4, 2008 DF3-4http://csg.csail.mit.edu/arvind/ A Frame in Dynamic Dataflow Program * L, 4L 3R, 4R 5L 5R out * a b +*7 -+ * y x Need to provide storage for only one operand/operator Frame

March 4, 2008 DF3-5http://csg.csail.mit.edu/arvind/ Data Structures in Dataflow.. P P Memory Data structures reside in a structure store  tokens carry pointers I-structures: Write-once, Read multiple times or allocate, write, read,..., read, deallocate  No problem if a reader arrives before the writer at the memory location I-fetch a I-store a v

March 4, 2008 DF3-6http://csg.csail.mit.edu/arvind/ I-Structure Storage: Split-phase operations & Presence bits I-Fetch t s a a a a v2v2 fp.ip I-structure Memory Need to deal with multiple deferred reads other operations: fetch/store, take/put, clear  v1v1 address to be read t I-Fetch s split phase forwarding address

March 4, 2008 DF3-7http://csg.csail.mit.edu/arvind/ Dynamic Dataflow Graphs

March 4, 2008 DF3-8http://csg.csail.mit.edu/arvind/ Firing Rules Input tokens must have identical tags + L R s:s: t1: LR t2: + S: t2:t1: p1 p2 p1 p2 context (fp) inst no (ip) port

March 4, 2008 DF3-9http://csg.csail.mit.edu/arvind/ Well Behaved Graphs c s 1 c s n c t 1 c t m 1. One-in-one-out property. 2. Self Cleaning: Initially the graph contains no tokens; when all the output tokens have been produced no tokens remain in the graph.

March 4, 2008 DF3-10http://csg.csail.mit.edu/arvind/ Consequences of Tagging No need of merge in well behaved dataflow graphs. Even if f(x1) completes after g(x2) the output tokens will carry the "correct" tag. X F T x2 x1 T F f g

March 4, 2008 DF3-11http://csg.csail.mit.edu/arvind/ Making Graphs Well Behaved Sometimes extra inputs and outputs have to be provided to make graphs well behaved. constants 3 3 trigger input.

March 4, 2008 DF3-12http://csg.csail.mit.edu/arvind/ Making Conditionals Well Behaved All unconnected outputs of switches are collected to generate a signal. XY b p f T F X signal X S

March 4, 2008 DF3-13http://csg.csail.mit.edu/arvind/ I-Structure Operations I-store v s t p s t p I-structure Storage signal   I-store: 1 x 2 x p address value signal I-store won't be well behaved without a signal output

March 4, 2008 DF3-14http://csg.csail.mit.edu/arvind/ Well Behaved Blocks All the signals in a block or a procedure are collected via a tree of synchronizing gates to generate a single signal. signal S S S signal inputs

March 4, 2008 DF3-15http://csg.csail.mit.edu/arvind/ Procedure Linkage Operators f get frame extract tag change Tag 0 Graph for f change Tag 1 a1 1: change Tag n an n:... change Tag 1 Fork token in frame 0 token in frame 1 Like standard call/return but caller & callee can be active simultaneously

March 4, 2008 DF3-16http://csg.csail.mit.edu/arvind/ Unbounded Loop Schema loop body X X change tag release context change tag T F p parent context get context X

March 4, 2008 DF3-17http://csg.csail.mit.edu/arvind/ Reusing Contexts Can we tie the allocation of context in iteration i with the release of context in iteration i-k? Start with k contexts, c0, c1,... ck-1 Bound the loop so that no more than k iterations can execute concurrently. At the end of iteration i, use the context from iteration (i+1) mod k. Release the contexts when the loop terminates..... c k-1 c0c0 c1c1

March 4, 2008 DF3-18http://csg.csail.mit.edu/arvind/ Bounded Loops The parent allocates k frames and stores the linking information and the loop constants. The name of the previous frame and the parent frame is stored in all frames but the next frame slot of the last frame is left empty. When an iteration begins it empties the next slot and when it completes it fills the next slot of the previous frame.... next previous parent loop constants 0 1 k-1

March 4, 2008 DF3-19http://csg.csail.mit.edu/arvind/ Unbounded Loop Schema loop body X X change tag release context change tag T F p parent context get context X Bounded get “next” extract “context” put “previous”

March 4, 2008 DF3-20http://csg.csail.mit.edu/arvind/ Next: Id/pH to Dataflow Graphs