1 Multithreaded Architectures Lecture 3 of 4 Supercomputing ’93 Tutorial Friday, November 19, 1993 Portland, Oregon Rishiyur S. Nikhil Digital Equipment.

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

Parallel Processing & Parallel Algorithm May 8, 2003 B4 Yuuki Horita.
Programming Languages and Paradigms
DATAFLOW ARHITEKTURE. Dataflow Processors - Motivation In basic processor pipelining hazards limit performance –Structural hazards –Data hazards due to.
Computer Organization. This module surveys the physical resources of a computer system. –Basic components CPUMemoryBus I/O devices –CPU structure Registers.
1 Lecture-2 CSIT-120 Spring 2001 Revision of Lecture-1 Introducing Computer Architecture The FOUR Main Elements Fetch-Execute Cycle A Look Under the Hood.
What's inside a router? We have yet to consider the switching function of a router - the actual transfer of datagrams from a router's incoming links to.
University of Michigan Electrical Engineering and Computer Science 1 Reducing Control Power in CGRAs with Token Flow Hyunchul Park, Yongjun Park, and Scott.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
CHAPTER 4 COMPUTER SYSTEM – Von Neumann Model
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.3 Program Flow Mechanisms.
(Page 554 – 564) Ping Perez CS 147 Summer 2001 Alternative Parallel Architectures  Dataflow  Systolic arrays  Neural networks.
1 Lecture-2 CS-120 Fall 2000 Revision of Lecture-1 Introducing Computer Architecture The FOUR Main Elements Fetch-Execute Cycle A Look Under the Hood.
Basic Computer Organization, CPU L1 Prof. Sin-Min Lee Department of Computer Science.
5 th Biennial Ptolemy Miniconference Berkeley, CA, May 9, 2003 The Component Interaction Domain: Modeling Event-Driven and Demand- Driven Applications.
ECE669 L23: Parallel Compilation April 29, 2004 ECE 669 Parallel Computer Architecture Lecture 23 Parallel Compilation.
Router Architectures An overview of router architectures.
1 DATAFLOW ARCHITECTURES JURIJ SILC, BORUT ROBIC, THEO UNGERER.
CMPE 511 DATA FLOW MACHINES Mustafa Emre ERKOÇ 11/12/2003.
CS321 Functional Programming 2 © JAS Implementation using the Data Flow Approach In a conventional control flow system a program is a set of operations.
Voicu Groza, 2008 SITE, HARDWARE/SOFTWARE CODESIGN OF EMBEDDED SYSTEMS Hardware/Software Codesign of Embedded Systems Voicu Groza SITE Hall, Room.
ECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Novel Architectures Copyright 2004 Daniel J. Sorin Duke University.
High Performance Architectures Dataflow Part 3. 2 Dataflow Processors Recall from Basic Processor Pipelining: Hazards limit performance  Structural hazards.
Computer Architecture Dataflow Machines. Data Flow Conventional programming models are control driven Instruction sequence is precisely specified Sequence.
Implementation of a Stored Program Computer ITCS 3181 Logic and Computer Systems 2014 B. Wilkinson Slides2.ppt Modification date: Oct 16,
A Graph Based Algorithm for Data Path Optimization in Custom Processors J. Trajkovic, M. Reshadi, B. Gorjiara, D. Gajski Center for Embedded Computer Systems.
C. André, J. Boucaron, A. Coadou, J. DeAntoni,
Computer Organization and Architecture Tutorial 1 Kenneth Lee.
1 Chapter 2 Dataflow Processors. 2 Dataflow processors Recall from basic processor pipelining: Hazards limit performance. – Structural hazards – Data.
March 4, 2008 DF5-1http://csg.csail.mit.edu/arvind/ The Monsoon Project Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of.
Lecture 11: 10/1/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.
Different parallel processing architectures Module 2.
Chapter 7 Dataflow Architecture. 7.1 Dataflow Models  A dataflow program: the sequence of operations is not specified, but depends upon the need and.
Represents different voltage levels High: 5 Volts Low: 0 Volts At this raw level a digital computer is instructed to carry out instructions.
1.4 Representation of data in computer systems Instructions.
Computer Architecture: Out-of-Order Execution II
1 November 11, 2015 A Massively Parallel, Hybrid Dataflow/von Neumann Architecture Yoav Etsion November 11, 2015.
March 4, 2008http://csg.csail.mit.edu/arvindDF3-1 Dynamic Dataflow Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
December 5, 2006http://csg.csail.mit.edu/6.827/L22-1 Dynamic Dataflow Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of.
18-WAN Technologies and Dynamic routing Dr. John P. Abraham Professor UTPA.
Computer Architecture Dataflow Machines Sunset over Lifou, New Caledonia.
1 3 Computing System Fundamentals 3.2 Computer Architecture.
Autumn 2006CSE P548 - Dataflow Machines1 Von Neumann Execution Model Fetch: send PC to memory transfer instruction from memory to CPU increment PC Decode.
Computer Operation. Binary Codes CPU operates in binary codes Representation of values in binary codes Instructions to CPU in binary codes Addresses in.
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
Computer Architecture: Dataflow (Part I) Prof. Onur Mutlu Carnegie Mellon University.
Chapter 3 Part 3 Switching and Bridging
Dataflow Machines CMPE 511
Ch 13 WAN Technologies and Routing
Gunjeet Kaur Dronacharya Group of institutions
Prof. Onur Mutlu Carnegie Mellon University
Fall 2012 Parallel Computer Architecture Lecture 22: Dataflow I
18-WAN Technologies and Dynamic routing
About Instructions Based in part on material from Chapters 4 & 5 in Computer Architecture by Nicholas Carter PHY 201 (Blum)
CS 31006: Computer Networks – The Routers
Computer Organization & Compilation Process
Computer Organization “Central” Processing Unit (CPU)
Dataflow Models.
Packet Switch Architectures
Topic B (Cont’d) Dataflow Model of Computation
Chapter 3 Part 3 Switching and Bridging
Deadlock Detection for Distributed Process Networks
Dataflow Model of Computation (From Dataflow to Multithreading)
Computer Organization & Compilation Process
Samira Khan University of Virginia Jan 23, 2019
Advanced Computer Architecture Dataflow Processing
Packet Switch Architectures
Prof. Onur Mutlu Carnegie Mellon University
Prof. Onur Mutlu Carnegie Mellon University
Presentation transcript:

1 Multithreaded Architectures Lecture 3 of 4 Supercomputing ’93 Tutorial Friday, November 19, 1993 Portland, Oregon Rishiyur S. Nikhil Digital Equipment Corporation Cambridge Research Laboratory Copyright (c) 1992, 1993 Digital Equipment Corporation nikhil−su93−3

Supercomputing '93 Multithreaded Architectures Tutorial, Lecture 1 2 Overview of Lecture 3 Multithreading: a Dataflow Story Dataflow graphs as a machine language MIT Tagged TokenManchester Dataflow Architecture Dataflow ETL Sigma−1 Explicit Token Store Machines MIT/Motorola ETL EM−4 Monsoon P−RISC: "RISC−ified" dataflow MIT/Motorola *T nikhil−94

Supercomputing '93 Multithreaded Architectures Tutorial, Lecture 1 3 Dataflow Graphs as a Machine Language Code: Dataflow Graph (interconnection of instructions) Conceptually, "tokens" carrying values flow along the edges of the graph. Memories: Two major components Heap, for data structures (e.g., I−structure memory) "Stack" (contexts, frames,...) Values on tokens may be memory addresses Each instruction: Waits for tokens on all inputs (and possibly a memory synchronization condition) Consumes input tokens Computes output values based on input values Possibly reads/writes memory Produces tokens on outputs No further restriction on instruction ordering A common misconception: "Dataflow graphs and machines cannot express side effects, and they only implement functional languages" nikhil−95

Supercomputing '93 Multithreaded Architectures Tutorial, Lecture 1 4 Encoding Dataflow Graphs and Tokens ConceptualEncoding of graph Program memory: Op− op1op2 + op3op4 Encoding of token: A "packet" containing: code 109op1 113op op3 159op4 Destination(s) 120L 120R 141, 159L..., 120R Destination instruction address, Left/Right port 6.847Value Re−entrancy ("dynamic" dataflow): Each invocation of a function or loop iteration gets its own, unique, "Context" Tokens destined for same instruction in different invocations are distinguished by a context identifier 120R Destination instruction address, Left/Right port CtxtContext Identifier 6.847Value nikhil−96

Supercomputing '93 Multithreaded Architectures Tutorial, Lecture 1 5 MIT Tagged Token Dataflow Architecture Designed by Arvind et. al., MIT, early 1980’s Simulated; never built Global view: Processor Nodes (including local program and "stack" memory) Resource Interconnection Network Manager Nodes "I−Structure" Memory Nodes (global heap data memory) Resource Manager Nodes responsible for Function allocation (allocation of context identifiers) Heap allocation etc. Stack memory and heap memory: globally addressed nikhil−97

Supercomputing '93 Multithreaded Architectures Tutorial, Lecture 1 6 MIT Tagged Token Dataflow Architecture Processor Wait−Match Waiting token Unit memory (associative) InstructionProgram Fetch memory Token Queue Execute Op Form Tokens Output To networkFrom network Wait−Match Unit: Tokens for unary ops go straight through Tokens for binary ops: try to match incoming token and a waiting token with same instruction address and context id Success:Both tokens forwarded Fail:Incoming token −−> Waiting Token Mem, Bubble (no−op) forwarded nikhil−98

Supercomputing '93 Multithreaded Architectures Tutorial, Lecture 1 7 MIT Tagged Token Dataflow Architecture Processor Operation 120R,c, Waiting token Wait−Match Unit 120,c, 6.001,6.847 Instruction Fetch Token141,159L,c, +,6.001,6.847 Queue Execute Op 141,159L,c, Form Tokens 141,c, L,c, Output memory (associative) 120L,c, Program memory , 159L To networkFrom network Output Unit routes tokens: Back to local Token Queue To another Processor To heap memory based on the addresses on the token Tokens from network are placed in Token Queue nikhil−99