Samira Khan University of Virginia Jan 23, 2019

Slides:



Advertisements
Similar presentations
Lecture 3: Instruction Set Principles Kai Bu
Advertisements

DATAFLOW ARHITEKTURE. Dataflow Processors - Motivation In basic processor pipelining hazards limit performance –Structural hazards –Data hazards due to.
CHAPTER 4 COMPUTER SYSTEM – Von Neumann Model
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.3 Program Flow Mechanisms.
(Page 554 – 564) Ping Perez CS 147 Summer 2001 Alternative Parallel Architectures  Dataflow  Systolic arrays  Neural networks.
CS321 Functional Programming 2 © JAS Implementation using the Data Flow Approach In a conventional control flow system a program is a set of operations.
High Performance Architectures Dataflow Part 3. 2 Dataflow Processors Recall from Basic Processor Pipelining: Hazards limit performance  Structural hazards.
Computer Architecture Dataflow Machines. Data Flow Conventional programming models are control driven Instruction sequence is precisely specified Sequence.
Programmer's view on Computer Architecture by Istvan Haller.
Chapter 14 Introduction to Microprocessors. 2 Microcomputer A self-contained computer system that consists of CPU (central processing unit), memory (RAM.
Chapter 4 The Von Neumann Model
1 Multithreaded Architectures Lecture 3 of 4 Supercomputing ’93 Tutorial Friday, November 19, 1993 Portland, Oregon Rishiyur S. Nikhil Digital Equipment.
Introduction to Computer Engineering CS/ECE 252, Fall 2009 Prof. Mark D. Hill Computer Sciences Department University of Wisconsin – Madison.
Computer Architecture And Organization UNIT-II General System Architecture.
1 Chapter 2 Dataflow Processors. 2 Dataflow processors Recall from basic processor pipelining: Hazards limit performance. – Structural hazards – Data.
Indira Gandhi National Open University presents. A Video Lecture Course: Computer Platforms.
March 4, 2008 DF5-1http://csg.csail.mit.edu/arvind/ The Monsoon Project Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of.
Different parallel processing architectures Module 2.
Chapter 7 Dataflow Architecture. 7.1 Dataflow Models  A dataflow program: the sequence of operations is not specified, but depends upon the need and.
Operand Addressing And Instruction Representation Cs355-Chapter 6.
Lecture 04: Instruction Set Principles Kai Bu
Stored Program A stored-program digital computer is one that keeps its programmed instructions, as well as its data, in read-write,
Von Neumann Model Computer Organization I 1 September 2009 © McQuain, Feng & Ribbens The Stored Program Computer 1945: John von Neumann –
Von Neumann Computers Article Authors: Rudolf Eigenman & David Lilja
Simple ALU How to perform this C language integer operation in the computer C=A+B; ? The arithmetic/logic unit (ALU) of a processor performs integer arithmetic.
March 4, 2008http://csg.csail.mit.edu/arvindDF3-1 Dynamic Dataflow Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
December 5, 2006http://csg.csail.mit.edu/6.827/L22-1 Dynamic Dataflow Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of.
Chapter 2 Data Manipulation © 2007 Pearson Addison-Wesley. All rights reserved.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
Autumn 2006CSE P548 - Dataflow Machines1 Von Neumann Execution Model Fetch: send PC to memory transfer instruction from memory to CPU increment PC Decode.
Computer Architecture: Dataflow (Part I) Prof. Onur Mutlu Carnegie Mellon University.
Samira Khan University of Virginia Jan 26, 2016 COMPUTER ARCHITECTURE CS 6354 Fundamental Concepts: Computing Models The content and concept of this course.
1 Instructions and Addressing Course website:
Computer Architecture Lecture 2: Fundamental Concepts and ISA
Dataflow Machines CMPE 511
15-740/ Computer Architecture Lecture 3: Performance
Control Unit Lecture 6.
Precise Exceptions and Out-of-Order Execution
Chapter 4 The Von Neumann Model
Prof. Onur Mutlu Carnegie Mellon University
Fall 2012 Parallel Computer Architecture Lecture 22: Dataflow I
Samira Khan University of Virginia Aug 28, 2017
CIS-550 Advanced Computer Architecture Lecture 10: Precise Exceptions
Chapter 4 The Von Neumann Model
CS203 – Advanced Computer Architecture
Introduction to Computer Engineering
Prof. Onur Mutlu Carnegie Mellon University Spring 2014, 2/21/2014
Chapter 4 The Von Neumann Model
The Processor and Machine Language
Chapter 4 The Von Neumann Model
Computer Architecture Dataflow (Part II) and Systolic Arrays
Introduction to Computer Engineering
Lecture 7: Dynamic Scheduling with Tomasulo Algorithm (Section 2.4)
Samira Khan University of Virginia Sep 5, 2018
Computer Architecture
Computer Instructions
Chapter 4 The Von Neumann Model
Topic B (Cont’d) Dataflow Model of Computation
ECE 352 Digital System Fundamentals
The Stored Program Computer
Information Representation: Machine Instructions
Advanced Computer Architecture Dataflow Processing
Samira Khan University of Virginia Jan 16, 2019
Introduction to Computer Engineering
Introduction to Computer Engineering
Introduction to Computer Engineering
Introduction to Computer Engineering
Prof. Onur Mutlu Carnegie Mellon University
Prof. Onur Mutlu Carnegie Mellon University
Chapter 4 The Von Neumann Model
Presentation transcript:

Samira Khan University of Virginia Jan 23, 2019 ADVANCED COMPUTER ARCHITECTURE Fundamental Concepts: Computing Models Samira Khan University of Virginia Jan 23, 2019 The content and concept of this course are adapted from CMU ECE 740

AGENDA Review from last lecture Fundamental concepts Computing models Data flow architecture

THE VON NEUMANN MODEL/ARCHITECTURE Also called stored program computer (instructions in memory). Two key properties: Stored program Instructions stored in a linear memory array Memory is unified between instructions and data The interpretation of a stored value depends on the control signals Sequential instruction processing One instruction processed (fetched, executed, and completed) at a time Program counter (instruction pointer) identifies the current instr. Program counter is advanced sequentially except for control transfer instructions When is a value interpreted as an instruction?

THE DATA FLOW MODEL (OF A COMPUTER) Von Neumann model: An instruction is fetched and executed in control flow order As specified by the instruction pointer Sequential unless explicit control flow instruction Dataflow model: An instruction is fetched and executed in data flow order i.e., when its operands are ready i.e., there is no instruction pointer Instruction ordering specified by data flow dependence Each instruction specifies “who” should receive the result An instruction can “fire” whenever all operands are received Potentially many instructions can execute at the same time Inherently more parallel

VON NEUMANN VS DATAFLOW Consider a Von Neumann program What is the significance of the program order? What is the significance of the storage locations? Which model is more natural to you as a programmer? a b v <= a + b; w <= b * 2; x <= v - w y <= v + w z <= x * y + *2 - + Sequential * Dataflow z

MORE ON DATA FLOW In a data flow machine, a program consists of data flow nodes A data flow node fires (fetched and executed) when all it inputs are ready i.e. when all inputs have tokens Data flow node and its ISA representation

DATA FLOW NODES

An Example

What does this model perform? val = a ^ b

What does this model perform? val = a ^ b val =! 0

What does this model perform? val = a ^ b val =! 0 val &= val - 1

What does this model perform? val = a ^ b val =! 0 val &= val - 1; dist = 0 dist++;

Hamming Distance int hamming_distance (unsigned a, unsigned b) { int dist = 0; unsigned val = a ^ b; // Count the number of bits set while (val != 0) { // A bit is set, so increment the count and clear the bit dist++; val &= val - 1; } // Return the number of differing bits return dist;

Hamming Distance  Number of positions at which the corresponding symbols are different. The Hamming distance between: "karolin" and "kathrin" is 3 1011101 and 1001001 is 2 2173896 and 2233796 is 3

RICHARD HAMMING Best known for Hamming Code Won Turing Award in 1968 Was part of the Manhattan Project Worked in Bell Labs for 30 years You and Your Research is mainly his advice to other researchers Had given the talk many times during his life time http://www.cs.virginia.edu/~robins/YouAndYourResearch.html

HOW TO BUILD A DATAFLOW MACHINE?

Monsoon Dataflow Processor 1990

Review Set 2 Due Jan 30 Choose 2 from a set of four Dennis and Misunas, “A Preliminary Architecture for a Basic Data Flow Processor,” ISCA 1974. Arvind and Nikhil,“Executing a Program on the MIT Tagged-Token Dataflow Architecture”, IEEE TC 1990. H. T. Kung, “Why Systolic Architectures?,” IEEE Computer 1982. Annaratone et al., “Warp Architecture and Implementation,” ISCA 1986.

Dataflow Graphs a b x y < ip , p , v > {x = a + b; y = b * 7 in (x-y) * (x+y)} a b + *7 - * y x 1 2 3 4 5 Values in dataflow graphs are represented as tokens An operator executes when all its input tokens are present; copies of the result token are distributed to the destination operators ip = 3 p = L token < ip , p , v > instruction ptr port data no separate control flow

Control Flow vs. Data Flow

Static Dataflow Allows only one instance of a node to be enabled for firing A dataflow node is fired only when all of the tokens are available on its input arcs and no tokens exist on any of its its output arcs Dennis and Misunas, “A Preliminary Architecture for a Basic Data Flow Processor,” ISCA 1974.

Static Dataflow Machine: Instruction Templates b + *7 - * y x 1 2 3 4 5 Destination 1 Destination 2 Operand 1 Operand 2 Opcode + 3L 4L 1 2 3 4 5 * 3R 4R - 5L + 5R * out Presence bits Each arc in the graph has an operand slot in the program

Static Dataflow Machine (Dennis+, ISCA 1974) Receive Instruction Templates 1 2 . Op dest1 dest2 p1 src1 p2 src2 FU FU FU FU FU Send <s1, p1, v1>, <s2, p2, v2> Many such processors can be connected together Programs can be statically divided among the processors

Static Data Flow Machines Mismatch between the model and the implementation The model requires unbounded FIFO token queues per arc but the architecture provides storage for one token per arc The architecture does not ensure FIFO order in the reuse of an operand slot The static model does not support Reentrant code Function calls Loops Data Structures

Problems with Re-entrancy Assume this was in a loop Or in a function And operations took variable time to execute How do you ensure the tokens that match are of the same invocation?

Dynamic Dataflow Architectures Allocate instruction templates, i.e., a frame, dynamically to support each loop iteration and procedure call termination detection needed to deallocate frames The code can be shared if we separate the code and the operand storage a token <fp, ip, port, data> frame pointer instruction pointer

A Frame in Dynamic Dataflow 1 2 3 4 5 + 1 2 4 5 3L, 4L Program a b + *7 - * y x 1 2 3 4 5 * 3R, 4R Need to provide storage for only one operand/operator - 3 5L + 5R * out <fp, ip, p , v> 1 2 4 5 3 L 7 Frame

Monsoon Processor (ISCA 1990) op r d1,d2 Code Instruction Fetch ip Operand Fetch fp+r Token Queue Frames ALU Form Token Network Network

Concept of Tagging Each invocation receives a separate tag

Procedure Linkage Operators f a1 an ... get frame extract tag change Tag 0 change Tag 1 change Tag n Like standard call/return but caller & callee can be active simultaneously 1: n: Fork Graph for f token in frame 0 token in frame 1 change Tag 0 change Tag 1

Samira Khan University of Virginia Jan 23, 2019 ADVANCED COMPUTER ARCHITECTURE Fundamental Concepts: Computing Models Samira Khan University of Virginia Jan 23, 2019 The content and concept of this course are adapted from CMU ECE 740