Instructor: Dr. Phillip Jones

Instructor: Dr. Phillip Jones
CPRE 583 Reconfigurable Computing Lecture 8: Fri 10/30/2009 (System Architectures) Instructor: Dr. Phillip Jones Reconfigurable Computing Laboratory Iowa State University Ames, Iowa, USA

Overview Class Projects Common System Architectures

Project Grading Breakdown
60% Final Project Demo 30% Final Project Report 30% of your project report grade will come from your 5 project updates. Friday’s midnight 10% Final Project Presentation

Project Update The current state of your project write up
Even in the early stages of the project you should be able to write a rough draft of the Introduction and Motivation section The current state of your Final Presentation What things are work & not working What roadblocks are you running into

What you should learn Introduction to common System Architectures

Outline System Architectures Why are they useful? Examples

References Reconfigurable Computing (2008) [1]
Chapter 5: Compute Models and System Architectures Scott Hauck, Andre DeHon

System Architectures Compute Models: Help express the parallelism of an application System Architecture: How to organize application implementation

Efficient Application Implementation
Compute model and system architecture should work together Both are a function of The nature of the application Required resources Required performance The nature of the target platform Resources available

(Image Processing) Platform 1 (Vector Processor) Platform 2 (FPGA)

(Image Processing) Compute Model System Architecture Platform 1 (Vector Processor) Platform 2 (FPGA)

(Image Processing) Data Flow Compute Model Streaming Data Flow System Architecture Platform 1 (Vector Processor) Platform 2 (FPGA)

(Image Processing) Compute Model System Architecture Platform 1 (Vector Processor) Platform 2 (FPGA)

(Image Processing) Data Parallel Compute Model Vector System Architecture Platform 1 (Vector Processor) Platform 2 (FPGA)

(Image Processing) Data Flow Compute Model Streaming Data Flow System Architecture Platform 1 (Vector Processor) Platform 2 (FPGA)

Implementing Streaming Dataflow
Data presence variable length connections between operators data rates vary between operator implementations data rates varying between operators Datapath sharing not enough spatial resources to host entire graph balanced use of resources (e.g. operators) cyclic dependencies impacting efficiency Interconnect sharing Interconnects are becoming difficult to route Links between operators infrequently used High variability in operator data rates Streaming coprocessor Extreme resource constraints

Data Presence X X +

Data Presence X X data_ready data_ready + data_ready

Data Presence X X FIFO FIFO data_ready data_ready + FIFO data_ready

Data Presence X X stall stall FIFO FIFO data_ready data_ready + FIFO

Data Presence Flow control: Term typical used in networking X X stall
FIFO FIFO data_ready data_ready + FIFO stall data_ready Flow control: Term typical used in networking

Data Presence Flow control: Term typical used in networking
Increase flexibility of how application can be implemented X X stall stall FIFO FIFO data_ready data_ready + FIFO stall data_ready Flow control: Term typical used in networking

Datapath Sharing X X +

Datapath Sharing Platform may only have one multiplier X X +

Datapath Sharing Platform may only have one multiplier X +

Datapath Sharing Platform may only have one multiplier REG X REG +

Datapath Sharing Platform may only have one multiplier REG X FSM REG +

Datapath Sharing Platform may only have one multiplier
REG X FSM REG + Important to keep track of were data is coming!!

Interconnect sharing X X +

Interconnect sharing Need more efficient use of interconnect X X +

Interconnect sharing Need more efficient use of interconnect X X FSM +

Streaming coprocessor

Sequential Control Typically thought of in the context of sequential programming on a processor (e.g. C, Java programming) Key to organizing synchronizing and control over highly parallel operations Time multiplexing resources: when task to too large for computing fabric Increasing data path utilization

Sequential Control X + A B C

Sequential Control X + A B C A*x2 + B*x + C

Sequential Control X + A B C C A B X X + A*x2 + B*x + C A*x2 + B*x + C

Finite State Machine with Datapath (FSMD)
B X X + A*x2 + B*x + C

Finite State Machine with Datapath (FSMD)
B X FSM X + A*x2 + B*x + C

Sequential Control: Types
Finite State Machine with Datapath (FSMD) Very Long Instruction Word (VLIW) data path control Processor Instruction augmentation Phased reconfiguration manager Worker farm

Very Long Instruction Word (VLIW) Datapath Control

Processor

Instruction Augmentation

Phased Configuration Manager

Worker Farm

Bulk Synchronous Parallelism

Data Parallel Single Program Multiple Data
Single Instruction Multiple Data (SIMD) Vector Vector Coprocessor

Data Parallel

Cellular Automata

Multi-threaded

Next Lecture Evolvable Hardware (Chapter 33)

Slides in Progress Need to revise this lecture with figures, and useful animations Add some non-FPGA systems, maybe not since GARP, and PipeRench were discussed in last lecture. Perhaps just mention again Main reason other archs are not used is economy of scales. Lots of FPGAs are manufacture, thus lowing cost and enable the use of state of the art fab technology (given high performance

Instructor: Dr. Phillip Jones

Similar presentations

Presentation on theme: "Instructor: Dr. Phillip Jones"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Instructor: Dr. Phillip Jones

Similar presentations

Presentation on theme: "Instructor: Dr. Phillip Jones"— Presentation transcript:

Similar presentations

About project

Feedback