1 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 20: Wed 11/2/2011 (Compute Models) Instructor: Dr. Phillip Jones Reconfigurable Computing Laboratory Iowa State University Ames, Iowa, USA
2 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) MP3: Due 11/4 –IT should have resolved the issue that was causing problems running MP3 on some of the linux-X and research-X remote machines Weekly Project Updates due: Friday’s (midnight) Announcements/Reminders
3 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Project Grading Breakdown 50% Final Project Demo 30% Final Project Report –20% of your project report grade will come from your 5-6 project updates. Friday’s midnight 20% Final Project Presentation
4 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) FPL FPT FCCM FPGA DAC ICCAD Reconfig RTSS RTAS ISCA Projects Ideas: Relevant conferences Micro Super Computing HPCA IPDPS
5 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Teams Formed and Topic: Mon 10/10 –Project idea in Power Point 3-5 slides Motivation (why is this interesting, useful) What will be the end result High-level picture of final product –Project team list: Name, Responsibility High-level Plan/Proposal: Fri 10/14 –Power Point 5-10 slides (presentation to class Wed 10/19) System block diagrams High-level algorithms (if any) Concerns –Implementation –Conceptual Related research papers (if any) Projects: Target Timeline
6 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Work on projects: 10/ /9 –Weekly update reports More information on updates will be given Presentations: Finals week –Present / Demo what is done at this point –15-20 minutes (depends on number of projects) Final write up and Software/Hardware turned in: Day of final (TBD) Projects: Target Timeline
7 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Initial Project Proposal Slides (5-10 slides) Project team list: Name, Responsibility (who is project leader) –Team size: 3-4 (5 case-by-case) Project idea Motivation (why is this interesting, useful) What will be the end result High-level picture of final product High-level Plan –Break project into mile stones Provide initial schedule: I would initially schedule aggressively to have project complete by Thanksgiving. Issues will pop up to cause the schedule to slip. –System block diagrams –High-level algorithms (if any) –Concerns Implementation Conceptual Research papers related to you project idea
8 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Weekly Project Updates The current state of your project write up –Even in the early stages of the project you should be able to write a rough draft of the Introduction and Motivation section The current state of your Final Presentation –Your Initial Project proposal presentation (Due Wed 10/19). Should make for a starting point for you Final presentation What things are work & not working What roadblocks are you running into
9 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Common Questions
10 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Compute Models Overview
11 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Introduction to Compute Models What you should learn
12 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Design patterns (previous lecture) –Why are they useful? –Examples Compute models (Abstraction) –Why are they useful? –Examples Outline
13 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Design patterns (previous lecture) –Why are they useful? –Examples Compute models (Abstraction) –Why are they useful? –Examples System Architectures (Implementation) –Why are they useful? –Examples Outline
14 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) References Reconfigurable Computing (2008) [1] –Chapter 5: Compute Models and System Architectures Scott Hauck, Andre DeHon Design Patterns for Reconfigurable Computing [2] –Andre DeHon (FCCM 2004) Type Architectures, Shared Memory, and the Corollary of Modest Potential [3] –Lawrence Snyder: Annual Review of Computer Science (1986)
15 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Problem -> Compute Model + Architecture -> Application Questions to answer –How to think about composing the application? –How will the compute model lead to a naturally efficient architecture? –How does the compute model support composition? –How to conceptualize parallelism? –How to tradeoff area and time? –How to reason about correctness? –How to adapt to technology trends (e.g. larger/faster chips)? –How does compute model provide determinacy? –How to avoid deadlocks? –What can be computed? –How to optimize a design, or validate application properties? Building Applications
16 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Compute Models [1]: High-level models of the flow of computation. Useful for: –Capturing parallelism –Reasoning about correctness –Decomposition –Guide designs by providing constraints on what is allowed during a computation Communication links How synchronization is performed How data is transferred Compute Models
17 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Data Flow: –Single-rate Synchronous Data Flow –Synchronous Data Flow –Dynamic Streaming Dataflow –Dynamic Streaming Dataflow with Peeks –Steaming Data Flow with Allocation Sequential Control: –Finite Automata (i.e. Finite State Machine) –Sequential Controller with Allocation –Data Centric –Data Parallel Two High-level Families
18 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Graph of operators that data (tokens) flows through Composition of functions Data Flow XX +
19 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Graph of operators that data (tokens) flows through Composition of functions Data Flow XX +
20 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Graph of operators that data (tokens) flows through Composition of functions Data Flow XX +
21 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Graph of operators that data (tokens) flows through Composition of functions Data Flow XX +
22 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Graph of operators that data (tokens) flows through Composition of functions Data Flow XX +
23 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Graph of operators that data (tokens) flows through Composition of functions Data Flow XX +
24 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Graph of operators that data (tokens) flows through Composition of functions Data Flow XX +
25 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Graph of operators that data (tokens) flows through Composition of functions Captures: –Parallelism –Dependences –Communication Data Flow XX +
26 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) One token rate for the entire graph –For example all operation take one token on a given link before producing an output token –Same power as a Finite State Machine Single-rate Synchronous Data Flow update 1 F copy
27 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Each link can have a different constant token input and output rate Same power as signal rate version but for some applications easier to describe Automated ways to detect/determine: –Dead lock –Buffer sizes Synchronous Data Flow update 1 F 1 10 copy
28 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Token rates dependent on data Just need to add two structures –Switch Select Dynamic Steaming Data Flow Switch Select S S in0in1 out in out0out1
29 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Token rates dependent on data Just need to add two structures -Switch, Select More –Powerful –Difficult to detect Deadlocks Still Deterministic Dynamic Steaming Data Flow Switch S Select x x y y F0F1 1 x x y y
30 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Allow operator to fire before all inputs have arrived –Example were this is useful is the merge operation Now execution can be nondeterministic –Answer depends on input arrival times Dynamic Steaming Data Flow with Peeks Merge
31 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Allow operator to fire before all inputs have arrived –Example were this is useful is the merge operation Now execution can be nondeterministic –Answer depends on input arrival times Dynamic Steaming Data Flow with Peeks Merge A
32 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Allow operator to fire before all inputs have arrived –Example were this is useful is the merge operation Now execution can be nondeterministic –Answer depends on input arrival times Dynamic Steaming Data Flow with Peeks Merge B A
33 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Allow operator to fire before all inputs have arrived –Example were this is useful is the merge operation Now execution can be nondeterministic –Answer depends on input arrival times Dynamic Steaming Data Flow with Peeks Merge B A
34 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Removes the need for static links and operators. That is the Data Flow graph can change over time More Power: Turing Complete More difficult to analysis Could be useful for some applications –Telecom applications. For example if a channel carries voice verses data the resources needed may vary greatly Can take advantage of platforms that allow runtime reconfiguration Steaming Data Flow with Allocation
35 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Sequence of sub routines –Programming languages (C, Java) –Hardware control logic (Finite State Machines) Transform global data state Sequential Control
36 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Finite state Can verify state reachablilty in polynomial time Finite Automata (i.e. Finite State Machine) S1 S2 S3
37 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Adds ability to allocate memory. Equivalent to adding new states Model becomes Turing Complete Sequential Controller with Allocation S1 S2 S3
38 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Adds ability to allocate memory. Equivalent to adding new states Model becomes Turing Complete Sequential Controller with Allocation S1 S2 S3 S4 SN
39 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Multiple instances of a operation type acting on separate pieces of data. For example: Single Instruction Multiple Data (SIMD) –Identical match test on all items in a database –Inverting the color of all pixels in an image Data Parallel
40 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Similar to Data flow, but state contained in the objects of the graph are the focus, not the tokens flowing through the graph –Network flow example Data Centric Source1 Dest1 Dest2 Switch Source2 Source3 Flow rate Buffer overflow
41 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Multi-threaded: a compute model made up a multiple sequential controllers that have communications channels between them Very general, but often too much power and flexibility. No guidance for: –Ensuring determinism –Dividing application into threads –Avoiding deadlock –Synchronizing threads The models discussed can be defined in terms of a Multi-threaded compute model Multi-threaded
42 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Multi-threaded (Illustration)
43 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Thread: is an operator that performs transforms on data as it flows through the graph Thread synchronization: Tokens sent between operators Streaming Data Flow as Multithreaded
44 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Thread: is a data item Thread synchronization: data updated with each sequential instruction Data Parallel as Multithreaded
45 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Use when a stricter compute model does not give enough expressiveness. Define restrictions to limit the amount of expressive power that can be used –Define synchronization policy –How to reason about deadlocking Caution with Multithreaded Model
46 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) “A Framework for Comparing Models of computation” [1998] –E. Lee, A. Sangiovanni-Vincentelli –Transactions on Computer-Aided Design of Integrated Circuits and Systems “Concurrent Models of Computation for Embedded Software”[2005] –E. Lee, S. Neuendorffer –IEEE Proceedings – Computers and Digital Techniques Other Models
47 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Next Lecture System Architectures
48 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Questions/Comments/Concerns Write down –Main point of lecture –One thing that’s still not quite clear –If everything is clear, then give an example of how to apply something from lecture OR
49 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) Lecture Notes Add CSP/Mulithread as root of a simple tree 15+5(late start) minutes of time left Think of one to two in class exercise (10 min) –Data Flow graph optimization algorithm? –Dead lock detection on a small model? Give some examples of where a given compute model would map to a given application. –Systolic array (implement) or Dataflow compute model) –String matching (FSM) (MISD) New image for MP3, too dark of a color