Gedae, Inc. www.gedae.com Implementing Modal Software in Data Flow for Heterogeneous Architectures James Steed, Kerry Barnes, William Lundgren Gedae, Inc.

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

ECE Synthesis & Verification - Lecture 2 1 ECE 667 Spring 2011 ECE 667 Spring 2011 Synthesis and Verification of Digital Circuits High-Level (Architectural)
Implementation of 2-D FFT on the Cell Broadband Engine Architecture William Lundgren Gedae), Kerry Barnes (Gedae), James Steed (Gedae)
High Level Languages: A Comparison By Joel Best. 2 Sources The Challenges of Synthesizing Hardware from C-Like Languages  by Stephen A. Edwards High-Level.
MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 3 Memory Management Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
Vector Processing. Vector Processors Combine vector operands (inputs) element by element to produce an output vector. Typical array-oriented operations.
SCORE - Stream Computations Organized for Reconfigurable Execution Eylon Caspi, Michael Chu, Randy Huang, Joseph Yeh, Yury Markovskiy Andre DeHon, John.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
1 Concurrent and Distributed Systems Introduction 8 lectures on concurrency control in centralised systems - interaction of components in main memory -
Mahapatra-Texas A&M-Fall'001 cosynthesis Introduction to cosynthesis Rabi Mahapatra CPSC498.
Introduction to Operating Systems – Windows process and thread management In this lecture we will cover Threads and processes in Windows Thread priority.
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
Configurable System-on-Chip: Xilinx EDK
VHDL Coding Exercise 4: FIR Filter. Where to start? AlgorithmArchitecture RTL- Block diagram VHDL-Code Designspace Exploration Feedback Optimization.
(Page 554 – 564) Ping Perez CS 147 Summer 2001 Alternative Parallel Architectures  Dataflow  Systolic arrays  Neural networks.
Figure 1.1 Interaction between applications and the operating system.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 11, 2009 Dataflow.
Programming Logic and Design, Introductory, Fourth Edition1 Understanding Computer Components and Operations (continued) A program must be free of syntax.
0 HPEC 2010 Automated Software Cache Management.
What do operating systems do? manage processes manage memory and computer resources provide security features execute user programs make solving user.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.
The 6713 DSP Starter Kit (DSK) is a low-cost platform which lets customers evaluate and develop applications for the Texas Instruments C67X DSP family.
Impulse Embedded Processing Video Lab Generate FPGA hardware Generate hardware interfaces HDL files HDL files FPGA bitmap FPGA bitmap C language software.
Trigger design engineering tools. Data flow analysis Data flow analysis through the entire Trigger Processor allow us to refine the optimal architecture.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 8: Main Memory.
I/O Systems I/O Hardware Application I/O Interface
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
COMPUTER SCIENCE &ENGINEERING Compiled code acceleration on FPGAs W. Najjar, B.Buyukkurt, Z.Guo, J. Villareal, J. Cortes, A. Mitra Computer Science & Engineering.
Gedae Portability: From Simulation to DSPs to the Cell Broadband Engine James Steed, William Lundgren, Kerry Barnes Gedae, Inc
GBT Interface Card for a Linux Computer Carson Teale 1.
FPGA FPGA2  A heterogeneous network of workstations (NOW)  FPGAs are expensive, available on some hosts but not others  NOW provide coarse- grained.
Programming Examples that Expose Efficiency Issues for the Cell Broadband Engine Architecture William Lundgren Gedae), Rick Pancoast.
Chapter 2: Computer-System Structures 2.1 Computer System Operation 2.5 Hardware Protection 2.6 Network Structure.
CIS250 OPERATING SYSTEMS Memory Management Since we share memory, we need to manage it Memory manager only sees the address A program counter value indicates.
6 Memory Management and Processor Management Management of Resources Measure of Effectiveness – On most modern computers, the operating system serves.
© 2004 Mercury Computer Systems, Inc. FPGAs & Software Components Graham Bardouleau & Jim Kulp Mercury Computer Systems, Inc. High Performance Embedded.
Replay Compilation: Improving Debuggability of a Just-in Time Complier Presenter: Jun Tao.
ECE 448: Lab 6 DSP and FPGA Embedded Resources (Digital Downconverter)
Memory. Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Paging Structure of the Page Table Segmentation.
The Main Injector Beam Position Monitor Front-End Software Luciano Piccoli, Stephen Foulkes, Margaret Votava and Charles Briegel Fermi National Accelerator.
Interrupt driven I/O. MIPS RISC Exception Mechanism The processor operates in The processor operates in user mode user mode kernel mode kernel mode Access.
Main Memory. Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Paging Structure of the Page Table Segmentation Example: The.
Introduction to VHDL Simulation … Synthesis …. The digital design process… Initial specification Block diagram Final product Circuit equations Logic design.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem.
ECE-C662 Lecture 2 Prawat Nagvajara
Gedae, Inc. Gedae: Auto Coding to a Virtual Machine Authors: William I. Lundgren, Kerry B. Barnes, James W. Steed HPEC 2004.
Part A Final Dor Obstbaum Kami Elbaz Advisor: Moshe Porian August 2012 FPGA S ETTING U SING F LASH.
04/26/20031 ECE 551: Digital System Design & Synthesis Lecture Set : Introduction to VHDL 12.2: VHDL versus Verilog (Separate File)
The Instruction Set Architecture. Hardware – Software boundary Java Program C Program Ada Program Compiler Instruction Set Architecture Microcode Hardware.
Interrupt driven I/O Computer Organization and Assembly Language: Module 12.
Teaching Digital Logic courses with Altera Technology
KERRY BARNES WILLIAM LUNDGREN JAMES STEED
Gedae Portability: From Simulation to DSPs to the Cell Broadband Engine James Steed, William Lundgren, Kerry Barnes Gedae, Inc
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 8: Main Memory.
CoDeveloper Overview Updated February 19, Introducing CoDeveloper™  Targeting hardware/software programmable platforms  Target platforms feature.
1 Introduction to Engineering Spring 2007 Lecture 18: Digital Tools 2.
Gedae, Inc. Implementing Modal Software in Data Flow for Heterogeneous Architectures James Steed, Kerry Barnes, William Lundgren Gedae, Inc.
Module 12: I/O Systems I/O hardware Application I/O Interface
PROGRAMMABLE LOGIC CONTROLLERS SINGLE CHIP COMPUTER
Current Generation Hypervisor Type 1 Type 2.
Texas Instruments TDA2x and Vision SDK
CS1251 Computer Architecture
Introduction to cosynthesis Rabi Mahapatra CSCE617
Lesson 4 Synchronous Design Architectures: Data Path and High-level Synthesis (part two) Sept EE37E Adv. Digital Electronics.
ECE-C662 Introduction to Behavioral Synthesis Knapp Text Ch
Operating System Concepts
THE ECE 554 XILINX DESIGN PROCESS
THE ECE 554 XILINX DESIGN PROCESS
Module 12: I/O Systems I/O hardwared Application I/O Interface
Presentation transcript:

Gedae, Inc. Implementing Modal Software in Data Flow for Heterogeneous Architectures James Steed, Kerry Barnes, William Lundgren Gedae, Inc. HPEC 2004

Gedae, Inc. Core Gedae Data Flow Gedae’s Core Data Flow Relationships Any application control can be implemented but –Complex modal software requires lots of logic –Done in an ad hoc manner that isn’t reusable static dynamic nondet Number of Tokens Produced/Consumed Restrictions on Execution Preplanned Determined at Runtime Full Inputs/Empty Outputs None

Gedae, Inc. Stream Segmentation Infinite streams can be broken into finite length segments Segments are processed independently Primitives add segment begin and end markers to a data stream Each marker causes side effects downstream segmenter in c out b efgh 1110 cefg abcd 0110 Segment begin Segment end

Gedae, Inc. Using Segmentation to Control Modes Segment markers cause old mode to end and new one to reset Exclusivity allows memory sharing between modes SearchTrackSearch Reset SearchReset TrackReset Search End SearchEnd Track

Gedae, Inc. Reset and EndOfSegment Methods Primitive code is grouped into methods When methods are executed: –Start: Beginning of execution –Reset: Beginning of each segment (start mode) –Apply: When queues are ready for execution (execute mode) –EndOfSegment: End of each segment (end mode) –Terminate: End of execution Start Reset Apply EndOf Segment Terminate Segment End Segment Begin Queues Ready App Started Queues Ready Run Terminate

Gedae, Inc. Sharing Resources Between Modes Exclusivity: Only one output is actively producing a segment at any given time Subgraphs controlled by a family of exclusive outputs can share resources –Schedule memory –Queue memory –State variables exc_branch in c [0]out [1]out efgh 1110 abcd 0010 ab c d efg h 5 segments

Gedae, Inc. Sharing State Information Between Modes Moving static variable out of filterS subgraph causes it to be persistent between segment boundaries No transients due to clearing of static variables at segment boundaries Internal StateExternal State

Gedae, Inc. Moving to Heterogeneity Gedae relies on –C cross compilers and –Optimized vector libraries to run on DSPs. How do we support firmware targets like FPGAS? Specify Implementation Gedae Implements Build Functionality Prototype, Simulate Link Compile Component Library: Generated C-code, Optimized Vector Routines, Run User Gedae Vendor Key Runtime Kernel

Gedae, Inc. Single Sample Language: Gedae-RTL Single sample extension to Gedae graph language Based on the theory of register transfer languages Registers store information, delayed by a clock rate out(i) = K*(in(i) - out(i-1)) + out(i-1)

Gedae, Inc. Gedae-RTL’s Seven Functions Register R(in,out,c) –Copy in to out delayed by clock rate c. Assignment A(E,out) –Evaluate the expression E and assign its value to out. Decimate D(in,c) –Tie clock rate c to signal in. Clock C(in,c) –Get clock rate c tied to in. Memory M(in,n,s) –Allocate buffer in with n elements of size s. Read MR(a,out) –Read the element at address a and put the value in out. Write MW(in,a) –Write the value in to address a.

Gedae, Inc. Language Independence Language Support Package (LSP) allows exportation of target- specific code Export Ansi-C to simulate functionality Export VHDL for FPGAs Export C enhanced for the AltiVec architecture Hardware Description Implementation Settings Transformations Internal Implementation LSP/Code Generator Code Gedae-RTL Graph

Gedae, Inc. Example: 5-Point FIR Filter Pipeline of N R() registers Set of N A(a*b,out) Network of N-1 A(a+b,out) out(i) = C0*in(i) + C1*in(i-1) + C2*in(i-2)...

Gedae, Inc. Gedae Implements the Heterogeneous System Pin connections to camera and VGA provide real time I/O Host updates threshold in non-real time Gedae-RTL graphs on the FPGA perform Sobel in real time Gedae graphs on the DSP calculate new thresholds