QUIZ What does ICAP stand for ? What is its main use ? Why is Partition Pin preferred over Bus Macro? 1.

Slides:



Advertisements
Similar presentations
PARTIAL RECONFIGURATION USING FPGAs: ARCHITECTURE
Advertisements

Spartan-3 FPGA HDL Coding Techniques
Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Run-Time FPGA Partial Reconfiguration for Image Processing Applications Shaon Yousuf Ph.D. Student NSF CHREC Center, University of Florida Dr. Ann Gordon-Ross.
Scrubbing Approaches for Kintex-7 FPGAs
Implementation Approaches with FPGAs Compile-time reconfiguration (CTR) CTR is a static implementation strategy where each application consists of one.
1 SECURE-PARTIAL RECONFIGURATION OF FPGAs MSc.Fisnik KRAJA Computer Engineering Department, Faculty Of Information Technology, Polytechnic University of.
PARTIAL RECONFIGURATION DESIGN. 2 Partial Reconfiguration Partial Reconfiguration :  Ability to reconfigure a portion of the FPGA while the remainder.
Lecture 7 FPGA technology. 2 Implementation Platform Comparison.
Altera FLEX 10K technology in Real Time Application.
Hardwired networks on chip for FPGAs and their applications
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
Embedded Systems: Introduction. Course overview: Syllabus: text, references, grading, etc. Schedule: will be updated regularly; lectures, assignments.
Lecture 26: Reconfigurable Computing May 11, 2004 ECE 669 Parallel Computer Architecture Reconfigurable Computing.
1 Student: Khinich Fanny Instructor: Fiksman Evgeny המעבדה למערכות ספרתיות מהירות High Speed Digital Systems Laboratory הטכניון - מכון טכנולוגי לישראל.
1 Performed by: Lin Ilia Khinich Fanny Instructor: Fiksman Eugene המעבדה למערכות ספרתיות מהירות High Speed Digital Systems Laboratory הטכניון - מכון טכנולוגי.
1 Chapter 9 Design Constraints and Optimization. 2 Overview Constraints are used to influence Synthesizer tool Place-and-route tool The four primary types.
Configurable System-on-Chip: Xilinx EDK
Storage Assignment during High-level Synthesis for Configurable Architectures Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.
ECE 699: Lecture 2 ZYNQ Design Flow.
GanesanP91 Synthesis for Partially Reconfigurable Computing Systems Satish Ganesan, Abhijit Ghosh, Ranga Vemuri Digital Design Environments Laboratory.
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
Mahesh Sukumar Subramanian Srinivasan. Introduction Face detection - determines the locations of human faces in digital images. Binary pattern-classification.
Bitstream Relocation with Local Clock Domains for Partially Reconfigurable FPGAs Adam Flynn, Ann Gordon-Ross, Alan D. George NSF Center for High-Performance.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
© 2011 Xilinx, Inc. All Rights Reserved This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
Programmable Logic- How do they do that? 1/16/2015 Warren Miller Class 5: Software Tools and More 1.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Benefits of Partial Reconfiguration Reducing the size of the FPGA device required to implement a given function, with consequent reductions in cost and.
ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.
Power Reduction for FPGA using Multiple Vdd/Vth
ECO Methodology for Very High Frequency Microprocessor Sumit Goswami, Srivatsa Srinath, Anoop V, Ravi Sekhar Intel Technology, Bangalore, India Introduction.
LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.
Automated Design of Custom Architecture Tulika Mitra
© 2003 Xilinx, Inc. All Rights Reserved For Academic Use Only Xilinx Design Flow FPGA Design Flow Workshop.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
J. Christiansen, CERN - EP/MIC
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
Tools - Implementation Options - Chapter15 slide 1 FPGA Tools Course Implementation Options.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
Page 1 Reconfigurable Communications Processor Principal Investigator: Chris Papachristou Task Number: NAG Electrical Engineering & Computer Science.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
Design Framework for Partial Run-Time FPGA Reconfiguration Chris Conger, Ann Gordon-Ross, and Alan D. George Presented by: Abelardo Jara-Berrocal HCS Research.
This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
Partial Region and Bitstream Cost Models for Hardware Multitasking on Partially Reconfigurable FPGAs + Also Affiliated with NSF Center for High- Performance.
MAPLD 2005/254C. Papachristou 1 Reconfigurable and Evolvable Hardware Fabric Chris Papachristou, Frank Wolff Robert Ewing Electrical Engineering & Computer.
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
Introductory project. Development systems Design Entry –Foundation ISE –Third party tools Mentor Graphics: FPGA Advantage Celoxica: DK Design Suite Design.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
PROJECT - ZYNQ Yakir Peretz Idan Homri Semester - winter 2014 Duration - one semester.
Lecture 12: Reconfigurable Systems II October 20, 2004 ECE 697F Reconfigurable Computing Lecture 12 Reconfigurable Systems II: Exploring Programmable Systems.
A Physical Resource Management Approach to Minimizing FPGA Partial Reconfiguration Overhead Heng Tan and Ronald F. DeMara University of Central Florida.
CHAPTER 8 Developing Hard Macros The topics are: Overview Hard macro design issues Hard macro design process Physical design for hard macros Block integration.
Reconfigurable Embedded Processor Peripherals Xilinx Aerospace and Defense Applications Brendan Bridgford Brandon Blodget.
FPGA Partial Reconfiguration Presented by: Abelardo Jara-Berrocal HCS Research Laboratory College of Engineering University of Florida April 10 th, 2009.
M. ALSAFRJALANI D. DZENITIS Runtime PR for Software Radio 2/26/2010 UFL ECE Dept 1 PARTIAL RECONFIGURATION (PR)
VAPRES A Virtual Architecture for Partially Reconfigurable Embedded Systems Presented by Joseph Antoon Abelardo Jara-Berrocal, Ann Gordon-Ross NSF Center.
WARP PROCESSORS ROMAN LYSECKY GREG STITT FRANK VAHID Presented by: Xin Guan Mar. 17, 2010.
Final Presentation Hardware DLL Real Time Partial Reconfiguration Management of FPGA by OS Submitters:Alon ReznikAnton Vainer Supervisors:Ina RivkinOz.
ASIC/FPGA design flow. Design Flow Detailed Design Detailed Design Ideas Design Ideas Device Programming Device Programming Timing Simulation Timing Simulation.
An Automated Hardware/Software Co-Design
ASIC Design Methodology
Rapid Overlay Builder for Xilinx FPGAs
ENG3050 Embedded Reconfigurable Computing Systems
FPGAs in AWS and First Use Cases, Kees Vissers
Jian Huang, Matthew Parris, Jooheung Lee, and Ronald F. DeMara
A High Performance SoC: PkunityTM
ECE 699: Lecture 3 ZYNQ Design Flow.
Dynamic Partial Reconfiguration of FPGA
Measuring the Gap between FPGAs and ASICs
Presentation transcript:

QUIZ What does ICAP stand for ? What is its main use ? Why is Partition Pin preferred over Bus Macro? 1

PARTIAL RECONFIGURATION DESIGN FLOW 2

Recap – Complete Architecture Overview 3

1. Module-based PR: Implement each Reconfigurable Module as an individual project Constrain each PR module to be placed in a given partition Initially full Bitstream is loaded and partial Bitstream of a complete PR module is loaded on demand Supported by Plan Ahead. Will be covered in detail 2. Difference-based PR: Implement each Reconfigurable Module as an individual project Constrain each PR module to be placed in a given partition Compute the difference of Bitstreams of the Reconfigurable modules to obtain the differential partial bitstream Initially full Bitstream is loaded and differential partial Bitstream of a PR module is loaded on demand Design Flows for PR 4

PR Specific Design Flow Comparison Without PRWith PR 5

Example PR Design System level block diagram (Implemented on Zync Z202 SoC) Original Sobel processed Sepia processed Reconfigurable part Static part The Reconfigurable “Filter Engine” will be replaced with Sobel or Sepia filter part during Runtime partial Reconfiguration. 6 Ref: Application Note: Zynq-7000 All Programmable SoC

Design Flow Vivado : Converts high level code to RTL code Xilinx Synthesis Tool : Converts RTL code to Netlist PlanAhead tool: Used for 1.Reconfigurable partitioning 2.Floorplan the design 3.Add Reconfigurable modules 4.Run Implementation tools to generate Full and partial bit stream Non PR specific design flow PR specific design flow 7 Ref: Application Note: Zynq-7000 All Programmable SoC

PR SPECIFIC DESIGN FLOW (USING PLANAHEAD) 8

Setting Partition A partition defines the smallest atomic area a module can be assigned Different Partitioning styles possible Not all supported by commercial vendors. Island style Slot Based Grid Based Partitioning Style 9 IslandSlot BasedGrid based

Setting Partition Partitioning style affects placement and flexibility Island style - suffers from fragmentation. Offered by the current vendors Xilinx and Altera. Slot style - Also suffers from fragmentation but to a lesser extent. Some academic tools have explored this style –ReCoBus Grid Style - Reduced fragmentation. Difficult to support. To enhance flexibility, the PR module must be placed and routed in every region it needs to be configured. Additional stress on Bit stream size. Placement Flexibility 10 IslandSlot BasedGrid based

Setting Partition In the Netlist view of the synthesized design, select FILTER ENGINE to set partition The type of partition should be selected as Reconfigurable partition 11 Ref: Application Note: Zynq-7000 All Programmable SoC

Adding Default Netlist Select the Sobel Filter Netlist for the Reconfigurable partition 12 Ref: Application Note: Zynq-7000 All Programmable SoC

Adding Reconfigurable Modules Add Sepia filter Netlist for the Reconfigurable partition 13 Ref: Application Note: Zynq-7000 All Programmable SoC

Floor Plan the PR region The PlanAhead tool requires the User to manually select the PR region considering the amount of resources required for the most complex reconfigurable module 14 Ref: Application Note: Zynq-7000 All Programmable SoC

Floor Plan the PR Region Column wise layout of different logic primitives Must be considered when placing Depending on the type of logic primitives used by the module(SLICEX, SLICEM, etc), relocation may or may not be possible. Resource Consideration 15

Floor Plan Design Recommendations When possible, add frames to an RP range in the same clock region rather than adding an additional clock region to avoid clock starvation 16 Global Clocks

Partition Manually optimize the Fanout before the automatic Placement and routing, done in implementation stage, for a better design 17 Floor Plan Design Recommendations Fan Outs

Implementation Sobel filter Sepia filter The Final Placed and Routed designs for Sobel and Sepia filter 18 Ref: Application Note: Zynq-7000 All Programmable SoC

Generating Bit Streams This step will generate Full and Partilal bitstream for Sepia and Sobel filter. The full bitstream of sobel could be used as initial bitstream The partial bitstream of sepia and Sobel could be loaded to FPGA via PCAP on demand 19 Ref: Application Note: Zynq-7000 All Programmable SoC

Bit Streams - Review Row address – 0 to 9 Top/Bottom row of the FPGA Together with row address can locate the tile Major Address : Columns 0 onwards Minor Address : No. of frames in tile Block type : Logic Blocks, BRAMs, Routing Blocks 20 Frame Address Register Frame Composition

Device Configuration Flow 21 Ref: Application Note: Zynq-7000 All Programmable SoC 1.Boot Loader loaded on to on chip RAM 2. Full Sobel bitstream via the Processor Configuration Access Port (PCAP) is loaded to FPGA. 3. The user application loads the partial bit streams into DDR memory upon start-up 4. At this point, the application can use the partial bit streams at any time to modify the pre-defined PL regions while the rest of the FPGA remains fully active and uninterrupted

RESOURCE UTILZATION & TIMING 22 Ref: Application Note: Zynq-7000 All Programmable SoC

Power Consideration PR itself requires power Power during PR is spent in: 1. Configuration Data Access – 2. Actual configuration of FPGA Resources Bonamy, R., et al. "Power Consumption Models for the Use of Dynamic and Partial Reconfiguration." Microprocessors and Microsystems (2014). 23

Case Study Fault tolerant Processor IF,MAC and ALU are the PRMs Different configurations available for each module. Focus on the self healing feature more than the performance itself. Fault Tolerance – Self Healing Architecture 24 Psarakis, Mihalis, and Andreas Apostolakis. "Fault tolerant FPGA processor based on runtime reconfigurable modules."

Case Study Reconfigurable Crypto processor Processor can choose from Different crypto algorithms Major Area savings Some Power Savings too. 25 Hori, Yohei, Toshihiro Katashita, and Kazukuni Kobara. "Energy and area saving effect of Dynamic Partial Reconfiguration on a 28-nm process FPGA."

Case Study Fast Start up is a 2 step configuration Useful in time critical systems to initiate a swift system start up. Example : Automotive safety Fast Start Up 26 Meyer, Joachim, et al. "Fast start-up for spartan-6 fpgas using dynamic partial reconfiguration."

Challenges of Partial Reconfiguration Complicated design flow Tool Support Doesn’t support Slot/Grid Style Manual Placement Steps Manual assistance for reconfiguring different target devices. Security issues Although encryption option is provided, security issues persist. Decreased performance as compared to full configuration. Xilinx reports a 10% degradation in clock frequency when using PR. Xilinx PR Implementation Flow HDL Design Description HDL Synthesis Set Design Constraints Placement Analysis Implement Static Design and PR Modules Merge Final Bitsreams Manual steps 27

Our Project A Run time reconfigurable motion estimation. Motion estimation (Block Matching) techniques used in video stabilization. Switch between 2 different algorithms (Full Search and Diamond Search) depending on external inputs such as video quality Achieve a tradeoff between speed and accuracy based on external inputs. Evaluate metrics such as area savings, power savings and reconfiguration time. PR tools are not in matured state. So it will be a challenging task to implement the motion estimation algorithms using PR, hence we have a backup plan to implement “Algorithmic approach to partial bit stream relocation”. 28

Thank you Questions ? 29