FPGA Partial Reconfiguration Presented by: Abelardo Jara-Berrocal HCS Research Laboratory College of Engineering University of Florida April 10 th, 2009.

Slides:



Advertisements
Similar presentations
PARTIAL RECONFIGURATION USING FPGAs: ARCHITECTURE
Advertisements

University of South Australia Distributed Reconfiguration Avishek Chakraborty, David Kearney, Mark Jasiunas.
2009 Midyear Workshop F4-09: Virtual Architecture and Design Automation for Partial Reconfiguration All Hands Meeting November 10th, 2009 Dr. Ann Gordon-Ross.
Computer Architecture (EEL4713, Fall 2013) Partial Reconfiguration Not just a half baked job of reconfiguring Rohit Kumar Research Student University of.
Run-Time FPGA Partial Reconfiguration for Image Processing Applications Shaon Yousuf Ph.D. Student NSF CHREC Center, University of Florida Dr. Ann Gordon-Ross.
Reconfigurable Computing (EEL4930/5934) Partial Reconfiguration Not just a half baked job of reconfiguring Rohit Kumar Joseph Antoon Research Students.
A self-reconfiguring platform Brandon Blodget,Philip James- Roxby, Eric Keller, Scott McMillan, Prasanna Sundararajan.
1 SECURE-PARTIAL RECONFIGURATION OF FPGAs MSc.Fisnik KRAJA Computer Engineering Department, Faculty Of Information Technology, Polytechnic University of.
Lecture 7 FPGA technology. 2 Implementation Platform Comparison.
ENG6530 Reconfigurable Computing Systems Dynamic Run Time Reconfiguration Operating System Support & Embedded Systems.
HTR: On-Chip Hardware Task Relocation for Partially Reconfigurable FPGAs + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Architectures of Digital Information Systems Part 1: Interrupts and DMA dr.ir.
QUIZ What does ICAP stand for ? What is its main use ? Why is Partition Pin preferred over Bus Macro? 1.
Committee Members: Annie S. Wu, Jooheung Lee, and Ronald F. DeMara Committee Members: Annie S. Wu, Jooheung Lee, and Ronald F. DeMara Optimizing Dynamic.
FAULT TOLERANCE IN FPGA BASED SPACE-BORNE COMPUTING SYSTEMS Niharika Chatla Vibhav Kundalia
Hardwired networks on chip for FPGAs and their applications
Reconfigurable Computing: What, Why, and Implications for Design Automation André DeHon and John Wawrzynek June 23, 1999 BRASS Project University of California.
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
1 Student: Khinich Fanny Instructor: Fiksman Evgeny המעבדה למערכות ספרתיות מהירות High Speed Digital Systems Laboratory הטכניון - מכון טכנולוגי לישראל.
1 Students: Lin Ilia Khinich Fanny Instructor: Fiksman Evgeny המעבדה למערכות ספרתיות מהירות High Speed Digital Systems Laboratory הטכניון - מכון טכנולוגי.
1 Performed by: Lin Ilia Khinich Fanny Instructor: Fiksman Eugene המעבדה למערכות ספרתיות מהירות High Speed Digital Systems Laboratory הטכניון - מכון טכנולוגי.
Define Embedded Systems Small (?) Application Specific Computer Systems.
Configurable System-on-Chip: Xilinx EDK
The Xilinx EDK Toolset: Xilinx Platform Studio (XPS) Building a base system platform.
Virtual Architecture For Partially Reconfigurable Embedded Systems (VAPRES) Architecture for creating partially reconfigurable embedded systems Module.
Lecture 7 Lecture 7: Hardware/Software Systems on the XUP Board ECE 412: Microcomputer Laboratory.
Using FPGAs with Embedded Processors for Complete Hardware and Software Systems Jonah Weber May 2, 2006.
Bitstream Relocation with Local Clock Domains for Partially Reconfigurable FPGAs Adam Flynn, Ann Gordon-Ross, Alan D. George NSF Center for High-Performance.
1 A survey on Reconfigurable Computing for Signal Processing Applications Anne Pratoomtong Spring2002.
Juanjo Noguera Xilinx Research Labs Dublin, Ireland Ahmed Al-Wattar Irwin O. Irwin O. Kennedy Alcatel-Lucent Dublin, Ireland.
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
Benefits of Partial Reconfiguration Reducing the size of the FPGA device required to implement a given function, with consequent reductions in cost and.
Partially Reconfigurable System-on-Chips for Adaptive Fault Tolerance Shaon Yousuf Adam Jacobs Ph.D. Students NSF CHREC Center, University of Florida Dr.
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
Embedded Systems Seminar (EEL6935, Spring 2013) Partial Reconfiguration Not just a half baked job of reconfiguring Rohit Kumar Research Student University.
Lessons Learned The Hard Way: FPGA  PCB Integration Challenges Dave Brady & Bruce Riggins.
Research on Reconfigurable Computing Using Impulse C Carmen Li Shen Mentor: Dr. Russell Duren February 1, 2008.
HW/SW PARTITIONING OF FLOATING POINT SOFTWARE APPLICATIONS TO FIXED - POINTED COPROCESSOR CIRCUITS - Nalini Kumar Gaurav Chitroda Komal Kasat.
DAPR: Design Automation for Partially Reconfigurable FPGAs Shaon Yousuf Ph.D. Student NSF CHREC Center, University of Florida Dr. Ann Gordon-Ross Associate.
Heng Tan Ronald Demara A Device-Controlled Dynamic Configuration Framework Supporting Heterogeneous Resource Management.
J. Christiansen, CERN - EP/MIC
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
Page 1 Reconfigurable Communications Processor Principal Investigator: Chris Papachristou Task Number: NAG Electrical Engineering & Computer Science.
Embedded Runtime Reconfigurable Nodes for wireless sensor networks applications Chris Morales Kaz Onishi 1.
Embedding Constraint Satisfaction using Parallel Soft-Core Processors on FPGAs Prasad Subramanian, Brandon Eames, Department of Electrical Engineering,
Design Framework for Partial Run-Time FPGA Reconfiguration Chris Conger, Ann Gordon-Ross, and Alan D. George Presented by: Abelardo Jara-Berrocal HCS Research.
Exploiting Partially Reconfigurable FPGAs for Situation-Based Reconfiguration in Wireless Sensor Networks Rafael Garcia, Dr. Ann Gordon-Ross, Dr. Alan.
Partial Region and Bitstream Cost Models for Hardware Multitasking on Partially Reconfigurable FPGAs + Also Affiliated with NSF Center for High- Performance.
MAPLD 2005/254C. Papachristou 1 Reconfigurable and Evolvable Hardware Fabric Chris Papachristou, Frank Wolff Robert Ewing Electrical Engineering & Computer.
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
Reconfigurable Embedded Processor Peripherals Xilinx Aerospace and Defense Applications Brendan Bridgford Brandon Blodget.
This material exempt per Department of Commerce license exception TSU Xilinx On-Chip Debug.
M. ALSAFRJALANI D. DZENITIS Runtime PR for Software Radio 2/26/2010 UFL ECE Dept 1 PARTIAL RECONFIGURATION (PR)
VAPRES A Virtual Architecture for Partially Reconfigurable Embedded Systems Presented by Joseph Antoon Abelardo Jara-Berrocal, Ann Gordon-Ross NSF Center.
SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.
Survey of Reconfigurable Logic Technologies
System on a Programmable Chip (System on a Reprogrammable Chip)
Runtime Reconfigurable Network-on- chips for FPGA-based systems Mugdha Puranik Department of Electrical and Computer Engineering
Runtime Temporal Partitioning Assembly to Reduce FPGA Reconfiguration Time Abelardo Jara-Berrocal, Ann Gordon-Ross HCS Research Laboratory College of Engineering.
An Automated Hardware/Software Co-Design
Dynamo: A Runtime Codesign Environment
Parallel Algorithm Design
Ming Liu, Wolfgang Kuehn, Zhonghai Lu, Axel Jantsch
FPGA: Real needs and limits
Parallel Programming in C with MPI and OpenMP
Abelardo Jara-Berrocal Joseph Antoon Ph.D. Students
ChipScope Pro Software
Shaon Yousuf Ph.D. Student NSF CHREC Center, University of Florida
ChipScope Pro Software
Dynamic Partial Reconfiguration of FPGA
Presentation transcript:

FPGA Partial Reconfiguration Presented by: Abelardo Jara-Berrocal HCS Research Laboratory College of Engineering University of Florida April 10 th, 2009

2 Outline Introduction Partial Reconfiguration (PR) Overview Proposed Design Methodologies Framework analysis F4: Virtual Architecture for Partial Reconfiguration and Design Automation for PR Design

3 General purpose I/O System controller FPGA Configuration lines Shared memory Battery Module A Module B Module A Module B Module A Module B Module C Introduction – Fully reconfigurable systems Bitstreams storage External I/O Design station Required design 1. Device too small for complex designs Module C Module B Module A Module B Module A Module C Module B Module A Module C 2. Big full bitstreams (long reconfiguration time) Config 1 Config 2 Config 3 Config 1 Request Config 2 Request 3. Complete system operation is halted prior to reconfiguration Does’nt fit Module C Module B disabled enabled disabled

4 Types of Modular Dynamic Reconfiguration:  Static Partial Reconfiguration: Reconfiguring a portion of the device (changing the functionality) when the device is inactive without affecting other areas of the device  Dynamic Partial Reconfiguration (PDR): Reconfiguring a portion of the device while the remaining design is still active and operating without affecting the remaining portion of the device. Virtex 4 and Virtex 5 devices support DPR Introduction – Modular Reconfiguration ) Reconfigurable region 1 Reconfigurable region 2

5 Partial Reconfiguration Partial Reconfiguration is useful for systems with multiple functions that can time-share the same FPGA resources. TERMINOLOGY Reconfigurable Region (PRR) Reconfigurable Module (PRM) Static Logic Bus Macro Partial Bitstream Merged Bitstream

6 Module A Module C Module B Introduction – A sample PR architecture FPGA Bitstreams storage Battery External I/O Module C 3. Smaller partial bitstreams Module A request 1. System controller does not need to be placed in an external device 2. Access to fast Internal Configuration Access Port (ICAP – 32 bits, 100 MHz) 4. No need to halt complete system when reconfiguring a module 5. Time multiplexing of FPGA resources, load and unload HW modules on demand Base system configuration JTAG Reconfigurable area disabled Controller (Microblaze) ICAP Flash controller Module C Module B enabled Module A enabled disabled Static area Module A Module B

7 Medium for Partial Reconfiguration External – JTAG, UART (RS232) Internal – ICAP ICAP (Internal Configuration Access Port)  Self-Reconfiguration controlled by soft-processor o Internal read and write access to configuration logic  Faster  HWICAP (provided by Xilinx) o Wraps the ICAP with additional logic to read and write frames to BRAM o Slave to PLB (Processor Peripheral Bus) o 100MHz, 32 bits

8 Additional considerations General benefits from PDR  Saves space on the FPGA  Less time to change only a part of design  Reduction of power dissipation by storing functionality to external memory  Smaller FPGAs can be used to run an application  Architecture adaptation Architecture adaptability  Main advantage, system can modify its internal modules based two schemes Data-Driven: Characteristics of input data changes at the runtime  Artificial intelligence, Evolutionary architectures, Adaptive Signal Processing Situation-Driven: System load/unload modules to adapt to environment conditions  Adaptive Fault tolerance, intelligent management of system resources

9 Bus Macros Bus Macros: Means of communication between PRMs and static design All connections between PRMs and static design must pass through a bus macro with the exception of a clock signal Type of Bus Macros  Tri-state buffer (TBUF) based bus macros  Slice-based (or LUT-based) bus macros Advantage of slice-based bus macros  No signals lines should cross the border in partial reconfiguration  TBUFs – will ignore the boundaries  Slice-based – signals not crossing boundaries

10 LUT-based Slice Macros

11 Controller (Microblaze) ICAP Flash controller Introduction – Current PR Design Flow Steps  Partition the system into modules  Define static modules and reconfigurable modules  Decide the number of PR regions (PRRs)  Decide PRR sizes, shapes and locations  Map modules to PRRs  Define PRR interfaces, instantiate slice macros for PRR interfaces Many manual steps  Design partitioning  Number of PRRs  PRR sizes, shapes and locations  Mapping PRMs to PRRs  Type and placement of PRR interfaces Module A Module C Module B Static modules Reconfigurable Modules (PRMs) 1 2 FPGA # of PRRs? PRR 1 PRR 2 Static region Static modules Modules: A and B Modules: C Design partitioning Design floorplanning and budgeting

12 Introduction – Early Access PR Design Flow Introduced by Xilinx in FPL’06 Major improvements: Automatic implementation scripts Rectangular regions (not full column reconfiguration) Static nets can cross reconfigurable regions Slice macros replace bus macros Partitioning and floorplanning steps are manually executed  Design guidelines for these steps are not provided (manual) Placement and PRRs constraints PRM Bitstreams Design partitioning Design floorplanning and budgeting Xilinx PR Implementation Flow Full Initial Bistream Reconfigurable design specifications (automatic) Potential for development of automatic CAD tools

13 Introduction – Current PR design tools limitations PR design is a very specialized task Only a physical level of support is provided  Architectural knowledge of the target device is a must  Not very flexible, many design constraints Partitioning and floorplanning steps are manually executed  No performance sensitive design guidelines are provided  No automatic heuristics based design flow is available too Lack of abstraction from low level details

14 PR Overview – Taxonomy of PR systems design flows PR Designs Multipurpose Special purpose Highly specialized systems design All PRMs that will exist on the system are known at design time Each PRR is independently optimized (size, shape, location, interface) based on the PRMs that will be mapped to it Output is: 1) Floorplan defining a static region and a set of optimized PRRs 2) The set of PRMs that can be placed in each PRR (PRMs to PRRs mapping) Not optimized for a specific application PRMs required by the application are not known when designing the base system Goal is to design a flexible and reusable base design that can be used for several different PR systems Base system designer defines a set of PRRs with fixed shapes, sizes, locations and interfaces Generated floorplan is used as input template for the PRMs implementation

15 PRR Geometries PR system design flows require:  Proper metrics for PRR performance analysis  Design guidelines for efficient PRR floorplanning Study of the effects of varying PRR shape over  Maximum Clock Frequency  Partial Bitstream Size Five separate test cores:  Beamforming (DSP/slice)  CFAR (slice/memory)  AES (register) Performed on V4SX55 thus far Aspect ratio = PRR Height / PRR Width

16 Framework analysis – Beamforming (~125 MHz, 40%) 5022 slices 16 DSP48s 17 RAMB16s Baseline, non-PR performance = 1614 kB, MHz Clock frequency (MHz)Bitstream size (kB) Aspect ratio

17 Framework analysis – CFAR (~100 MHz, 16%) 2610 slices 2 DSP48s 34 RAMB16s Baseline, non-PR performance = 1001 kB, MHz Clock frequency (MHz)Bitstream size (kB) Aspect ratio

18 Framework analysis – AES (~80 MHz, 13.75%) 3634 slices 3943 registers 4 RAMB16s Baseline, non-PR performance = 1393 kB, MHz Clock frequency (MHz) Bitstream size (kB) Aspect ratio

F4: Virtual Architecture and Design Automation for Partial Reconfiguration Abelardo Jara Shaon Yousuft Rohit Kumar Terence Frederick CHREC Students Dr. Ann Gordon-Ross Dr. Alan D. George UF ECE Faculty

20 Approach Task 1: VA for PR Adaptive Embedded Systems  SCORES Inter-module Communication Architecture  VAPRES Multipurpose Base Embedded Platform  Initial Research on fast algorithms for online PRMs placement and scheduling Task 2: PR Design Flow Automation  Framework to model and design PR systems  Identification of points in Xilinx PR Design Flow amenable for automation  Software tools (C/C++ programs/scripts) for automatable steps Task 3: Bitstream Relocation  Port Bit Reloc to Microblaze  Context save and restore for PRMs PR for Application Designers 20

21 Background – VA for Adaptive PR Embedded Systems Multi-purpose base system platform to build runtime-adaptive HW processing embedded systems  Architectural support for on-demand HW module loading/unloading HW modules can offer better performance than SW modules  Exploit increased parallelism  Main bottleneck: Inter-module communication flows through centralized controller Can be alleviated by adding custom inter-module communication architecture VA benefits:  Adaptive base system platform Response to environmental changes HW/SW partitioned applications  Time-shared virtual resources enable larger available area for system operations  Improved system resource utilization Case study application: PR for Mobile Agents SCORES Controller and peripherals External memory VAPRES Type A module Type B module Type A module Type B target Type A target Free slot e.g. Geographical area divided into 4 regions (one processing node per region) Adaptive embedded system at each processing node Target B Target A 21

22 VAPRES - (Virtual Architecture for Partially Reconfigurable Adaptive Embedded Systems) VAPRES Architectural Components Partially Reconfigurable Regions (PRRs)  Independently clocked using BUFRs  PR modules (PRMs) can span multiple PRRs Controlling agent (Microblaze):  Dynamic module placement and scheduling  Module control and context save/restore  Partial reconfiguration through ICAP  Communication with other VAPRES nodes VAPRES Motivations/Benefits  Embedded base architecture for multi-purpose PR systems  Facilitates dynamic HW modules placement and scheduling  Provides dynamic module frequency scaling  Computing power can be distributed among VAPRES-based nodes Microblaze PRR1PRR2PRR3PRR4 Network-on-chip (SCORES) Fast Simplex Link (FSL) PLB Bus ICAP Flash controller UART USB BUFR Switch Shared memory Interface Network Network (other VAPRES nodes) Network (other VAPRES nodes) PRM A

23 Central Controlling Agent ICAP Mem controller Background – Current Application PR Design Flow  Manual steps Partition the application into modules Define static modules and partially reconfigurable modules (PRMs) Determine the number of PR regions (PRRs) Determine PRR sizes, shapes, and locations (resource allocation) Map PRMs to PRRs Define PRR interfaces and instantiate slice macros for PRR interfaces  Automatiable points and optimization problems (design-time) Design partitioning Number of PRRs PRR sizes, shapes, and locations Mapping PRMs to PRRs Type and placement of PRR interfaces Reconfiguration schedule Module A Module C Module B Static modules Reconfigurable Modules (PRMs) 1 2 FPGA # of PRRs? PRR 1 PRR 2 Static region Static modules Modules: A and B Modules: C Design partitioning Design floorplanning and budgeting Potential for automation through C/C++ programs or scripts PR is a very powerful feature of Xilinx FPGAs, but requires specialized skills

24 Questions