Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hardware/Software Communication Middleware for Data Adaptable Embedded Systems Sachidanand Mahadevan, Vijay Shankar Gopinath, Roman Lysecky, Jonathan Sprinkle,

Similar presentations


Presentation on theme: "Hardware/Software Communication Middleware for Data Adaptable Embedded Systems Sachidanand Mahadevan, Vijay Shankar Gopinath, Roman Lysecky, Jonathan Sprinkle,"— Presentation transcript:

1 Hardware/Software Communication Middleware for Data Adaptable Embedded Systems Sachidanand Mahadevan, Vijay Shankar Gopinath, Roman Lysecky, Jonathan Sprinkle, Jerzy Rozenblit, Michael W. Marcellin Department of Electrical and Computer Engineering University of Arizona, Tucson, AZ This work was supported in part by the National Science Foundation under Grant CNS-0915010.

2 Introduction & Motivation Data Adaptable Approach Increasingly Complex Applications Demands Complex Algorithms Compute Intensive Highly-Configurable Example: JPEG2000 Image Compression Provides significant advantages – quality and compression – over JPEG standard Support for configurability at each processing stage (e.g. color transform, wavelet, block encoding, code stream) Results in high-computational demands and larger design space

3 Introduction & Motivation Traditional SW & HW Solutions µPµPµPµP µPµPµPµP µPµPµPµP µPµPµPµP Software Only (single/multicore) Hardware Accelerated (Dedicated HW IP) µPµPµPµP I$ D$ JPEG2000Co-Processor µPµPµPµP I$ D$ FPGA Coprocessor Bitstream Memory Reconfigurable (FPGA supporting dynamic reconfiguration) GoalsSoftware OnlyHardware AcceleratedReconfigurable Configurability/FlexibilityYesNoYes PerformanceNoYes

4 µPµPµPµP I$ D$ FPGA Reconfigurable (FPGA supporting dynamic reconfiguration) Introduction & Motivation Related Work Coprocessor Bitstream Memory Data/Input-driven Reconfigurable Systems Adaptive Programming [Santambrogio et al., ICCAD 2007] Utilizing programming technique the eliminates the need to re-compute computations if the inputs have not changed Can be applied both to software execution and reconfigurable hardware Function-level Data Adaptation [Sima & Bertels, ISDPS 2009]. Determines if executing HW will be beneficial given the overhead of HW communication Performed at function level by monitoring/profiling input parameters

5 Introduction & Motivation Data-Adaptable Reconfigurable Embedded Systems (DARES) Reconfigurable systems for high-configurable/compute-intensive applications Can be reconfigured at runtime for immediate application needs How/when to reconfigure specific to application and data input Goal: Reconfigure hardware tasks within FPGA based upon the current data profile Input stream...10110000 Output stream 10011000... µP Reconfigurable FPGA Task A (512x 512) Task B (5/3) Task C (Cas ual) New Input Stream...000010100 New Data Profile: - 14-bits/channel - Task A (1024x1024, 4:4:2) - Task B (Wavelet 5/3) - Task C (Error Resilient) Task A (1024x 1024) HW Task Implementations Task C Task B Task D Task A

6 DARES Approach Project Goals DARES Approach Goals Develop model-driven framework for specifyingapplication tasks, processing requirements, data configurability, and target data profiles for hardware support Develop runtime middleware and communication framework for runtime communication and system reconfiguration Develop automated tool flow supporting the proposed methodology Input stream...10110000 Output stream 10011000... µP Reconfigurable FPGA Task A (512x 512) Task B (5/3) Task C (Cas ual) Task A (1024x 1024) HW Task Implementations Task C Task B Task D Task A

7 DARES Approach Design Methodology and Toolchain – Overview 1.Modeling Framework 2.SW Task Generation/Compilation 3.HW Coprocessor Generation/Synthesis 4.HW/SW Communication Framework 5.Final Software/Hardware Implementation HW/SW Codesign (Model Interpreter) Software Binary HWTask Hardware Task Bitstreams Application and Data Profile Model Software Model for HW Tasks HW Task Hardware Tasks HW/SW Comm. Framework Xilinx ISE ImpulseC CoDeveloper Software Threads Communication Middleware Software Compiler (gcc) (1) Init. Code 01000100101101010101010 10101111101010110010010 01000101010101001001000 10001010010000001111110 HWTaskHWTaskHWTask (2)(3) (4) (5)

8 DARES Approach Design Methodology and Toolchain – Overview 1.Modeling Framework 2.SW Task Generation/Compilation 3.HW Coprocessor Generation/Synthesis 4.HW/SW Communication Framework 5.Final Software/Hardware Implementation HW/SW Codesign (Model Interpreter) Software Binary HWTask Hardware Task Bitstreams Application and Data Profile Model Software Model for HW Tasks HW Task Hardware Tasks HW/SW Comm. Framework Xilinx ISE ImpulseC CoDeveloper Software Threads Communication Middleware Software Compiler (gcc) Init. Code 01000100101101010101010 10101111101010110010010 01000101010101001001000 10001010010000001111110 HWTaskHWTaskHWTask (4)

9 DARES Approach Hardware/Software Communication Framework HW Task Hardware Tasks HW/SW Comm. Framework Xilinx ISE HWTask Hardware/Software Communication Framework Hardware coprocessors integrated with hardware/software communication framework Supports seamless communication between software and hardware tasks in conjunction with communication middleware Efficient communication mechanisms supported for communication between adjacent and non-adjacent hardware tasks System Bus (PLB) User IP (HW Task) FIFO Bus Interface (Memory Mapped) FIFO In Bus2FIFO FIFO Out FIFO2Bus Fpout_wren Fpout_wdata Fpout_full Fpin_wren Fpin_wdata Fpin_full

10 DARES Approach Hardware/Software Communication Methods Software to Software (SW Buffer) System Bus (PLB) User IP (HW Task) FIFO Bus Interface (Memory Mapped) FIFO In Bus2FIFO FIFO Out FIFO2Bus Fpout_wren Fpout_wdata Fpout_full Fpin_wren Fpin_wdata Fpin_full µP Mem... Task Software to Software (HW FIFO) µPMem... Task

11 DARES Approach Hardware/Software Communication Methods Software to HardwareHardware to Hardware (Adjacent) µPMem... Task Task Task µP Mem... Task Task System Bus (PLB) User IP (HW Task) FIFO Bus Interface (Memory Mapped) FIFO In Bus2FIFO FIFO Out FIFO2Bus Fpout_wren Fpout_wdata Fpout_full Fpin_wren Fpin_wdata Fpin_full

12 DARES Approach Hardware/Software Communication Methods Hardware to Hardware (Non-Adjacent)Hardware to Software µPMem... Task TaskTask µP Mem... Task Task System Bus (PLB) User IP (HW Task) FIFO Bus Interface (Memory Mapped) FIFO In Bus2FIFO FIFO Out FIFO2Bus Fpout_wren Fpout_wdata Fpout_full Fpin_wren Fpin_wdata Fpin_full

13 DARES Approach Case Study – JPEG (not 2000) Experimental Setup Consider a JPEG image compression application Generated software and hardware implementations for JPEG encoding tasks using DARES toolchain Discrete cosine transform (dct), quantization (qnt), zig-zag ordering (zz), and run-length encoding (rle) Independently verified software and hardware accelerated implementation Evaluated system performance for all combination of hardware coprocessors available within system Virtex-5 FX FPGA (ML507 board) Two μP:FPGA Frequency Configurations: 4:1 – 400 MHz PowerPC processor with 100 MHz PLB bus 2:1 – 250MHz PowerPC processor with 125 MHz PLB bus Three SW to HW communication Options: No DMA, DMA (1 block transfers), DMA (4 block transfers) µP Reconfigurable FPGA dctqntzzrle

14 DARES Approach Case Study – JPEG (not 2000) Experimental Results – Area HW/SW Communication Framework requires 23% to 35% of the hardware task’s area 19% of total system area Area HW/SW Comm. Framework (HWCF) Hardware Tasks DCTQNTZZRLETotal LUTs843481126033037485019,693 FFs806189521452247222613,886 % HWCF24%35%31%23%19%

15 DARES Approach Case Study – JPEG (not 2000) Experimental Results – Performance (4:1 μP:FPGA Configuration) Achieves performance improvement of 1.9X for single hardware task and up to 6.3X with all tasks executing in hardware Compared to software-only implementations Non-adjacent communication between HW tasks using FIFO2Bus (not shown) is 0.05% to 0.24% slower Marginal overhead due to bus transaction wait times

16 DARES Approach Case Study – JPEG (not 2000) Experimental Results – Performance (2:1 μP:FPGA Configuration) Overall performance is slower than 4:1 Configuration Reduced software performance is not outweighed by faster communication 4.6X speedup with all tasks implemented in HW (compared to 5X for 4:1)

17 Conclusions and Future Work Conclusions Hardware/Software Communication Framework provides seamless communication method between software and hardware tasks Highly configurable to provide several methods for communication DARES approach using communication framework achieves excellent runtime performance – greater than 6X performance improvement Moderate area requirements for communication framework – 19% of total system requirements Future Work Develop fully automated tools for DARES approaches including: Synthesis-in-the-loop hardware coprocessor performance/area/power estimation Develop hardware/software codesign methods for determining optimal system implementation subject to area (physical and memory) constraints Integrate methods to generating partial bitstream supporting dynamic reconfiguration Develop runtime manager for reconfiguring the HW/SW implementation at runtime given input data profile Extend communication framework to support multiple input/output FIFOs Apply DARES approaches to relevant highly-configurable, compute-intensive applications e.g. JPEG2000

18 Question? Thank You!


Download ppt "Hardware/Software Communication Middleware for Data Adaptable Embedded Systems Sachidanand Mahadevan, Vijay Shankar Gopinath, Roman Lysecky, Jonathan Sprinkle,"

Similar presentations


Ads by Google