Jian Huang, Matthew Parris, Jooheung Lee, and Ronald F. DeMara

Slides:

Advertisements

Similar presentations

Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.

Advertisements

A HIGH-PERFORMANCE IPV6 LOOKUP ENGINE ON FPGA Author : Thilan Ganegedara, Viktor Prasanna Publisher : FPL 2013.

Distributed Arithmetic

Zhiguo Ge, Weng-Fai Wong, and Hock-Beng Lim Proceedings of the Design, Automation, and Test in Europe Conference, 2007 (DATE’07) April /4/17.

1 SECURE-PARTIAL RECONFIGURATION OF FPGAs MSc.Fisnik KRAJA Computer Engineering Department, Faculty Of Information Technology, Polytechnic University of.

INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS, ICT '09. TAREK OUNI WALID AYEDI MOHAMED ABID NATIONAL ENGINEERING SCHOOL OF SFAX New Low Complexity.

Committee Members: Annie S. Wu, Jooheung Lee, and Ronald F. DeMara Committee Members: Annie S. Wu, Jooheung Lee, and Ronald F. DeMara Optimizing Dynamic.

A Survey of Logic Block Architectures For Digital Signal Processing Applications.

A Matlab Playground for JPEG Andy Pekarske Nikolay Kolev.

SOAR Processing with Field Programmable Gate Arrays Presented by Matthew Scarpino Hoplite Systems LLC.

1 Student: Khinich Fanny Instructor: Fiksman Evgeny המעבדה למערכות ספרתיות מהירות High Speed Digital Systems Laboratory הטכניון - מכון טכנולוגי לישראל.

ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.

Department of Electrical and Computer Engineering Texas A&M University College Station, TX Abstract 4-Level Elevator Controller Lessons Learned.

1 Performed by: Lin Ilia Khinich Fanny Instructor: Fiksman Eugene המעבדה למערכות ספרתיות מהירות High Speed Digital Systems Laboratory הטכניון - מכון טכנולוגי.

Dynamically Parameterized Architectures for Power Aware Video Coding: Motion Estimation and DCT Wayne Burleson Prashant Jain

V The DARPA Dynamic Programming Benchmark on a Reconfigurable Computer Justification High performance computing benchmarking Compare and improve the performance.

Data Partitioning for Reconfigurable Architectures with Distributed Block RAM Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.

Storage Assignment during High-level Synthesis for Configurable Architectures Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.

HW/SW Co-Synthesis of Dynamically Reconfigurable Embedded Systems HW/SW Partitioning and Scheduling Algorithms.

Juanjo Noguera Xilinx Research Labs Dublin, Ireland Ahmed Al-Wattar Irwin O. Irwin O. Kennedy Alcatel-Lucent Dublin, Ireland.

Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.

03/12/20101 Analysis of FPGA based Kalman Filter Architectures Arvind Sudarsanam Dissertation Defense 12 March 2010.

Platform-based Design for MPEG-4 Video Encoder Presenter: Yu-Han Chen.

1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University.

H.264 Deblocking Filter Irfan Ullah Department of Information and Communication Engineering Myongji university, Yongin, South Korea Copyright © solarlits.com.

ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.

An automatic tool flow for the combined implementation of multi-mode circuits Brahim Al Farisi, Karel Bruneel, João Cardoso, Dirk Stroobandt.

Topic Tree for Mini Literary Survey Video. Tree Papers 3-D Frame Frequency Multiplier 26.Haijin Fan, Fan Yang, Qingmin Liao, "Frame Frequency Multiplier.

Computer Architecture and Organization Introduction.

Efficient FPGA Implementation of QR

Heng Tan Ronald Demara A Device-Controlled Dynamic Configuration Framework Supporting Heterogeneous Resource Management.

J. Greg Nash ICNC 2014 High-Throughput Programmable Systolic Array FFT Architecture and FPGA Implementations J. Greg.

Reconfigurable Computing Using Content Addressable Memory (CAM) for Improved Performance and Resource Usage Group Members: Anderson Raid Marie Beltrao.

Radix-2 2 Based Low Power Reconfigurable FFT Processor Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Gin-Der Wu and Yi-Ming Liu Department.

Design Space Exploration for Application Specific FPGAs in System-on-a-Chip Designs Mark Hammerquist, Roman Lysecky Department of Electrical and Computer.

1 Reconfigurable Acceleration of Microphone Array Algorithms for Speech Enhancement Ka Fai Cedric Yiu, Yao Lu, Xiaoxiang Shi The Hong Kong Polytechnic.

An FPGA Implementation of the Ewald Direct Space and Lennard-Jones Compute Engines By: David Chui Supervisor: Professor P. Chow.

Design of an 8-bit Carry-Skip Adder Using Reversible Gates Vinothini Velusamy, Advisor: Prof. Xingguo Xiong Department of Electrical Engineering, University.

A Configurable High-Throughput Linear Sorter System Jorge Ortiz Information and Telecommunication Technology Center 2335 Irving Hill Road Lawrence, KS.

Task Graph Scheduling for RTR Paper Review By Gregor Scott.

Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp

A Physical Resource Management Approach to Minimizing FPGA Partial Reconfiguration Overhead Heng Tan and Ronald F. DeMara University of Central Florida.

Fuzzy sliding mode controller for DC motor Advisor : Ying Shieh Kung Student: Bui Thi Hai Linh Southern Taiwan University Seminar class

Processor Architecture

-BY KUSHAL KUNIGAL UNDER GUIDANCE OF DR. K.R.RAO. SPRING 2011, ELECTRICAL ENGINEERING DEPARTMENT, UNIVERSITY OF TEXAS AT ARLINGTON FPGA Implementation.

Research on TCAM-based OpenFlow Switch Author: Fei Long, Zhigang Sun, Ziwen Zhang, Hui Chen, Longgen Liao Conference: 2012 International Conference on.

COMPARATIVE STUDY OF HEVC and H.264 INTRA FRAME CODING AND JPEG2000 BY Under the Guidance of Harshdeep Brahmasury Jain Dr. K. R. RAO ID MS Electrical.

Hierarchical Systolic Array Design for Full-Search Block Matching Motion Estimation Noam Gur Arie,August 2005.

Dr D. Greer, Queens University Belfast ) Software Engineering Chapter 7 Software Architectural Design Learning Outcomes Understand.

Presenter: Yi-Ting Chung Fast and Scalable Hybrid Functional Verification and Debug with Dynamically Reconfigurable Co- simulation.

Automated Software Generation and Hardware Coprocessor Synthesis for Data Adaptable Reconfigurable Systems Andrew Milakovich, Vijay Shankar Gopinath, Roman.

Optimizing Interconnection Complexity for Realizing Fixed Permutation in Data and Signal Processing Algorithms Ren Chen, Viktor K. Prasanna Ming Hsieh.

Dynamic and On-Line Design Space Exploration for Reconfigurable Architecture Fakhreddine Ghaffari, Michael Auguin, Mohamed Abid Nice Sophia Antipolis University.

Fang Fang James C. Hoe Markus Püschel Smarahara Misra

Author: Yun R. Qu, Shijie Zhou, and Viktor K. Prasanna Publisher:

Backprojection Project Update January 2002

A Partial Reconfiguration Controller for Altera Stratix V FPGAs

Floating-Point FPGA (FPFPGA)

Topics SRAM-based FPGA fabrics: Xilinx. Altera..

EEE2135 Digital Logic Design Chapter 1. Introduction

Direct Attached Storage and Introduction to SCSI

Centar ( Global Signal Processing Expo

FIGURE 7.1 Conventional and array logic diagrams for OR gate

ENEE 631 Project Video Codec and Shot Segmentation

Hybrid Architecture of DCT/DFT/Wavelet Transform (HWT)

Scalable Memory-Less Architecture for String Matching With FPGAs

Dynamic Partial Reconfiguration of FPGA

Course Outline for Computer Architecture

Power-efficient range-match-based packet classification on FPGA

Presentation transcript:

Jian Huang, Matthew Parris, Jooheung Lee, and Ronald F. DeMara Scalable FPGA Architecture for DCT Computation Using Dynamic Partial Reconfiguration Jian Huang, Matthew Parris, Jooheung Lee, and Ronald F. DeMara School of Electrical Engineering and Computer Science, University of Central Florida Experimental Results Architecture Abstract DCT: Reconfigurations of PEs: The PEs can be removed or added using dynamic partial reconfiguration to achieve zonal coding for the DCT coefficients The internal logic of PEs can also be changed through dynamic partial reconfiguration to reduce the precision of resultant DCT coefficients. Advantages: The DCT computations are not interrupted when switching from different zones or different precisions. The unused PEs can be utilized by reconfiguration for other functions, such as motion estimation computation. In this paper, we propose FPGA-based scalable architecture for DCT computation using dynamic partial reconfiguration. Our architecture can achieve quality scalability using dynamic partial reconfiguration. This is important for some critical applications that need continuous hardware servicing. Our scalable architecture has two features. First, the architecture can perform DCT computations for eight different zones, i.e., from 1×1 DCT to 8×8 DCT. Second, the architecture can change the configuration of processing elements to trade off the precisions of DCT coefficients with computational complexity. Using dynamic partial reconfiguration with 2.1 MB bitstreams, 16 distinct hardware architectures can be implemented. We show the experimental results and comparisons between different configurations using both partial reconfiguration and non-partial reconfiguration process. Introduction Our scalable 2D-DCT architecture is based on distributed arithmetic (DA). DA architecture is suitable for FPGA implementation due to its ROM-based computations of inner products. Our scalable architecture is working in two different modes to reduce the computational complexity and the power consumption. First, it can achieve quality scalability by performing 2D-DCT operations for different zones, i.e., from 1×1 to 8×8, as shown in Fig. 1. Only the DCT coefficients in the shaded area are computed. Our scalable architecture can be adjusted through dynamic partial reconfiguration to perform different types of DCT zonal coding. Second, our scalable architecture can be reconfigured to reduce the precision of DCT coefficients especially when quantization parameter increases to achieve high compression ratio. Third, our scalable architecture can reconfigure the unused DCT computation module for other functions, such as motion estimation computation. Fig. 1. Different Zones for DCT Computation Precision comparisons Top level architecture of scalable DCT Schematic diagram of PE The controller is used to generate the address and control signals for data fetching and data assigning. The shared memory, data mapping, and butterfly addition/subtraction are combined together. Mapping block is to read the data from the memory or write the data to the memory according to the address and control signals generated by the controller module. Addition/subtraction performs butterfly addition or subtraction to generate the input data for the PEs. Using the Xilinx Early Access Partial Reconfiguration (EAPR) design flow, the scalable architecture is implemented on the Xilinx Virtex-4 SX35 Video Starter Kit. The scalable architecture previously described naturally allows each PE to reside within a separate reconfiguration area for modification of its configuration without disturbing the remaining portion of the FPGA. Power and throughput comparisons Conclusion In this paper, we have presented our exploration of scalable architecture of DCT computations using FPGA dynamic partial reconfiguration. We used distributed arithmetic based architecture for DCT computations. Using dynamic partial reconfiguration, the processing elements of the DCT architecture can be changed on the fly including the number of the PEs and the internal logic of the PEs. The FPGA does not need to be stopped while changing the configuration, which is important for many image/video applications. We provided detailed implementation results and comparisons for different configurations of PEs using both partial reconfiguration process and non-partial reconfiguration process. Location of 8 PEs on V4SX35