1 Streaming Integral Image Generation on FPGA Michael DeBole Acknowledgements: K. Irick The Pennsylvania State University Department of Computer Science.

Slides:



Advertisements
Similar presentations
Multiprocessor Architecture for Image processing Mayank Kumar – 2006EE10331 Pushpendre Rastogi – 2006EE50412 Under the guidance of Dr.Anshul Kumar.
Advertisements

Network II.5 simulator ..
1 Building a Fast, Virtualized Data Plane with Programmable Hardware Bilal Anwer Nick Feamster.
Confidential 1 Phoenix Security Architecture and DevID July 2005 Karen Zelenko Phoenix Technologies.
SE-292 High Performance Computing
Bus Specification Embedded Systems Design and Implementation Witawas Srisa-an.
PARTIAL RECONFIGURATION USING FPGAs: ARCHITECTURE
Taking CUDA to Ludicrous Speed Getting Righteous Performance from your GPU 1.
Indexing Large Data COMP # 22
Presenter : Cheng-Ta Wu Kenichiro Anjo, Member, IEEE, Atsushi Okamura, and Masato Motomura IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39,NO. 5, MAY 2004.
©2004 Brooks/Cole FIGURES FOR CHAPTER 16 SEQUENTIAL CIRCUIT DESIGN Click the mouse to move to the next page. Use the ESC key to exit this chapter. This.
Enhanced matrix multiplication algorithm for FPGA Tamás Herendi, S. Roland Major UDT2012.
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Progress With iBOBs at Jodrell Bits & Bytes Meeting, JBO, th Dec 2007 Jonathan Hargreaves Electronic Engineer, Jodrell Bank Observatory.
Autonomous Tracking Unit John Berglund Randy Cuaycong Wesley Day Andrew Fikes Kamran Shah Professor: Dr. Rabi Mahapatra CPSC Spring 1999 Autonomous.
SE-292 High Performance Computing Memory Hierarchy R. Govindarajan
Architecture-Specific Packing for Virtex-5 FPGAs
Reconfigurable Computing (EN2911X, Fall07) Lecture 04: Programmable Logic Technology (2/3) Prof. Sherief Reda Division of Engineering, Brown University.
NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius
StreamBlade SOE TM Initial StreamBlade TM Stream Offload Engine (SOE) Single Board Computer SOE-4-PCI Rev 1.2.
Sumitha Ajith Saicharan Bandarupalli Mahesh Borgaonkar.
Distributed Arithmetic
BIST for Logic and Memory Resources in Virtex-4 FPGAs Sachin Dhingra, Daniel Milton, and Charles Stroud Electrical and Computer Engineering Auburn University.
Computes the partial dot products for only the diagonal and upper triangle of the input matrix. The vector computed by this architecture is added to the.
Team Morphing Architecture Reconfigurable Computational Platform for Space.
1 Performed By: Khaskin Luba Einhorn Raziel Einhorn Raziel Instructor: Rivkin Ina Spring 2004 Spring 2004 Virtex II-Pro Dynamical Test Application Part.
Lecture 26: Reconfigurable Computing May 11, 2004 ECE 669 Parallel Computer Architecture Reconfigurable Computing.
Configurable System-on-Chip: Xilinx EDK
1 Energy Efficient Packet Classification Hardware Accelerator Alan Kennedy, Xiaojun Wang HDL Lab, School of Electronic Engineering, Dublin City University.
Introduction to Field Programmable Gate Arrays (FPGAs) COE 203 Digital Logic Laboratory Dr. Aiman El-Maleh College of Computer Sciences and Engineering.
1 Fast Communication for Multi – Core SOPC Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab.
Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
HW/SW CODESIGN OF THE MPEG-2 VIDEO DECODER Matjaz Verderber, Andrej Zemva, Andrej Trost University of Ljubljana Faculty of Electrical Engineering Trzaska.
HW/SW CODESIGN OF THE MPEG-2 VIDEO DECODER Matjaz Verderber, Andrej Zemva, Andrej Trost University of Ljubljana Faculty of Electrical Engineering Trzaska.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
General Purpose FIFO on Virtex-6 FPGA ML605 board midterm presentation
© 2011 Xilinx, Inc. All Rights Reserved This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
Students: Oleg Korenev Eugene Reznik Supervisor: Rolf Hilgendorf
General Purpose FIFO on Virtex-6 FPGA ML605 board Students: Oleg Korenev Eugene Reznik Supervisor: Rolf Hilgendorf 1 Semester: spring 2012.
Computer Science, Software Engineering & Robotics Workshop, FGCU, April 27-28, 2012 FPGA: Field Programmable Gate Arrays Vincent Giannone Mentor: Dr. Janusz.
Impulse Embedded Processing Video Lab Generate FPGA hardware Generate hardware interfaces HDL files HDL files FPGA bitmap FPGA bitmap C language software.
1 A 252Kgates/4.9Kbytes SRAM/71mW Multi-Standard Video Decoder for High Definition Video Applications Motivation A variety of video coding standards Increasing.
Department of Electrical Engineering Electronics Computers Communications Technion Israel Institute of Technology High Speed Digital Systems Lab. High.
Lecture #3 Page 1 ECE 4110– Sequential Logic Design Lecture #3 Agenda 1.FPGA's 2.Lab Setup Announcements 1.No Class Monday, Labor Day Holiday 2.HW#2 assigned.
Pinewood Derby Timing System Using a Line-Scan Camera Rob Ostrye Class of 2006 Prof. Rudko.
DLS Digital Controller Tony Dobbing Head of Power Supplies Group.
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
Page 1 Reconfigurable Communications Processor Principal Investigator: Chris Papachristou Task Number: NAG Electrical Engineering & Computer Science.
1 Fly – A Modifiable Hardware Compiler C. H. Ho 1, P.H.W. Leong 1, K.H. Tsoi 1, R. Ludewig 2, P. Zipf 2, A.G. Oritz 2 and M. Glesner 2 1 Department of.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
Lecture #3 Page 1 ECE 4110–5110 Digital System Design Lecture #3 Agenda 1.FPGA's 2.Lab Setup Announcements 1.HW#2 assigned Due.
Floating-Point Divide and Square Root for Efficient FPGA Implementation of Image and Signal Processing Algorithms Xiaojun Wang, Miriam Leeser
Graphics: Conceptual Model Real Object Human Eye Display Device Graphics System Synthetic Model Synthetic Camera Real Light Synthetic Light Source.
PROJECT - ZYNQ Yakir Peretz Idan Homri Semester - winter 2014 Duration - one semester.
CORE Generator System V3.1i
This material exempt per Department of Commerce license exception TSU Xilinx On-Chip Debug.
Memory-Efficient and Scalable Virtual Routers Using FPGA Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,
USB host for web camera connection Characterization presentation Presenters: Alexander Shapiro Sergey Alexandrov Supervisor: Mike Sumszyk High Speed Digital.
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
집적회로설계 1 Spring 2007 Prof. Sang Sik AHN Signal Processing LAB.
Presenter: Darshika G. Perera Assistant Professor
Backprojection Project Update January 2002
Parallel Beam Back Projection: Implementation
Head-to-Head Xilinx Virtex-II Pro Altera Stratix 1.5v 130nm copper
Distributed Real-Time Embedded Video Processing
ECE 4110–5110 Digital System Design
Spartan FPGAs مرتضي صاحب الزماني.
FPGA’s 9/22/08.
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

1 Streaming Integral Image Generation on FPGA Michael DeBole Acknowledgements: K. Irick The Pennsylvania State University Department of Computer Science & Engineering Microsystems Design Lab (

2 Ubiquitous Distributed Systems? Traditional System Relies on a Central Point for Computation and Video Analysis Limitations Large Area High Power and Energy Requirements Sub-Optimal Performance Costly Equip Each Camera with a Local Processing Element! Distributed System with Pre-Processing done at Camera Nodes! Advantages Distributed System Low Area High performance Cost Efficient

3 Integral Image Computation

4 Storage Requirements i j Example: 512 x 512 Image 8-Bit Pixels (Grayscale) 32-Bit Words = Max Bits =

5 Streaming Computation Raster Scan Goals Minimal Internal Storage Small Latency Pixel Rate Frequency (~27MHz) Components Accumulator Single Adder RAM (# of Entries = Num of Rows) # of Bits Equals Bits Needed for Last Sum

6 Dynamic Memory Storage Based on Current Position (i,j) Need to determine number of bits needed to store current sum Recall: Tricks: Images 256 < M,N < 1024 I and J require 10 bits Slight Overestimate 10-Bit Address Lookup Dual Port Memory (1024 entries x ~4bits)

7 Integral Image Architecture

8 System Configuration

9 Current System Setup Xilinx Tools Xilinx ISE, XPS, SDK ML410 System Development Board Virtex4-FX60 device 2 Embedded PPC Cores Slices: 25,280 DSP48s: 128 BlockRams: 232

10 Integral Image System Status: ML410 Base System With Ethernet Host Ethernet Application Integral Image Hardware (w/ Support Logic) Integral Image Hardware w/ PLBstreamer Map Blob/Filtering Application to FPGA Complete To Be Done

11 Ethernet Application

12 Integral Image Simulation Results 1 Image = 5ms Realtime (33ms)

13 Hardware Implementation Integral Image Hardware Device Utilization Summary Logic Utilization UsedAvailableUtilization Slices < 1% Slice Flip-Flops <1% LUTs <1% Integral Image Streamer Hardware Device Utilization Summary Logic UtilizationUsedAvailableUtilization Slices % Slice Flip-Flops % LUTs %

14 Conclusions Real-Time Streaming Integral Imaging Hardware Minimal Resources, Application Specific Memory Utilization To Do: Map Application to FPGA

15 Thank You! Questions?