1 Streaming Integral Image Generation on FPGA Michael DeBole Acknowledgements: K. Irick The Pennsylvania State University Department of Computer Science & Engineering Microsystems Design Lab (
2 Ubiquitous Distributed Systems? Traditional System Relies on a Central Point for Computation and Video Analysis Limitations Large Area High Power and Energy Requirements Sub-Optimal Performance Costly Equip Each Camera with a Local Processing Element! Distributed System with Pre-Processing done at Camera Nodes! Advantages Distributed System Low Area High performance Cost Efficient
3 Integral Image Computation
4 Storage Requirements i j Example: 512 x 512 Image 8-Bit Pixels (Grayscale) 32-Bit Words = Max Bits =
5 Streaming Computation Raster Scan Goals Minimal Internal Storage Small Latency Pixel Rate Frequency (~27MHz) Components Accumulator Single Adder RAM (# of Entries = Num of Rows) # of Bits Equals Bits Needed for Last Sum
6 Dynamic Memory Storage Based on Current Position (i,j) Need to determine number of bits needed to store current sum Recall: Tricks: Images 256 < M,N < 1024 I and J require 10 bits Slight Overestimate 10-Bit Address Lookup Dual Port Memory (1024 entries x ~4bits)
7 Integral Image Architecture
8 System Configuration
9 Current System Setup Xilinx Tools Xilinx ISE, XPS, SDK ML410 System Development Board Virtex4-FX60 device 2 Embedded PPC Cores Slices: 25,280 DSP48s: 128 BlockRams: 232
10 Integral Image System Status: ML410 Base System With Ethernet Host Ethernet Application Integral Image Hardware (w/ Support Logic) Integral Image Hardware w/ PLBstreamer Map Blob/Filtering Application to FPGA Complete To Be Done
11 Ethernet Application
12 Integral Image Simulation Results 1 Image = 5ms Realtime (33ms)
13 Hardware Implementation Integral Image Hardware Device Utilization Summary Logic Utilization UsedAvailableUtilization Slices < 1% Slice Flip-Flops <1% LUTs <1% Integral Image Streamer Hardware Device Utilization Summary Logic UtilizationUsedAvailableUtilization Slices % Slice Flip-Flops % LUTs %
14 Conclusions Real-Time Streaming Integral Imaging Hardware Minimal Resources, Application Specific Memory Utilization To Do: Map Application to FPGA
15 Thank You! Questions?