Real-time Object Image Tracking Based on Block- Matching Algorithm ECE 734 Hsiang-Kuo Tang Tai-Hsuan Wu Ying-Tien Lin
Outline Introduction Motion Tracking Theories Different Motion Analysis (DMA) Method Block-Matching Algorithm (BMA) Implementation Issues Methodology & Optimizations Different Approaches – C++, PLX, ET44M210 Demonstration
Introduction Motivation: There are many commercial applications about motion tracking Robotic Vision Electrical Pet Traffic Monitoring More…. Objective: Efficient Implementation in portable embedded system Simple but powerful algorithms Smart optimizations by developers
Object-tracking algorithm Different motion analysis method SAD of consecutive frames A threshold is set to detect the moving The motion object is here!
Object-tracking algorithm Disadvantage of DMA method May include covered or covering background The size of tracking area is not the same as the size of tracking object !
Object-tracking algorithm Solution: Block-Matching Algorithm (BMA) Using motion vector to compensate the redundant part of tracking area Using motion estimation to adjust the size of tracking area
Implementation Methodology
Implementation Methodology & Optimization Capture images from I/O device & transfer RGB to YUV values Pre-compute YUV values & save them in ROM Compute SAD values between adjacent frames Parallel Processing as much as possible Compute motion estimation & compensate tracking area Replace full-search with 41SWS/BPD (FS-like sub-sampling)
Implementation approaches Simulation in C++ program Evaluate the whole algorithm Simulation in PLX Implement some optimizations Realization in ET44M210 micro-controller Find the performance bottlenecks
Implementation approaches - PLX Optimizations: Parallel Processing Absolute value calculation: 4 ops per register abs8macroRd,Rs1,Rs2 // used in SAD, MAD operations psub.1.uRtmp1,Rs1,Rs2 psub.1.uRtmp2,Rs2,Rs1 padd.1.uRd,Rtmp1,Rtmp2 endm Load & store operation alignment: 4 ops/register mix.4.rRtmp6,RGB2,RGB1 // fit 4 RGB values in 1 register mix.4.rRtmp7,RGB4,RGB3 store.8Rtmp6,PLCD,0 // plot them in LCD screen store.8Rtmp7,PLCD,8
Implementation approaches - PLX Results: DMA/BMA between two frames
Implementation Approaches - ET44M210
Evaluation of ET44M210 TypeInst / frame Grab image Convert Y Calculate SAD Find MVs Summation frames/sec11.04 When running at full speed (48MHz), 11 frames per cycle can be achieved. But due to the USB module, the ET44M210 can only run at 24MHz. Lots of instructions must be cost to handle USB transmission, so that the average performance reduced to 0.9 frame per second.
Let ’ s make a brief demonstration about motion tracking on ET44M210 chip …
41SWS/BPD