Download presentation
Presentation is loading. Please wait.
Published byScarlett Grant Modified over 9 years ago
1
ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu
2
Overview Application: Object tracking in real time Challenges: Static Scene Moving objects Occluding Collision Disappearing Rotation Scaling Divide and Conquer: Feature Extraction and Tracking Focus on: Feature Extraction, used SIFT Improve an existing implementation with GPU
3
Scale Invariant Feature Transform (SIFT) Input: image Output: keypoints
4
GPU Implementation Selected the GPU implementation by Sinha et al. at UNC at Chapel Hill Open-source SiftGPU available (latest V4.00, Sept. 2012) SIFT well suited to be implemented on GPU Tens of thousands of threads handle subsets of data without communication with each other
5
Attempts to Speed Up Tackled the 2 most time consuming processing steps Blurring images with Gaussian low-pass filter Changed pixel data access pattern Used different schemes of data partitioning Keypoint descriptor (128-element vector) calculations Optimize code in the kernel Used usual optimization techniques Changed GPU memory usage Threads management Experimented with kernel parameters Maximized usage of available threads Result: Reduced descriptor compute time from 73 to 22 ms (70%)
6
Conclusion Existing implementation is already pretty good Hard to take full advantage of the architecture. Need to have good understanding of Memory architecture Thread usage CUDA C/C++ compiler (nvcc) optimizes code in different ways. Need to experiment to gain performance Hard to debug code running on GPU Visual Profiler can provide valuable insights on code behaviors
7
Backup Slides
8
References SiftGPU available at http://cs.unc.edu/~ccwu/siftgpu/http://cs.unc.edu/~ccwu/siftgpu/ D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, November 2004. Sudipta N. Sinha et al., “GPU-based Video Feature Tracking And Matching,” Technical Report TR 06-012, Department of Computer Science, UNC Chapel Hill, May 2006. NVIDIA GeForce GT 640M LE CUDA Cores: 384 Total available graphics memory: 4095 MB
9
Test image with keypoints
12
Algorithm
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.