Scale Invariant Feature Transform (SIFT) JOJO 2011.7.14
Outline Background Universal SIFT (2D) N-SIFT (3D) Conclusion Keypoints Detection Descriptor Construction Match Method N-SIFT (3D) Conclusion
Outline Background Universal SIFT (2D) N-SIFT (3D) Conclusion Keypoints Detection Descriptor Construction Match Method N-SIFT (3D) Conclusion
Background SIFT was proposed by David G. Lowe in 2004 Widely application in 2D image matching and recognition Extended to N-dimensional SIFT by Warren Cheung in 2007 and applied in Medical image processing
Outline Background Universal SIFT (2D) N-SIFT (3D or more) Conclusion Keypoints Detection Descriptor Construction Match Method N-SIFT (3D or more) Conclusion
Universal SIFT (2D) Invariant to scale, rotation, illumination and adapt to complex geometry transform Add a intermediate variable
Outline Background Universal SIFT (2D) N-SIFT (3D or more) Conclusion Keypoints Detection Descriptor Construction Match Method N-SIFT (3D or more) Conclusion
2D-SIFT (Keypoints Detection) Scale space of image : varies to get scale space Keypoints are detected in multi-scale space
2D-SIFT (Keypoints Detection) Scale Space of An Image
2D-SIFT (Keypoints Detection) Linderberg found that normalized LoG scale space is invariant to scale. DoG: Detect keypoints in DoG space Keypoints are detected in multi-scale space is relative to
2D-SIFT (Keypoints Detection) DoG Scale Space
2D-SIFT (Keypoints Detection) Build 3D scale space: First octave Down sample Next octave Keypoints are detected in multi-scale space
2D-SIFT (Keypoints Detection) Local extrema detection: Search adjoining 26 points of one point in 3D DoG Space D(x,y,) . If the point is a local extrema, it’s a candidate keypoint. Keypoints are detected in multi-scale space
2D-SIFT (Keypoints Detection) Accurate keypoint localization: Change x x is larger than 0.5 in any dimension Y Keypoints are detected in multi-scale space N Output x
2D-SIFT (Keypoints Detection) Eliminate unstationary keypoints: Keypoints with low contrast |D(x)|<0.03 Edge Keypoints determinant
Extrema in DoG space: 832 Original image233*189 Extrema in DoG space:832 Discard low contrast:832->729 Discard edge points:729 536
Outline Background Universal SIFT (2D) N-SIFT (3D or more) Conclusion Keypoints Detection Descriptor Construction Match Method N-SIFT (3D or more) Conclusion
2D-SIFT (Descriptor Construction) (1)Realize invariance to rotation: Determine the main direction of a keypoint Main direction correction Keypoints are detected in multi-scale space
2D-SIFT (Descriptor Construction) Main direction of a keypoint Keypoint Divide the circle neighborhood to several regions Calculate the sum of gradient magnitude of each region Keypoints are detected in multi-scale space parabola interpolation Accurate localize main direction
2D-SIFT (Descriptor Construction) Main direction correction Main direction Keypoints are detected in multi-scale space
2D-SIFT (Descriptor Construction) (2)Construct descriptor Divide the 16*16 neighborhood to 4*4 blocks with 4*4 pixels Main direction Keypoints are detected in multi-scale space
2D-SIFT (Descriptor Construction) (2)Construct descriptor Sum the gradient magnitude in 8 directions of each block Keypoints are detected in multi-scale space
2D-SIFT (Descriptor Construction) (2)Construct descriptor Sum the gradient magnitude in 8 directions of each block (8*4*4=128) and normalize the vector 主方向 Keypoints are detected in multi-scale space
2D-SIFT Input image Detect extrema in DoG space Accurate keypoints localization Eliminate low contrast points and edge points Keypoints detection Descriptor construction Determine main direction Main direction correction Get descriptor Keypoints are detected in multi-scale space Feature vectors
Outline Background Universal SIFT (2D) N-SIFT (3D or more) Conclusion Keypoints Detection Descriptor Construction Match Method N-SIFT (3D or more) Conclusion
2D-SIFT (Match Method) SIFT extraction Feature set Image1 Image2 Corresponding points set NNDR: Nearest Neighbor Distance Ratio principle
Outline Background Universal SIFT (2D) N-SIFT (3D) Conclusion Keypoints Detection Descriptor Construction Match Method N-SIFT (3D) Conclusion
N-SIFT (3D)
N-SIFT (3D) Keypoints are detected in multi-scale space
N-SIFT (3D) 16*16*16 4*4*4 Histogram: 8*8 Vector: 8*8*4*4*4=16,384 Keypoints are detected in multi-scale space
N-SIFT (3D) Difference: No accurate keypoints localization No main direction correction
Outline Background Universal SIFT (2D) N-SIFT (3D or more) Conclusion Keypoints Detection Descriptor Construction Match Method N-SIFT (3D or more) Conclusion
Conclusion Invariant to scale Keypoints was detected in DoG scale space Invariant to rotation Main direction correction Invariant to illumination Linear: normalize feature vector Non-linear: elements above 0.2 in vector is cut to 0.2 Adaption to complex geometry transform blocked histograms
Thank you!