Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Invariant Local Feature for Image Matching Presented by Wyman Advised by Michael R. Lyu 11 th December, 2006.

Similar presentations


Presentation on theme: "1 Invariant Local Feature for Image Matching Presented by Wyman Advised by Michael R. Lyu 11 th December, 2006."— Presentation transcript:

1 1 Invariant Local Feature for Image Matching Presented by Wyman Advised by Michael R. Lyu 11 th December, 2006

2 2 Outlines Introduction Shape-SIFT (SSIFT) Image Retrieval System Conclusion and Future Work

3 3 Introduction Image matching using invariant local features can be applied to many CV applications –Panorama Stitching –Image Retrieval –Object Recognition Query Image Near-duplicate Images Image Retrieval System

4 4 Contributions 1.Proposed Shape-SIFT (SSIFT) descriptor –Designed to be invariant to change in background and object color (commonly occur in object class recognition) 2.Built an image retrieval system –Proposed two new components specifically for image near- duplicate detection 1.A new matching strategy 2.A new verification process –Fast and accurate

5 5 Outlines Introduction Shape-SIFT (SSIFT) Image Retrieval System Conclusion and Future Work

6 6 Background and Object Color Invariance We have tested that SIFT performs better than GLOH and PCA-SIFT in term of recall rate However, SIFT is not robust to changing background and object color SIFT is a histogram of gradient orientation of pixels within a 16 x 16 local window Gradient orientations may flip in direction when the colors of feature change –E.g. Entries in 0 o bin moved to 180 o bin (great diff. during matching) A Coloring B Coloring Common Shape

7 7 Background and Object Color Invariance Orientation flipping happens when background and object change in color / luminance More precisely, orientation flipping occurs along the contour where color on either/both sides change Contour of object is an important clue for recognizing object. Thus, correctly describing features near contour is a key to success in object recognition Background’s Color Change Coloring Change

8 8 Our Feature Descriptor Orientation Assignment –For each sample in the Gaussian smoothed image L, L(x,y), the orientation θ(x,y) is computed as follow: Horizontal pixel intensity difference Vertical pixel intensity difference cf. SIFT Orientation Assignment Don’t care which side has higher intensity

9 9 Canonical Orientation Determination Keypoint Descriptor –In order to achieve rotation invariance, the coordinates of the descriptor is rotated according to the dominant orientation –However, we have limited the orientation assignment to [0 o,180 o ] and thus the dominant orientation also lies within [0 o,180 o ] –This causes ambiguity in rotating the coordinates of the descriptor or ? Dominant orientation = θ θ 180 o + θ

10 10 Canonical Orientation Determination We proposed two approaches to tackle this problem: –Duplication Generate two descriptors One with canonical orientation θ and the other with 180 o + θ –By the distribution of peak orientation Determine the dominant orientation using a feature’s orientation dependent statistic value Insignificant adverse effect to matching feature Over 75% of feature’s orientation can be determined 0123 1012 2101 3210

11 11 SSIFT-64 and SIFT-128 SSIFT-64 descriptor is computed by concatenating 8 4-bin orientation histograms over the feature region together Problem –SSIFT-64 is more invariant to background and object color change –it is inevitably less distinctive Solution –Append 8 more 4-bin orientation histograms to SSIFT-64 to produce SSIFT-128 –Each of these histograms contains north, east, south and west orientation bins, covering 360 degree range of orientation –We adopted a matching mechanism such that the performance of our descriptor on changing background and object color is retained while increasing the recall and precision on other cases

12 12 Matching Mechanism SSIFT-128 Matching mechanism for SSIFT-128 descriptor –Find the best match match 64 using the first 64 elements of SSIFT-128 (SSIFT-64) and calculate the distance ratio dr 64 –Then refine the match using also last 64 elements and calculate the distance ratio dr 128 –If dr 64 > dr 128 Best match = match 64 –Else Best match = match 128 –Experiments show that the overall performance of SSIFT-128 is better than SSIFT-64

13 13 Performance Evaluation

14 14 Performance Evaluation

15 15 Performance Evaluation Viewpoint changeRotation & Scale change Illumination Change

16 16 Outlines Introduction Shape-SIFT (SSIFT) Image Retrieval System Conclusion and Future Work

17 17 Image Retrieval System Designed for Image near-duplicate (IND) detection –To detect illegal copies of copyright images on the Web –To remove meaningless duplicates in the returned result of content-based image retrieval system (CBIRS) –To evaluate the similarity of pages on the Web Integrated a number of state-of-the-art techniques –Detector: DOG pyramid from SIFT –Descriptor: SIFT descriptor –Matching: Exact Euclidean Locality Sensitive Hashing (E 2 LSH) Proposed two new steps to improve recall & precision –A new matching strategy: K-NNRatio matching strategy –A new verification process: Orientation verification process

18 18 Two Main Phases Database Construction –Image Representation –Index Construction –Keypoint and Image Lookup Tables Database Query –Matching Strategies –Verification Processes –Image Voting Keypoint lookup table Image lookup table

19 19 Database Construction – Image Representation Database Image 1. Extract SIFT features SIFT features are invariant to common image manipulations which include changes in: Illumination Contrast Coloring Saturation Resizing Cropping Framing Affine warping Jpeg-compression Huge amount of SIFT features in the database Slow! Image Representations Extract

20 20 Database Construction – Index Construction 1. Extract SIFT features Image Representations Extract Insert 2. Build LSH Index Build LSH index for fast features retrieval The hashing scheme is locality-sensitive Locality-sensitive: The probability that two points share the same hash value decreases with the distance between them. E 2 LSH package allows us to find the K-nearest neighbors (K-NN) of a query point at certain probability Database Image

21 21 Database Construction E 2 LSH’s scalability problem –All data points and data structures are stored in the main memory –Limitation: Database’s size < Amt. of free M.M. Solutions: –Offline Query Applications: Divide the dataset into several parts Build hash table for one part at a time and run all queries on that part Combine the results from all parts together to obtain the final results –Online Query Applications: Increase the swap space of the computer so that more memory is available Or load part of the hash table dynamically

22 22 Database Query 1. Extract SIFT features Extract Query 2. Query Database Query Image Match Matching Strategies: Threshold Based Set threshold (fixed) on distance K-NN Set threshold (fixed) on distance Take top K (fixed) nearest neighbors K-NNRatio Assign weights to K nearest neighbors Depend on distance ratio and position More Flexible! … Candidate matches

23 23 Database Query – 1. Matching Strategies K-NNRatio –Weights are assigned based on two principles: If a keypoint is near to the query point, it is probably the correct match and thus its weight should be high. If a keypoint has large distance ratio to its next nearest neighbor, it is relatively more likely to be the correct match and thus its weight should be high. –Weight of a keypoint is formulated as follow: Parameter b and c control the influence power of the two terms b = c = 0 is a special case of K-NNRatio in which K-NNRatio matching strategy is reduced to K-NN matching strategy

24 24 Database Query – 1. Matching Strategies Some advantages of K-NNRatio over K-NN –Nearest neighbor does not always gain high weight. E.g. Weight will not be high if it is far from the query point. –Keypoint with larger rank (i.e. k(K A )) can still gain high weight if it is far from the remaining neighbors. –The K nearest neighbors do not get the same weight. If there are only two possible matches, ideally only two keypoints will get high weights. This relieves the adverse effect of the fixed value of K and the threshold.

25 25 Database Query – 2. Verification Processes Each query keypoint matches a number of keypoints in the database Each matched keypoint in the database corresponds to one image Group the matched keypoints by their corresponding images False matches can be discovered by checking the inter-relationship between the keypoints in the same group –However, not all false matches can be removed Others suggested affine geometric verification using RANSAC We further propose orientation verification Image 1 … Image 2 … Image 3 … 0.75 0.56 0.80 0.30 0.20

26 26 Database Query – Orientation Verification For each group –Observation: If an image is rotated, all keypoints rotates accordingly The changes in orientation for all keypoints is more or less the same –For each candidate match, find the difference of the canonical orientation between the query keypoint and the matched database keypoint –Map the difference to one of the 36 bins that cover 360 degrees –Slide a moving window of width 3 bins over the 36 bins and find the window that contains maximum number of supports –Remove candidate matches in all the uncovered bins Simplified Version: 8 bins Moving window

27 27 Database Query – Affine Geometric Verification The affine transformation between two images can be modeled by the following equation: We can compute A with 3 matching pairs (x 0,b 0 ), (x 1,b 1 ) & (x 2,b 2 ) x: the homogenous coordinates of a keypoint in the query image b: the homogenous coordinates of the matched keypoint from the database A: the transformation matrix The RANSAC Step 1.Randomly pick 3 matching pairs 2.Compute A 3.Count the number of matches with L 2 distance between Ax and b smaller than a threshold (=> agree the transformation) 4.Loop to find the transformation with the greatest # support (= # matches)

28 28 Database Query – Image Voting Rank the groups by their total # support Remove groups with support smaller than a threshold Return the top 10 groups (if any) to users Image 1 … Image 2 … Image 3 … 0.75 0.56 0.80 0.300.20 2.11 0.300.20 …  # Support

29 29 Performance Evaluation We evaluate the performance of our system using a feature database containing 1 million of keypoints from 1200 images The images in the database are the transformed versions of the query images Example images:

30 30 Performance Evaluation – Orientation Verification Evaluation Metrics –Correct match is a match between a query image and any one of its transformed versions in the database Verification Schemes Orientation Verification Affine Geometric Verification Orientation Verification + Affine Geometric Verification +1% recall +7% precision

31 31 Performance Evaluation – K-NNRatio Determine parameters a, b, and c by trying different values: We find that setting (a = 4, b = 0.2, and c = 4) gives the best result Under this setting: # correct# falserecallPrecision K-NN101130184%77% K-NNRatio104017787%85% +3% recall +8% precision

32 32 Outlines Introduction Shape-SIFT (SSIFT) Image Retrieval System Conclusion and Future Work

33 33 Conclusion and Future Work We have introduced the followings: –SSIFT –Our image retrieval system –Described the orientation verification process and K-NNRatio matching strategy +4% recall and +15% precision in total We will do the followings: –Obtain Yan Ke’s dataset for performance evaluation of IND detection and compare our performance with them –Remove duplicated images in the result of CBIR system –Incorporate global features

34 34 Short Conclusion We have presented a new SIFT-based feature descriptor which is invariant also to background and object color change We have presented the performance evaluation of this descriptor and shown to have similar performance with SIFT in previously proposed dataset and have better performance in our dataset

35 35 Introduction Matching with invariant local features –Robust to Occlusion, clutter background –cf. global features Three phases: –Detection –Description –Matching Repeatedly Detected Distinctive, Invariance Accurate, Fast

36 36 Our Feature Descriptor Orientation Assignment –The orientation θ(x,y) is an angle ranged from 0 o to 180 o –It is now invariant to background and object color. Even when the colors are removed, remaining a contour line, the orientations are still consistent (see Fig 1) –180 o range of orientations is mapped to 18-bin orientation histogram –Each sample added to the histogram is weighted by its gradient magnitude m(x,y) –Gradient magnitude m(x,y) is also affected by background and object color but can be relieved by thresholding the values and renormalization Fig 1 If m(x,y) > 0.2 then m(x,y) = 0.2 Emphasize the distribution of orientation

37 37 Background and Object Color Invariance Contour of object is an important clue for recognizing object. Thus, correctly describing features near contour is a key to success in object recognition SIFT features are histograms of these orientations and thus they are NOT invariant to change in background and object color img1 img2 Recall-precision Graph SIFT achieves 10% Recall Rate only

38 38 Introduction Approaches based on feature detectors & descriptors dominate in object class recognition We have tested that SIFT performs better than GLOH and PCA-SIFT in term of recall rate The success of SIFT indicates that image gradient orientations and magnitudes are distinctive and robust However, SIFT is not robust to changing background and object color † C. S. Krystian Mikolajczyk. A performance evaluation of local descriptors. TPAMI 2005

39 39 Revise of SIFT Major stages of computation of SIFT descriptor † Feature Detection 1. Scale-space extrema detection † David G. Lowe. Distinctive Image Feature from Scale-Invariant Keypoints. IJCV 2004 2. Keypoint localization Feature Description 3. Orientation assignment 4. Keypoint descriptor Gradient magnitude and orientation are affected by background and object color Feature Detection will also be affected when background color changes to a similar color as the front object

40 40 Our Feature Descriptor Feature Detection –Our descriptor takes the output of the SIFT feature detection algorithm –Other feature detectors such as Harris-Affine detector † and Harris- Laplace detector can also be used as long as stable extrema across scale space are detected and are accurately located † K.Mikolajczyk et al. A Comparison of Affine Region Detectors. IJCV 2005

41 41 Canonical Orientation Determination Objective –To achieve rotation invariance Approach 1 –Rotate the coordinates of the descriptor by θand also by 180 o + θ –Describe feature in both coordinate systems –Generate two descriptors for each keypoint –Disadvantages: This doubles the amount of features in the database This lowers the matching performance because of the significant increase in the size of database

42 42 Canonical Orientation Determination Approach 2 –Determine the dominant orientation using a feature’s orientation dependent statistic value –The measure should be invariant to scale, background and object color change, viewpoint change and illumination change –We propose to use distribution of most frequent orientation Similar to finding the 1 st image moment (center of mass) First divide the feature region into two halves, say, region A and region B Note that we can have different halving scheme and weighting scheme Then compute Mass A and Mass B by the following equations: where m f (x,y) is the gradient magnitude of the most frequent orientation only

43 43 Distribution of Most Frequent Orientation Approach 2 –Dominant orientation is determined by these rules: Rule 1: If Mass A >> Mass B then dominant = θ Rule 2: If Mass B >> Mass A then dominant = 180 o +θ Rule 3: Otherwise, fail to determine. Use approach 1 Halving and weighting schemes –Right-Left Halving Region A and B are divided vertically in the feature region Each feature is of size 16 x 16 pixels and is divided into 4 x 4 subregions Weighting w(x,y) of the magnitude m f (x,y) of the most frequent orientation in the corresponding subregion is shown on the right 2112 2112 2112 2112 w(x,y) Region A Region B Pixels near the dividing line may easily fall into the other side and add to the mass of the other side. Thus, pixels in these regions get lower weight

44 44 Other Halving and Weighting Schemes Diagonal (Right-Up) Halving –Region A and B are divided diagonally in the feature region –Region A is in the right-up position Up-Down Halving –Region A and B are divided horizontally in the feature region Different halving scheme have different performance in term of filtering rate and inverse effect to feature matching 2222 1111 1111 2222 0123 1012 2101 3210 Region A Region B 2112 2112 2112 2112

45 45 Dominant Orientation Finder Problems –If Mass A and Mass B are close to each other, little change in feature may lead to different dominant orientation determination –Description of the same feature with different dominant orientation can be very different –Incorrect dominant orientation  incorrect match Solution –Difference between Mass A and Mass B have to be large enough to be used for determining dominant orientation –If Mass A > k * Mass B then dominant = θ –If Mass B > k * Mass A then dominant = 180 o +θ –k is empirically determined (~ 1.2 for Right-Left halving scheme)

46 46 Dominant Orientation Finder Determining k –By controlling the value of k, we can control the number of dominant orientation determinable features N determinable and the number of unstable dominant orientation features N unstable –Small k  N determinable ↑ and N unstable ↑ Fewer repeated features for same keypoint but features’ dominant orientations are less stable –Large k  N determinable ↓ and N unstable ↓ More repeated features but higher stability

47 47 Determining Best Scheme and k Empirically Objective –To find the best halving and weighting scheme and the best threshold k Assessment –Filtering Ratio = N determinable / Max. of N determinable –Matching Perf. = recall rate / Max. of achievable recall rate (reflect N unstable ) –Prefer high filtering ratio and high matching performance (both attain 1) Experiment –Assess with more than 2000 features Result –Right-Left and Diagonal halving schemes –k = 1.1 – 1.3 We choose to use Diagonal halving scheme with k = 1.3 for our descriptor


Download ppt "1 Invariant Local Feature for Image Matching Presented by Wyman Advised by Michael R. Lyu 11 th December, 2006."

Similar presentations


Ads by Google