Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 1699: Intro to Computer Vision Local Image Features: Extraction and Description Prof. Adriana Kovashka University of Pittsburgh September 17, 2015.

Similar presentations


Presentation on theme: "CS 1699: Intro to Computer Vision Local Image Features: Extraction and Description Prof. Adriana Kovashka University of Pittsburgh September 17, 2015."— Presentation transcript:

1 CS 1699: Intro to Computer Vision Local Image Features: Extraction and Description Prof. Adriana Kovashka University of Pittsburgh September 17, 2015

2 Announcement What: CS Welcome Party When: Friday, September 18, 2015 at 11:00AM Where: SENSQ 5317 Why (1): To learn about the CS Department, the CS Major and other interesting / useful things Why (2): To meet other students and faculty Why (3): To have some pizza!

3 Plan for today Feature extraction / keypoint detection Feature description Homework due on Tuesday Office hours today moved to 1pm-2pm tomorrow

4 An image is a set of pixels… What we seeWhat a computer sees Source: S. Narasimhan Adapted from S. Narasimhan

5 Problems with pixel representation Not invariant to small changes – Translation – Illumination – etc. Some parts of an image are more important than others What do we want to represent?

6 Interest points Note: “interest points” = “keypoints”, also sometimes called “features” Many applications – tracking: which points are good to track? – recognition: find patches likely to tell us something about object category – 3D reconstruction: find correspondences across different views D. Hoiem

7 Human eye movements Yarbus eye tracking D. Hoiem

8 Interest points Suppose you have to click on some point, go away and come back after I deform the image, and click on the same points again. – Which points would you choose? original deformed D. Hoiem

9 Choosing distinctive interest points If you wanted to meet a friend would you say a)“Let’s meet on campus.” b)“Let’s meet on Green street.” c)“Let’s meet at Green and Wright.” – Corner detection Or if you were in a secluded area: a)“Let’s meet in the Plains of Akbar.” b)“Let’s meet on the side of Mt. Doom.” c)“Let’s meet on top of Mt. Doom.” – Blob (valley/peak) detection D. Hoiem

10 Choosing interest points Where would you tell your friend to meet you? D. Hoiem

11 Choosing interest points Where would you tell your friend to meet you? D. Hoiem

12 What points would you choose? K. Grauman

13 Overview of Keypoint Matching K. Grauman, B. Leibe B1B1 B2B2 B3B3 A1A1 A2A2 A3A3 1. Find a set of distinctive key- points 3. Extract and normalize the region content 2. Define a region around each keypoint 4. Compute a local descriptor from the normalized region 5. Match local descriptors Query In database

14 Goals for Keypoints Detect points that are repeatable and distinctive Adapted from D. Hoiem

15 Key trade-offs More Repeatable More Points B1B1 B2B2 B3B3 A1A1 A2A2 A3A3 Detection More Distinctive More Flexible Description Robust to occlusion Minimize wrong matches Robust to expected variations Maximize correct matches Precise localization Adapted from D. Hoiem

16 Why extract features? Motivation: panorama stitching We have two images – how do we combine them? L. Lazebnik

17 Why extract features? Motivation: panorama stitching We have two images – how do we combine them? Step 1: extract features Step 2: match features L. Lazebnik

18 Why extract features? Motivation: panorama stitching We have two images – how do we combine them? Step 1: extract features Step 2: match features Step 3: align images L. Lazebnik

19 Goal: interest operator repeatability We want to detect (at least some of) the same points in both images. No chance to find true matches, yet we have to be able to run the detection procedure independently per image. K. Grauman

20 We want to be able to reliably determine which point goes with which. Must provide some invariance to geometric and photometric differences between the two views. ? Goal: descriptor distinctiveness K. Grauman

21 Corners as distinctive interest points We should easily recognize the point by looking through a small window Shifting a window in any direction should give a large change in intensity “edge”: no change along the edge direction “corner”: significant change in all directions “flat” region: no change in all directions A. Efros, D. Frolova, D. Simakov

22 Corners as distinctive interest points We should easily recognize the point by looking through a small window Shifting a window in any direction should give a large change in intensity “edge”: no change along the edge direction “corner”: significant change in all directions “flat” region: no change in all directions Adapted from A. Efros, D. Frolova, D. Simakov

23 Corners as distinctive interest points We should easily recognize the point by looking through a small window Shifting a window in any direction should give a large change in intensity “edge”: no change along the edge direction “corner”: significant change in all directions “flat” region: no change in all directions Adapted from A. Efros, D. Frolova, D. Simakov

24 Corners as distinctive interest points We should easily recognize the point by looking through a small window Shifting a window in any direction should give a large change in intensity “edge”: no change along the edge direction “corner”: significant change in all directions “flat” region: no change in all directions Adapted from A. Efros, D. Frolova, D. Simakov

25 Corners as distinctive interest points We should easily recognize the point by looking through a small window Shifting a window in any direction should give a large change in intensity “edge”: no change along the edge direction “corner”: significant change in all directions “flat” region: no change in all directions Adapted from A. Efros, D. Frolova, D. Simakov

26 Harris Detector: Mathematics Window-averaged squared change of intensity induced by shifting the image data by [ u,v ]: Intensity Shifted intensity Window function or Window function w(x,y) = Gaussian1 in window, 0 outside D. Frolova, D. Simakov

27 Harris Detector: Mathematics Window-averaged squared change of intensity induced by shifting the image data by [ u,v ]: Intensity Shifted intensity Window function E(u, v) D. Frolova, D. Simakov

28 Harris Detector: Mathematics Expanding I(x,y) in a Taylor series expansion, we have, for small shifts [u,v], a quadratic approximation to the error surface between a patch and itself, shifted by [u,v]: where M is a 2  2 matrix computed from image derivatives: D. Frolova, D. Simakov

29 Notation: K. Grauman Harris Detector: Mathematics

30 Harris Detector [ Harris88 ] Second moment matrix 30 1. Image derivatives 2. Square of derivatives 3. Gaussian filter g(  I ) IxIx IyIy Ix2Ix2 Iy2Iy2 IxIyIxIy g(I x 2 )g(I y 2 ) g(I x I y ) 4. Cornerness function – both eigenvalues are strong har 5. Non-maxima suppression (optionally, blur first) D. Hoiem

31 First, consider an axis-aligned corner: What does this matrix reveal? K. Grauman

32 First, consider an axis-aligned corner: This means dominant gradient directions align with x or y axis Look for locations where both λ’s are large. If either λ is close to 0, then this is not corner-like. What does this matrix reveal? What if we have a corner that is not aligned with the image axes? K. Grauman

33 What does this matrix reveal? Since M is symmetric, we have The eigenvalues of M reveal the amount of intensity change in the two principal orthogonal gradient directions in the window. K. Grauman

34 Corner response function “flat” region: 1 and 2 are small “edge”: 1 >> 2 2 >> 1 “corner”: 1 and 2 are large, 1 ~ 2 Adapted from A. Efros, D. Frolova, D. Simakov, K. Grauman

35 Harris Detector: Mathematics Measure of corner response: ( k – empirical constant, k = 0.04-0.06) D. Frolova, D. Simakov

36 Harris Detector: Summary Compute image gradients I x and I y for all pixels For each pixel – Compute by looping over neighbors x, y – compute Find points with large corner response function R ( R > threshold) Take the points of locally maximum R as the detected feature points (i.e., pixels where R is bigger than for all the 4 or 8 neighbors). 36 D. Frolova, D. Simakov (k :empirical constant, k = 0.04-0.06)

37 K. Grauman Example of Harris application

38 Compute corner response at every pixel. Example of Harris application K. Grauman

39 Harris Detector – Responses [ Harris88 ] Effect: A very precise corner detector. D. Hoiem

40 Harris Detector - Responses [ Harris88 ] D. Hoiem

41 Affine intensity change Only derivatives are used => invariance to intensity shift I  I + b Intensity scaling: I  a I R x (image coordinate) threshold R x (image coordinate) Partially invariant to affine intensity change I  a I + b L. Lazebnik

42 Image translation Derivatives and window function are shift-invariant Corner location is covariant w.r.t. translation L. Lazebnik

43 Image rotation Second moment ellipse rotates but its shape (i.e. eigenvalues) remains the same Corner location is covariant w.r.t. rotation L. Lazebnik

44 Scaling Invariant to image scale? image zoomed image A. Torralba

45 Scaling All points will be classified as edges Corner Corner location is not covariant to scaling! L. Lazebnik

46 Scale Invariant Detection The problem: how do we choose corresponding circles independently in each image? Do objects in the image have a characteristic scale that we can identify? D. Frolova, D. Simakov

47 Scale Invariant Detection Solution: – Design a function on the region (circle), which is “scale invariant” (the same for corresponding regions, even if they are at different scales) – Take a local maximum of this function scale = 1/2 f region size Image 1 f region size Image 2 A. Torralba s1s1 s2s2

48 Scale Invariant Detection A “good” function for scale detection: has one stable sharp peak f region size Bad f region size Bad f region size Good ! For usual images: a good function would be a one which responds to contrast (sharp local intensity change) A. Torralba

49 Automatic Scale Selection K. Grauman, B. Leibe How to find corresponding patch sizes?

50 Automatic Scale Selection Function responses for increasing scale (scale signature) K. Grauman, B. Leibe

51 Automatic Scale Selection Function responses for increasing scale (scale signature) K. Grauman, B. Leibe

52 Automatic Scale Selection Function responses for increasing scale (scale signature) K. Grauman, B. Leibe

53 Automatic Scale Selection Function responses for increasing scale (scale signature) K. Grauman, B. Leibe

54 Automatic Scale Selection Function responses for increasing scale (scale signature) K. Grauman, B. Leibe

55 Automatic Scale Selection Function responses for increasing scale (scale signature) K. Grauman, B. Leibe

56 Recall: Edge detection f S. Seitz Edge Derivative of Gaussian Edge = max of derivative

57 f Edge Second derivative of Gaussian Edge = zero crossing of 2 nd derivative Recall: Edge detection S. Seitz

58 From edges to blobs Edge = ripple Blob = superposition of two ripples Spatial selection: the magnitude of the Laplacian response will achieve a maximum at the center of the blob, provided the scale of the Laplacian is “matched” to the scale of the blob maximum L. Lazebnik

59 Blob detection in 2D Laplacian of Gaussian: Circularly symmetric operator for blob detection in 2D K. Grauman

60 What Is A Useful Signature Function? Laplacian of Gaussian = “blob” detector K. Grauman, B. Leibe

61 We can approximate the Laplacian with a difference of Gaussians; more efficient to implement. (Laplacian) (Difference of Gaussians)

62 Gaussian pyramid example

63 DoG – Efficient Computation Computation in Gaussian scale pyramid K. Grauman, B. Leibe  Original image Sampling with step    =2   

64 Find local maxima in position-scale space of Difference-of-Gaussian K. Grauman, B. Leibe       List of (x, y, s)

65 Results: Difference-of-Gaussian K. Grauman, B. Leibe

66 Gaussian pyramid example

67 Local features: desired properties Repeatability – The same feature can be found in several images despite geometric and photometric transformations Saliency – Each feature has a distinctive description Compactness and efficiency – Many fewer features than image pixels Locality – A feature occupies a relatively small area of the image; robust to clutter and occlusion K. Grauman

68 Plan for today Feature extraction / keypoint detection – Corners – Blobs Feature description

69 Raw patches as local descriptors The simplest way to describe the neighborhood around an interest point is to write down the list of intensities to form a feature vector. But this is very sensitive to even small shifts, rotations. K. Grauman

70 Geometric transformations e.g. scale, translation, rotation K. Grauman

71 Photometric transformations T. Tuytelaars

72 Local Descriptors The ideal descriptor should be – Robust – Distinctive – Compact – Efficient Most available descriptors focus on edge/gradient information – Capture texture information – Color rarely used K. Grauman, B. Leibe

73 Local Descriptors: SIFT Descriptor [Lowe, ICCV 1999] Histogram of oriented gradients Captures important texture information Robust to small translations / affine deformations K. Grauman, B. Leibe

74 Basics of Lowe’s SIFT algorithm Run Difference-of-Gaussian keypoint detector – Find maxima in location/scale space Find all major orientations – Bin orientations – Weight by gradient magnitude – Weight by distance to center (Gaussian-weighted mean) – Return orientations within 0.8 of peak For each (x, y, scale, orientation), create descriptor: – Sample 16x16 gradient magnitude and relative orientation – Bin 4x4 samples into 4x4 histograms – Threshold values to max of 0.2, divide by L2 norm – Final descriptor: 4x4x8 normalized histograms Lowe IJCV 2004

75 Basic idea: Take 16x16 square window around detected feature Compute gradient orientation for each pixel Create histogram over edge orientations weighted by magnitude Scale Invariant Feature Transform L. Zitnick, adapted from D. Lowe 0 22 angle histogram

76 SIFT descriptor Full version Divide the 16x16 window into a 4x4 grid of cells (2x2 case shown below) Compute an orientation histogram for each cell 16 cells * 8 orientations = 128 dimensional descriptor L. Zitnick, adapted from D. Lowe

77 Full version Divide the 16x16 window into a 4x4 grid of cells (2x2 case shown below) Compute an orientation histogram for each cell 16 cells * 8 orientations = 128 dimensional descriptor Threshold normalize the descriptor: SIFT descriptor 0.2 such that: L. Zitnick, adapted from D. Lowe

78 CSE 576: Computer Vision Making descriptor rotation invariant Image from Matthew Brown Rotate patch according to its dominant gradient orientation This puts the patches into a canonical orientation. K. Grauman

79 Extraordinarily robust matching technique Can handle changes in viewpoint Up to about 60 degree out of plane rotation Can handle significant changes in illumination Sometimes even day vs. night (below) Fast and efficient—can run in real time Lots of code available http://people.csail.mit.edu/albert/ladypack/wiki/index.php/Known_implementations_of_SIFT SIFT descriptor [Lowe 2004] S. Seitz

80 Matching local features ? To generate candidate matches, find patches that have the most similar appearance (e.g., lowest SSD) Simplest approach: compare them all, take the closest (or closest k, or within a thresholded distance) Image 1 Image 2 K. Grauman

81 Ambiguous matches At what SSD value do we have a good match? To add robustness to matching, can consider ratio : distance to best match / distance to second best match If low, first match looks good. If high, could be ambiguous match. Image 1 Image 2 ???? K. Grauman

82 Matching SIFT Descriptors Nearest neighbor (Euclidean distance) Threshold ratio of nearest to 2 nd nearest descriptor Lowe IJCV 2004

83 SIFT Repeatability Lowe IJCV 2004

84 SIFT Repeatability Lowe IJCV 2004

85 Robust feature-based alignment L. Lazebnik

86 Robust feature-based alignment Extract features L. Lazebnik

87 Robust feature-based alignment Extract features Compute putative matches L. Lazebnik

88 Robust feature-based alignment Extract features Compute putative matches Loop: Hypothesize transformation T (small group of putative matches that are related by T) L. Lazebnik

89 Robust feature-based alignment Extract features Compute putative matches Loop: Hypothesize transformation T (small group of putative matches that are related by T) Verify transformation (search for other matches consistent with T) L. Lazebnik

90 Robust feature-based alignment Extract features Compute putative matches Loop: Hypothesize transformation T (small group of putative matches that are related by T) Verify transformation (search for other matches consistent with T) L. Lazebnik

91 Applications of local invariant features Image alignment Indexing and retrieval Recognition Robot navigation 3D reconstruction Wide baseline stereo Panoramas Motion tracking … Adapted from K. Grauman and L. Lazebnik

92 Recognition of specific objects, scenes Rothganger et al. 2003 Lowe 2002 Schmid and Mohr 1997 Sivic and Zisserman, 2003 K. Grauman

93 Wide baseline stereo [Image from T. Tuytelaars ECCV 2006 tutorial]

94 Automatic mosaicing http://www.cs.ubc.ca/~mbrown/autostitch/autostitch.html K. Grauman

95 Summary Keypoint detection: repeatable and distinctive – Corners, blobs, stable regions – Laplacian of Gaussian, automatic scale selection Descriptors: robust and selective – Histograms for robustness to small shifts and translations (SIFT descriptor) Adapted from D. Hoiem, K. Grauman


Download ppt "CS 1699: Intro to Computer Vision Local Image Features: Extraction and Description Prof. Adriana Kovashka University of Pittsburgh September 17, 2015."

Similar presentations


Ads by Google