Lecture 36: Course review CS4670 / 5670: Computer Vision Noah Snavely.

Slides:

Advertisements

Similar presentations

Lecture 11: Two-view geometry

Advertisements

Computer vision: models, learning and inference Chapter 13 Image preprocessing and feature extraction.

TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.

MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.

Lecture 31: Modern object recognition

Graph-Based Image Segmentation

Edge detection Goal: Identify sudden changes (discontinuities) in an image Intuitively, most semantic and shape information from the image can be encoded.

Mosaics con’t CSE 455, Winter 2010 February 10, 2010.

CS4670: Computer Vision Kavita Bala Lecture 10: Feature Matching and Transforms.

Lecture 6: Feature matching CS4670: Computer Vision Noah Snavely.

Lecture 8: Geometric transformations CS4670: Computer Vision Noah Snavely.

Lecture 2: Edge detection and resampling

Lecture 4: Feature matching

Lecture 7: Image Alignment and Panoramas CS6670: Computer Vision Noah Snavely What’s inside your fridge?

Lecture 21: Multiple-view geometry and structure from motion

Feature extraction: Corners and blobs

Lecture 9: Image alignment CS4670: Computer Vision Noah Snavely

Lecture 10: Cameras CS4670: Computer Vision Noah Snavely Source: S. Lazebnik.

Lecture 35: Course review CS4670: Computer Vision Noah Snavely.

Lecture 3a: Feature detection and matching CS6670: Computer Vision Noah Snavely.

Lecture 20: Two-view geometry CS6670: Computer Vision Noah Snavely.

Lecture 2: Image filtering

Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.

CS4670: Computer Vision Kavita Bala Lecture 7: Harris Corner Detection.

CS4670: Computer Vision Kavita Bala Lecture 8: Scale invariance.

Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.

Automatic Camera Calibration

Image Stitching Ali Farhadi CSE 455

What we didn’t have time for CS664 Lecture 26 Thursday 12/02/04 Some slides c/o Dan Huttenlocher, Stefano Soatto, Sebastian Thrun.

1 Interest Operators Harris Corner Detector: the first and most basic interest operator Kadir Entropy Detector and its use in object recognition SIFT interest.

Lecture 06 06/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Lecture 4: Feature matching CS4670 / 5670: Computer Vision Noah Snavely.

Lecture 29: Face Detection Revisited CS4670 / 5670: Computer Vision Noah Snavely.

Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.

Why is computer vision difficult?

Lec 22: Stereo CS4670 / 5670: Computer Vision Kavita Bala.

Announcements Project 3 due Thursday by 11:59pm Demos on Friday; signup on CMS Prelim to be distributed in class Friday, due Wednesday by the beginning.

Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.

Bahadir K. Gunturk1 Phase Correlation Bahadir K. Gunturk2 Phase Correlation Take cross correlation Take inverse Fourier transform  Location of the impulse.

Feature extraction: Corners and blobs. Why extract features? Motivation: panorama stitching We have two images – how do we combine them?

Final Review Course web page: vision.cis.udel.edu/~cv May 21, 2003  Lecture 37.

Lecture 15: Eigenfaces CS6670: Computer Vision Noah Snavely.

Lecture 15: Eigenfaces CS6670: Computer Vision Noah Snavely.

Local features: detection and description

Midterm Review. Tuesday, November 3 7:15 – 9:15 p.m. in room 113 Psychology Closed book One 8.5” x 11” sheet of notes on both sides allowed Bring a calculator.

Announcements Final is Thursday, March 18, 10:30-12:20 –MGH 287 Sample final out today.

Lecture 10: Harris Corner Detector CS4670/5670: Computer Vision Kavita Bala.

Invariant Local Features Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging.

CS4670 / 5670: Computer Vision Kavita Bala Lec 27: Stereo.

TP12 - Local features: detection and description

Announcements Final is Thursday, March 20, 10:30-12:20pm

Lecture 4: Harris corner detection

Lecture 7: Image alignment

Local features: detection and description May 11th, 2017

Feature description and matching

Image Stitching Slides from Rick Szeliski, Steve Seitz, Derek Hoiem, Ira Kemelmacher, Ali Farhadi.

Lecture 31: Graph-Based Image Segmentation

Filtering Things to take away from this lecture An image as a function

Seam Carving Project 1a due at midnight tonight.

Announcements Final is Thursday, March 16, 10:30-12:20

Announcements Project 4 questions Evaluations.

CS4670: Intro to Computer Vision

Lecture VI: Corner and Blob Detection

Feature descriptors and matching

Lecture 5: Feature invariance

Filtering An image as a function Digital vs. continuous images

Lecture 5: Feature invariance

Image Stitching Linda Shapiro ECE/CSE 576.

Image Stitching Linda Shapiro ECE P 596.

Presentation transcript:

Lecture 36: Course review CS4670 / 5670: Computer Vision Noah Snavely

Announcements Project 5 due tonight, 11:59pm Final exam Monday, Dec 10, 9am – Open-note, closed-book Course evals –

Questions?

Neat Video

Topics – image processing Filtering Edge detection Image resampling / aliasing / interpolation Feature detection – Harris corners – SIFT – Invariant features Feature matching

Topics – 2D geometry Image transformations Image alignment / least squares RANSAC Panoramas

Topics – 3D geometry Cameras Perspective projection Single-view modeling Stereo Two-view geometry (F-matrices, E-matrices) Structure from motion Multi-view stereo

Topics – recognition Skin detection / probabilistic modeling Eigenfaces Viola-Jones face detection (cascades / adaboost) Bag-of-words models Segmentation / graph cuts

Topics – Light, reflectance, cameras Light, BRDFS Photometric stereo

Image Processing

Linear filtering One simple version: linear filtering (cross-correlation, convolution) – Replace each pixel by a linear combination of its neighbors The prescription for the linear combination is called the “kernel” (or “mask”, “filter”) kernel 8 Modified image data Source: L. Zhang Local image data

Convolution Same as cross-correlation, except that the kernel is “flipped” (horizontally and vertically) Convolution is commutative and associative This is called a convolution operation:

Gaussian Kernel Source: C. Rasmussen

The gradient points in the direction of most rapid increase in intensity Image gradient The gradient of an image: The edge strength is given by the gradient magnitude: The gradient direction is given by: how does this relate to the direction of the edge? Source: Steve Seitz

Finding edges gradient magnitude

thinning (non-maximum suppression) Finding edges

Image sub-sampling 1/4 (2x zoom) 1/8 (4x zoom) Why does this look so crufty? 1/2 Source: S. Seitz

Subsampling with Gaussian pre-filtering G 1/4G 1/8Gaussian 1/2 Solution: filter the image, then subsample Source: S. Seitz

Image interpolation “Ideal” reconstruction Nearest-neighbor interpolation Linear interpolation Gaussian reconstruction Source: B. Curless

Image interpolation Nearest-neighbor interpolationBilinear interpolationBicubic interpolation Original image: x 10

The surface E(u,v) is locally approximated by a quadratic form. The second moment matrix

The Harris operator min is a variant of the “Harris operator” for feature detection The trace is the sum of the diagonals, i.e., trace(H) = h 11 + h 22 Very similar to min but less expensive (no square root) Called the “Harris Corner Detector” or “Harris Operator” Lots of other detectors, this is one of the most popular

Laplacian of Gaussian “Blob” detector Find maxima and minima of LoG operator in space and scale * = maximum minima

Scale-space blob detector: Example

f1f1 f2f2 f2'f2' Feature distance How to define the difference between two features f 1, f 2 ? Better approach: ratio distance = ||f 1 - f 2 || / || f 1 - f 2 ’ || f 2 is best SSD match to f 1 in I 2 f 2 ’ is 2 nd best SSD match to f 1 in I 2 gives large values for ambiguous matches I1I1 I2I2

2D Geometry

Parametric (global) warping Transformation T is a coordinate-changing machine: p’ = T(p) What does it mean that T is global? – Is the same for any point p – can be described by just a few numbers (parameters) Let’s consider linear xforms (can be represented by a 2D matrix): T p = (x,y)p’ = (x’,y’)

2D image transformations These transformations are a nested set of groups Closed under composition and inverse is a member

Projective Transformations aka Homographies aka Planar Perspective Maps Called a homography (or planar perspective map)

Inverse Warping Get each pixel g(x’,y’) from its corresponding location (x,y) = T -1 (x,y) in f(x,y) f(x,y)g(x’,y’) xx’ T -1 (x,y) Requires taking the inverse of the transform y y’

Affine transformations

Matrix form 2n x 6 6 x 1 2n x 1

RANSAC General version: 1.Randomly choose s samples Typically s = minimum sample size that lets you fit a model 2.Fit a model (e.g., line) to those samples 3.Count the number of inliers that approximately fit the model 4.Repeat N times 5.Choose the model that has the largest set of inliers

Projecting images onto a common plane mosaic PP each image is warped with a homography Can’t create a 360 panorama this way…

3D Geometry

Pinhole camera Add a barrier to block off most of the rays – This reduces blurring – The opening known as the aperture – How does this transform the image?

Perspective Projection Projection is a matrix multiply using homogeneous coordinates: divide by third coordinate This is known as perspective projection The matrix is the projection matrix

Projection matrix (t in book’s notation) translationrotationprojection intrinsics

l Point and line duality – A line l is a homogeneous 3-vector – It is  to every point (ray) p on the line: l p=0 p1p1 p2p2 What is the intersection of two lines l 1 and l 2 ? p is  to l 1 and l 2  p = l 1  l 2 Points and lines are dual in projective space l1l1 l2l2 p What is the line l spanned by rays p 1 and p 2 ? l is  to p 1 and p 2  l = p 1  p 2 l can be interpreted as a plane normal

Vanishing points Properties – Any two parallel lines (in 3D) have the same vanishing point v – The ray from C through v is parallel to the lines – An image may have more than one vanishing point in fact, every image point is a potential vanishing point image plane camera center C line on ground plane vanishing point V line on ground plane

Measuring height Camera height

Your basic stereo algorithm For each epipolar line For each pixel in the left image compare with every pixel on same epipolar line in right image pick pixel with minimum match cost Improvement: match windows

Stereo as energy minimization Better objective function {{ match cost smoothness cost Want each pixel to find a good match in the other image Adjacent pixels should (usually) move about the same amount

Fundamental matrix This epipolar geometry of two views is described by a Very Special 3x3 matrix, called the fundamental matrix maps (homogeneous) points in image 1 to lines in image 2! The epipolar line (in image 2) of point p is: Epipolar constraint on corresponding points: epipolar plane epipolar line 0 (projection of ray) Image 1Image 2

Epipolar geometry demo

8-point algorithm In reality, instead of solving, we seek f to minimize, least eigenvector of.

Structure from motion Camera 1 Camera 2 Camera 3 R 1,t 1 R 2,t 2 R 3,t 3 X1X1 X4X4 X3X3 X2X2 X5X5 X6X6 X7X7 minimize f (R, T, P)f (R, T, P) p 1,1 p 1,2 p 1,3 non-linear least squares

Stereo: another view error depth

Recognition

Face detection Do these images contain faces? Where?

Skin classification techniques Skin classifier Given X = (R,G,B): how to determine if it is skin or not? Nearest neighbor –find labeled pixel closest to X –choose the label for that pixel Data modeling –fit a model (curve, surface, or volume) to each class Probabilistic data modeling –fit a probability model to each class

Dimensionality reduction The set of faces is a “subspace” of the set of images Suppose it is K dimensional We can find the best subspace using PCA This is like fitting a “hyper-plane” to the set of faces –spanned by vectors v 1, v 2,..., v K –any face

Eigenfaces PCA extracts the eigenvectors of A Gives a set of vectors v 1, v 2, v 3,... Each one of these vectors is a direction in face space –what do these look like?

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial K. Grauman, B. Leibe Viola-Jones Face Detector: Summary Train with 5K positives, 350M negatives Real-time detector using 38 layer cascade 6061 features in final layer [Implementation available in OpenCV: 56 Faces Non-faces Train cascade of classifiers with AdaBoost Selected features, thresholds, and weights New image Apply to each subwindow

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial K. Grauman, B. Leibe Viola-Jones Face Detector: Results 57 K. Grauman, B. Leibe First two features selected

Bag-of-words models ….. frequency codewords

Histogram of Oriented Gradients (HoG) [Dalal and Triggs, CVPR 2005] 10x10 cells 20x20 cells HoGify

Support Vector Machines (SVMs) Discriminative classifier based on optimal separating line (for 2D case) Maximize the margin between the positive and negative training examples [slide credit: Kristin Grauman]

Vision Contests PASCAL VOC Challenge 20 categories Annual classification, detection, segmentation, … challenges

Binary segmentation Suppose we want to segment an image into foreground and background

Binary segmentation as energy minimization Define a labeling L as an assignment of each pixel with a 0-1 label (background or foreground) Problem statement: find the labeling L that minimizes {{ match cost smoothness cost (“how similar is each labeled pixel to the foreground / background?”)

Segmentation by Graph Cuts Break Graph into Segments Delete links that cross between segments Easiest to break links that have low cost (similarity) –similar pixels should be in the same segments –dissimilar pixels should be in different segments w ABC

Cuts in a graph Link Cut set of links whose removal makes a graph disconnected cost of a cut: A B Find minimum cut gives you a segmentation

Cuts in a graph A B Normalized Cut a cut penalizes large segments fix by normalizing for size of segments volume(A) = sum of costs of all edges that touch A

Light, reflectance, cameras

Radiometry What determines the brightness of an image pixel? Light source properties Surface shape Surface reflectance properties Optics Sensor characteristics Slide by L. Fei-Fei Exposure

Classic reflection behavior ideal specular Lambertianrough specular from Steve Marschner

Photometric stereo N L1L1 L2L2 V L3L3 Can write this as a matrix equation:

Example

Questions? Good luck!