Richard G. Baraniuk Chinmay Hegde Manifold Learning in the Wild A New Manifold Modeling and Learning Framework for Image Ensembles Aswin C. Sankaranarayanan.

Slides:



Advertisements
Similar presentations
Distinctive Image Features from Scale-Invariant Keypoints
Advertisements

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.
Richard G. Baraniuk Chinmay Hegde Sriram Nagaraj Go With The Flow A New Manifold Modeling and Learning Framework for Image Ensembles Aswin C. Sankaranarayanan.
CSE 473/573 Computer Vision and Image Processing (CVIP)
MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…
Distinctive Image Features from Scale- Invariant Keypoints Mohammad-Amin Ahantab Technische Universität München, Germany.
Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Fast High-Dimensional Feature Matching for Object Recognition David Lowe Computer Science Department University of British Columbia.
“Random Projections on Smooth Manifolds” -A short summary
A Study of Approaches for Object Recognition
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Distinctive Image Feature from Scale-Invariant KeyPoints
Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.
Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp Presented by: Shalomi.
Object Recognition Using Distinctive Image Feature From Scale-Invariant Key point D. Lowe, IJCV 2004 Presenting – Anat Kaspi.
Scale Invariant Feature Transform (SIFT)
A Global Geometric Framework for Nonlinear Dimensionality Reduction Joshua B. Tenenbaum, Vin de Silva, John C. Langford Presented by Napat Triroj.
Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.
776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2013.
NonLinear Dimensionality Reduction or Unfolding Manifolds Tennenbaum|Silva|Langford [Isomap] Roweis|Saul [Locally Linear Embedding] Presented by Vikas.
Lightseminar: Learned Representation in AI An Introduction to Locally Linear Embedding Lawrence K. Saul Sam T. Roweis presented by Chan-Su Lee.
Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.
Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.
Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Computer vision.
Final Exam Review CS485/685 Computer Vision Prof. Bebis.
Richard G. Baraniuk Chinmay Hegde Sriram Nagaraj Manifold Learning in the Wild A New Manifold Modeling and Learning Framework for Image Ensembles Aswin.
Go With The Flow A New Manifold Modeling and Learning Framework for Image Ensembles Richard G. Baraniuk Rice University.
Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.
Richard Baraniuk Chinmay Hegde Marco Duarte Mark Davenport Rice University Michael Wakin University of Michigan Compressive Learning and Inference.
Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.
Local invariant features Cordelia Schmid INRIA, Grenoble.
Learning a Kernel Matrix for Nonlinear Dimensionality Reduction By K. Weinberger, F. Sha, and L. Saul Presented by Michael Barnathan.
776 Computer Vision Jan-Michael Frahm Fall SIFT-detector Problem: want to detect features at different scales (sizes) and with different orientations!
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
CVPR 2003 Tutorial Recognition and Matching Based on Local Invariant Features David Lowe Computer Science Department University of British Columbia.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai.
Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.
Manifold learning: MDS and Isomap
Nonlinear Dimensionality Reduction Approach (ISOMAP)
Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.
H. Lexie Yang1, Dr. Melba M. Crawford2
Features, Feature descriptors, Matching Jana Kosecka George Mason University.
A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )
Distinctive Image Features from Scale-Invariant Keypoints
Multiscale Geometric Signal Processing in High Dimensions
Scale Invariant Feature Transform (SIFT)
Presented by David Lee 3/20/2006
Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.
CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching.
SIFT.
776 Computer Vision Jan-Michael Frahm Spring 2012.
SIFT Scale-Invariant Feature Transform David Lowe
Presented by David Lee 3/20/2006
Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Scale Invariant Feature Transform (SIFT)
Learning Mid-Level Features For Recognition
Unsupervised Riemannian Clustering of Probability Density Functions
Paper Presentation: Shape and Matching
Feature description and matching
ISOMAP TRACKING WITH PARTICLE FILTERING
Features Readings All is Vanity, by C. Allan Gilbert,
Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE
Object recognition Prof. Graeme Bailey
Brief Review of Recognition + Context
Feature descriptors and matching
NonLinear Dimensionality Reduction or Unfolding Manifolds
Presentation transcript:

Richard G. Baraniuk Chinmay Hegde Manifold Learning in the Wild A New Manifold Modeling and Learning Framework for Image Ensembles Aswin C. Sankaranarayanan Rice University

Sensor Data Deluge

Internet Scale Databases Tremendous size of corpus of available data –Google Image Search of “Notre Dame Cathedral” yields 3m results  3Tb of data

Concise Models Efficient processing / compression requires concise representation Our interest in this talk: Collections of images

Concise Models Our interest in this talk: Collections of image parameterized by  \in  –translations of an object : x-offset and y-offset –rotations of a 3D object pitch, roll, yaw –wedgelets : orientation and offset

Concise Models Our interest in this talk: Collections of image parameterized by  \in  –translations of an object : x-offset and y-offset –rotations of a 3D object pitch, roll, yaw –wedgelets : orientation and offset Image articulation manifold

Image Articulation Manifold N-pixel images: K-dimensional articulation space Then is a K-dimensional manifold in the ambient space Very concise model –Can be learnt using Non-linear dim. reduction articulation parameter space

Ex: Manifold Learning LLE ISOMAP LE HE Diff. Geo … K=1 rotation

Ex: Manifold Learning K=2 rotation and scale

Smooth IAMs N-pixel images: Local isometry image distance parameter space distance Linear tangent spaces are close approximation locally Low dimensional articulation space articulation parameter space

Smooth IAMs articulation parameter space N-pixel images: Local isometry image distance parameter space distance Linear tangent spaces are close approximation locally Low dimensional articulation space

Smooth IAMs articulation parameter space N-pixel images: Local isometry image distance parameter space distance Linear tangent spaces are close approximation locally Low dimensional articulation space

Theory/Practice Disconnect Smoothness Practical image manifolds are not smooth! If images have sharp edges, then manifold is everywhere non-differentiable [Donoho and Grimes] Tangent approximations ?

Theory/Practice Disconnect Smoothness Practical image manifolds are not smooth! If images have sharp edges, then manifold is everywhere non-differentiable [Donoho and Grimes] Tangent approximations ?

Failure of Tangent Plane Approx. Ex: cross-fading when synthesizing / interpolating images that should lie on manifold Input Image Geodesic Linear path

Ex:translation manifold all blue images are equidistant from the red image Local isometry –satisfied only when sampling is dense Theory/Practice Disconnect Isometry

Theory/Practice Disconnect Nuisance articulations Unsupervised data, invariably, has additional undesired articulations –Illumination –Background clutter, occlusions, … Image ensemble is no longer low-dimensional

Image representations Conventional representation for an image –A vector of pixels –Inadequate! pixel image

Image representations Replace vector of pixels with an abstract bag of features –Ex: SIFT (Scale Invariant Feature Transform) selects keypoint locations in an image and computes keypoint descriptors for each keypoint –Very popular in many many vision problems

Features (including SIFT) ubiquitous in fusion and processing apps (15k+ cites for 2 SIFT papers) SIFT Features building 3D models part-based object recognition organizing internet-scale databases image stitching Figures courtesy Rob Fergus (NYU), Phototourism website, Antonio Torralba (MIT), and Wei Lu

Image representations Replace vector of pixels with an abstract bag of features –Ex: SIFT (Scale Invariant Feature Transform) selects keypoint locations in an image and computes keypoint descriptors for each keypoint –Keypoint descriptors are local; it is very easy to make them robust to nuisance imaging parameters

Loss of Geometrical Info Bag of features representations hide potentially useful image geometry Goal: make salient image geometrical info more explicit for exploitation Image space Keypoint space

Key idea Keypoint space can be endowed with a rich low-dimensional structure in many situations

Key idea Keypoint space can be endowed with a rich low-dimensional structure in many situations Mechanism: define kernels, between keypoint locations, keypoint descriptors

Keypoint Kernel Keypoint space can be endowed with a rich low-dimensional structure in many situations Mechanism: define kernels, between keypoint locations, keypoint descriptors Joint keypoint kernel between two images is given by

Many Possible Kernels Euclidean kernel Gaussian kernel Polynomial kernel Pyramid match kernel [Grauman et al. ’07] Many others

Keypoint Kernel Joint keypoint kernel between two images is given by Using Euclidean/Gaussian (E/G) combination yields

From Kernel to Metric Lemma: The E/G keypoint kernel is a Mercer kernel –enables algorithms such as SVM Lemma: The E/G keypoint kernel induces a metric on the space of images –alternative to conventional L 2 distance between images –keypoint metric robust to nuisance imaging parameters, occlusion, clutter, etc.

Keypoint Geometry Theorem: Under the metric induced by the kernel certain ensembles of articulating images form smooth, isometric manifolds Keypoint representation compact, efficient, and … Robust to illumination variations, non-stationary backgrounds, clutter, occlusions

Keypoint Geometry Theorem: Under the metric induced by the kernel certain ensembles of articulating images form smooth, isometric manifolds In contrast: conventional approach to image fusion via image articulation manifolds (IAMs) fraught with non-differentiability (due to sharp image edges) –not smooth –not isometric

Application: Manifold Learning 2D Translation

Application: Manifold Learning 2D Translation IAM KAM

Manifold Learning in the Wild Rice University’s Duncan Hall Lobby –158 images –360° panorama using handheld camera –Varying brightness, clutter

Duncan Hall Lobby Ground truth using state of the art structure-from-motion software Manifold Learning in the Wild Ground truthIAMKAM

Manifold Learning in the Wild Viewing angle – 179 images IAM KAM

Manifold Learning in the Wild Rice University’s Brochstein Pavilion –400 outdoor images of a building –occlusions, movement in foreground, varying background

Manifold Learning in the Wild Brochstein Pavilion –400 outdoor images of a building –occlusions, movement in foreground, background IAMKAM

Internet scale imagery Notre-dame cathedral –738 images –Collected from Flickr –Large variations in illumination (night/day/saturations), clutter (people, decorations), camera parameters (focal length, fov, …) –Non-uniform sampling of the space

Organization k-nearest neighbors

Organization “geodesics’ 3D rotation “Walk-closer” “zoom-out”

Summary Challenges for manifold learning in the wild are both theoretical and practical Need for novel image representations –Sparse features  Robustness to outliers, nuisance articulations, etc.  Learning in the wild: unsupervised imagery Promise lies in fast methods that exploit only neighborhood properties –No complex optimization required