University of Pennsylvania 1 GRASP CIS 580 Machine Perception Fall 2004 Jianbo Shi Object recognition.

Slides:

Advertisements

Similar presentations

Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Advertisements

Image Registration  Mapping of Evolution. Registration Goals Assume the correspondences are known Find such f() and g() such that the images are best.

Alignment Visual Recognition “Straighten your paths” Isaiah.

QR Code Recognition Based On Image Processing

Last 4 lectures Camera Structure HDR Image Filtering Image Transform.

Document Image Processing

Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.

Announcements Final Exam May 13th, 8 am (not my idea).

Uncertainty Representation. Gaussian Distribution variance Standard deviation.

Image alignment Image from

Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.

Computer Vision Optical Flow

Image alignment Image from

Lecture 5 Template matching

Announcements Final Exam May 16 th, 8 am (not my idea). Practice quiz handout 5/8. Review session: think about good times. PS5: For challenge problems,

Invariants. Definitions and Invariants A definition of a class means that given a list of properties: –For all props, all objects have that prop. –No.

Image alignment.

A Study of Approaches for Object Recognition

Scale Invariant Feature Transform

Chamfer Matching & Hausdorff Distance Presented by Ankur Datta Slides Courtesy Mark Bouts Arasanathan Thayananthan.

Announcements Take home quiz given out Thursday 10/23 –Due 10/30.

Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.

1Ellen L. Walker Matching Find a smaller image in a larger image Applications Find object / pattern of interest in a larger picture Identify moving objects.

MSU CSE 803 Stockman CV: Matching in 2D Matching 2D images to 2D images; Matching 2D images to 2D maps or 2D models; Matching 2D maps to 2D maps.

Robust estimation Problem: we want to determine the displacement (u,v) between pairs of images. We are given 100 points with a correlation score computed.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

Computing transformations Prof. Noah Snavely CS1114

כמה מהתעשייה? מבנה הקורס השתנה Computer vision.

Robust fitting Prof. Noah Snavely CS1114

Image Stitching Ali Farhadi CSE 455

Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean Hall 5409 T-R 10:30am – 11:50am.

Image alignment.

MASKS © 2004 Invitation to 3D vision Lecture 3 Image Primitives andCorrespondence.

Out-of-plane Rotations Environment constraints ● Surveillance systems ● Car driver images ASM: ● Similarity does not remove 3D pose ● Multiple-view database.

By Yevgeny Yusepovsky & Diana Tsamalashvili the supervisor: Arie Nakhmani 08/07/2010 1Control and Robotics Labaratory.

October 14, 2014Computer Vision Lecture 11: Image Segmentation I 1Contours How should we represent contours? A good contour representation should meet.

1 Interest Operators Harris Corner Detector: the first and most basic interest operator Kadir Entropy Detector and its use in object recognition SIFT interest.

Professor: S. J. Wang Student : Y. S. Wang

S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.

CSCE 643 Computer Vision: Structure from Motion

Plenoptic Modeling: An Image-Based Rendering System Leonard McMillan & Gary Bishop SIGGRAPH 1995 presented by Dave Edwards 10/12/2000.

CS654: Digital Image Analysis Lecture 25: Hough Transform Slide credits: Guillermo Sapiro, Mubarak Shah, Derek Hoiem.

1cs426-winter-2008 Notes  Will add references to splines on web page.

Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

CSE 185 Introduction to Computer Vision Feature Matching.

A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )

Lecture 9 Feature Extraction and Motion Estimation Slides by: Michael Black Clark F. Olson Jean Ponce.

Structure from Motion Paul Heckbert, Nov , Image-Based Modeling and Rendering.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Structure from motion Multi-view geometry Affine structure from motion Projective structure from motion Planches : –

Dynamic Programming (DP), Shortest Paths (SP)

Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.

MASKS © 2004 Invitation to 3D vision Lecture 3 Image Primitives andCorrespondence.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Lecture 16: Image alignment

SIFT Scale-Invariant Feature Transform David Lowe

Nearest-neighbor matching to feature database

Slope and Curvature Density Functions

Paper Presentation: Shape and Matching

Object recognition Prof. Graeme Bailey

Nearest-neighbor matching to feature database

CSE 455 – Guest Lectures 3 lectures Contact Interest points 1

CV: Matching in 2D Matching 2D images to 2D images; Matching 2D images to 2D maps or 2D models; Matching 2D maps to 2D maps MSU CSE 803 Stockman.

CSE 185 Introduction to Computer Vision

Back to equations of geometric transformations

Image Registration  Mapping of Evolution

Image Stitching Linda Shapiro ECE/CSE 576.

Image Stitching Linda Shapiro ECE P 596.

Presentation transcript:

University of Pennsylvania 1 GRASP CIS 580 Machine Perception Fall 2004 Jianbo Shi Object recognition

University of Pennsylvania 2 GRASP Problem Definition 1.An Image is a set of 2D geometric features, along with positions. 2.An Object is a set of 2D/3D geometric features, along with positions. 3.A pose positions the object relative to the image. l 2D Translation; l 2D translation + rotation; l 2D translation, rotation and scale; l planar or 3D object positioned in 3D with perspective or scaled orthorgraphic 4.The best pose places the object features nearest the image features

University of Pennsylvania 3 GRASP Strategy 1.Build feature descriptions 2.Search possible poses. l Can search space of poses l Or search feature matches, which produce pose 3.Transform model by pose. 4.Compare transformed model and image. 5.Pick best pose.

University of Pennsylvania 4 GRASP Overview 1.We will discuss feature selection in 2 lectures 2.Pose error measurement 3.search methods for 2D. 4.discuss search for 3D objects.

University of Pennsylvania 5 GRASP Feature example: edges

University of Pennsylvania 6 GRASP Evaluate Pose We look at this first, since it defines the problem. Again, no perfect measure; l Trade-offs between veracity of measure and computational considerations.

University of Pennsylvania 7 GRASP Chamfer Matching For every edge point in the transformed object, compute the distance to the nearest image edge point. Sum distances.

University of Pennsylvania 8 GRASP Main Feature: Every model point matches an image point. An image point can match 0, 1, or more model points.

University of Pennsylvania 9 GRASP Variations Sum a different distance l f(d) = d 2 l or Manhattan distance. l f(d) = 1 if d < threshold, 0 otherwise. w This is called bounded error. Use maximum distance instead of sum. l This is called: directed Hausdorff distance. Use other features l Corners. l Lines. Then position and angles of lines must be similar. t Model line may be subset of image line.

University of Pennsylvania 10 GRASP Other comparisons Enforce each image feature can match only one model feature. Enforce continuity, ordering along curves. These are more complex to optimize.

University of Pennsylvania 11 GRASP Overview 1.We will discuss feature selection in 2 lectures 2.Pose error measurement 3.search methods for 2D. 4.discuss search for 3D objects.

University of Pennsylvania 12 GRASP Pose Search Simplest approach is to try every pose. Two problems: many poses, costly to evaluate each. We can reduce the second problem with:

University of Pennsylvania 13 GRASP Pose: Chamfer Matching with the Distance Transform Example: Each pixel has (Manhattan) distance to nearest edge pixel.

University of Pennsylvania 14 GRASP Computing Distance Transform It’s only done once, per problem, not once per pose. Basically a shortest path problem. Simple solution passing through image once for each distance. l First pass mark edges 0. l Second, mark 1 anything next to 0, unless it’s already marked. Etc…. Actually, a more clever method requires 2 passes.

University of Pennsylvania 15 GRASP Pose: Ransac Match enough features in model to features in image to determine pose. Examples: l match a point and determine translation. l match a corner and determine translation and rotation. l Points and translation, rotation, scaling? l Lines and rotation and translation?

University of Pennsylvania 16 GRASP

University of Pennsylvania 17 GRASP Pose: Generalized Hough Transform Like Hough Transform, but for general shapes. Example: match one point to one point, and for every rotation of the object its translation is determined.

University of Pennsylvania 18 GRASP Overview 1.We will discuss feature selection in 2 lectures 2.Pose error measurement 3.search methods for 2D objects. 4.discuss search for 3D objects.

University of Pennsylvania 19 GRASP Computing Pose: Points 1.Solve W= S*R. l In Structure-from-Motion, we knew W. l In Recognition, we know R. 2.This is just set of linear equations l Ok, maybe with some non-linear constraints.

University of Pennsylvania 20 GRASP Linear Pose: 2D Translation We know x,y,u,v, need to find translation. For one point, u 1 - x 1 = t x ; v 1 - x 1 = t y For more points we can solve a least squares problem.

University of Pennsylvania 21 GRASP Linear Pose: 2D rotation, translation and scale Notice a and b can take on any values. Equations linear in a, b, translation. Solve exactly with 2 points, or overconstrained system with more.

University of Pennsylvania 22 GRASP Linear Pose: 3D Affine

University of Pennsylvania 23 GRASP Pose: Scaled Orthographic Projection of Planar points s 1,3, s 2,3 disappear. Non-linear constraints disappear with them.

University of Pennsylvania 24 GRASP Non-linear pose A bit trickier. Some results: 2D rotation and translation. Need 2 points. 3D scaled orthographic. Need 3 points, give 2 solutions. 3D perspective, camera known. Need 3 points. Solve 4 th degree polynomial. 4 solutions.

University of Pennsylvania 25 GRASP Transforming the Object We don’t really want to know pose, we want to know what the object looks like in that pose. We start with: Solve for pose: Project rest of points:

University of Pennsylvania 26 GRASP Transforming object with Linear Combinations No 3D model, but we’ve seen object twice before. See four points in third image, need to fill in location of other points. Just use rank theorem.

University of Pennsylvania 27 GRASP Recap: Recognition w/ RANSAC 1.Find features in model and image. l Such as corners. 2.Match enough to determine pose. l Such as 3 points for planar object, scaled orthographic projection. 3.Determine pose. 4.Project rest of object features into image. 5.Look to see how many image features they match. l Example: with bounded error, count how many object features project near an image feature. 6.Repeat steps 2-5 a bunch of times. 7.Pick pose that matches most features.

University of Pennsylvania 28 GRASP Recognizing 3D Objects Previous approach will work. But slow. RANSAC considers n 3 m 3 possible matches. About m 3 correct. Solutions: l Grouping. Find features coming from single object. l Viewpoint invariance. Match to small set of model features that could produce them.

University of Pennsylvania 29 GRASP Grouping: Continuity

University of Pennsylvania 30 GRASP Connectivity Connected lines likely to come from boundary of same object. l Boundary of object often produces connected contours. l Different objects more rarely do; only when overlapping. Connected image lines match connected model lines. l Disconnected model lines generally don’t appear connected.

University of Pennsylvania 31 GRASP Other Viewpoint Invariants Parallelism Convexity Common region ….

University of Pennsylvania 32 GRASP Planar Invariants At

University of Pennsylvania 33 GRASP p1 p2 p3 p4 p4 = p1 + a(p2-p1) + b(p3-p1) A(p4)+ t = A(p1+a(p2-p1) + b(p3-p1)) + t = A(p1)+t + a(A(p2)+t – A(p1)-t) + b(A(p3)+t – A(p1)-t)

University of Pennsylvania 34 GRASP p1 p2 p3 p4 p4 = p1 + a(p2-p1) + b(p3-p1) A(p4)+ t = A(p1+a(p2-p1) + b(p3-p1)) + t = A(p1)+t + a(A(p2)+t – A(p1)-t) + b(A(p3)+t – A(p1)-t) p4 is linear combination of p1,p2,p3. Transformed p4 is same linear combination of transformed p1, p2, p3.

University of Pennsylvania 35 GRASP What we didn’t talk about Smooth 3D objects. Can we find the guaranteed optimal solution? Indexing with invariants. Error propagation.