CS 496: Computer Vision Thanks to Chris Bregler. CS 496: Computer Vision PersonnelPersonnel – Instructor: Szymon Rusinkiewicz – TA:

CS 496: Computer Vision Thanks to Chris Bregler

CS 496: Computer Vision PersonnelPersonnel – Instructor: Szymon Rusinkiewicz smr@cs.princeton.edu – TA: Wagner Corrêa wtcorrea@cs.princeton.edu – Email to both cs496@princeton.edu Course web page http://www.cs.princeton.edu/courses/cs496/Course web page http://www.cs.princeton.edu/courses/cs496/

What is Computer Vision? Input: images or videoInput: images or video Output: description of the worldOutput: description of the world

What is Computer Vision? Input: images or videoInput: images or video Output: description of the worldOutput: description of the world – Many levels of description

Low-Level or “Early” Vision Considers local properties of an image “There’s an edge!”

Mid-Level Vision Grouping and segmentation “There’s an object and a background!”

High-Level Vision Recognition “It’s a chair!”

Big Question #1: Who Cares? Applications of computer visionApplications of computer vision – In AI: vision serves as the “input stage” – In medicine: understanding human vision – In engineering: model extraction

Vision and Other Fields Computer Vision Artificial Intelligence Cognitive Psychology Signal Processing Computer Graphics Pattern Analysis Metrology

Big Question #2: Does It Work? Situation much the same as AI:Situation much the same as AI: – Some fundamental algorithms – Large collection of hacks / heuristics Vision is hard!Vision is hard! – Especially at high level, physiology unknown – Requires integrating many different methods – Requires reasoning and understanding: “AI completeness”

Computer and Human Vision Emulating effects of human visionEmulating effects of human vision Understanding physiology of human visionUnderstanding physiology of human vision

Image Formation Human: lens forms image on retina, sensors (rods and cones) respond to lightHuman: lens forms image on retina, sensors (rods and cones) respond to light Computer: lens system forms image, sensors (CCD, CMOS) respond to lightComputer: lens system forms image, sensors (CCD, CMOS) respond to light

Low-Level Vision Hubel

Retinal ganglion cellsRetinal ganglion cells Lateral Geniculate Nucleus – function unknown (visual adaptation?)Lateral Geniculate Nucleus – function unknown (visual adaptation?) Primary Visual CortexPrimary Visual Cortex – Simple cells: orientational sensitivity – Complex cells: directional sensitivity Further processingFurther processing – Temporal cortex: what is the object? – Parietal cortex: where is the object? How do I get it?

Low-Level Vision Net effect: low-level human vision can be (partially) modeled as a set of multiresolution, oriented filtersNet effect: low-level human vision can be (partially) modeled as a set of multiresolution, oriented filters

Low-Level Depth Cues FocusFocus VergenceVergence StereoStereo Not as important as popularly believedNot as important as popularly believed

Low-Level Computer Vision Filters and filter banksFilters and filter banks – Implemented via convolution – Detection of edges, corners, and other local features – Can include multiple orientations – Can include multiple scales: “filter pyramids” ApplicationsApplications – First stage of segmentation – Texture recognition / classification – Texture synthesis

Texture Analysis / Synthesis MultiresolutionOriented Filter Bank Original Image ImagePyramid

Texture Analysis / Synthesis OriginalTexture SynthesizedTexture Heeger and Bergen

Low-Level Computer Vision Optical flowOptical flow – Detecting frame-to-frame motion – Local operator: looking for gradients ApplicationsApplications – First stage of tracking

Optical Flow Image #1 Optical Flow Field Image #2

Low-Level Computer Vision Shape from XShape from X – Stereo – Motion – Shading – Texture foreshortening

3D Reconstruction Tomasi+Kanade Debevec,Taylor,Malik Phigin et al. Forsyth et al.

Mid-Level Vision Physiology unclearPhysiology unclear Observations by Gestalt psychologistsObservations by Gestalt psychologists – Proximity – Similarity – Common fate – Common region – Parallelism – Closure – Symmetry – Continuity – Familiar configuration Wertheimer

Grouping Cues

Mid-Level Computer Vision TechniquesTechniques – Clustering based on similarity – Limited work on other principles ApplicationsApplications – Segmentation / grouping – Tracking

Snakes: Active Contours Contour Evolution for Segmenting an Artery

Birchfeld Histograms

Expectation Maximization (EM) Color Segmentation

Bayesian Methods Prior probabilityPrior probability – Expected distribution of models Conditional probability P(A|B)Conditional probability P(A|B) – Probability of observation A given model B

Bayesian Methods Prior probabilityPrior probability – Expected distribution of models Conditional probability P(A|B)Conditional probability P(A|B) – Probability of observation A given model B Bayes’s Rule P(B|A) = P(A|B)  P(B) / P(A)Bayes’s Rule P(B|A) = P(A|B)  P(B) / P(A) – Probability of model B given observation A Thomas Bayes (c. 1702-1761)

Bayesian Methods # black pixels

High-Level Vision Human mechanisms: ???Human mechanisms: ???

High-Level Vision Computational mechanismsComputational mechanisms – Bayesian networks – Templates – Linear subspace methods – Kinematic models

Cootes et al. Template-Based Methods

Linear Subspaces

Data PCA New Basis Vectors Kirby et al. Principal Components Analysis (PCA)

Kinematic Models Optical Flow/Feature tracking: no constraints Layered Motion: rigid constraints Articulated: kinematic chain constraints Nonrigid: implicit / learned constraints

Real-world Applications Osuna et al:

Course Outline Image formation and captureImage formation and capture Filtering and feature detectionFiltering and feature detection Optical flow and trackingOptical flow and tracking Projective geometryProjective geometry Shape from XShape from X Segmentation and clusteringSegmentation and clustering RecognitionRecognition Applications: 3D scanning; image-based renderingApplications: 3D scanning; image-based rendering

3D Scanning

Image-Based Modeling and Rendering Debevec et al. Manex

Course Mechanics 60%: 4 written / programming assignments60%: 4 written / programming assignments 30%: Final group project30%: Final group project 10%: In-class participation (includes attendance, project presentation, etc.)10%: In-class participation (includes attendance, project presentation, etc.)

Course Mechanics Book: Computer Vision – A Modern Approach David Forsyth and Jean PonceBook: Computer Vision – A Modern Approach David Forsyth and Jean Ponce PapersPapers All online – available from class webpageAll online – available from class webpage

CS 496: Computer Vision PersonnelPersonnel – Instructor: Szymon Rusinkiewicz smr@cs.princeton.edu – TA: Wagner Corrêa wtcorrea@cs.princeton.edu – Email to both cs496@princeton.edu Course web page http://www.cs.princeton.edu/courses/cs496/Course web page http://www.cs.princeton.edu/courses/cs496/

CS 496: Computer Vision Thanks to Chris Bregler. CS 496: Computer Vision PersonnelPersonnel – Instructor: Szymon Rusinkiewicz – TA:

Similar presentations

Presentation on theme: "CS 496: Computer Vision Thanks to Chris Bregler. CS 496: Computer Vision PersonnelPersonnel – Instructor: Szymon Rusinkiewicz – TA:"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 496: Computer Vision Thanks to Chris Bregler. CS 496: Computer Vision PersonnelPersonnel – Instructor: Szymon Rusinkiewicz – TA:

Similar presentations

Presentation on theme: "CS 496: Computer Vision Thanks to Chris Bregler. CS 496: Computer Vision PersonnelPersonnel – Instructor: Szymon Rusinkiewicz – TA:"— Presentation transcript:

Similar presentations

About project

Feedback