What should be done at the Low Level?

Slides:

Advertisements

Similar presentations

Chapter 2: Marr’s theory of vision. Cognitive Science  José Luis Bermúdez / Cambridge University Press 2010 Overview Introduce Marr’s distinction between.

Advertisements

Theories of Vision: a swift overview From Pixels to Percepts A. Efros, CMU, Spring 2011 Most slides from Steve Palmer.

5/13/2015CAM Talk G.Kamberova Computer Vision Introduction Gerda Kamberova Department of Computer Science Hofstra University.

Computer Vision: CSE 803 A brief intro MSU/CSE Fall 2014.

Edge and Corner Detection Reading: Chapter 8 (skip 8.1) Goal: Identify sudden changes (discontinuities) in an image This is where most shape information.

Perception Putting it together. Sensation vs. Perception A somewhat artificial distinction Sensation: Analysis –Extraction of basic perceptual features.

Computational Vision: Object Recognition Object Recognition Jeremy Wyatt.

Introduction to Cognitive Science Lecture 2: Vision in Humans and Machines 1 Vision in Humans and Machines September 10, 2009.

Cognitive Processes PSY 334 Chapter 2 – Perception June 30, 2003.

Processing Digital Images. Filtering Analysis –Recognition Transmission.

Capturing Light… in man and machine : Computational Photography Alexei Efros, CMU, Fall 2006 Some figures from Steve Seitz, Steve Palmer, Paul Debevec,

Object Perception. Perceptual Grouping and Gestalt Laws Law of Good continuation. This is perceived as a square and triangle, not as a combination of.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 6: Low-level features 1 Computational Architectures in Biological.

Computational Theories & Low-level Pixels To Percepts A. Efros, CMU, Spring 2009.

Visual Cognition II Object Perception. Theories of Object Recognition Template matching models Feature matching Models Recognition-by-components Configural.

Color, lightness & brightness Lavanya Sharan February 7, 2011.

Highlights Lecture on the image part (10) Automatic Perception 16

Capturing Light… in man and machine : Computational Photography Alexei Efros, CMU, Fall 2010.

CS292 Computational Vision and Language Visual Features - Colour and Texture.

1Ellen L. Walker Matching Find a smaller image in a larger image Applications Find object / pattern of interest in a larger picture Identify moving objects.

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 1 CS664, USC, Spring 2002 Lecture 6. Object Recognition Reading Assignments:

Jochen Triesch, UC San Diego, 1 COGS Visual Modeling Jochen Triesch & Martin Sereno Dept. of Cognitive Science UC.

Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.

00/4/103DVIP-011 Part Three: Descriptions of 3-D Objects and Scenes.

Tone mapping with slides by Fredo Durand, and Alexei Efros Digital Image Synthesis Yung-Yu Chuang 11/08/2005.

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Computer vision.

Cavanagh's pseudorealism Jan 23 - David Thompson.

Theories of Vision: a swift overview Learning-based Methods in Vision A. Efros, CMU, Spring 2012 Most slides from Steve Palmer.

Physiology of Vision: a swift overview Pixels to Percepts A. Efros, CMU, Spring 2011 Some figures from Steve Palmer.

Perception Introduction Pattern Recognition Image Formation

Recovering Surface Layout from a Single Image D. Hoiem, A.A. Efros, M. Hebert Robotics Institute, CMU Presenter: Derek Hoiem CS 598, Spring 2009 Jan 29,

Physiology of Vision: a swift overview : Advanced Machine Perception A. Efros, CMU, Spring 2006 Some figures from Steve Palmer.

Computing & Information Sciences Kansas State University Wednesday, 03 Dec 2008CIS 530 / 730: Artificial Intelligence Lecture 38 of 42 Wednesday, 03 December.

Introduction EE 520: Image Analysis & Computer Vision.

December 4, 2014Computer Vision Lecture 22: Depth 1 Stereo Vision Comparing the similar triangles PMC l and p l LC l, we get: Similarly, for PNC r and.

Lecture 2b Readings: Kandell Schwartz et al Ch 27 Wolfe et al Chs 3 and 4.

Computer Vision Why study Computer Vision? Images and movies are everywhere Fast-growing collection of useful applications –building representations.

CS 8690: Computer Vision Ye Duan. CS8690 Computer Vision University of Missouri at Columbia Instructor Ye Duan (209 Engr West)

Perception Is… The process of recognizing, organizing, and interpreting sensory information.

Digital Image Processing (DIP) Lecture # 5 Dr. Abdul Basit Siddiqui Assistant Professor-FURC 1FURC-BCSE7.

Visual Object Undrestanding. Verdical Perception Perception that is consistent with actual state of afairs in the environment.

Psychological approaches to the study of vision. The spatial frequency approach ● Like regular (temporal) frequency ● BUT, concerns how many cycles a.

Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.

School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.

1 Artificial Intelligence: Vision Stages of analysis Low level vision Surfaces and distance Object Matching.

1-1 Chapter 1: Introduction 1.1. Images An image is worth thousands of words.

Color and Brightness Constancy Jim Rehg CS 4495/7495 Computer Vision Lecture 25 & 26 Wed Oct 18, 2002.

Vision Overview  Like all AI: in its infancy  Many methods which work well in specific applications  No universal solution  Classic problem: Recognition.

Fundamentals of Sensation and Perception RECOGNIZING VISUAL OBJECTS ERIK CHEVRIER NOVEMBER 23, 2015.

3:01 PM Three points for today Sensory memory (SM) contains highly transient information about the dynamic sensory array. Stabilizing the contents of SM.

Colour and Texture. Extract 3-D information Using Vision Extract 3-D information for performing certain tasks such as manipulation, navigation, and recognition.

High level vision.

Tone mapping Digital Visual Effects, Spring 2007 Yung-Yu Chuang 2007/3/13 with slides by Fredo Durand, and Alexei Efros.

Theories of Vision: a swift overview : Learning-based Methods in Vision A. Efros, CMU, Spring 2007 Most slides from Steve Palmer.

Computational Vision Jitendra Malik University of California, Berkeley.

From local motion estimates to global ones - physiology:

Processing visual information for Computer Vision

Capturing Light… in man and machine

- photometric aspects of image formation gray level images

Physiology of Vision: a swift overview

Capturing Light… in man and machine

Feature description and matching

Fast Bilateral Filtering for the Display of High-Dynamic-Range Images

Capturing Light… in man and machine

Brief Review of Recognition + Context

Digital Visual Effects, Spring 2006 Yung-Yu Chuang 2006/3/8

Cognitive Processes PSY 334

Fourier Transform of Boundaries

Feature descriptors and matching

Presentation transcript:

What should be done at the Low Level? 16-721: Learning-Based Methods in Vision A. Efros, CMU, Spring 2009

Class Introductions Name: Research area / project / advisor What you want to learn in this class? When I am not working, I ______________ Favorite fruit:

Analysis Projects / Presentations Wed: Varun note-taker: Dan Next Wed: Dan note-taker: Edward Dan and Edward need to meet with me ASAP Varun needs to meet second time

Four Stages of Visual Perception Ceramic cup on a table David Marr, 1982 © Stephen E. Palmer, 2002

Four Stages of Visual Perception The Retinal Image An Image (blowup) Receptor Output © Stephen E. Palmer, 2002

Four Stages of Visual Perception Retinal Image Image-based Representation An Image Primal Sketch (Marr) Image- based processes Edges Lines Blobs etc. (Line Drawing) © Stephen E. Palmer, 2002

Four Stages of Visual Perception Image-based Representation Surface-based Representation Primal Sketch 2.5-D Sketch Surface- based processes Stereo Shading Motion etc. © Stephen E. Palmer, 2002

Koenderink’s trick

Four Stages of Visual Perception Surface-based Representation Object-based Representation 2.5-D Sketch Volumetric Sketch Object- based processes Grouping Parsing Completion etc. © Stephen E. Palmer, 2002

Geons (Biederman '87)

Four Stages of Visual Perception Object-based Representation Category-based Representation Volumetric Sketch Category- based processes Pattern- Recognition Spatial- description Basic-level Category Category: cup Color: light-gray Size: 6” Location: table © Stephen E. Palmer, 2002

We likely throw away a lot

line drawings are universal

However, things are not so simple… Problems with feed-forward model of processing…

two-tone images

“attached shadow” contour hair (not shadow!) “cast shadow” contour inferred external contours

Cavanagh's argument Finding 3D structure in two-tone images requires distinguishing cast shadows, attached shadows, and areas of low reflectivity The images do not contain this information a priori (at low level)

Feedforward vs. feedback models Marr's model (circa 1980) Cavanagh’s Model (circa 1990s) object recognition by matching 3D models Object basic recognition with 2D primitives memory 3D model 3D shape 2½D sketch 2D shape feedback primal sketch reconstruction of shape from image features stimulus stimulus

A Classical View of Vision Object and Scene Recognition High-level Figure/Ground Organization Mid-level Grouping / Segmentation In fact, here is a classical view of visual perception, in which figure/ground organization actually plays a very important role. In this linear architecture, first we have the image on the retina, then pixels in the image are grouped together into regions and contours. After grouping, figure/ground origination takes place, which assigns the ownership of contours and forms the perception of shape. Based on that, we have the recognition of objects and scenes at the end. Low-level pixels, features, edges, etc.

A Contemporary View of Vision Object and Scene Recognition High-level Figure/Ground Organization Grouping / Segmentation Mid-level In fact, here is a classical view of visual perception, in which figure/ground organization actually plays a very important role. In this linear architecture, first we have the image on the retina, then pixels in the image are grouped together into regions and contours. After grouping, figure/ground origination takes place, which assigns the ownership of contours and forms the perception of shape. Based on that, we have the recognition of objects and scenes at the end. But where we draw this line? Low-level pixels, features, edges, etc.

Question #1: What (if anything) should be done at the “Low-Level”? N.B. I have already told you everything that is known. From now on, there aren’t any answers.. Only questions…

Who cares? Why not just use pixels? Pixel differences vs. Perceptual differences

Eye is not a photometer! "Every light is a shade, compared to the higher lights, till you come to the sun; and every shade is a light, compared to the deeper shades, till you come to the night." — John Ruskin, 1879

Cornsweet Illusion

Campbell-Robson contrast sensitivity curve Sine wave Campbell-Robson contrast sensitivity curve

Metamers

Question #1: What (if anything) should be done at the “Low-Level”? i.e. What input stimulus should we be invariant to?

Invariant to: Brightness / Color changes? low-frequency changes small brightness / color changes But one can be too invariant

Invariant to: Edge contrast / reversal? I shouldn’t care what background I am on! but be careful of exaggerating noise

Representation choices Raw Pixels Gradients: Gradient Magnitude: Thresholded gradients (edge + sign): Thresholded gradient mag. (edges):

Typical filter bank

pyramid (e.g. wavelet, stearable, etc) Filters Input image

What does it capture? v = F * Patch (where F is filter matrix)

Why these filters?

Learned filters

Spatial invariance Rotation, Translation, Scale Yes, but not too much… In brain: complex cells – partial invariance In Comp. Vision: histogram-binning methods (SIFT, GIST, Shape Context, etc) or, equivalently, blurring (e.g. Geometric Blur) -- will discuss later

Many lives of a boundary

Often, context-dependent… input canny human Maybe low-level is never enough?