Michael Arbib & Laurent Itti: CS664 – USC, spring 2002. Lecture 6: Object Recognition 1 CS664, USC, Spring 2002 Lecture 6. Object Recognition Reading Assignments:

Slides:



Advertisements
Similar presentations
Object recognition and scene “understanding”
Advertisements

Fast Readout of Object Identity from Macaque Inferior Tempora Cortex Chou P. Hung, Gabriel Kreiman, Tomaso Poggio, James J.DiCarlo McGovern Institute for.
Last week... why object recognition is difficult, the template model the feature recognition model, word recognition as a case study Today... Recognition.
Perception Putting it together. Sensation vs. Perception A somewhat artificial distinction Sensation: Analysis –Extraction of basic perceptual features.
Cognitive Processes PSY 334 Chapter 2 – Perception April 9, 2003.
Computational Vision: Object Recognition Object Recognition Jeremy Wyatt.
Visual Cognition II Object Perception. Theories of Object Recognition Template matching models Feature matching Models Recognition-by-components Configural.
Recognition by Parts Visual Recognition Lecture 12 “The whole is equal to the sum of its parts” Euclid.
A Study of Approaches for Object Recognition
CS 561, Sessions 27 1 Towards intelligent machines Thanks to CSCI561, we now know how to… - Search (and play games) - Build a knowledge base using FOL.
Cognitive Processes PSY 334 Chapter 2 – Perception June 30, 2003.
Deficits of vision What do visual deficits tell us about the structure of the visual system?
Recognising objects and faces. General problems Given that objects move on a surface, why do they not appear to change shape? How do we recognise objects.
Processing Digital Images. Filtering Analysis –Recognition Transmission.
Object Perception. Perceptual Grouping and Gestalt Laws Law of Good continuation. This is perceived as a square and triangle, not as a combination of.
Computational Architectures in Biological Vision, USC, Spring 2001
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 6: Low-level features 1 Computational Architectures in Biological.
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall Lecture 7. Object Recognition CS564 – Lecture 7. Object Recognition and Scene.
Visual Cognition II Object Perception. Theories of Object Recognition Template matching models Feature matching Models Recognition-by-components Configural.
Michael Arbib & Laurent Itti: CS664 – Spring Lecture 5: Visual Attention (bottom-up) 1 CS 664, USC Spring 2002 Lecture 7. Visual Attention (top-down)
Object Recognition Using Geometric Hashing
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 1 Computational Architectures in Biological.
What should be done at the Low Level?
1Ellen L. Walker Matching Find a smaller image in a larger image Applications Find object / pattern of interest in a larger picture Identify moving objects.
Computational Architectures in Biological Vision, USC, Spring 2001
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Cognitive Processes PSY 334 Chapter 2 – Perception.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 5: Introduction to Vision 2 1 Computational Architectures in.
An aside: peripheral drift illusion illusion of motion is strongest when reading text (such as this) while viewing the image in your periphery. Blinking.
Michael Arbib & Laurent Itti: CS664 – Spring Lecture 5: Visual Attention (bottom-up) 1 CS 664, USC Spring 2002 Lecture 5. Visual Attention (bottom-up)
Cognitive Processes PSY 334 Chapter 2 – Perception.
Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis Laurent Itti, Christof Koch, and Ernst Niebur IEEE PAMI, 1998.
THE PROBLEM OF VISUAL RECOGNITION (Ch. 3, Farah) Why is it difficult to identify real world objects from the retinal image? Why is it difficult to identify.
Biases: An Example Non-accidental properties: Properties that appear in an image that are very unlikely to have been produced by chance, and therefore.
Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.
Outline »Visual Pattern recognition ◊Template theory ◊Feature Theory ◊Top down influences »Object recognition »Auditory Pattern recognition ◊Physiology.
Perception & Attention Computational Cognitive Neuroscience Randall O’Reilly.
Lecture 3 - Race against Time 1 Three points for today Sensory memory (SM) contains highly transient information about the dynamic sensory array. Stabilizing.
Information Processing Assumptions Measuring the real-time stages General theory –structures –control processes Representation –definition –content vs.
National Taiwan A Road Sign Recognition System Based on a Dynamic Visual Model C. Y. Fang Department of Information and.
Korea University Dept.of Industrial System & Information Engineering User Interface Lab Chapter 3 _ Object Recognition + 이병용.
Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.
1 Computational Vision CSCI 363, Fall 2012 Lecture 5 The Retina.
Perception.
Class 21, 1999 CBCl/AI MIT Neuroscience II T. Poggio.
Autonomous Robots Vision © Manfred Huber 2014.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 12: Visual Attention 1 Computational Architectures in Biological.
Learning object affordances based on structural object representation Kadir F. Uyanik Asil Kaan Bozcuoglu EE 583 Pattern Recognition Jan 4, 2011.
3:01 PM Three points for today Sensory memory (SM) contains highly transient information about the dynamic sensory array. Stabilizing the contents of SM.
CSC321 Lecture 5 Applying backpropagation to shape recognition Geoffrey Hinton.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.
High level vision.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 4 – Audio and Digital Image Representation Klara Nahrstedt Spring 2010.
Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.
High-Level Vision Object Recognition.
Visual Recognition Lecture 12
9.012 Presentation by Alex Rakhlin March 16, 2001
Cognitive Processes PSY 334
Making meaning of complex patterns
Feature description and matching
Perceiving and Recognizing Objects
Reading Assignments: Lecture 6. Object Recognition None
4.2 Data Input-Output Representation
Neuropsychology of Vision Anthony Cate April 19, 2001
Brief Review of Recognition + Context
Pattern recognition (…and object perception).
Cognitive Processes PSY 334
Feature descriptors and matching
Psychophysical and Physiological Evidence for Viewer-centered Object Representations in the Primate N.K. Logothetis and J. Pauls Cerebral Cortex (1995)
Presentation transcript:

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 1 CS664, USC, Spring 2002 Lecture 6. Object Recognition Reading Assignments: None

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 2

3 Four stages of representation (Marr, 1982) 1) pixel-based (light intensity) 2) primal sketch (discontinuities in intensity) 3) 2 ½ D sketch (oriented surfaces, relative depth between surfaces) 4) 3D model (shapes, spatial relationships, volumes) problem: computationally intractable!

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 4

5 Challenges of Object Recognition The binding problem: binding different features (color, orientation, etc) to yield a unitary percept. (see next slide) Bottom-up vs. top-down processing: how much is assumed top-down vs. extracted from the image? Perception vs. recognition vs. categorization: seeing an object vs. seeing is as something. Matching views of known objects to memory vs. matching a novel object to object categories in memory. Viewpoint invariance: a major issue is to recognize objects irrespectively of the viewpoint from which we see them.

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 6

7 Viewpoint Invariance Major problem for recognition. Biederman & Gerhardstein, 1994: We can recognize two views of an unfamiliar object as being the same object. Thus, viewpoint invariance cannot only rely on matching views to memory.

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 8 Models of Object Recognition See Hummel, 1995, The Handbook of Brain Theory & Neural Networks Direct Template Matching: Processing hierarchy yields activation of view-tuned units. A collection of view-tuned units is associated with one object. View tuned units are built from V4-like units, using sets of weights which differ for each object. e.g., Poggio & Edelman, 1990; Riesenhuber & Poggio, 1999

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 9 Computational Model of Object Recognition (Riesenhuber and Poggio, 1999)

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 10 the model neurons are tuned for size and 3D orientation of object

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 11 Models of Object Recognition Hierarchical Template Matching: Image passed through layers of units with progressively more complex features at progressively less specific locations. Hierarchical in that features at one stage are built from features at earlier stages. e.g., Fukushima & Miyake (1982)’s Neocognitron: Several processing layers, comprising simple (S) and complex (C) cells. S-cells in one layer respond to conjunc- tions of C-cells in previous layer. C-cells in one layer are excited by small neighborhoods of S-cells.

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 12 Models of Object Recognition Transform & Match: First take care of rotation, translation, scale, etc. invariances. Then recognize based on standardized pixel representation of objects. e.g., Olshausen et al, 1993, dynamic routing model Template match: e.g., with an associative memory based on a Hopfield network.

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 13 Recognition by Components Structural approach to object recognition: Biederman, 1987: Complex objects are composed so simpler pieces We can recognize a novel/unfamiliar object by parsing it in terms of its component pieces, then comparing the assemblage of pieces to those of known objects.

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 14 Recognition by components (Biederman, 1987) GEONS: geometric elements of which all objects are composed (cylinders, cones, etc). On the order of 30 different shapes. Skips 2 ½ D sketch: Geons are directly recognized from edges, based on their nonaccidental properties (i.e., 3D features that are usually preserved by the projective imaging process).

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 15 Basic Properties of GEONs They are sufficiently different from each other to be easily discriminated They are view-invariant (look identical from most viewpoints) They are robust to noise (can be identified even with parts of image missing)

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 16 Support for RBC: We can recognize partially occluded objects easily if the occlusions do not obscure the set of geons which constitute the object.

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 17 Potential difficulties Edelman, 1997 A.Structural description not enough, also need metric info B.Difficult to extract geons from real images C.Ambiguity in the structu- ral description: most often we have several candidates D.For some objects, deriving a structural repre- sentation can be difficult

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 18 Geon Neurons in IT? These are preferred stimuli for some IT neurons.

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 19

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 20 Fusiform Face Area in Humans

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 21 representation Image specific Supports fine discrimination Noise tolerant Image invariant Supports generalization Noise sensitive visual processing Standard View on Visual Processing Tjan, 1999

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 22 Early visual processing Face Place Common objects ? (e.g. Kanwisher et al; Ishai et al) primary visual processing (Tjan, 1999) Multiple memory/decision sites

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 23 primary visual processing memory... “R1”“R1” “Ri”“Ri”“Rn”“Rn” Independent Decisions 11 ii nn Delays Homunculus’ Response the first arriving response Sensory Memory Tjan’s “Recognition by Anarchy”

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 24 A toy visual system Task:Identify letters from arbitrary positions & orientations “e”

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 25 normalize position normalize orientation Image down- sampling memory

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 26 memory normalize position normalize orientation Image down- sampling memory Site 1 Site 2Site 3

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 27 Test stimuli: 1) familiar (studied) views, 2) new positions, 3) new position & orientations 1800 {30%}1500 {25%}800 {20%}450 {15%}210 {10%} Signal-to-Noise Ratio {RMS Contrast} Study stimuli: 5 orientations  20 positions at high SNR

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 28 raw image norm. pos. norm. ori. Site 3 Site 2 Site 1 Processing speed for each recognition module depends on recognition difficulty by that module.

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 29 Proportion Correct Contrast (%) Familiar viewsNovel positions & orientations raw image norm. pos. norm. ori. Site 3 Site 2 Site 1

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 30 Novel positions & orientations Proportion Correct raw image norm. pos. norm. ori. Site 3 Site 2 Site 1 Contrast (%) Familiar views Black curve: full model in which recognition is based on the fastest of the responses from the three stages.

Michael Arbib & Laurent Itti: CS664 – USC, spring Lecture 6: Object Recognition 31