My Group’s Current Research on Image Understanding.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
CVPR2013 Poster Modeling Actions through State Changes.
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Object recognition and scene “understanding”
Article review by Alexander Backus Distributed representations meeting article review.
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs Roozbeh Mottaghi 1, Sanja Fidler 2, Jian Yao 2, Raquel Urtasun 2, Devi Parikh 3 1 UCLA.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.
Analogy-Making. Consider the following cognitive activities.
Petacat: Applying ideas from Copycat to image understanding.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Morris LeBlanc.  Why Image Retrieval is Hard?  Problems with Image Retrieval  Support Vector Machines  Active Learning  Image Processing ◦ Texture.
Fast and Compact Retrieval Methods in Computer Vision Part II A. Torralba, R. Fergus and Y. Weiss. Small Codes and Large Image Databases for Recognition.
Visual Cognition II Object Perception. Theories of Object Recognition Template matching models Feature matching Models Recognition-by-components Configural.
CS 561, Sessions 27 1 Towards intelligent machines Thanks to CSCI561, we now know how to… - Search (and play games) - Build a knowledge base using FOL.
Ensemble Tracking Shai Avidan IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE February 2007.
Visual Cognition II Object Perception. Theories of Object Recognition Template matching models Feature matching Models Recognition-by-components Configural.
CS 223B Assignment 1 Help Session Dan Maynes-Aminzade.
Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.
An aside: peripheral drift illusion illusion of motion is strongest when reading text (such as this) while viewing the image in your periphery. Blinking.
Computational Vision Jitendra Malik, UC Berkeley.
Generic object detection with deformable part-based models
Multiclass object recognition
Machine learning & category recognition Cordelia Schmid Jakob Verbeek.
Object Bank Presenter : Liu Changyu Advisor : Prof. Alex Hauptmann Interest : Multimedia Analysis April 4 th, 2013.
Content-Based Image Retrieval
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe & Computer Vision Laboratory ETH.
Visual Scene Understanding (CS 598) Derek Hoiem Course Number: Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday.
Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.
Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.
Object Recognition in Images Slides originally created by Bernd Heisele.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
CSSE463: Image Recognition Day 11 Lab 4 (shape) tomorrow: feel free to start in advance Lab 4 (shape) tomorrow: feel free to start in advance Test Monday.
Non-Photorealistic Rendering and Content- Based Image Retrieval Yuan-Hao Lai Pacific Graphics (2003)
Computer Vision Group University of California Berkeley On Visual Recognition Jitendra Malik UC Berkeley.
Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.
Image Classification for Automatic Annotation
Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos VC 15/16 – TP14 Pattern Recognition Miguel Tavares.
Chapter 10. The Explorer System in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans On, Kyoung-Woon Biointelligence Laboratory.
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.
Extracting Simple Verb Frames from Images Toward Holistic Scene Understanding Prof. Daphne Koller Research Group Stanford University Geremy Heitz DARPA.
Data Mining for Surveillance Applications Suspicious Event Detection Dr. Bhavani Thuraisingham.
Finding Clusters within a Class to Improve Classification Accuracy Literature Survey Yong Jae Lee 3/6/08.
1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.
National Taiwan Normal A System to Detect Complex Motion of Nearby Vehicles on Freeways C. Y. Fang Department of Information.
Object Detection. Detecting Visual Situations with Convolutional Networks and Active Visual Search.
Design and Use of Earth Observation Image Content Tools Mihai Datcu(1, 2), Daniele Cerra(1), Houda Chaabouni-Chouayakh(1), Amaia de Miguel(1), Daniela.
SHAHAB iCV Research Group.
CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.
Object detection with deformable part-based models
Deep Predictive Model for Autonomous Driving
Lecture 25: Introduction to Recognition
Recognition using Nearest Neighbor (or kNN)
Object detection as supervised classification
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Project Implementation for ITCS4122
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Lecture 25: Introduction to Recognition
Image Segmentation Techniques
Brief Review of Recognition + Context
Unsupervised Classification
RCNN, Fast-RCNN, Faster-RCNN
Miguel Tavares Coimbra
Semantic Segmentation
Presentation transcript:

My Group’s Current Research on Image Understanding

An image-understanding task

Low-level vision

Color, Shape, Texture Low-level vision

Color, Shape, Texture Simple Segmentation Low-level vision

Color, Shape, Texture Simple Segmentation Low-level vision Object recognition

Color, Shape, Texture Simple Segmentation Low-level vision Object recognition High-level perception

Color, Shape, Texture Simple Segmentation Low-level vision Object recognition High-level perception Pattern recognition

Color, Shape, Texture Simple Segmentation Low-level vision Object recognition High-level perception Pattern recognition Analogy-making

Color, Shape, Texture Simple Segmentation Low-level vision Object recognition High-level perception Pattern recognition “Meaning” Analogy-making

Color, Shape, Texture Simple Segmentation Low-level vision Object recognition High-level perception ??? Pattern recognition “Meaning” Analogy-making

Color, Shape, Texture Simple Segmentation Low-level vision Object recognition High-level perception Pattern recognition “Meaning” Analogy-making The “SEMANTIC GAP’

Color, Shape, Texture Simple Segmentation Low-level vision Object recognition High-level perception Pattern recognition “Meaning” Analogy-making HMAX model of visual cortex Riesenhuber, Poggio, et al. The “SEMANTIC GAP’

Color, Shape, Texture Simple Segmentation Low-level vision Object recognition High-level perception Pattern recognition “Meaning” Analogy-making Active Symbol Architecture for high-level perception Hofstadter et al. HMAX model of visual cortex Riesenhuber, Poggio, et al. The “SEMANTIC GAP’

Color, Shape, Texture Simple Segmentation Low-level vision Object recognition High-level perception Pattern recognition “Meaning” Analogy-making Active Symbol Architecture for high-level perception Hofstadter et al. HMAX model of visual cortex Riesenhuber, Poggio, et al. The “SEMANTIC GAP’

The HMAX model for object recognition (Riesenhuber, Poggio, Serre, et al.)

1. Densely tile the image with windows of different sizes. 2. HMAX features are computed in each window. 3. The features in each window are given as input to the trained support vector machine. 4. If the SVM returns a score above a learned threshold, then the object is said to be “detected”. … … Recognition Phase Streetscenes “scene understanding” system (Bileschi, 2006)

Object detection (here, “car”) with HMAX model (Bileschi, 2006)

Some limitations of the Streetscenes approach to scene understanding

Requires exhaustive search for object identification and localization

Some limitations of the Streetscenes approach to scene understanding Requires exhaustive search for object identification and localization Exhaustive search over:

Some limitations of the Streetscenes approach to scene understanding Requires exhaustive search for object identification and localization Exhaustive search over: Window size and location in the image

Some limitations of the Streetscenes approach to scene understanding Requires exhaustive search for object identification and localization Exhaustive search over: Window size and location in the image Object categories (e.g., car, pedestrian, tree, etc.)

Some limitations of the Streetscenes approach to scene understanding Requires exhaustive search for object identification and localization Exhaustive search over: Window size and location in the image Object categories (e.g., car, pedestrian, tree, etc.) Exhaustive use of HMAX features in each window

Does not recognize spatial and abstract relationships among objects for whole scene understanding

Has no prior knowledge about object categories and their place in “conceptual space”

Does not recognize spatial and abstract relationships among objects for whole scene understanding Has no prior knowledge about object categories and their place in “conceptual space” HMAX model is completely feed-forward; no feedback to allow context to aid in scene understanding.

Goal of our project Perform whole-scene interpretation without exhaustive search. –Incorporate conceptual knowledge –Allow feedforward and feedback modes to interact

PersonDog leash attached to walking action holds A Simple Semantic Network (or “Ontology”) “Dog walking”

But...

But...

PersonDog leash attached to walking action holds Dog Group running “Dog walking”

PersonDog leash attached to walking action holds running Allowing “conceptual slippage” “Dog walking” Dog Group

But...

Person leash attached to walking action holds “Dog walking” running Cat Iguana Dog Dog Group Tail

But...

ttp://thedaemon.com/images/DARPA_Segue_Dog.jpg

alking_dog_from_car.jpg

sports.com/fun_pictures/dog_walking_helicopter.jpg

PersonDog leash attached to walking action holds running Cat Iguana BikingCarHelicopter “Dog walking” Dog Group DrivingSegue-ing Treadmill-ing Horse Tail

Active Symbol Architecture (Hofstadter et al., 1995)

Basis for –Copycat (analogy-making), Hofstadter & Mitchell –Tabletop (anlaogy-making), Hofstadter & French –Metacat (analogy-making and self-awareness), Hofstadter & Marshall and many others…

Semantic network Temperature Workspace Active Symbol Architecture (Hofstadter et al., 1995) Perceptual agents (codelets) are “active symbols”

Petacat: (Descendant of Copycat, part of the PetaVision project) Integration of Active Symbol Architecture and HMAX Initial task: Decide if image is an instance of “taking a dog for a walk”, and if so, how good an instance it is.

Workspace

Semantic network Workspace

taking a dog for a walk outdoors has location person dog has action is on is touching has component a road a beach trail drives runs flies horse swims rope belt leash sidewalk string walks is in front of has location has action has component stands sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors is on Spatial Relation Semantic Network cat

Property links Slip links taking a dog for a walk outdoors has location person dog has action is on is touching has component a road a beach trail drives runs flies horse swims rope belt leash sidewalk string walks is in front of has location has action has component stands sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors is on Spatial Relation Semantic Network cat

Semantic Network taking a dog for a walk outdoors has location person dog has action is on is touching has component a road a beach trail drives runs flies horse swims rope belt leash sidewalk string walks is in front of has location has action has component stands sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors is on Spatial Relation cat

taking a dog for a walk outdoors has location person dog has action is on is touching has component a road a beach trail drives runs flies horse swims rope belt leash sidewalk string walks is in front of has location has action has component stands sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors is on Spatial Relation cat

taking a dog for a walk outdoors has location person dog has action is on is touching has component a road a beach trail drives runs flies horse swims rope belt leash sidewalk string walks is in front of has location has action has component stands sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors is on Spatial Relation cat

taking a dog for a walk outdoors has location person dog has action is on is touching has component a road a beach trail drives runs flies horse swims rope belt leash sidewalk string walks is in front of has location has action has component stands sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors is on Spatial Relation cat

taking a dog for a walk has location person dog has action is on is touching has component a road a beach trail drives runs flies cathorse swims rope belt leash string walks is in front of has location has action has component stands sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors sidewalk outdoors is on Spatial Relation

taking a dog for a walk has location person dog has action is on is touching has component a road a beach trail drives runs flies horse swims rope belt leash string walks is in front of has location has action has component stands is on sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors sidewalk outdoors Spatial Relation cat

taking a dog for a walk has location person dog has action is on is touching has component a road a beach trail drives runs flies cathorse swims rope belt leash string walks is in front of has location has action has component stands is on sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors sidewalk outdoors Spatial Relation

taking a dog for a walk has location person dog has action is on is touching has component a road a beach trail drives runs flies cathorse swims rope belt leash string walks is in front of has location has action has component stands is on sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors sidewalk outdoors Spatial Relation

Measures how well organized the program’s “understanding” is as processing proceeds –Little organization  high temperature –Lots of organization  low temperature Temperature feeds back to affect perceptual agents: –High temperature  low confidence in decisions  decisions are made more randomly –Low temperature  high confidence in decisions  decisions are made more deterministically Temperature

Input image

Weak segmentation

Input imageWeak segmentation Location “heat map” (probability distribution over pixel locations)_

Input imageWeak segmentation Location “heat map” (probability distribution over pixel locations)_ Scale “heat map” (probability distribution over scales at each pixel location)

Dog? Scout codelets: Send C1 features in window to corresponding SVM. If positive result, post builder codelet with urgency equal to SVM’s confidence.

Dog? Person? Scout codelets: Send C1 features in window to corresponding SVM. If positive result, post builder codelet with urgency equal to SVM’s confidence.

Dog? Sidewalk? Person? Scout codelets: Send C1 features in window to corresponding SVM. If positive result, post builder codelet with urgency equal to SVM’s confidence.

Dog? Sidewalk? Person? Dog ? Outdoors? Scout codelets: Send C1 features in window to corresponding SVM. If positive result, post builder codelet with urgency equal to SVM’s confidence.

Dog? negative Dog? negative Sidewalk? positive: 0.4 Person? negative Outdoors? positive: 0.7 Scout codelets: Send C1 features in window to corresponding SVM. If positive result, post builder codelet with urgency equal to SVM’s confidence. Dog ? positive: 0.8

Builder codelets: Ask HMAX to compute C2 features using prototype shapes specific to the object class, and send them to corresponding SVM. If positive, decide to build structure with probability equal to SVM confidence. Break competing structures if necessary. Dog? negative Dog? negative Sidewalk? positive: 0.4 Person? negative Outdoors? positive: 0.7 Dog ? positive: 0.8

Outdoors Dog

taking a dog for a walk has location person dog has action is on is touching has component a road a beach trail drives runs flies cathorse swims rope belt leash string walks is in front of has location has action has component stands is on sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors sidewalk outdoors Spatial Relation

Object-specific heat maps are updated. + Dog Person heat map +

Object-specific heat maps are updated. + Dog Person heat map + Dog Person?

Object-specific heat maps are updated. As codelets build structure, heat maps are continually updated to reflect prior (learned) expectations about location and scale as a function of location and scale of “built” objects. + Dog + Person heat map Person?

Dog ? Dog Leash? Outdoors Leash? Sidewalk? Person?

Dog Outdoors Sidewalk Person Strength: 0.6

Dog Outdoors Sidewalk

taking a dog for a walk has location person dog has action is on is touching has component a road a beach trail drives runs flies cathorse swims rope belt leash string walks is in front of has location has action has component stands is on sits is in front of is touching is behind is next to is on a grass is touching Object Action indoors sidewalk outdoors Spatial Relation

Dog Outdoors Sidewalk Leash? Dog ? Sidewalk? Dog ? Rope?

Dog Outdoors Sidewalk Leash Dog (weak)

Dog Outdoors Sidewalk Leash Dog (weak) Dog (strong)

Dog Outdoors Sidewalk Leash Dog

Outdoors Sidewalk Leash Dog Once objects begin to be built, relation and grouping codelets can run on them. is next to Dog group

Once objects begin to be built, relation and grouping codelets can run on them. Dog Outdoors Sidewalk Dog is next to Dog group Leash

Dog Outdoors Sidewalk Dog is next to Dog group is next to Leash Once objects begin to be built, relation and grouping codelets can run on them.

How Petacat makes a final decision Temperature taking a dog for a walk Dog Outdoor s Leash Dog is next to Dog group Sidewalk is next to

How Petacat makes a final decision Temperature taking a dog for a walk Dog Outdoor s Leash Dog is next to Dog group Sidewalk “Situation” codelet is more likely to run when temperature is low. is next to

Dog Outdoors Leash Dog is next to Dog group is next to Sidewalk Situation codelet tries to match prototypical situation with existing workspace structures, possibly allowing slippages.

Dog Outdoors Leash Dog is next to Dog group Sidewalk perso n taking a dog for a walk leash dog outdoors is next to has component has location is in front of Situation codelet tries to match prototypical situation with existing workspace structures, possibly allowing slippages.

Dog Outdoors Leash Dog is next to Dog group perso n taking a dog for a walk leash dog outdoors is next to has component has location is in front of is next to Dog group Sidewalk

Dog Outdoors Leash Dog is next to Dog group perso n taking a dog for a walk leash dog outdoors is next to has component has location is in front of is next to Dog group If resulting temperature is low enough, classify scene as positive Sidewalk

Dog Outdoors Leash Dog is next to Dog group Sidewalk perso n taking a dog for a walk leash dog outdoors is next to has component has location is in front of is next to Dog group If situation codelet fails enough times or does not run for a long time, program has increasing chance of ending with negative classification. If resulting temperature is low enough, classify scene as positive

Temperature at the end of the run gives a measure of how good an instance the picture is (e.g., of the “dog walking” situation). Temperature