The Problem with the Blank Slate Approach Derek Hoiem Beckman Institute, UIUC * unpublished *

Slides:

Advertisements

Similar presentations

SCIENCE PROCESS SKILLS

Advertisements

Applications of one-class classification

A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions Jing Gao Wei Fan Jiawei Han Philip S. Yu University of Illinois.

Learning Shared Body Plans Ian Endres University of Illinois work with Derek Hoiem, Vivek Srikumar and Ming-Wei Chang.

Image Retrieval: Current Techniques, Promising Directions, and Open Issues Yong Rui, Thomas Huang and Shih-Fu Chang Published in the Journal of Visual.

Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.

An Overview of the Common Core State Standards for Mathematical Practice for use with the Common Core Essential Elements The present publication was developed.

Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,

Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 143, Brown James Hays 02/22/11 Many slides from Derek Hoiem.

Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/15/12.

Object-centric spatial pooling for image classification Olga Russakovsky, Yuanqing Lin, Kai Yu, Li Fei-Fei ECCV 2012.

Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs Roozbeh Mottaghi 1, Sanja Fidler 2, Jian Yao 2, Raquel Urtasun 2, Devi Parikh 3 1 UCLA.

CPSC 425: Computer Vision (Jan-April 2007) David Lowe Prerequisites: 4 th year ability in CPSC Math 200 (Calculus III) Math 221 (Matrix Algebra: linear.

Recognition: A machine learning approach

Quantifying and Transferring Contextual Information in Object Detection Professor: S. J. Wang Student : Y. S. Wang 1.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

A Study of Approaches for Object Recognition

Recognition Of Textual Signs Final Project for “Probabilistic Graphics Models” Submitted by: Ezra Hoch, Golan Pundak, Yonatan Amit.

Spatial Pyramid Pooling in Deep Convolutional

Introduction to Machine Learning Approach Lecture 5.

Why You Should Make Smart Flashcards Mark Mitchell & Janina Jolley, 2014 Clarion University of Pennsylvania

Exercise Session 10 – Image Categorization

Face Alignment Using Cascaded Boosted Regression Active Shape Models

SLB /04/07 Thinking and Communicating “The Spiritual Life is Thinking!” (R.B. Thieme, Jr.)

Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.

Welcome to Common Core High School Mathematics Leadership

Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.

2 2  Background  Vision in Human Brain  Efficient Coding Theory  Motivation  Natural Pictures  Methodology  Statistical Characteristics  Models.

Day 3. Standards Reading: 1.0 Word Analysis, Fluency, and Systematic Vocabulary Development- Students apply their knowledge of word origins to determine.

Yao, B., and Fei-fei, L. IEEE Transactions on PAMI(2012)

Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.

Computer Science Department Pacific University Artificial Intelligence -- Computer Vision.

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.

M150: Data, Computing and information Outline 1.Unit two. 2.What’s next. 3.Some questions. 4.Your questions. 1.

Chapter 13 Qualitative Data Analysis Linking Theory and Analysis Qualitative Data Processing Computer Programs for Qualitative Data The Qualitative Analysis.

UNBIASED LOOK AT DATASET BIAS Antonio Torralba Massachusetts Institute of Technology Alexei A. Efros Carnegie Mellon University CVPR 2011.

School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.

Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.

Learning to Share Meaning in a Multi-Agent System (Part I) Ganesh Padmanabhan.

Assessment. Levels of Learning Bloom Argue Anderson and Krathwohl (2001)

Category Independent Region Proposals Ian Endres and Derek Hoiem University of Illinois at Urbana-Champaign.

Recognition Using Visual Phrases

GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.

Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations ZUO ZHEN 27 SEP 2011.

Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.

Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.

Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.

Structuring Learning. Agenda Structuring learning. Structuring lab sessions. Engagement. Critical Thinking. Ideas for structuring learning. Activity.

Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/22/11.

DISCRIMINATIVELY TRAINED DENSE SURFACE NORMAL ESTIMATION ANDREW SHARP.

Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.

Cell Segmentation in Microscopy Imagery Using a Bag of Local Bayesian Classifiers Zhaozheng Yin RI/CMU, Fall 2009.

Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,

Another Example: Circle Detection

Mathematical Practice Standards

Recent developments in object detection

SHAHAB iCV Research Group.

Krishna Kumar Singh, Yong Jae Lee University of California, Davis

Nonparametric Semantic Segmentation

Accounting for the relative importance of objects in image retrieval

CS 1674: Intro to Computer Vision Scene Recognition

Recognizing and Learning Object categories

Brief Review of Recognition + Context

a chicken and egg problem…

Object Recognition Today we will move on to… April 12, 2018

Semantic Segmentation

THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU

Presentation transcript:

The Problem with the Blank Slate Approach Derek Hoiem Beckman Institute, UIUC * unpublished *

Scene Understanding: What We Have DOG GROUND TENT 7

Scene Understanding: What We Have (More Realistically) CAR GROUND 7

Scene Understanding: What We Want What/where: Large, furry dog lying in opening of tent, leashed to tent Why: The tent provides shelter from the bright sun. What if: What would happen if I tried to pet the dog?

The Blank Slate Approach to Recognition LAB Histogram Textons Bag of SIFT HOG x x xx x x x x x o o o o o = Object Definition ExamplesImage FeaturesClassifier + + _____ The Assumption: It is possible to learn to recognize an object from positive and negative examples (with the right features, classifier, etc)

Problems with the Blank Slate Approach 1.Is inefficient

Inefficient: Objects Defined in Isolation Person +/- Person Examples Feature Space Classifier = Patch  “Person”? Car +/- Car Examples Feature Space Classifier = Patch  “Car”? Road +/- Road Examples Feature Space Classifier = Patch  “Road”? Dog +/- Dog Examples Feature Space Classifier = Patch  “Dog”? Building +/- Building Examples Feature Space Classifier = Patch  “Building”?

Problems with the Blank Slate Approach 1.Is inefficient. 2.Does not allow the underlying concept to be learned.

translation Signifiers Saussure 1916 barks reproduces four-legged furry face-licker squirrel-chaser Signified (Mental Concept) recognition A View from Semiotics (study of signs) pet

Magritte’s Warning The Treachery of Images –Magritte 1929 Ceci aussi n’est pas une pipe.

Problems with the Blank Slate Approach 1.Is inefficient. 2.Does not allow the underlying concept to be learned. 3.Might not allow accurate naming.

A Flawed Argument for Blank Slate Approach People can identify objects in isolation But inference ≠ learning - it might not be possible to learn to identify objects from +/- examples

Appearances are Just Symptoms

Quantitative and Qualitative Gap Quantitative – CalTech 101: ~85% accuracy – CalTech 256: ~65% accuracy – MSRC 21: ~75% accuracy – PASCAL 2007: 0.1 – 0.45 average precision Qualitative

Moving Beyond the Blank Slate

Major Shifts in Approach 1.Progressive vision: from starting from scratch to accumulating knowledge

Major Shifts in Approach 1.Progressive vision: from starting from scratch to accumulating knowledge 2.From isolated to relational models of objects LAB Histogram Textons Bag of SIFT HOG x x xx x x x x x o o o o o + “dog” = barks 1-4 ft high four-legged furry face-licker squirrel-chaser pet has tail OLD NEW

Major Shifts in Approach 1.Progressive vision: from starting from scratch to accumulating knowledge 2.From isolated to relational models of objects 3.Beyond naming: infer properties and explain cause and predict effect What/where: Large, furry dog lying in opening of tent, leashed to tent Why: The tent provides shelter from the bright sun. What if: What would happen if I tried to pet the dog?

Existing Solutions are not Enough More data – May never have enough – No understanding Feature sharing or “deep learning” – May not be possible to discover underlying concepts from visual data – “Hidden” features cannot be communicated (or independently improved) – Still no understanding Compositional models – Only handles “part of” relation Context with graphical models – Hand-designed relational structure: brittle, not scalable

Key Step: Incorporate Background Knowledge LAB Histogram Textons Bag of SIFT HOG x x xx x x x x x o o o o o ExamplesImage FeaturesClassifier + + Background Things Attributes Relations Describe in Terms of Background Knowledge Bias Feature Relevance

Uses of Background Knowledge Background knowledge (things that can be identified and relational structures) can Serves as building blocks for new concepts Helps determine what is likely to be important Allows properties to be inferred through class hierarchies Early Category and Concept Development – Eds. Rakison and Oakes 2003

Uses of Background Knowledge: Examples Using background knowledge as building blocks Explaining away large “stuff”

Background Knowledge as Building Blocks … Object Appearance Model Boundary Appearance Model Surface Appearance Model … Independently Trained Appearance Models Input Image Object Concept Model Final Object Estimates Part of Looks like Near … Set of Possible Relations +/- Training Examples

Results on MSRC 21 Input ImageAppearance OnlyConceptual Model MethodAverage Pixel Accuracy Multisegs (baseline)72% Final Result (with knowledge)80% Best Published~75%

Most image area belongs to a few categories Plot here Percent of Fully Labeled Image Area

Many of these can be recognized with high precision Plot here Precision Recall for 60 Classes (total area = 0.95)

Being able to recognize the big stuff is useful 1.Recognize big stuff and ignore it – Efficient search strategies – Improve object discovery 2.Big stuff provides context

Explaining Away the Big Stuff Alpha mask = 1 - P(background) Images

Explaining Away the Big Stuff Images Alpha mask = 1 - P(background)

Knowledge-Based Regularization Add penalties for assigning weights to features or relations that are unlikely to be relevant, inferred by – Superclasses (soft inheritance) – Guidance from semantic webs, text mining, or instruction – Similar objects – Functional attributes

Conclusions The blank slate approach is not sufficient for scene understanding We need an approach that progressively builds on existing knowledge – Broader recognition (beyond nouns and verbs) – New representations to support relational models – New mechanisms for relational learning and knowledge transfer – New types of evaluation – Integration with linguists, psychologists, etc.