Are Categories Necessary in a Data-Rich World? Alexei (Alyosha) Efros CMU Joint work with Tomasz Malisiewicz.

Slides:

Advertisements

Similar presentations

Shape Matching and Object Recognition using Low Distortion Correspondence Alexander C. Berg, Tamara L. Berg, Jitendra Malik U.C. Berkeley.

Advertisements

Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint.

Attributes for Classifier Feedback Amar Parkash and Devi Parikh.

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Kiri Wagstaff Jet Propulsion Laboratory, California Institute of Technology July 25, 2012 Association for the Advancement of Artificial Intelligence CHALLENGES.

The Layout Consistent Random Field for detecting and segmenting occluded objects CVPR, June 2006 John Winn Jamie Shotton.

Learning Shared Body Plans Ian Endres University of Illinois work with Derek Hoiem, Vivek Srikumar and Ming-Wei Chang.

1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Adding Unlabeled Samples to Categories by Learned Attributes Jonghyun Choi Mohammad Rastegari Ali Farhadi Larry S. Davis PPT Modified By Elliot Crowley.

Computer Science Department Learning on the Fly: Rapid Adaptation to the Image Erik Learned-Miller with Vidit Jain, Gary Huang, Laura Sevilla Lara, Manju.

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.

Shape Sharing for Object Segmentation

Recovering Human Body Configurations: Combining Segmentation and Recognition Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley) Alexei A. Efros.

Wrap Up. We talked about Filters Edges Corners Interest Points Descriptors Image Stitching Stereo SFM.

UCB Computer Vision Animals on the Web Tamara L. Berg CSE 595 Words & Pictures.

Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification Computer Vision, ICCV IEEE 11th International.

- Recovering Human Body Configurations: Combining Segmentation and Recognition (CVPR’04) Greg Mori, Xiaofeng Ren, Alexei A. Efros and Jitendra Malik -

LARGE-SCALE IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill road building car sky.

Lecture 31: Modern object recognition

INTRODUCTION Heesoo Myeong, Ju Yong Chang, and Kyoung Mu Lee Department of EECS, ASRI, Seoul National University, Seoul, Korea Learning.

Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.

Pedestrian Detection in Crowded Scenes Dhruv Batra ECE CMU.

Discriminative and generative methods for bags of features

Recognition: A machine learning approach

Event prediction CS 590v. Applications Video search Surveillance – Detecting suspicious activities – Illegally parked cars – Abandoned bags Intelligent.

Concepts and Categories. Functions of Concepts By dividing the world into classes of things to decrease the amount of information we need to learn, perceive,

Knowing Semantic memory.

PSY 402 Theories of Learning Chapter 8 – Stimulus Control How Stimuli Guide Instrumental Action.

LARGE-SCALE NONPARAMETRIC IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill CVPR 2011Workshop on Large-Scale.

Concepts: from instances to meaning : Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.

Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Opportunities of Scale, Part 2 Computer Vision James Hays, Brown Many slides from James Hays, Alyosha Efros, and Derek Hoiem Graphic from Antonio Torralba.

What, Where & How Many? Combining Object Detectors and CRFs

Generic object detection with deformable part-based models

Salient Object Detection by Composition

The Three R’s of Vision Jitendra Malik.

General Knowledge Dr. Claudia J. Stanny EXP 4507 Memory & Cognition Spring 2009.

Object Detection Sliding Window Based Approach Context Helps

Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.

Why Categorize in Computer Vision ?. Why Use Categories? People love categories!

Learning Collections of Parts for Object Recognition and Transfer Learning University of Illinois at Urbana- Champaign.

Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

Category Structure Psychology 355: Cognitive Psychology Instructor: John Miyamoto 05/20 /2015: Lecture 08-2 This Powerpoint presentation may contain macros.

Concepts: from instances to meaning Pixels to Percepts A. Efros, CMU, Spring 2011.

Computer Vision Group University of California Berkeley On Visual Recognition Jitendra Malik UC Berkeley.

Visual Data on the Internet With slides from Alexei Efros, James Hays, Antonio Torralba, and Frederic Heger : Computational Photography Jean-Francois.

Image Classification for Automatic Annotation

Category Independent Region Proposals Ian Endres and Derek Hoiem University of Illinois at Urbana-Champaign.

Recognition Using Visual Phrases

Context Neelima Chavali ECE /21/2013. Roadmap Introduction Paper1 – Motivation – Problem statement – Approach – Experiments & Results Paper 2 Experiments.

Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.

PSY 402 Theories of Learning Chapter 8 – Stimulus Control How Stimuli Guide Instrumental Action.

Carl Vondrick, Aditya Khosla, Tomasz Malisiewicz, Antonio Torralba Massachusetts Institute of Technology

Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.

Recent developments in object detection

Object detection with deformable part-based models

Data Driven Attributes for Action Detection

Nonparametric Semantic Segmentation

Opportunities of Scale, Part 2

Object detection as supervised classification

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Class Schedule In-text Citations Long-term Memory: Organization

Rob Fergus Computer Vision

Learning to Combine Bottom-Up and Top-Down Segmentation

“The Truth About Cats And Dogs”

Brief Review of Recognition + Context

Human-object interaction

“Traditional” image segmentation

Presentation transcript:

Are Categories Necessary in a Data-Rich World? Alexei (Alyosha) Efros CMU Joint work with Tomasz Malisiewicz

Acknowledgements Talks by Moshe Bar; writings of Shimon Edelman Murphy Big Book of Concepts Weinberger Everything is Miscellaneous Many great discussions with many colleagues, especially Tomasz Malisiewicz, James Hays, and Derek Hoiem

Unreasonable Effectiveness of Data Parts of our world can be explained by elegant mathematics: –physics, chemistry, astronomy, etc. But much cannot: –psychology, genetics, economics, etc. Enter: The Magic of Big Data –Great advances in several fields: e.g. speech recognition, machine translation, search engines [Halevy, Norvig, Pereira 2009]

Categorization vs. The Data 4 Philosophy and Psychology Language Arts and recreation Literature Technology Religion Weinberger, Everything is Miscellaneous

categorization is losing… vs.

Whats in a name?

…That which we call a rose By any other name would smell as sweet. chair category (PASCAL VOC) train category (PASCAL VOC)

Why Categorize? 1.Knowledge Transfer 2.Communication Tiger cat dog Leopard

Classical View of Categories Dates back to Plato & Aristotle 1. Categories are defined by a list of properties shared by all elements in a category 2. Category membership is binary 3. Every member in the category is equal

Problems with Classical View Humans dont do this! – People dont rely on abstract definitions / lists of shared properties (Wittgenstein 1953, Rosch 1973) e.g. define the properties shared by all games e.g. are curtains furniture? Are olives fruit? – Typicality e.g. Chicken -> bird, but bird -> eagle, pigeon, etc. – Language-dependent e.g. Women, Fire, and Dangerous Things category is Australian aboriginal language (Lakoff 1987) – Doesnt work even in human-defined domains e.g. Is Pluto a planet?

Visual Problems with Categories A lot of categories are functional Categories are 3D, but images are 2D World is too varied Chair car train

Typical HOG car detector Felzenszwalb et al, PASCAL 2007

Why not? + submitted to CVPR 2011

Semantic -> Visual Categories Aspect ratio splits partsPoselets All use a priori domain information

Fundamental Problem with Categorization Making decisions too early! Like Amazinn.com, can we just categorize at run-time, once we know the task!

On-the-fly Categorization? 1.Knowledge Transfer 2.Communication

Association instead of categorization Ask not what is this?, ask what is this like – Moshe Bar Exemplar Theory (Medin & Schaffer 1978, Nosofsky 1986, Krushke 1992) –categories represented in terms of remembered objects (exemplars) –Similarity is measured between input and all exemplars –think non-parametric density estimation

What is this? Car Road Building Input Image He 2004, Tu 2004, Shotton 2006, Galleguillos 2008, Fei-Fei 2009, Gould 2009, etc.

What is this like? Malisiewicz & Efros, CVPR08

20 What is the ultimate goal? Understanding / Parsing Images A what is it like? machine

21 Our Contributions Posing Recognition as Association –Use large number of object exemplars 21 Learning Object Similarity –Different distance function per exemplar Recognition-Based Object Segmentation –Use multiple segmentation approach

Visual Associations How are objects similar? Shape Color

Distance Similarity Functions Positive Linear Combinations of Elementary Distances Computed Over 14 Features Building e Distance Function Building e

Learning Object Similarity Learn a different distance function for each exemplar in training set Formulation is similar to Frome et al [1,2] [1] Andrea Frome, Yoram Singer, Jitendra Malik. "Image Retrieval and Recognition Using Local Distance Functions." In NIPS, [2] Andrea Frome, Yoram Singer, Fei Sha, Jitendra Malik. "Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification." In ICCV, 2007.

25 Non-parametric density estimation Color Dimension Shape Dimension Class 1 Class 2 Class 3

26 Non-parametric density estimation Color Dimension Shape Dimension Class 1 Class 2 Class 3

27 Non-parametric density estimation Color Dimension Shape Dimension Class 1 Class 2 Class 3

Learning Distance Functions 28 Dshape Dcolor Focal Exemplar similar side DecisionBoundary dissimilar side Dont Care

Visualizing Distance Functions (Training Set) Query Top Neighbors with Tex-Hist Dist Top Neighbors with Learned Dist

Visualizing Distance Functions (Training Set)

Labels Crossing Boundary

Object Segmentation via Recognition Generate Multiple Segmentations (Hoiem 2005, Russell 2006, Malisiewicz 2007) – Mean-Shift and Normalized Cuts – Use pairs and triplets of adjacent segments – Generate about 10,000 segments per image Enhance training with bad segments Apply learned distance functions to bottom- up segments

33 Example Associations Bottom-Up Segments

34 Quantitative Evaluation 34 Object hypothesis is correct if labels match and OS >.5 *We do not penalize for multiple correct overlapping associations OS(A,B) = Overlap Score = intersection(A,B) / union(A,B)

35 Towards Image Parsing Need for Context 35

Image Parsing with Context

Bushs Memex (1945) Store publications, correspondence, personal work, on microfilm Items retrieved rapidly using index codes –Builds on rapid selector Can annotate text with margin notes, comments Can construct a trail through the material and save it –Roots of hypertext Acts as an external memory

Visual Memex, a proposal [Malisiewicz & Efros] Nodes = instances Edges = associations types of edges: visual similarity spatial, temporal co- occurrence geometric structure language geography.. New object

Torralbas Context Challenge

2 1 Slide by Antonio Torralba

Torralbas Context Challenge Chance ~ 1/30000 Slide by Antonio Torralba

Our Challenge Setup Malisiewicz & Efros, NIPS09

3 models Visual Memex: exemplars, non-parametric object-object relationships –Recurse through the graph Baseline: CoLA: categories, parametric object-object relationships Reduced Memex: categories, non- parametric relationships

Qual. results

Quant. results

Next Step: top-down segmentation Visual Memex A B C B C

Take Home Message Categorization is not a goal in itself –Rather, it is a means for transferring knowledge onto a new instance Skipping explicit categorization might make things easier, not harder –The harder intermediate problem syndrome Keeping around all your data isnt so bad… –you never know when you will need it

Questions? +