Download presentation
Presentation is loading. Please wait.
Published byJakobe Wynter Modified over 10 years ago
1
Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010, Crete, Greece 1
2
What type of bird is this? 2
3
3 …? Field Guide
4
What type of bird is this? 4 Computer Vision ?
5
What type of bird is this? 5 Bird? Computer Vision
6
What type of bird is this? 6 Chair? Bottle? Computer Vision
7
Parakeet Auklet 7 Field guides difficult for average users Computer vision doesnt work perfectly (yet) Research mostly on basic- level categories
8
Visual Recognition With Humans in the Loop Parakeet Auklet What kind of bird is this? 8
9
Levels of Categorization Airplane? Chair? Bottle? … Basic-Level Categories 9 [Griffin et al. 07, Lazebnik et al. 06, Grauman et al. 06, Everingham et al. 06, Felzenzwalb et al. 08, Viola et al. 01, … ]
10
Levels of Categorization American Goldfinch? Indigo Bunting? … Subordinate Categories 10 [Belhumeur et al. 08, Nilsback et al. 08, …]
11
Levels of Categorization Yellow Belly? Blue Belly?… Parts and Attributes 11 [Farhadi et al. 09, Lampert et al. 09, Kumar et al. 09]
12
Visual 20 Questions Game Blue Belly? no Cone-shaped Beak? yes Striped Wing? yes American Goldfinch? yes Hard classification problems can be turned into a sequence of easy ones 12
13
Recognition With Humans in the Loop Computer Vision Cone-shaped Beak? yes American Goldfinch? yes Computer Vision Computers: reduce number of required questions Humans: drive up accuracy of vision algorithms 13
14
Research Agenda Heavy Reliance on Human Assistance More Automated Computer Vision Improves Blue belly? no Cone-shaped beak? yes Striped Wing? yes American Goldfinch? yes Striped Wing? yes American Goldfinch? yes Fully Automatic American Goldfinch? yes 20102015 2025 14
15
Field Guides www.whatbird.com 15
16
Field Guides www.whatbird.com 16
17
Example Questions 17
18
Example Questions 18
19
Example Questions 19
20
Example Questions 20
21
Example Questions 21
22
Example Questions 22
23
Basic Algorithm Input Image ( ) Question 1: Is the belly black? Question 2: Is the bill hooked? Computer Vision A: NO A: YES Max Expected Information Gain … 23
24
Without Computer Vision Input Image ( ) Question 1: Is the belly black? Question 2: Is the bill hooked? Class Prior A: NO A: YES Max Expected Information Gain … 24
25
Basic Algorithm Select the next question that maximizes expected information gain: Easy to compute if we can to estimate probabilities of the form: Object Class Image Sequence of user responses 25
26
Basic Algorithm Model of user responses Computer vision estimate Normalization factor 26
27
Basic Algorithm Model of user responses Computer vision estimate Normalization factor 27
28
Modeling User Responses Assume: Estimate using Mechanical Turk grey red black white brown blue grey red black white brown blue grey red black white brown blue Definitely Probably Guessing What is the color of the belly? Pine Grosbeak 28
29
Incorporating Computer Vision Use any recognition algorithm that can estimate: p(c|x) We experimented with two simple methods: 1-vs-all SVMAttribute-based classification [Lampert et al. 09, Farhadi et al. 09] 29
30
Incorporating Computer Vision [Vedaldi et al. 08, Vedaldi et al. 09 ] Self Similarity Color Histograms Color Layout Bag of Words Spatial Pyramid Geometric Blur Color SIFT, SIFT Multiple Kernels Used VLFeat and MKL code + color features 30
31
Birds 200 Dataset 200 classes, 6000+ images, 288 binary attributes Why birds? Black-footed Albatross Groove-Billed Ani Parakeet AukletField SparrowVesper Sparrow Arctic TernForsters TernCommon Tern Bairds Sparrow Henslows Sparrow 31
32
Birds 200 Dataset 200 classes, 6000+ images, 288 binary attributes Why birds? Black-footed Albatross Groove-Billed Ani Parakeet AukletField SparrowVesper Sparrow Arctic TernForsters TernCommon Tern Bairds Sparrow Henslows Sparrow 32
33
Birds 200 Dataset 200 classes, 6000+ images, 288 binary attributes Why birds? Black-footed Albatross Groove-Billed Ani Parakeet AukletField SparrowVesper Sparrow Arctic TernForsters TernCommon Tern Bairds Sparrow Henslows Sparrow 33
34
Results: Without Computer Vision Comparing Different User Models 34
35
Results: Without Computer Vision if users answers agree with field guides… Perfect Users: 100% accuracy in 8log 2 (200) questions 35
36
Results: Without Computer Vision Real users answer questions MTurkers dont always agree with field guides… 36
37
Results: Without Computer Vision Real users answer questions MTurkers dont always agree with field guides… 37
38
Results: Without Computer Vision Probabilistic User Model: tolerate imperfect user responses 38
39
Results: With Computer Vision 39
40
Results: With Computer Vision Users drive performance: 19% 68% Just Computer Vision 19% 40
41
Results: With Computer Vision Computer Vision Reduces Manual Labor: 11.1 6.5 questions 41
42
Examples Without computer vision: Q #1: Is the shape perching-like? no (Def.) With computer vision: Q #1: Is the throat white? yes (Def.) Western Grebe Different Questions Asked w/ and w/out Computer Vision perching-like 42
43
Examples computer vision Magnolia Warbler User Input Helps Correct Computer Vision Is the breast pattern solid? no (definitely) Common Yellowthroat Magnolia Warbler Common Yellowthroat 43
44
Recognition is Not Always Successful Acadian Flycatcher Least Flycatcher Parakeet Auklet Least Auklet Is the belly multi- colored? yes (Def.) Unlimited questions 44
45
Summary 11.1 6.5 questions Computer vision reduces manual labor Users drive up performance 19% 45 Recognition of fine-grained categories More reliable than field guides
46
Summary 11.1 6.5 questions Computer vision reduces manual labor Users drive up performance 19% 46 Recognition of fine-grained categories More reliable than field guides
47
Summary 11.1 6.5 questions Computer vision reduces manual labor Users drive up performance 19% 47 Recognition of fine-grained categories More reliable than field guides
48
Summary 11.1 6.5 questions Computer vision reduces manual labor Users drive up performance 19% 48 Recognition of fine-grained categories More reliable than field guides
49
Future Work Extend to domains other than birds Methodologies for generating questions Improve computer vision 49
50
Questions? Project page and datasets available at: http://vision.caltech.edu/visipedia/ http://vision.ucsd.edu/project/visipedia/ 50
51
Categories of Recognition Easy for Humans Airplane? Chair? Bottle? … Hard for Humans American Goldfinch? Indigo Bunting?… Easy for Humans Yellow Belly? Blue Belly?… Basic-LevelSubordinateParts & Attributes Hard for computers 51
52
Related Work 20 Questions Game [20q.net] oMoby [IQEngines.com] Many Others: Crowdsourcing, Information Theory, Relevance Feedback, Active Learning, Expert Systems, … Field Guides [whabird.com] Botanists Electronic Field Guide [Belhumeur et al. 08] Oxford Flowers [Nilsback et al. 08] [Lampert et al. 09] [Farhadi et al. 09] [Kumar et al. 09] Attributes 52
53
Traditional Approach to Object Recognition Research Easy ProblemMore Difficult Problem Computer Vision Improves Stuck on basic-level categories Performance still too low for practical application 53
54
Missing Attributes Indigo Bunting Blue Grosbeak 54
55
Overview Solve difficult vision problems Leverage existing computer vision Make field guides more automated and reliable 55
56
Research Agenda 56 Reliance on Computers Reliance on Humans (# questions) 201020152020 2025
57
Research Agenda 57 Reliance on Computers Reliance on Humans (# questions) 201020152020 2025 Field guides No use of computer vision
58
Research Agenda 58 Reliance on Computers Reliance on Humans (# questions) 201020152020 2025 This paper
59
Research Agenda 59 Reliance on Computers Reliance on Humans (# questions) 201020152020 2025 Computer vision improves Fewer questions required
60
Research Agenda 60 Reliance on Computers Reliance on Humans (# questions) 201020152020 2025 Computer vision solved Fully automatic system
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.