Download presentation
Presentation is loading. Please wait.
1
Advanced Topics in Computer Vision
Devi Parikh Electrical and Computer Engineering
2
Plan for today Topic overview: Introductions Course overview:
What does the visual recognition problem entail? Why are these hard problems? What works today? Introductions Course overview: Logistics Requirements Please interrupt at any time with questions or comments
3
Computer Vision Automatic understanding of images and video
Computing properties of the 3D world from visual data (measurement) Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities. (perception and interpretation) Algorithms to mine, search, and interact with visual data (search and organization) Kristen Grauman 3
4
What does recognition involve?
Fei-Fei Li
5
Detection: are there people?
6
Activity: What are they doing?
7
Object categorization
mountain tree building banner street lamp vendor people
8
Instance recognition Potala Palace A particular sign
9
Scene and context categorization
outdoor city …
10
Attribute recognition
gray made of fabric crowded flat
11
Object Categorization
Task Description “Given a small number of training images of a category, recognize a-priori unknown instances of that category and assign the correct category label.” Which categories are feasible visually? German shepherd animal dog living being “Fido” K. Grauman, B. Leibe
12
Visual Object Categories
Basic Level Categories in human categorization [Rosch 76, Lakoff 87] The highest level at which category members have similar perceived shape The highest level at which a single mental image reflects the entire category The level at which human subjects are usually fastest at identifying category members The first level named and understood by children The highest level at which a person uses similar motor actions for interaction with category members K. Grauman, B. Leibe
13
Visual Object Categories
Basic-level categories in humans seem to be defined predominantly visually. There is evidence that humans (usually) start with basic-level categorization before doing identification. Basic-level categorization is easier and faster for humans than object identification! How does this transfer to automatic classification algorithms? K. Grauman, B. Leibe
14
Visual Object Categories
… animal Abstract levels … … four-legged … Basic level dog cat cow German shepherd Doberman Individual level … “Fido” … K. Grauman, B. Leibe
15
How many object categories are there?
~10,000 to 30,000 Biederman 1987 Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. 15
16
~10,000 to 30,000 16
17
Other Types of Categories
Functional Categories e.g. chairs = “something you can sit on” K. Grauman, B. Leibe
18
Why recognition? Recognition a fundamental part of perception
e.g., robots, autonomous agents Organize and give access to visual content Connect to information Detect trends and themes Where are we now? Kristen Grauman 18
19
Computer Vision … 45 years later
“spend the summer linking a camera to a computer and getting the computer to describe what it saw” - Marvin Minsky (1966), MIT … 45 years later
20
Computer Vision OR Vision is HARD!
21
We’ve come a long way…
22
We’ve come a long way…
23
We’ve come a long way…
24
Posing visual queries Yeh et al., MIT Belhumeur et al.
Kooaba, Bay & Quack et al. Kristen Grauman
25
Finding visually similar objects
Kristen Grauman 25
26
Discovering visual patterns
Sivic & Zisserman Objects Lee & Grauman Categories Wang et al. Actions Kristen Grauman
27
Auto-annotation Gammeter et al. T. Berg et al. Kristen Grauman
28
Exploring community photo collections
Snavely et al. Kristen Grauman Simon & Seitz 28
29
Autonomous agents able to detect objects
Kristen Grauman
30
We’ve come a long way… Fischler and Elschlager, 1973
31
We’ve come a long way…
32
We’ve come a long way… Dollar et al., BMVC 2009
33
Still a long way to go… Dollar et al., BMVC 2009
34
Dollar et al., BMVC 2009
35
Dollar et al., BMVC 2009
36
Challenges
37
Challenges: robustness
Illumination Object pose Clutter Intra-class appearance Occlusions Viewpoint Kristen Grauman
38
context and human experience
Challenges: context and human experience Context cues Kristen Grauman
39
context and human experience
Challenges: context and human experience Context cues Function Dynamics Kristen Grauman Video credit: J. Davis
40
Challenges: scale, efficiency
Half of the cerebral cortex in primates is devoted to processing visual information ~20 hours of video added to YouTube per minute ~5,000 new tagged photos added to Flickr per minute Thousands to millions of pixels in an image 30+ degrees of freedom in the pose of articulated objects (humans) 3,000-30,000 human recognizable object categories Kristen Grauman
41
Challenges: learning with minimal supervision
Less More Unlabeled, multiple objects Classes labeled, some clutter Cropped to object, parts and classes labeled Kristen Grauman
42
Slide from Pietro Perona, 2004 Object Recognition workshop
43
Slide from Pietro Perona, 2004 Object Recognition workshop
44
What kinds of things work best today?
Frontal face detection Reading license plates, zip codes, checks Recognizing flat, textured objects (like books, CD covers, posters) Fingerprint recognition Kristen Grauman
45
Inputs in 1963… L. G. Roberts, Machine Perception of Three Dimensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963. Kristen Grauman
46
… and inputs today Personal photo albums Movies, news, sports
Surveillance and security Medical and scientific images Slide credit; L. Lazebnik 46
47
Understand and organize and index all this data!!
… and inputs today Images on the Web Movies, news, sports 350 mil. photos, 1 mil. added daily 1.6 bil. images indexed as of summer 2005 916,271 titles 10 mil. videos, 65,000 added daily Understand and organize and index all this data!! Satellite imagery City streets Slide credit; L. Lazebnik 47
48
Introductions What is your name?
Which program are you in? How far along? What is your research area and current project about? Take a minute to explain it to us In a way that we can all follow Have you taken a computer vision course before? Machine learning or pattern recognition? Do you know what classifiers are / do? What are you hoping to get out of this class?
49
This course ECE 6504 TR 5:00 pm to 6:15 pm McBryde Hall (MCB) 216
Office hours: by appointment ( ) Course webpage: (Google me My homepage Teaching) 49
50
This course Focus on current research in computer vision
High-level recognition problems, innovative applications. 50
51
Goals Understand current approaches Analyze and critique
Identify interesting research questions Present clearly and methodically 51
52
Expectations Discussions will center on recent papers in the field [15%] Paper reviews each class [25%] Can have 3 late days over the course of the semester Presentations (2-3 times) [25%] Papers and background reading Experiments Project [35%] 52
53
Prerequisites Ability to analyze high-level conference papers
Courses in computer vision and machine learning are a plus 53
54
Paper reviews For each class, review two of the assigned papers.
me by 9:00 pm the day before class (MW) Skip reviews the classes you are presenting. 54
55
Paper review guidelines
Less than a page (½ – ¾) Brief (2-3 sentences) summary Main contribution Strengths? Weaknesses? How convincing are the experiments? Suggestions to improve them? Extensions? Applications? Additional comments, unclear points Relationships observed between the papers we are reading 55
56
Paper presentation guidelines
Papers Experiments 56
57
Papers Read selected papers in topic area and background papers as necessary Well-organized talk, 45 minutes What to cover? Problem overview, motivation Algorithm explanation, technical details Any commonalities, important differences between techniques covered in the papers. See class webpage for more details. 57
58
Experiments Implement/download code for a main idea in the paper and show us toy examples: Experiment with different types of (mini) training/testing data sets Evaluate sensitivity to important parameter settings Show (on a small scale) an example to analyze a strength/weakness of the approach Share links to any tools or data. 58
59
Tips Look up papers and authors. Their webpage may have data, code, slides, videos, etc. Make sure talk flows well and makes sense as a whole. Cite ALL sources. Don’t forget the high-level picture. Give a very clear and well-organized and thought out talk. 59
60
Timetable for presenters
Meet me three days before your presentation to do a dry run Fridays and Mondays me to set up an appointment at least a couple of days ahead of time This is a hard deadline. 60
61
Projects Possibilities: Extend a technique studied in class
Analysis and empirical evaluation of an existing technique Comparison between two approaches Design and evaluate a novel approach Talk to me if want to work with a partner Talk to me if you need help with ideas 61
62
Project timeline Project proposals (1 page) [10%]
February 12th Mid-semester presentations (10 minutes) [20%] March 19th and 21st (after Spring break) Final presentations (20 minutes) [35%] April 25th to May 7th Project reports (8 pages) [35%] May 15th 62
63
Implementation Use any language / platform you like
No support for code / implementation issues will be provided 63
64
Miscellaneous Best presentation, best project and best discussion prizes! We will vote Dinner Feedback welcome and useful No laptops, phones, etc. in class please I will interrupt if something in your talk is not clear. 64
65
Tips Make sure you are saying everything we need to know to understand what you are saying. Make sure you know what you are talking about. Imagine your audience, better yet, imagine talking to a complete stranger. Make your talks visual (images, video, not lots of text). Look up examples. 65
66
Other courses Introduction to Machine Learning and Perception
Dhruv Batra Offered again in the Fall (?) Advanced Machine Learning Spring semesters (?) Computer Vision Fall semesters (?) 66
67
Coming up Read the class webpage
Schedule is up Tour of schedule Select 6 dates (topics) you would like to present me by Thursday Whoever signs up for Tuesday next week Need not do an experimental section for that day Meet me on Friday with at least rough outline of slides Meet me on Monday with the final version First-come-first-serve Overview of my research on Thursday
68
Questions? See you Thursday!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.