Advanced Topics in Computer Vision

Advanced Topics in Computer Vision
Devi Parikh Electrical and Computer Engineering

Plan for today Topic overview: Introductions Course overview:
What does the visual recognition problem entail? Why are these hard problems? What works today? Introductions Course overview: Logistics Requirements Please interrupt at any time with questions or comments

Computer Vision Automatic understanding of images and video
Computing properties of the 3D world from visual data (measurement) Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities. (perception and interpretation) Algorithms to mine, search, and interact with visual data (search and organization) Kristen Grauman 3

What does recognition involve?
Fei-Fei Li

Detection: are there people?

Activity: What are they doing?

Object categorization
mountain tree building banner street lamp vendor people

Instance recognition Potala Palace A particular sign

Scene and context categorization
outdoor city …

Attribute recognition
gray made of fabric crowded flat

Object Categorization
Task Description “Given a small number of training images of a category, recognize a-priori unknown instances of that category and assign the correct category label.” Which categories are feasible visually? German shepherd animal dog living being “Fido” K. Grauman, B. Leibe

Visual Object Categories
Basic Level Categories in human categorization [Rosch 76, Lakoff 87] The highest level at which category members have similar perceived shape The highest level at which a single mental image reflects the entire category The level at which human subjects are usually fastest at identifying category members The first level named and understood by children The highest level at which a person uses similar motor actions for interaction with category members K. Grauman, B. Leibe

Basic-level categories in humans seem to be defined predominantly visually. There is evidence that humans (usually) start with basic-level categorization before doing identification.  Basic-level categorization is easier and faster for humans than object identification! How does this transfer to automatic classification algorithms? K. Grauman, B. Leibe

… animal Abstract levels … … four-legged … Basic level dog cat cow German shepherd Doberman Individual level … “Fido” … K. Grauman, B. Leibe

How many object categories are there?
~10,000 to 30,000 Biederman 1987 Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. 15

~10,000 to 30,000 16

Other Types of Categories
Functional Categories e.g. chairs = “something you can sit on” K. Grauman, B. Leibe

Why recognition? Recognition a fundamental part of perception
e.g., robots, autonomous agents Organize and give access to visual content Connect to information Detect trends and themes Where are we now? Kristen Grauman 18

Computer Vision … 45 years later
“spend the summer linking a camera to a computer and getting the computer to describe what it saw” - Marvin Minsky (1966), MIT … 45 years later

Computer Vision OR Vision is HARD!

We’ve come a long way…

Posing visual queries Yeh et al., MIT Belhumeur et al.
Kooaba, Bay & Quack et al. Kristen Grauman

Finding visually similar objects
Kristen Grauman 25

Discovering visual patterns
Sivic & Zisserman Objects Lee & Grauman Categories Wang et al. Actions Kristen Grauman

Auto-annotation Gammeter et al. T. Berg et al. Kristen Grauman

Exploring community photo collections
Snavely et al. Kristen Grauman Simon & Seitz 28

Autonomous agents able to detect objects
Kristen Grauman

We’ve come a long way… Fischler and Elschlager, 1973

We’ve come a long way…

We’ve come a long way… Dollar et al., BMVC 2009

Still a long way to go… Dollar et al., BMVC 2009

Dollar et al., BMVC 2009

Challenges

Challenges: robustness
Illumination Object pose Clutter Intra-class appearance Occlusions Viewpoint Kristen Grauman

context and human experience
Challenges: context and human experience Context cues Kristen Grauman

context and human experience
Challenges: context and human experience Context cues Function Dynamics Kristen Grauman Video credit: J. Davis

Challenges: scale, efficiency
Half of the cerebral cortex in primates is devoted to processing visual information ~20 hours of video added to YouTube per minute ~5,000 new tagged photos added to Flickr per minute Thousands to millions of pixels in an image 30+ degrees of freedom in the pose of articulated objects (humans) 3,000-30,000 human recognizable object categories Kristen Grauman

Challenges: learning with minimal supervision
Less More Unlabeled, multiple objects Classes labeled, some clutter Cropped to object, parts and classes labeled Kristen Grauman

Slide from Pietro Perona, 2004 Object Recognition workshop

What kinds of things work best today?
Frontal face detection Reading license plates, zip codes, checks Recognizing flat, textured objects (like books, CD covers, posters) Fingerprint recognition Kristen Grauman

Inputs in 1963… L. G. Roberts, Machine Perception of Three Dimensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963. Kristen Grauman

… and inputs today Personal photo albums Movies, news, sports
Surveillance and security Medical and scientific images Slide credit; L. Lazebnik 46

Understand and organize and index all this data!!
… and inputs today Images on the Web Movies, news, sports 350 mil. photos, 1 mil. added daily 1.6 bil. images indexed as of summer 2005 916,271 titles 10 mil. videos, 65,000 added daily Understand and organize and index all this data!! Satellite imagery City streets Slide credit; L. Lazebnik 47

Introductions What is your name?
Which program are you in? How far along? What is your research area and current project about? Take a minute to explain it to us In a way that we can all follow Have you taken a computer vision course before? Machine learning or pattern recognition? Do you know what classifiers are / do? What are you hoping to get out of this class?

This course ECE 6504 TR 5:00 pm to 6:15 pm McBryde Hall (MCB) 216
Office hours: by appointment ( ) Course webpage: (Google me  My homepage  Teaching) 49

This course Focus on current research in computer vision
High-level recognition problems, innovative applications. 50

Goals Understand current approaches Analyze and critique
Identify interesting research questions Present clearly and methodically 51

Expectations Discussions will center on recent papers in the field [15%] Paper reviews each class [25%] Can have 3 late days over the course of the semester Presentations (2-3 times) [25%] Papers and background reading Experiments Project [35%] 52

Prerequisites Ability to analyze high-level conference papers
Courses in computer vision and machine learning are a plus 53

Paper reviews For each class, review two of the assigned papers.
me by 9:00 pm the day before class (MW) Skip reviews the classes you are presenting. 54

Paper review guidelines
Less than a page (½ – ¾) Brief (2-3 sentences) summary Main contribution Strengths? Weaknesses? How convincing are the experiments? Suggestions to improve them? Extensions? Applications? Additional comments, unclear points Relationships observed between the papers we are reading 55

Paper presentation guidelines
Papers Experiments 56

Papers Read selected papers in topic area and background papers as necessary Well-organized talk, 45 minutes What to cover? Problem overview, motivation Algorithm explanation, technical details Any commonalities, important differences between techniques covered in the papers. See class webpage for more details. 57

Experiments Implement/download code for a main idea in the paper and show us toy examples: Experiment with different types of (mini) training/testing data sets Evaluate sensitivity to important parameter settings Show (on a small scale) an example to analyze a strength/weakness of the approach Share links to any tools or data. 58

Tips Look up papers and authors. Their webpage may have data, code, slides, videos, etc. Make sure talk flows well and makes sense as a whole. Cite ALL sources. Don’t forget the high-level picture. Give a very clear and well-organized and thought out talk. 59

Timetable for presenters
Meet me three days before your presentation to do a dry run Fridays and Mondays me to set up an appointment at least a couple of days ahead of time This is a hard deadline. 60

Projects Possibilities: Extend a technique studied in class
Analysis and empirical evaluation of an existing technique Comparison between two approaches Design and evaluate a novel approach Talk to me if want to work with a partner Talk to me if you need help with ideas 61

Project timeline Project proposals (1 page) [10%]
February 12th Mid-semester presentations (10 minutes) [20%] March 19th and 21st (after Spring break) Final presentations (20 minutes) [35%] April 25th to May 7th Project reports (8 pages) [35%] May 15th 62

Implementation Use any language / platform you like
No support for code / implementation issues will be provided 63

Miscellaneous Best presentation, best project and best discussion prizes! We will vote Dinner Feedback welcome and useful No laptops, phones, etc. in class please I will interrupt if something in your talk is not clear. 64

Tips Make sure you are saying everything we need to know to understand what you are saying. Make sure you know what you are talking about. Imagine your audience, better yet, imagine talking to a complete stranger. Make your talks visual (images, video, not lots of text). Look up examples. 65

Other courses Introduction to Machine Learning and Perception
Dhruv Batra Offered again in the Fall (?) Advanced Machine Learning Spring semesters (?) Computer Vision Fall semesters (?) 66

Coming up Read the class webpage
Schedule is up Tour of schedule Select 6 dates (topics) you would like to present me by Thursday Whoever signs up for Tuesday next week Need not do an experimental section for that day Meet me on Friday with at least rough outline of slides Meet me on Monday with the final version First-come-first-serve Overview of my research on Thursday

Questions? See you Thursday!

Advanced Topics in Computer Vision

Similar presentations

Presentation on theme: "Advanced Topics in Computer Vision"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Advanced Topics in Computer Vision

Similar presentations

Presentation on theme: "Advanced Topics in Computer Vision"— Presentation transcript:

Similar presentations

About project

Feedback