Presentation is loading. Please wait.

Presentation is loading. Please wait.

16-721: Learning-based Methods in Vision Staff: Instructor: Alexei (Alyosha) Efros 4207 TA: Tomasz Malisiewicz Smith Hall.

Similar presentations


Presentation on theme: "16-721: Learning-based Methods in Vision Staff: Instructor: Alexei (Alyosha) Efros 4207 TA: Tomasz Malisiewicz Smith Hall."— Presentation transcript:

1 16-721: Learning-based Methods in Vision Staff: Instructor: Alexei (Alyosha) Efros (efros@cs), 4207 NSH@cs TA: Tomasz Malisiewicz (tomasz@cmu), Smith Hall 236Tomasz Malisiewicz Web Page: http://www.cs.cmu.edu/~efros/courses/LBMV09/

2 Today Introduction Why This Course? Administrative stuff Overview of the course

3 A bit about me Alexei (Alyosha) Efros Relatively new faculty (RI/CSD) Ph.D 2003, from UC Berkeley (signed by Arnie!) Research Fellow, University of Oxford, ’03-’04 Teaching The plan is to have fun and learn cool things, both you and me! Social warning: I don’t see well Research Vision, Graphics, Data-driven “stuff”

4 PhD Thesis on Texture and Action Synthesis Antonio Criminisi’s son cannot walk but he can fly Smart Erase button in Microsoft Digital Image Pro:

5 Why this class? The Old Days™: 1. Graduate Computer Vision 2. Advanced Machine Perception

6 Why this class? The New and Improved Days: 1. Graduate Computer Vision 2. Advanced Machine Perception Physics-based Methods in Vision Geometry-based Methods in Vision Learning-based Methods in Vision

7 Describing Visual Scenes using Transformed Dirichlet ProcessesTransformed Dirichlet Processes. E. Sudderth, A. Torralba, W. Freeman, and A. Willsky. NIPS, Dec. 2005. The Hip & Trendy Learning

8 Learning as Last Resort

9 from [Sinha and Adelson 1993] EXAMPLE: Recovering 3D geometry from single 2D projection Infinite number of possible solutions!

10 Learning-based Methods in Vision This class is about trying to solve problems that do not have a solution! Don’t tell your mathematician frineds! This will be done using Data: E.g. what happened before is likely to happen again Google Intelligence (GI): The AI for the post-modern world! Note: this is not quite statistics Why is this even worthwhile? Even a decade ago at ICCV99 Faugeras claimed it wasn’t!

11 The Vision Story Begins… “What does it mean, to see? The plain man's answer (and Aristotle's, too). would be, to know what is where by looking.” -- David Marr, Vision (1982)

12 Vision: a split personality “What does it mean, to see? The plain man's answer (and Aristotle's, too). would be, to know what is where by looking. In other words, vision is the process of discovering from images what is present in the world, and where it is.” Answer #1: pixel of brightness 243 at position (124,54) …and depth.7 meters Answer #2: looks like bottom edge of whiteboard showing at the top of the image Which do we want? Is the difference just a matter of scale? depth map

13 Measurement vs. Perception

14 Brightness: Measurement vs. Perception

15 Proof!

16 Lengths: Measurement vs. Perception http://www.michaelbach.de/ot/sze_muelue/index.html Müller-Lyer Illusion

17 Vision as Measurement Device Real-time stereo on Mars Structure from Motion Physics-based Vision Virtualized Reality

18 …but why do Learning for Vision? “What if I don’t care about this wishy-washy human perception stuff? I just want to make my robot go!” Small Reason: For measurement, other sensors are often better (in DARPA Grand Challenge, vision was barely used!) For navigation, you still need to learn! Big Reason: The goals of computer vision (what + where) are in terms of what humans care about.

19 So what do humans care about? slide by Fei Fei, Fergus & Torralba

20 Verification: is that a bus? slide by Fei Fei, Fergus & Torralba

21 Detection: are there cars? slide by Fei Fei, Fergus & Torralba

22 Identification: is that a picture of Mao? slide by Fei Fei, Fergus & Torralba

23 Object categorization sky building flag wall banner bus cars bus face street lamp slide by Fei Fei, Fergus & Torralba

24 Scene and context categorization outdoor city traffic … slide by Fei Fei, Fergus & Torralba

25 Rough 3D layout, depth ordering slide by Fei Fei, Fergus & Torralba

26 Challenges 1: view point variation Michelangelo 1475-1564 slide by Fei Fei, Fergus & Torralba

27 Challenges 2: illumination slide credit: S. Ullman

28 Challenges 3: occlusion Magritte, 1957 slide by Fei Fei, Fergus & Torralba

29 Challenges 4: scale slide by Fei Fei, Fergus & Torralba

30 Challenges 5: deformation Xu, Beihong 1943 slide by Fei Fei, Fergus & Torralba

31 Challenges 6: background clutter Klimt, 1913 slide by Fei Fei, Fergus & Torralba

32 Challenges 7: object intra-class variation slide by Fei-Fei, Fergus & Torralba

33 Challenges 8: local ambiguity slide by Fei-Fei, Fergus & Torralba

34 Challenges 9: the world behind the image

35 In this course, we will: Take a few baby steps…

36 Goals Read some interesting papers together Learn something new: both you and me! Get up to speed on big chunk of vision research understand 70% of CVPR papers! Use learninig-based vision in your own work Try your hand in a large vision project Learn how to speak Learn how think critically about papers

37 Course Organization Requirements: 1.Class Participation (33%) Keep annotated bibliography Post on the Class Blog before each class Ask questions / debate / flight / be involved! 2.Two Projects (66%) Analysis Project Implement and Evaluate paper and present it in class Must talk to me AT LEAST 2 weeks beforehand! Synthesis Project Can be done solo or in groups of 2 Regular meetings Must use lots of data

38 Class Participation Keep annotated bibliography of papers you read (always a good idea!). The format is up to you. At least, it needs to have: Summary of key points A few Interesting insights, “aha moments”, keen observations, etc. Weaknesses of approach. Unanswered questions. Areas of further investigation, improvement. Before each class: Submit your summary for current paper(s) in hard copy (printout/xerox) Submit a comment on the Class Blog ask a question, answer a question, post your thoughts,praise, criticism, start a discussion, etc.

39 Analysis Project 1.Pick a paper / set of papers from the list 2.Understand it as if you were the author Re-implement it If there is code, understand the code completely Run it on data the same data (you can contact authors for data and even code sometimes) 3.Understand it better than the author Run it on LOTS of new data (e.g. LabelMe dataset, Flickr dataset, etc, etc) Figure out how it succeeds, how it fails, where it fails, and, most importantly WHY it fails Look at which parts of the code do the real work, and which parts are just window-dressing Maybe suggest directions for improvement. 4.Prepare an amazing 1hr presentation Discuss with me twice – once when you start the project, 3 days before the presentation

40 Synthesis Project Can grow out of analysis project, or your own research But it needs to use large amounts of data! 1-2 people per project. Project proposals in a few weeks. Project presentations at the end of semester. Results presented as a CVPR-format paper. Hopefully, a few papers may be submitted to conferences.

41 End of Semester Awards We will vote for: Best Analysis Project Best Synthesis Project Best Blog Comment Prize: dinner in a French restaurant in Paris (transportation not included!) or some other worthy prizes


Download ppt "16-721: Learning-based Methods in Vision Staff: Instructor: Alexei (Alyosha) Efros 4207 TA: Tomasz Malisiewicz Smith Hall."

Similar presentations


Ads by Google