Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 558 Computer Vision John Oliensis. Today’s class What is vision What is computer vision How we can solve vision problems –Important tools –Overall.

Similar presentations


Presentation on theme: "CS 558 Computer Vision John Oliensis. Today’s class What is vision What is computer vision How we can solve vision problems –Important tools –Overall."— Presentation transcript:

1 CS 558 Computer Vision John Oliensis

2 Today’s class What is vision What is computer vision How we can solve vision problems –Important tools –Overall approaches

3 Why is Vision Interesting? Psychology –~ 50% of cerebral cortex is for vision. –Vision is how we experience the world. Engineering –Want machines to interact with world. –Digital images are everywhere.

4 Vision is inferential

5

6 Inferring Surface “Lightness” How do we determine the “true” surface color at A and B? ?Discount slow changes from lighting, keep quick paint changes?

7 Inferring Surface Color We perceive true surface color despite unknown or changing light!

8 Vision is Inferential (surface brightness) plaid-movie, haze movie

9 Vision is inferential: Shape from light

10 Shape from Motion

11 Vision is Inferential: Prior Knowledge

12 Computer Vision Inference  Computation Building machines that see Modeling biological perception

13 So what do humans care about? slide by Fei Fei, Fergus & Torralba

14 Verification: is that a bus? slide by Fei Fei, Fergus & Torralba

15 Detection: are there cars? slide by Fei Fei, Fergus & Torralba

16 Identification: is that a picture of Mao? slide by Fei Fei, Fergus & Torralba

17 Object categorization sky building flag wall banner bus cars bus face street lamp slide by Fei Fei, Fergus & Torralba

18 Scene and context categorization outdoor city traffic … slide by Fei Fei, Fergus & Torralba

19 Rough 3D layout, depth ordering slide by Fei Fei, Fergus & Torralba

20 Challenges 1: view point variation Michelangelo 1475-1564 slide by Fei Fei, Fergus & Torralba

21 Challenges 2: illumination slide credit: S. Ullman

22 Challenges 3: occlusion Magritte, 1957 slide by Fei Fei, Fergus & Torralba

23 Challenges 4: scale slide by Fei Fei, Fergus & Torralba

24 Challenges 5: deformation Xu, Beihong 1943 slide by Fei Fei, Fergus & Torralba

25 Challenges 6: background clutter Klimt, 1913 slide by Fei Fei, Fergus & Torralba

26 Challenges 7: object intra-class variation slide by Fei-Fei, Fergus & Torralba

27 Challenges 8: local ambiguity slide by Fei-Fei, Fergus & Torralba

28 Summary: Same object can appear very different! How can you isolate what’s the same in these two pictures (the horse) given the huge differences?

29 Quick Tour of Computer Vision

30 Approach: local cues The entire image is too complex. Try to find distinctive small patches which may help to interpret it Example: brightness boundaries Maybe part of object’s outline? May help in inferring object shapes. Build larger interpretations from these small “clues”

31 Local cue: Brightness Boundary

32 Could this be part of the outline of something?

33 Local cue: Brightness Boundary Part of the leaf outline

34 Local cue: Brightness Boundary

35 Could this be part of the outline of something?

36 Local cue: Brightness Boundary

37 Not an outline, Just a highlight

38 Where’s the squirrel outline?

39 Integrating information over larger regions Finding outlines Finding regions that might correspond to objects

40 Boundary Detection http://www.robots.ox.ac.uk/~vdg/dynamics.html

41 Boundary Detection Finding the Corpus Callosum (G. Hamarneh, T. McInerney, D. Terzopoulos)

42 Segmentation (foreground versus background) (Sharon, Balun, Brandt, Basri)

43 Segmentation (foreground versus background) Different approach JO

44 Different approach

45 A Classical View of Vision Grouping / Segmentation Figure/Ground Organization Object and Scene Recognition pixels, boundaries, small windows… Low-level Mid-level High-level

46 A Contemporary View of Vision Figure/Ground Organization Grouping / Segmentation Object and Scene Recognition pixels, boundaries, small windows… Low-level Mid-level High-level But where do we draw this line?

47 Boundaries and regions  Shape Texture  appearance

48 Texture RepetitionSynthesis Learn the statistics of a texture to recognize it Synthesize texture based on learned model Original

49 RepetitionSynthesis Texture Textures over time (Smoke, flame,waterfall...) Original

50 Tracking (JO+HZ)

51 Understanding Action Tracking face features  emotions Tracking pedestrians  surveillance

52 Tracking office workers

53 Stereo

54 Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923 (Slide courtesy Steve Seitz)

55 Stereo Image 1Image 2 Camera 1Camera 2

56 Stereo http://www.magiceye.com/

57 Stereo http://www.magiceye.com/

58 Estimated Camera Motion Structure from Motion Motion and shape from movies

59 movie to shape Estimated 3D shape

60 Movie  shape Important for humans

61 Motion – Application Inserting virtual objects into video (www.realviz.com)

62 Motion Application Aligning virtual & real objects despite camera motion Visually guided surgery

63 Recognition (despite appearance change) Lighting affects appearance

64

65

66 Classification (Funkhauser, Min, Kazhdan, Chen, Halderman, Dobkin, Jacobs)

67 Viola and Jones: Real time Face Detection

68

69 Approaches to Vision

70 Approach 1: Toy Models + Algorithms 1) Start with simple idealized model of world, images Find good algorithms 2) Experiment on real world. 3) Update model, algorithms Real Problem is going beyond idealizations!

71 Example: 3D shape from shading (JO)

72 How does Shading determine Shape? Bright Dark Shading (image brightness) indicates how much light on each surface patch  gives surface patch orientation  overall shape

73 Very Idealized! Uniformly bright surface (no paint!) (else brightness doesn’t indicate orientation) Other idealizations as well –no shadows –smooth surface –no objects in front of others –no glossiness or mirror reflections –known light source –light from one direction only

74 Approach 2: Psychology/Neuroscience Derive insights from human/animal vision Example: processing at multiple scales True for people; useful for computers

75 (Try squinching your eyes from far)

76

77

78 Approach 3: Engineering Limited goals, application-oriented. Exploit domain constraints! Problem: May not generalize to other tasks

79 Example: Image Mosaics + + … += Goal: Stitch together images into composite image Composite has to look real, taken from one place: may have to warp original images

80 Approach 4 Bayesian inference + Learning Given the image, what 3D scene produced it? Impossible! Image is 2D, has too little information about scene since it’s 3D. Bayesian solution: Learn: accumulate experience about what types of 3D scenes and images are likely to occur. Use this experience to help in interpreting new images. (i.e., tune algorithm based on experience).

81 Approach 4 Bayesian inference + Learning Usually based on probabilities –How likely is this object to appear? –How likely is it that this image patch shows the object? Finding the probability for all possibilities often very hard, can lead to huge computations.

82 Recognize objects (Bayesian learning) Recognize parts (eyes, nose,…) and their spatial arrangement. Learning: Automatically tune algorithm from its success on trial runs

83 Approach 4A: Learning from millions of pictures

84 Theory of Vision David Marr (1980s) –Visual understanding is a computation –It proceeds in well defined stages Primal Sketch 2½D Sketch 3D Representations –Wrong in details Gestalt, Gibson ecological theory, geons… Now: no general theory of vision

85 The State of Computer Vision Technology –Applications Surveillance Road monitoring Computer driven cars Football Movies Medicine Face Recognition/BiometricsSpace HCI (Human Computer Interface); sign language recognition Remote Sensing –Successful companies Largest ~100-200 million in revenues. In-house applications.

86 The State of Computer Vision Science –More progress in engineering –Interesting theory for specific problems (e.g., estimating 3D shape of objects from images) –Beginnings of progress on “intelligent “vision (i.e., recognizing objects)

87 The State of Computer Vision Sociology –Engineers (dominant group) –Applied math –Computer science –Visual Psychology, neuroscience

88 Related Fields Learning (can computers teach themselves to see?) + Artificial Intelligence (AI) Graphics. “Vision is inverse graphics” Visual perception + Neuroscience Math (eg., geometry, statistics/probability) + Physics Operation research, optimization

89 History (very rough) “Those who cannot remember the past are condemned to repeat it” 1985-1990 –Toy models/algorithms (line drawings of blocks) –AI Recognition Systems. –Segmentation. Break images up into regions that could be objects –Low level vision. Detecting brightness boundaries, estimating 3D shapes of objects –Neural nets. David Marr. 1990s –Estimating camera motion from movies. Projective geometry, –Model-based recognition. Use specific object models to find them in images –Represent 2D shapes by their “skeletons” –Tracking –Classifying pixels from appearance ( blue  sky or water, green  leaf, …) 2000s –Learning: internet scale data –More reliable appearance descriptors  better recognition of objects –More math Graph theory, Monte Carlo, level sets. –Robust Statistics: recovering from mistakes of low level modules

90 Tools Needed for Course Math –Linear Algebra (to be taught) –Signal Processing (to be taught). –Calculus –Some geometry –Probability Computer Science –Algorithms –Programming (matlab)


Download ppt "CS 558 Computer Vision John Oliensis. Today’s class What is vision What is computer vision How we can solve vision problems –Important tools –Overall."

Similar presentations


Ads by Google