Tracking through Optical Snow

Slides:

Advertisements

Similar presentations

Dynamic Occlusion Analysis in Optical Flow Fields

Advertisements

December 5, 2013Computer Vision Lecture 20: Hidden Markov Models/Depth 1 Stereo Vision Due to the limited resolution of images, increasing the baseline.

May 2004Motion/Optic Flow1 Motion Field and Optical Flow Ack, for the slides: Professor Yuan-Fang Wang Professor Octavia Camps.

COMP 290 Computer Vision - Spring Motion II - Estimation of Motion field / 3-D construction from motion Yongjik Kim.

CHAPTER 7 Viewing and Transformations © 2008 Cengage Learning EMEA.

Metric Self Calibration From Screw-Transform Manifolds Russell Manning and Charles Dyer University of Wisconsin -- Madison.

ME 2304: 3D Geometry & Vector Calculus Dr. Faraz Junejo Double Integrals.

1 Computational Vision CSCI 363, Fall 2012 Lecture 26 Review for Exam 2.

Module 3Special Relativity1 Module 3 Special Relativity We said in the last module that Scenario 3 is our choice. If so, our first task is to find new.

1-1 Measuring image motion velocity field “local” motion detectors only measure component of motion perpendicular to moving edge “aperture problem” 2D.

1 Computational Vision CSCI 363, Fall 2012 Lecture 31 Heading Models.

1 Computational Vision CSCI 363, Fall 2012 Lecture 20 Stereo, Motion.

1 Perception, Illusion and VR HNRS 299, Spring 2008 Lecture 8 Seeing Depth.

1 Computational Vision CSCI 363, Fall 2012 Lecture 28 Structure from motion.

December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.

1 Computational Vision CSCI 363, Fall 2012 Lecture 21 Motion II.

3D Imaging Motion.

Motion transparency: beyond layers Michael Langer Richard Mann School of Computer Science McGill U. U. Waterloo.

CS332 Visual Processing Department of Computer Science Wellesley College Analysis of Motion Recovering observer motion.

Optical Flow. Distribution of apparent velocities of movement of brightness pattern in an image.

1 Motion Analysis using Optical flow CIS601 Longin Jan Latecki Fall 2003 CIS Dept of Temple University.

 In this packet we will look at:  The meaning of acceleration  How acceleration is related to velocity and time  2 distinct types acceleration  A.

Computer vision: models, learning and inference M Ahad Multiple Cameras

By James J. Todd and Victor J. Perotti Presented by Samuel Crisanto THE VISUAL PERCEPTION OF SURFACE ORIENTATION FROM OPTICAL MOTION.

1 Computational Vision CSCI 363, Fall 2012 Lecture 29 Structure from motion, Heading.

Optical Snow and the Aperture Problem Michael Langer Richard Mann School of Computer Science McGill U. U. Waterloo.

Processing Images and Video for An Impressionist Effect Automatic production of “painterly” animations from video clips. Extending existing algorithms.

1 Computational Vision CSCI 363, Fall 2012 Lecture 32 Biological Heading, Color.

Copyright © Cengage Learning. All rights reserved.

Copyright © Cengage Learning. All rights reserved.

Rendering falling snow using an inverse Fourier transform

SPECULAR FLOW AND THE PERCEPTION OF SURFACE REFLECTANCE

Visual motion Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys.

2D Motion is just the Beginning

Motion and Optical Flow

HERE IS AN EXAMPLE OF ONE-DIMENSIONAL MOTION YOU’VE SEEN BEFORE

Review: Transformations

From: Motion processing with two eyes in three dimensions

Prof. Riyadh Al_Azzawi F.R.C.Psych

3D Graphics Rendering PPT By Ricardo Veguilla.

4.1 Objective: Students will look at polynomial functions of degree greater than 2, approximate the zeros, and interpret graphs.

Using Algebra Tiles to Solve Equations, Combine Like Terms, and use the Distributive Property Objective: To understand the different parts of an equation,

3D Motion Estimation.

3D Rendering Pipeline Hidden Surface Removal 3D Primitives

Introduction to Summary Statistics

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.

Transformations of Functions

Introduction to Summary Statistics

GAM 325/425: Applied 3D Geometry

Inferential Statistics

Prof. Riyadh Al_Azzawi F.R.C.Psych

Gillis Mathematics Inequalities.

2.4: Transformations of Functions and Graphs

Volume 54, Issue 6, Pages (June 2007)

Vectors and Two-Dimensional Motion

Visual Motion and the Perception of Surface Material

Progressive Photon Mapping Toshiya Hachisuka Henrik Wann Jensen

Copyright © Cengage Learning. All rights reserved.

Neural Mechanisms of Visual Motion Perception in Primates

Optical Snow and the Aperture Problem

Recovering observer motion

Optical Snow and the Aperture Problem

Prof. Riyadh Al_Azzawi F.R.C.Psych

MT Neurons Combine Visual Motion with a Smooth Eye Movement Signal to Code Depth-Sign from Motion Parallax Jacob W. Nadler, Mark Nawrot, Dora E. Angelaki,

1.6 Transformations of Functions

Occlusion and smoothness probabilities in 3D cluttered scenes

College Physics, 7th Edition

Using the Rule Normal Quantile Plots

Using the Rule Normal Quantile Plots

Presentation transcript:

Tracking through Optical Snow Michael Langer Richard Mann School of Computer Science School of Computer Science McGill U. U. Waterloo

Optical snow e.g. falling snow Today I’m going to tell you about a new category of motion that Richard Mann and I have been trying to understand. We call the motion Optical snow. An example is what you see during a snow fall. Take the ideal case that each snowflake is falling at the same 3D velocity. In this case, dense motion parallax occurs. Snowflakes that are closer to the eye move with a faster image speed than snow flakes that are further from the eye. e.g. falling snow

Optical snow Here we show a synthetic image sequence. The scene is a set of randomly placed spheres in a view volume. The spheres are obviously falling downwards. This is a very complex motion. Many depth discontinuities are present, as well as many image speeds which are due to the motion parallax. Despite all the depth discontinuities, the sequence produces a rich motion percept. Note that the percept has a very rich layered structure. Any model that is based on only two layers is not going to do well here.

Optical snow Moving observer in a 3D cluttered scene You might be skeptical and say that falling snow doesn’t come up very often in nature and that it is not so interesting to study. We argue differently. Optical snow arises whenever an observer moves relative to a cluttered 3D scene such as a tree or bush. Walking through a forest or any 3D cluttered scene gives you dense motion parallax, or optical snow. Points that are near to you move with a different speed than points that are farther away. WHETHER IT IS A RIGID SCENE MOVING W.R.T. A FIXED OBSERVER LIKE IN FALLING SNOW, OR A STATIC SCENE AS SEEN BY A MOVING OBSERVER, THE MOTION IS THE SAME. Moving observer in a 3D cluttered scene

Optical snow We are all familiar with this type of motion. We have a rich motion percept, even though the image sequence itself is enormously complex. I am certainly not claiming that we have an accurate percept of the depths of the objects. Rather we perceive the basic statistical properties of the motion. Namely, this is a dense 3D scene and we can estimate the direction of motion and perhaps the range of image. We are trying to understand how this is possible. What sort of computations are necessary or sufficient to achieve these percepts?. What sort of computational problem is the visual system solving and how does it solve it?

Related work Computation of Image Flow: Psychophysics of Heading: “The Fox and the Forest” (Steve Zucker ’80’s) Psychophysics of Heading: “3D cloud of dots” (Bill Warren ’80s-’90s) Ecological Optics (J. J. Gibson) monkeys in a forest, cats in the tall grass. But how ? WHAT WORDS TO USE HERE? We are certainly not the first people to point out that 3D cluttered scenes are egologically interesting and important. Helmoltz discussed 3d cluttered scenes. More recently, Bill Warren and others have pointed out that cluttered scenes such as grasslands and forests do not produce smooth motion fields, and that such optical want to emphasize the ecological importance of this optical snow. Among the animals that inhabit such scenes are the ones most commonly studied in visual neuroscience, namely rabbit, cat, and monkey. Today I am going to go over some of the basic first steps in answering this question. MENTION THAT IT HAS BEEN USED BY HEADING PEOPLE e.g. Warren, .. “3D cloud of dots”

Goal of this talk How to model and compute image velocities in a 3D cluttered scene ?

Overview of Talk Fourier analysis of optical snow (Langer & Mann, ICCV ’01) Generalized optical snow Biologically motivated computational model (sketch only)

v f + v f + f = 0 Fourier model of image translation (Fahle & Poggio ’81, Watson & Ahumada ’85) f t t f y f x We first consider an observation that was made by Watson and Ahumada back in 1985. If an image is translating with some uniform velociy (vx,vy) then the power spectrum of the motion has a very simple property. Here’s the idea. Each of the spatial frequency components of the image translates with this velocity as well. But any spatial frequency component that is translating with velocity (vx,vy) will induce a temporal frequency. The relation between spatial frequencies and the temporal frequency is a linear one and is shown here. It follows that if you take the 3d fourier transform of a translating image, then all the power will lie along a plane in the frequency domain. This is not an obvious result. But its true. v f + v f + f = 0 x x y y t

Optical Snow (v , v ) = (α t , α t ) x y x y Now lets relate this to optical snow. The key property of optical snow is that rather than having a single velocity vector (vx, vy) you have a family of velocity vectors all of which have the same direction. Mathematically, instead of having a velocity (vx,vy) we now have a set of velocities scaled by speeds alpha. All motion is in the same direction but there are many speeds. In the case of vertically falling snow, the direction of motion tau is (0,1). (v , v ) = (α t , α t ) x y x y

Optical Snow (v , v ) = (α t , α t ) x y x y In the case of an observer moving laterally through a 3D cluttered scene, the direction of motion is horizontal. The speeds will vary from point to point but all the velocity directions are horizontal. (v , v ) = (α t , α t ) x y x y

Optical snow (v , v ) = (α t , α t ) v v y x x y x y In the model of optical snow that I’ve presented up to now, all the object surfaces are moving in the same image directions but there is a range of speeds. Let’s now generalize this. We do so by adding a constant (omegax, omegay) to each velocity vector. Mathematically, what we are doing is taking the line of velocities that is in direction (taux, tauy) and shifting this line off the origin. But what does this mean? Let me look at two problems in which this model is relevant. (v , v ) = (α t , α t ) x y x y

Fourier model of optical snow “bowtie” When the optical snow is in a general direction (taux, tauy) and there is a range of speeds, then we can directly apply Watson and Ahumada’s motion plane. But Instead of having one motion plane, we now have a family of planes. Each speed alpha gives rise to its own plane. Interesting, all these planes intersect at a common axis and this axis in the (fx,fy) plane. So, I am claiming that if you look at the 3D power spectrum of optical snow, then you don’t get a plane, but you don’t get junk either. What you get is a bowtie pattern. α t f + α t f + f = 0 x x y y t

Example of bowtie in power spectrum This is not just a mathematical theory. Here is the bowtie that you get from the bush sequence that I showed you a few slides back. Because the motion is horizontal, we know we can project the 3D power spectrum along the appropriate axis in order to see it. If we project the power spectrum along different axes, then this bowtie pattern disappears. That is, we need to view the power spectrum from the correct direction in order to see the bowtie. bush sequence

Overview of Talk Fourier analysis of optical snow (Langer & Mann, ICCV ’01) Generalized optical snow Biologically motivated computational model (sketch only)

Moving observer in 3D cluttered scene rotation In particular, if the observer is moving laterally through the scene, then the translation direction is constant over the image. What I will do today is to generalize the model of optical snow that we presented in earlier papers to the case in which the observer makes a general eye rotation while moving laterally through the scene. This eye rotation is illustrated in the figure shown here. A good approximation of the effect of an eye rotation is that it adds a constant velocity to each point in the image. We call this constant velocity term (omegax, omegay). There are really two assumptions being made here. The first is that the rotation is a combination of a pan and tilt. The camera does not roll. The second assumption is that the field of view is relatively small, so that second order effects of rotation can be ignored. translation

Moving observer in 3D cluttered scene (Longuet-Higgins and Prazdney 1980) Velocity field is the sum of two fields : translation of camera - depends on 3D scene geometry (depth) rotation of camera - independent of 3D scene geometry The first problem is the one we saw earlier. How can a moving observer judge its heading in a 3D cluttered scene. Longuet-Higgins and Prazdny and others showed years ago, the instanteous motion field seen by a moving observer in a static 3d scene is the sum of a translation and a rotation field, that is, the camera translation and its rotation. The model makes no assumption about smoothness of the scene geometry. The model holds fine for 3D cluttered scenes. I’m not going to drag you through the equations. The important point is that the translation field depends on depth and the rotation field is independent of depth.

Moving observer in 3D cluttered scene rotation (pan + tilt) In particular, if the observer is moving laterally through the scene, then the translation direction is constant over the image. What I will do today is to generalize the model of optical snow that we presented in earlier papers to the case in which the observer makes a general eye rotation while moving laterally through the scene. This eye rotation is illustrated in the figure shown here. A good approximation of the effect of an eye rotation is that it adds a constant velocity to each point in the image. We call this constant velocity term (omegax, omegay). There are really two assumptions being made here. The first is that the rotation is a combination of a pan and tilt. The camera does not roll. The second assumption is that the field of view is relatively small, so that second order effects of rotation can be ignored. translation (lateral)

vertical translation + pan to left Here is an example of what we mean. Take the vertically falling snowfall sequence from earlier and add to it a camera rotation which is a pan to the left. The result is a rather odd motion field. It is the sum of vertical parallel snow, plus a constant horizontal drift which is due to camera pan. vertical translation + pan to left

Tracking through optical snow Here is an example of what we mean. Take the vertically falling snowfall sequence from earlier and add to it a camera rotation which is a pan to the left. The result is a rather odd motion field. It is the sum of vertical parallel snow, plus a constant horizontal drift which is due to camera pan. vertical translation + pan to left

Generalized optical snow v y v x In the model of optical snow that I’ve presented up to now, all the object surfaces are moving in the same image directions but there is a range of speeds. Let’s now generalize this. We do so by adding a constant (omegax, omegay) to each velocity vector. Mathematically, what we are doing is taking the line of velocities that is in direction (taux, tauy) and shifting this line off the origin. But what does this mean? Let me look at two problems in which this model is relevant. (v , v ) = ( α t , α t ) + (w , w ) x y x y x y translation rotation

Fourier model of generalized optical snow “tilted bowtie” When the optical snow is in a general direction (taux, tauy) and there is a range of speeds, then we can directly apply Watson and Ahumada’s motion plane. But Instead of having one motion plane, we now have a family of planes. Each speed alpha gives rise to its own plane. Interesting, all these planes intersect at a common axis and this axis in the (fx,fy) plane. So, I am claiming that if you look at the 3D power spectrum of optical snow, then you don’t get a plane, but you don’t get junk either. What you get is a bowtie pattern. (α t + w ) f + (α t + w ) f + f = 0 x x x y y y t

Tilted bowtie in power spectrum Since we generated the sequence, we know what is the axis of the bowtie. Here is the 3D power spectrum projected along the bowtie axis. You will notice that there are some funny aliasing effects that are visible in this plot namely the wraparound at the boundary of the frequency domain. These aliasing effects are due to the high precision of the rendering. For real images, motion blur occurs at the boundaries of the spheres and these aliasing effects are not present. They were not present in the power spectrum plot I showed earlier which showed the bush sequence. vertical translation + pan to left

Overview of Talk Fourier analysis of optical snow Generalized optical snow Biologically motivated computational model (sketch only)

Oriented, directionally tuned cells in V1. f t - f - + y - - + - + - f x Neuroscientists tell us that each complex cells in V1 is sensitive to a particular region of the visual field, and to a particular combination of spatial and temporal frequencies. That is, complex cells are orientation and directionally tuned. The blue sphere that I’ve shown here shows the region of the 3D frequency domain for which one V1 cell has its peak sensitivity.

Oriented, directionally tuned cells in V1. f t y x Now consider a family of these complex cells in V1 cells and suppose that they cover the3D frequency domain. The tiling I’ve shown here is just a cartoon.

Pure image translation (v , v ) x y f t y x f t t f y f x If we take the case of pure image translation, then we get single motion plane. On the right, I’ve marked in red those cells that have peak response to this motion plane. Heeger and Simoncelli and others have used this idea to propose a model of detecting pure image translation. The idea is to have a higher level cell, such as an MT cell, look for a particular pattern of responses in V1 that indicates a particular motion plane. Heeger and Simoncelli say that for a given motion plane, you get a distributed code of cells that overlap with the motion plane, and from this distrubted code you can build template cells. If you are familiar with this stuff already, GREAT. If not, the point to take away is that since pure translation motion gives rise to a plane in the frequency domain, it also gives rise to a distribution of responses of complex cells. (see Heeger ’87, Yuille and Grzywacz ’90, Simoncelli and Heeger ‘97)

Generalized optical snow f t f x f y The same holds for non-parallel snow. When you add an arbitrary pan or tilt of the camera, for example in tracking an object, the bowtie may tilt out of the fx, fy plane. In this case, the distributed code follows the bowtie. A crude sketch of the distributed code is shown on the right. Let me be clear about what we are and are not claiming here. We are not claiming that there are template cells in MT that are sensitive to particular distributed codes. I personally have no idea what is going on in MT – I am not a neuroscientist. What I am claiming is that if you believe the textbook description of complex cells in V1, which the neuroscientists tell us, then non-parallel optical snow will give us a distributed code over these cells. How this distributed code is processed by higher levels of the brain is an open problem. (Langer and Mann, in preparation)

Summary Goal: how to model and compute image velocities in a 3D cluttered scene ? Generalized optical snow: lateral motion + pan and tilt → tilted bowtie in frequency domain. Many algorithms possible for fitting bowtie

Computational models of heading - Longuet-Higgins and Prazdney 1980 Rieger and Lawton 1994 Heeger and Jepson 1992 Hildreth 1992, Lappe and Rauschecker 1993 Royden 1997, …. These models assume “the image velocity field” can be pre-computed. But this assumption is problematic in a 3D cluttered scene.