Optical Snow and the Aperture Problem

Optical Snow and the Aperture Problem
Richard Mann School of Computer Science University of Waterloo Michael Langer School of Computer Science McGill University

Optical flow The study of ecological optics was pioneered by Gibson a half a century ago. One of the most interesting problems that Gibson addressed is how a moving observer can judge direction of heading and the 3D scene structure from the motion field. Since Gibson posed the problem, optical flow has been studied by many, both in the psychophysics community and in the computer vision communities. Optical flow is defined in terms of a 2D motion field. At each point on the retina, there is a well defined image velocity. J.J. Gibson, The Senses Considered As Perceptual Systems, 1966

Layered motion e.g. occlusions, transparency
A more general situation to study optical flow occurs when there are multiple objects in the scene. One common way to study this problem is treat the image motion as a small set of layers. The idea is to have one layer per object. Layered models can be used when there are multiple opaque objects in the scene, and when there are transparent objects. The definition of transparency can be quite general here, and can include lighting variations and mirror reflections off glass. e.g. occlusions, transparency

Motion beyond layers e.g. falling snow
Recently, we have been looking at a third natural category of motion that is quite different from optical flow and from layered motion. An example of this new type of motion is what you see during a snow fall. Take the ideal case that each of the snowflakes is falling at the same 3D velocity as illustrated in this figure. In this case, dense motion parallax occurs. Snowflakes that are closer to the eye move with a faster image speed than flakes that are further from the eye. We call this motion “optical snow”. e.g. falling snow

“Optical snow” I want to emphasize that optical snow is different from optical flow and from layered motion. Optical snow gives a rich motion percept but the percept is not that of translation, nor that of two layers. Rather, we have a rich set of depth, and of depth discontinuities. It just make sense to think of layers here, because the number of layers would be too large.

“Optical Snow” Lateral egomotion in a 3D cluttered scene
You might be skeptical and say that falling snow doesn’t come up very often in nature and that it is not a natural motion category to study. We argue differently. This sort of motion arises whenever an observer moves relative to a cluttered 3D scene. A natural example is a person moving past a bush. Again you have dense motion parallax with discontinuities in depth occuring nearly everywhere. Parallax occurs because points that are near to you move with a different speed than points that are farther away. This happens all the time. Lateral egomotion in a 3D cluttered scene

Optical snow I want to emphasize that optical snow is different from optical flow and from layered motion. Optical snow gives a rich motion percept but the percept is not that of translation, nor that of two layers. Rather, we have a rich set of discontinuities and a large number of layers.

Overview of Talk background: - Fourier analysis of optical snow
- how to estimate direction of optical snow? (Langer and Mann, ICCV ’01)

Overview of Talk background: new stuff:
- Fourier analysis of optical snow - how to estimate direction of optical snow? (Langer and Mann, ICCV ’01) new stuff: - aperture problem

Fourier analysis of image translation
(Watson & Ahumada ’85) f t t f y f x Now let’s get technical for a few slides. First, I would like to review an interesting result that is due to Watson and Ahumada back in This concerns Fourier analysis of image motion. They observed that if you have an image which is translating with some uniform velociy (vx,vy) then each of the spatial frequency components will translate with this velocity as well. Any spatial frequency component that is translating with velocity (vx,vy) will induce a temporal frequency. The relation between spatial and temporal frequency is linear, and is shown by the equation in red. This is the equation of a plane. We call this a motion plane. It follows that if you take the 3d fourier transform of a translating image, then all the power will lie along a plane in the frequency domain. This is not an obvious result. But its true. If image patch is translating with velocity (v , v ) then all power lies on a plane: x y v f + v f + f = 0 x x y y t

Optical Snow Image velocities are (α v , α v ) x y
Now lets relate this to optical snow. The key property of optical snow is that rather than having a single velocity vector,as in the case of a pure translation discussed in the previous slide, we now have a family of velocity vectors all of which have the same direction. Mathematically, instead of having a velocity (vx,vy) we now have a set of velocities scaled by speeds alpha. Each snowflake produces a different image speed depending on its depth. Image velocities are (α v , α v ) x y

Fourier analysis of optical snow
“bowtie” f x We can directly apply Watson and Ahumada’s motion plane here. Instead of having one motion plane, we now have a family of planes. Each speed gives rise to its own plane. I don’t expect you to think about the math. What I would like you to understand is the picture. I am claiming that if you have an image sequence in which all the velocities in the image are in the same direction but there is a range of speeds, then I am claiming that if you take the 3D Fourier transform and look at the 3D power spectrum then you will find that the power makes a bowtie pattern. α v f + α v f + f = 0 x x y y t

Bowtie of falling spheres
Θ Let’s look at an example. Take the falling sphere sequence. If the look at the power spectrum along the axis of the bowtie, then you get the plot shown on the upper right. The bowtie is beautifully visible. Notice that there are aliasing effects as well, as the power spectrum wraps around at the boundaries of the spatial frequency domain. This aliasing effect is not present in the real images we looked at because real images have blur.

Bowtie of bush f t f Θ Here is the bowtie for the bush sequence. Again, when we project the 3D power spectrum in the direction of the axis of the bowtie, the bowtie is clearly visible.

Q: How to compute motion direction ? A: rotate a wedge and measure power
IN the previous two examples, I showed you the bowtie, but I knew in advance what the direction of the bowtie was because I was the one who made the image sequence ! But how can we estimate the direction of the bowtie( or equivalently, the direction of the motion) AUTOMATICALLY from an image sequence. The way we do this is to use a trick. On the left we have a bowtie. On the right, we have a wedge-shaped region in the frequency domain. This wedge is defined in advance. The idea is to consider how much power falls in the wedge as a function of the orientation of the wedge. Suppose we take the wedge on the right and rotate it around the ft axis. When the wedge is aligned with the bowtie (as is the case shown) there will be no power in the wedge. As the wedge rotates through other angles, it will intersect more and more the power within the bowtie. So we can estimate the direction of motion by looking at the amount of power in the wedge as a function of the orientation. Minimum of power in wedge occurs when wedge is aligned with the bowtie.

Computing the direction of motion
minimum of power motion direction The main observation is that the true motion direction is perpendicular to the minimum in power. You can see this for the plot on the left which shows the amount of power in the wedge as a function of the orientation of the wedge, as we spin it around the omaga_t axis. The minimum of power in the wedge occurs at a certain angle. The motion direction is 90 degrees away from that angle. This is data for the falling sphere sequence. Notice that, IN THIS EXAMPLE, the motion direction corresponds to the MAXIMUM of power in the wedge. One might ask the question. Does the motion direction always correspond to the maximum of power in the wedge? It turns out that the answer is NO. Here is where the aperture problem comes in. The motion direction is perpendicular to the direction of minimum of power.

Aperture Problem Vertically falling cylinders appear
“normal” direction The aperture problem can be best illustrated with an example. Instead of the falling sphere sequence. Consider a set of falling cylinders such as shown on the right. The cylinders are falling vertically. However, perceptually they appear to be falling at a diagonal direction, what is called the normal direction. The reason for this is that our visual system assumes that the motion is perpendicular to axis of the cylinders. Because this is essentially a 1D image, there is no information to say that the cylinders are falling vertically. Vertically falling cylinders appear to move in normal direction.

Aperture Problem “normal” direction (max of power) true motion
If we take the Fourier transform of the image sequence, and plot the power in the wedge as a orientation, we find that the maximum in power occurs in the normal velocity direction which is diagonal, not a the true motion direction which is vertical. If you look at the curve of power on the left, there is no way to automatrically identify the true motion direction. This is what you would expect. There is no information in the image to tell you the true motion direction. THIS IS HOW THE APERTURE PROBLEM ARISES IN OPTICAL SNOW.

Aperture problem ? falling ellipsoids same power but random phase
One of the limitations of the fourier-method I have presented is that it does not consider local image features such as edges, T-junctions, and X-junctions. Look at the sequence on the left. The motion is vertical. The reason is presumably the boundaries of each object are visible. The image sequence on the right has the same power spectrum as the one on the left, and so it is susceptible to the same aperture problem as I described in the previous slide. If you look continue to look at the sequence on the right for ten seconds or more, you will notice a small bias to see the motion as downwards to the right, rather than purely down . This is the aperture problem coming into play. The visual system gets confused between the direction of motion, and the direction of spatial structure. In future work, we will look at how local image features can be used to resolve this aperture problem. ? falling ellipsoids same power but random phase

Summary Optical snow: a new motion category
Fourier-based method for detecting direction of motion Analysis of aperture problem

Optical Snow and the Aperture Problem

Similar presentations

Presentation on theme: "Optical Snow and the Aperture Problem"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Optical Snow and the Aperture Problem

Similar presentations

Presentation on theme: "Optical Snow and the Aperture Problem"— Presentation transcript:

Similar presentations

About project

Feedback