Visual Perception of 3D Shape Roland W. Fleming Manish Singh Max Planck Institute for Biological Cybernetics Rutgers University – New Brunswick.

Visual Perception of 3D Shape Roland W. Fleming Manish Singh Max Planck Institute for Biological Cybernetics Rutgers University – New Brunswick

The problem of 3D perception Bishop Berkeley (1685-1753): "It is I think agreed by all that distance of itself, and immediately, cannot be seen. For distance being a line directed end-wise to the eye, it projects only one point in the fund of the eye, which point remains invariably the same whether the distance be longer or shorter." P1P1 P2P2 P

The optics of the eye project the 3D world onto a 2D image plane on the retina. What we as behaving organisms care about is the 3D structure of the world. Unfortunately the projection from 3D to 2D is not invertible. The problem of 3D perception Image [2D] World [3D]

Multiple surfaces are consistent with any given image, so 3D shape perception is fundamentally ambiguous It is an inference from incomplete information The problem of 3D perception

Ambiguities in 3D Perception Necker Cube 2 dominant interpretations

Ambiguities in 3D Perception 2 dominant interpretations Only a handful of legal interpretations are generally experienced. Why? Note that neither of these two interpretations are correct perspective projections!

Philosophical Schools Constructivism (e.g. Helmholtz, Gregory, Rock) – vision is ill-posed: sensory data are impoverished – the world we see is a construction – perception is a process of inductive inference – Extra-retinal information and assumptions about the world play a central role Direct Perception (e.g. Gibson) – “ambient optic array” contains sufficient information to support action – we perceive the world directly, through active interaction – the relevant information is global and comparative

Philosophical Schools Gestalt Perception (e.g. Koffka, Metzger, Kohler) – vision is all about structure – the interpretation that we experience is determined by the interaction of simple rules describing the organization of the interpretation – The simplest interpretation is favoured: Prägnanz time

Explaining the Necker Cube 2 dominant interpretations Constructivism: the percepts are the most probable interpretations Direct Perception: the relevant image information specifies these interpretations, but such ambiguous images are rarely encountered in the real world, and we normally resolve the ambiguity through interaction Gestalt: the percepts are the simplest, ‘most orderly’ interpretations.

Perception Pipeline image

Perception Pipeline cues image shading texture

Perception Pipeline cues image shading texture shape estimate shape estimate

Perception Pipeline cues priors image shading texture shape estimate shape estimate “Surfaces are generally smooth” “Texture tends to be isotropic” “Light usually comes from above”

Generic Viewpoint Assumption Koenderink & van Doorn (1979). Binford (1981). Freeman (1994).

Image-based material editing Kahn, Reinhard, Fleming & Bülthoff (2006). Transactions on Graphics: Proceedings of SIGGRAPH 06. © ACM SIGGRAPH. transparencyre-textured  Given single photograph as input, modify material appearance of object.  Physically correct solution not possible: aim for ‘perceptually correct’ solution.  Exploit assumptions of human vision to develop heuristics.

Crude Shape Reconstruction Light from the side: shadows and intensity gradient leads to substantial distortions of the face original reconstructed depths

Importance of viewpoint Substantial errors in depth reconstruction are not visible in transformed image transformed image correct viewpoint

Importance of viewpoint

Seen from Above

Hollow Mask Illusion Convexity and familiarity combine to yield a strong sense that the mask is convex, even when it is concave. But note that the apparent lighting and shape is different. convexconcavetransition

Bas-Relief Ambiguity Scenes related to one another by an affine transformation are indistinguishable from one another Belhumeur, Kriegman & Yuille (1997)

Scenes related to one another by an affine transformation are indistinguishable from one another Bas-Relief Ambiguity Belhumeur, Kriegman & Yuille (1997)

Bas-Relief Ambiguity Belhumeur, Kriegman & Yuille (1997) showed that shape from shading information is fundamentally ambiguous. For direct illumination, scenes that are related to one another by an affine transformation (scaling + shearing) yield pixel-for-pixel identical images. Despite this we rarely experience any ambiguity in the perception of shaded objects. Everyday perception gives us the impression that we see objects in a correct and stable way. But do we? Koenderink and colleagues have shown that perceived shape varies considerably from day to day, with the percepts typically related to one another by an affine transformation.

Light from Above In the absence of other information to indicate shape or lighting direction, the brain assumes light comes from above “light” from below “light” from above

Linear Perspective

Bounding Contours © Dejan Todorović, 2009. Adapted and used with permission

Bounding Contours

Structure from Motion Individual frames carry a relatively weak sense of 3D shape. It is only through optic flow (motion) that the shape is revealed

Pattern of compressions and rarefactions across the image indicates something about the 3D shape. Shape from Texture

Isotropic compression of textures due to distance Shape from Texture

Anisotropic compression of textures due to slant Shape from Texture

Anisotropic compression of textures due to slant

Shape from Texture Anisotropic compression of textures due to slant

Anisotopic compression specifies surface orientation up to a 180° ambiguity on the surface tilt. This means we can experience perceptual flips (bistability) when there are no other cues to specify convexity vs. concavity Under orthographic projection, there is no isotropic compression and no convergence, so we can see the red line as lying either on a ridge or in a valley

Under perspective projection, isotropic compression (scale gradient) and convergence cues resolve the ambiguity. We experience the red line as lying on a ridge, and not on a valley.

Homogeneous: the statistics of the texture are uniform from location to location. This is necessary to ensure that changes in the statistics of the texture observed in the image are due solely to the process of projection into the image plane and are not intrinsic to the texture itself Isotropic: the texture does not have a dominant local orientation. This is necessary to ensure that anisotropic compressions are aligned with the depth gradient of the surface Assumptions in Shape from Texture

Illusory distortions of shape Inspired by Todd & Thaler VSS 05

Illusory distortions of shape

Inspired by Todd & Thaler VSS 05 Illusory distortions of shape

Interaction of light with surface

Matte Glossy Mirrored

Ambiguity between illumination and Shape

reflectance mapimage Classical Shape from Shading Visual system estimates surface orientation from image intensity

Classical Shape from Shading reflectance map Image intensity is a scalar but surface orientation is a vector Recovering orientation from intensity is under-constrained Large amount of computer vision research proposing ways to reduce this ambiguity Problem: image intensity is ambiguous:

Visual system estimates surface orientation from image intensity Classical Shape from Shading reflectance map Circular logic: estimating the reflectance map requires knowing the geometry. Under typical viewing conditions, it is unclear how well subjects can estimate the reflectance map. Problem: reflectance map is unknown:

Visual system estimates surface orientation from image intensity Classical Shape from Shading reflectance map There is no principled way of predicting when human shape perception should succeed or fail Successes attributed to correct estimation of reflectance map, errors to incorrect estimates of reflectance map. But why and when should this occur? Problem: predicting human perception

Use image measurements other than intensity Use the kinds of image measurements the visual system employs at the front end Alternative approach reflectance mapimage

Mirrors No stereopsis No diffuse shading No texture Nothing but a distorted reflection of the world surrounding the object! Yet we perceive the 3D shape. How? Fleming, Torralba & Adelson (2004). Journal of Vision.

highly curved Curvatures determine distortions

slightly curved Anisotropies in surface curvature lead to powerful distortions of the reflected world Curvatures determine distortions

Eigenvectors of Hessian matrix Intrinsic principal curvatures

image depths

Population codes

Orientation fields Ground truth

3D shape appears to be conveyed by the continuously varying patterns of orientation across the image of a surface

Beyond specularity Specular reflection Diffuse reflection

Orientations in shading

Orientation fields in shading

Reflectance as Illumination Mirrors in an increasingly blurry world

highly curved

slightly curved Anisotropies in surface curvature lead to anisotropies in the image.

Texture vs. Reflectance

“Shape from Smear”

Higher level shape properties Neither object is physically unstable (falling over) But: one “affords being toppled” more than the other

Perceived Shape is Multi-Scale Coarse Mid Fine

Perceived Shape is Multi-Scale Lee, C. H., Varshney, A. & Jacobs, D. W., Mesh saliency, in SIGGRAPH '05: ACM SIGGRAPH 2005 Papers, pp. 659-666 (New York, NY, USA: ACM, 2005). © ACM SIGGRAPH 2005, All rights reserved. Mesh Saliency

Perceived Shape is Multi-Scale Lee, C. H., Varshney, A. & Jacobs, D. W., Mesh saliency, in SIGGRAPH '05: ACM SIGGRAPH 2005 Papers, pp. 659-666 (New York, NY, USA: ACM, 2005). © ACM SIGGRAPH 2005, All rights reserved. Coarse spatial scaleFine spatial scale Applications : Level of Detail Hiding Watermarks Viewpoint selection

Conclusions There are many different cues to 3D shape, which the human visual system can draw on under typical viewing conditions. Most cues are ambiguous or unreliable if considered in isolation. The secret of conveying shape effectively is to provide multiple cues. Orientation fields may be an important common language in human shape processing. There are probably many other applications in CG that can exploit this. Many of the assumptions made by human vision can be exploited in a computer graphics applications. Richer, more perceptual representations of geometry are an exciting challenge for the future.

Visual Perception of 3D Shape Roland W. Fleming Manish Singh Max Planck Institute for Biological Cybernetics Rutgers University – New Brunswick.

Similar presentations

Presentation on theme: "Visual Perception of 3D Shape Roland W. Fleming Manish Singh Max Planck Institute for Biological Cybernetics Rutgers University – New Brunswick."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Visual Perception of 3D Shape Roland W. Fleming Manish Singh Max Planck Institute for Biological Cybernetics Rutgers University – New Brunswick.

Similar presentations

Presentation on theme: "Visual Perception of 3D Shape Roland W. Fleming Manish Singh Max Planck Institute for Biological Cybernetics Rutgers University – New Brunswick."— Presentation transcript:

Similar presentations

About project

Feedback