Visual Recognition Lecture 12

Visual Recognition Lecture 12
Recognition by Parts Visual Recognition Lecture 12 “The whole is equal to the sum of its parts” Euclid

Main approaches to recognition:
Pattern recognition Invariants Alignment Part decomposition Functional description

Recognize !

“One of the most interesting aspect of the world is that it can be considered to be made up of patterns. A pattern is essentially an arrangement. It is characterized by the order of the elements of which it is made rather than by intrinsic nature of the elements” Norbert Wiener

Nonsense Object The description reflect the working of a representational system Segmentation at regions of deep concavity Parts are described with common volumetric terms The manner of segmentation and analysis into components does not depend on our familiarity with the object

Issues Why parts? Why partition the shape?
How does the visual system decompose shapes into parts ? Are parts chosen arbitrarily by the visual system? How the 3D parts of an object are inferred from its 2D projection delivered by the eye? Etc.

Between Speech and OR Number of categories rivals the number of words that can be identified from speech Speech perception: by identification of primitive elements – phonemes Small set of primitives (English 44) each with a handful of attributes The representational power derives from combinations of the primitives

OR – The Visual Domain Primitives – modest number of simple geometric components Generally convex and volumetric (cylinders, blocks, cones, etc.) Segmentation at regions of sharp concavity Primitives derive from combinations of few qualitative characteristics of the edges in the 2D image (straight vs. curved, symmetry etc.) These particular properties of edges are invariant over changes in orientation and can be determined from just a few points on each edge Tolerance for variations of viewpoint, occlusion, noise The representational power derives from the enormous number of combinations

Count VS. Mass Noun Objects
Categorization of isolated (unanticipated) objects Modeling is limited to concrete entities with specified boundaries Mass nouns (water, sand) do not have a simple volumetric description and are identified differently. Primarily through surface characteristics (texture, color)

Unexpected Object Recognition
Is possible (not an obvious conclusion) Can be done rapidly When viewed from novel orientations Under moderate level of visual noise When partially occluded When it is a new exemplar of a category

Resulting Constraints
Access to mental representation should not be dependent on absolute judgment of quantitative detail The information that is the basis of recognition should be relatively invariant with respect to orientation and modest degradation Partial matches should be computable

RBC: Recognition-By-Components
The contribution: a proposal for a particular vocabulary of components derived from perceptual mechanisms and its account of how an arrangement of these components can access a representation of an object in memory

Issues Stages up to and including the identification of components are assumed to be bottom-up It is likely that top-down routes (e.g. from expectancy, object familiarity, scene constraints) will be observed at number of the stages (e.g. segmentation, component definition, matching) – omitted in the interests of simplicity Matching of the components occurs in parallel Partial matches are possible (degree of match is proportional to the similarity in the components between image and representation)

Geons - Units of Representation
Segmentation into separate regions at points of deep concavity (particularly at cusps where there are discontinuities in curvature) Transversality – paired concavities arise whenever convex volumes are joined Each segmented region is approximated by one of a possible set of simple components = geons (geometrical ions) Can be modeled by generalized cones: volume swept out by a cross section moving along an axis

Geons Are hypothesized to be simple, typically symmetrical volumes lacking sharp concavities (e.g. blocks, cylinders, spheres) Can be differentiated on the basis of perceptual properties in the 2D image that are readily detectable and relatively independent of viewing position and degradation (e.g. good continuation, symmetry) Objects can be complex – the units are simple and regular

Relations Among the Geons
The arrangement of primitives is necessary for representing a particular object Different arrangements of the same components can lead to different objects

Perceptual Basis for RBC
Certain properties of edges in 2D are taken by the visual system as strong evidence that the 3D edges contain those same properties Nonaccidental properties – would only rarely be produced by accidental alignments of viewpoint and object features Five nonaccidental properties: Collinearity – the edge in the 3D world is also straight Curvilinearity – smoothly curved elements in the image are inferred to arise from smoothly curved features in the 3D world Symmetry – the object projecting the image is also symmetrical Parallelism Cotermination

Nonaccidental Properties
Witkin & Tenenbaum 83: surface’s silhouette override the perceptual interpretation of the luminance gradient

Penrose Impossible Triangle

Penrose Impossible Triangle
Cotermination – accidental alignment of the ends of noncoterminous segments

Muller-Lyer Illusion

Muller-Lyer Illusion Y, arrow, and L vertices allow inference as to the identity of the volume in the image

Generating Geons from GC
The primitives should be rapidly identifiable and invariant over viewpoint and noise Differences among components are based on differences in nonaccidental properties Variation over the nonaccidental relations of four attributes of GC generates a set of 36 geons

Geon Set The characteristics of the cross section: Shape, Symmetry, Constancy of size along the axis (2 x 3 x 3) The shape of the axis ( x 2) Here figures 6 and or 7

Nonaccidental 2D Contrasts Among Geons
The values of the 4 attributes can be directly detected as differences in nonaccidental properties e.g. : Cross-section edges and curvature of the axis – collinearity or curvilinearity Constant vs expand size of the cross section – parallelism Specification of the above is sufficient to uniquely classify a given arrangement of edges as one of the 36 geons

More Distinctive Nonaccidental Differences
The arrangement of vertices – a richer description

RBC - Summary A specific set of primitives is derived from small number of independent characteristics of the input The perceptual system is designed to represent the free combination of a modest number of primitives based on simple perceptual contrast Geons are uniquely specified from their 2D image properties ( -> 3D object centered reconstruction is not needed) The input is mapped onto this modest number of primitives. Then using a representational system we can code and access free combinations of these primitives

RBC – General Principles
A line drawing which represents discontinuities is an efficient description and sufficient for primal access Objects are better represented and analyzed by decomposing them into their natural components – parts A qualitative description of the components is necessary and sufficient to permit fast access to DB of object models Non-accidental instances of viewpoint invariant features in the 2D line drawing are sufficient to permit fast access to the qualitative model of a 3D object Primal access for visual OR is obtained by matching a description of the spatial structure of components making up the object to an indexed DB of models in similar representation

RBC – Computational Hypotheses
Five specific classes of 2D line groupings are sufficient to access the parts representation Segmentation should happen at concavities in the outline of an object The geons form an efficient qualitative shape representation for the parts which is suitable for primal access The symbolic description for objects and models should include geon labels aspect ratios and relative sizes of parts

Implementations PARVO - Bergevin and Levine 1988
OPTICA – Dickinson, Rosenfeld, Pentland 1989 Munck-Fairwood 1991 Pentland and Sclaroff 1991 Raja and Jain 1992

Example - Recovering Geons using Superquadrics
Lame curves (1818): Superellipse (Hein 1960) Where p even positive integer and q odd positive integer

Superellipse From star-shape to a square in the limit

Superellipsoid 3D surface is obtained by the spherical product of
two 2D curves

e2 0.1 1 2 e

Superquadrics Barr 1981 – extension to Include superhyperboloids
(1-2 pieces) and supertoroids

Superquadrics in Genral Position
From world coordinates to SQ centered (11DOF)

Issues Domain: Suitable mainly for categorization. Problems:
Extracting parts from the image is often difficult and unreliable. Many objects cannot be distinguished by their part structure only. Metric information is essential in many cases.

Visual Recognition Lecture 12

Similar presentations

Presentation on theme: "Visual Recognition Lecture 12"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Visual Recognition Lecture 12

Similar presentations

Presentation on theme: "Visual Recognition Lecture 12"— Presentation transcript:

Similar presentations

About project

Feedback