Recognition by Parts Visual Recognition Lecture 12 “The whole is equal to the sum of its parts” Euclid.

Recognition by Parts Visual Recognition Lecture 12 “The whole is equal to the sum of its parts” Euclid

Main approaches to recognition: Pattern recognition Pattern recognition Invariants Invariants Alignment Alignment Part decomposition Part decomposition Functional description Functional description

Recognize !

“One of the most interesting aspect of the world is that it can be considered to be made up of patterns. A pattern is essentially an arrangement. It is characterized by the order of the elements of which it is made rather than by intrinsic nature of the elements” “One of the most interesting aspect of the world is that it can be considered to be made up of patterns. A pattern is essentially an arrangement. It is characterized by the order of the elements of which it is made rather than by intrinsic nature of the elements” Norbert Wiener

Nonsense Object The description reflect the working of a representational system The description reflect the working of a representational system Segmentation at regions of deep concavity Segmentation at regions of deep concavity Parts are described with common volumetric terms Parts are described with common volumetric terms The manner of segmentation and analysis into components does not depend on our familiarity with the object The manner of segmentation and analysis into components does not depend on our familiarity with the object

Issues Why parts? Why partition the shape? Why parts? Why partition the shape? How does the visual system decompose shapes into parts ? How does the visual system decompose shapes into parts ? Are parts chosen arbitrarily by the visual system? Are parts chosen arbitrarily by the visual system? How the 3D parts of an object are inferred from its 2D projection delivered by the eye? How the 3D parts of an object are inferred from its 2D projection delivered by the eye? Etc. Etc.

Between Speech and OR Number of categories rivals the number of words that can be identified from speech Number of categories rivals the number of words that can be identified from speech Speech perception: by identification of primitive elements – phonemes Speech perception: by identification of primitive elements – phonemes Small set of primitives (English 44) each with a handful of attributes Small set of primitives (English 44) each with a handful of attributes The representational power derives from combinations of the primitives The representational power derives from combinations of the primitives

OR – The Visual Domain Primitives – modest number of simple geometric components Primitives – modest number of simple geometric components Generally convex and volumetric (cylinders, blocks, cones, etc.) Generally convex and volumetric (cylinders, blocks, cones, etc.) Segmentation at regions of sharp concavity Segmentation at regions of sharp concavity Primitives derive from combinations of few qualitative characteristics of the edges in the 2D image (straight vs. curved, symmetry etc.) Primitives derive from combinations of few qualitative characteristics of the edges in the 2D image (straight vs. curved, symmetry etc.) These particular properties of edges are invariant over changes in orientation and can be determined from just a few points on each edge These particular properties of edges are invariant over changes in orientation and can be determined from just a few points on each edge Tolerance for variations of viewpoint, occlusion, noise Tolerance for variations of viewpoint, occlusion, noise The representational power derives from the enormous number of combinations The representational power derives from the enormous number of combinations

Count VS. Mass Noun Objects Categorization of isolated (unanticipated) objects Categorization of isolated (unanticipated) objects Modeling is limited to concrete entities with specified boundaries Modeling is limited to concrete entities with specified boundaries Mass nouns (water, sand) do not have a simple volumetric description and are identified differently. Primarily through surface characteristics (texture, color) Mass nouns (water, sand) do not have a simple volumetric description and are identified differently. Primarily through surface characteristics (texture, color)

Unexpected Object Recognition Is possible (not an obvious conclusion) Is possible (not an obvious conclusion) Can be done rapidly Can be done rapidly When viewed from novel orientations When viewed from novel orientations Under moderate level of visual noise Under moderate level of visual noise When partially occluded When partially occluded When it is a new exemplar of a category When it is a new exemplar of a category

Resulting Constraints Access to mental representation should not be dependent on absolute judgment of quantitative detail Access to mental representation should not be dependent on absolute judgment of quantitative detail The information that is the basis of recognition should be relatively invariant with respect to orientation and modest degradation The information that is the basis of recognition should be relatively invariant with respect to orientation and modest degradation Partial matches should be computable Partial matches should be computable

RBC: Recognition-By-Components The contribution: a proposal for a particular vocabulary of components derived from perceptual mechanisms and its account of how an arrangement of these components can access a representation of an object in memory The contribution: a proposal for a particular vocabulary of components derived from perceptual mechanisms and its account of how an arrangement of these components can access a representation of an object in memory

Issues Stages up to and including the identification of components are assumed to be bottom-up Stages up to and including the identification of components are assumed to be bottom-up It is likely that top-down routes (e.g. from expectancy, object familiarity, scene constraints) will be observed at number of the stages (e.g. segmentation, component definition, matching) – omitted in the interests of simplicity It is likely that top-down routes (e.g. from expectancy, object familiarity, scene constraints) will be observed at number of the stages (e.g. segmentation, component definition, matching) – omitted in the interests of simplicity Matching of the components occurs in parallel Matching of the components occurs in parallel Partial matches are possible (degree of match is proportional to the similarity in the components between image and representation) Partial matches are possible (degree of match is proportional to the similarity in the components between image and representation)

Geons - Units of Representation Segmentation into separate regions at points of deep concavity (particularly at cusps where there are discontinuities in curvature) Segmentation into separate regions at points of deep concavity (particularly at cusps where there are discontinuities in curvature) Transversality – paired concavities arise whenever convex volumes are joined Transversality – paired concavities arise whenever convex volumes are joined Each segmented region is approximated by one of a possible set of simple components = geons (geometrical ions) Each segmented region is approximated by one of a possible set of simple components = geons (geometrical ions) Can be modeled by generalized cones: volume swept out by a cross section moving along an axis Can be modeled by generalized cones: volume swept out by a cross section moving along an axis

Geons Are hypothesized to be simple, typically symmetrical volumes lacking sharp concavities (e.g. blocks, cylinders, spheres) Are hypothesized to be simple, typically symmetrical volumes lacking sharp concavities (e.g. blocks, cylinders, spheres) Can be differentiated on the basis of perceptual properties in the 2D image that are readily detectable and relatively independent of viewing position and degradation (e.g. good continuation, symmetry) Can be differentiated on the basis of perceptual properties in the 2D image that are readily detectable and relatively independent of viewing position and degradation (e.g. good continuation, symmetry) Objects can be complex – the units are simple and regular Objects can be complex – the units are simple and regular

Relations Among the Geons The arrangement of primitives is necessary for representing a particular object The arrangement of primitives is necessary for representing a particular object Different arrangements of the same components can lead to different objects Different arrangements of the same components can lead to different objects

Perceptual Basis for RBC Certain properties of edges in 2D are taken by the visual system as strong evidence that the 3D edges contain those same properties Certain properties of edges in 2D are taken by the visual system as strong evidence that the 3D edges contain those same properties Nonaccidental properties – would only rarely be produced by accidental alignments of viewpoint and object features Nonaccidental properties – would only rarely be produced by accidental alignments of viewpoint and object features Five nonaccidental properties:  Collinearity – the edge in the 3D world is also straight  Curvilinearity – smoothly curved elements in the image are inferred to arise from smoothly curved features in the 3D world  Symmetry – the object projecting the image is also symmetrical  Parallelism  Cotermination

Nonaccidental Properties Witkin & Tenenbaum 83: surface’s silhouette override the perceptual interpretation of the luminance gradient

Penrose Impossible Triangle

Cotermination – accidental alignment of the ends of noncoterminous segments Cotermination – accidental alignment of the ends of noncoterminous segments

Muller-Lyer Illusion

Y, arrow, and L vertices allow inference as to the identity of the volume in the image

Generating Geons from GC The primitives should be rapidly identifiable and invariant over viewpoint and noise The primitives should be rapidly identifiable and invariant over viewpoint and noise Differences among components are based on differences in nonaccidental properties Differences among components are based on differences in nonaccidental properties Variation over the nonaccidental relations of four attributes of GC generates a set of 36 geons Variation over the nonaccidental relations of four attributes of GC generates a set of 36 geons

Geon Set The characteristics of the cross section: Shape, Symmetry, Constancy of size along the axis (2 x 3 x 3) The characteristics of the cross section: Shape, Symmetry, Constancy of size along the axis (2 x 3 x 3) The shape of the axis ( x 2) The shape of the axis ( x 2) Here figures 6 and or 7 Here figures 6 and or 7

Nonaccidental 2D Contrasts Among Geons The values of the 4 attributes can be directly detected as differences in nonaccidental properties e.g. : The values of the 4 attributes can be directly detected as differences in nonaccidental properties e.g. :  Cross-section edges and curvature of the axis – collinearity or curvilinearity  Constant vs expand size of the cross section – parallelism Specification of the above is sufficient to uniquely classify a given arrangement of edges as one of the 36 geons Specification of the above is sufficient to uniquely classify a given arrangement of edges as one of the 36 geons

More Distinctive Nonaccidental Differences The arrangement of vertices – a richer description

RBC - Summary A specific set of primitives is derived from small number of independent characteristics of the input A specific set of primitives is derived from small number of independent characteristics of the input The perceptual system is designed to represent the free combination of a modest number of primitives based on simple perceptual contrast The perceptual system is designed to represent the free combination of a modest number of primitives based on simple perceptual contrast Geons are uniquely specified from their 2D image properties ( -> 3D object centered reconstruction is not needed) Geons are uniquely specified from their 2D image properties ( -> 3D object centered reconstruction is not needed) The input is mapped onto this modest number of primitives. Then using a representational system we can code and access free combinations of these primitives The input is mapped onto this modest number of primitives. Then using a representational system we can code and access free combinations of these primitives

RBC – General Principles A line drawing which represents discontinuities is an efficient description and sufficient for primal access A line drawing which represents discontinuities is an efficient description and sufficient for primal access Objects are better represented and analyzed by decomposing them into their natural components – parts Objects are better represented and analyzed by decomposing them into their natural components – parts A qualitative description of the components is necessary and sufficient to permit fast access to DB of object models A qualitative description of the components is necessary and sufficient to permit fast access to DB of object models Non-accidental instances of viewpoint invariant features in the 2D line drawing are sufficient to permit fast access to the qualitative model of a 3D object Non-accidental instances of viewpoint invariant features in the 2D line drawing are sufficient to permit fast access to the qualitative model of a 3D object Primal access for visual OR is obtained by matching a description of the spatial structure of components making up the object to an indexed DB of models in similar representation Primal access for visual OR is obtained by matching a description of the spatial structure of components making up the object to an indexed DB of models in similar representation

RBC – Computational Hypotheses Five specific classes of 2D line groupings are sufficient to access the parts representation Five specific classes of 2D line groupings are sufficient to access the parts representation Segmentation should happen at concavities in the outline of an object Segmentation should happen at concavities in the outline of an object The geons form an efficient qualitative shape representation for the parts which is suitable for primal access The geons form an efficient qualitative shape representation for the parts which is suitable for primal access The symbolic description for objects and models should include geon labels aspect ratios and relative sizes of parts The symbolic description for objects and models should include geon labels aspect ratios and relative sizes of parts

Implementations PARVO - Bergevin and Levine 1988 PARVO - Bergevin and Levine 1988 OPTICA – Dickinson, Rosenfeld, Pentland 1989 OPTICA – Dickinson, Rosenfeld, Pentland 1989 Munck-Fairwood 1991 Munck-Fairwood 1991 Pentland and Sclaroff 1991 Pentland and Sclaroff 1991 Raja and Jain 1992 Raja and Jain 1992

Example - Recovering Geons using Superquadrics Lame curves (1818) : Superellipse (Hein 1960) Where p even positive integer and q odd positive integer

Superellipse From star-shape to a square in the limit

Superellipsoid 3D surface is obtained by the spherical product of two 2D curves

e1 0.1 1 2 e2 0.1 1 2

Superquadrics Barr 1981 – extension to Include superhyperboloids (1-2 pieces) and supertoroids

Superquadrics in Genral Position From world coordinates to SQ centered (11DOF)

Issues Domain: Suitable mainly for categorization. Suitable mainly for categorization.Problems: Extracting parts from the image is often difficult and unreliable. Extracting parts from the image is often difficult and unreliable. Many objects cannot be distinguished by their part structure only. Many objects cannot be distinguished by their part structure only. Metric information is essential in many cases. Metric information is essential in many cases.

Recognition by Parts Visual Recognition Lecture 12 “The whole is equal to the sum of its parts” Euclid.

Similar presentations

Presentation on theme: "Recognition by Parts Visual Recognition Lecture 12 “The whole is equal to the sum of its parts” Euclid."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Recognition by Parts Visual Recognition Lecture 12 “The whole is equal to the sum of its parts” Euclid.

Similar presentations

Presentation on theme: "Recognition by Parts Visual Recognition Lecture 12 “The whole is equal to the sum of its parts” Euclid."— Presentation transcript:

Similar presentations

About project

Feedback