Recognizing hand-drawn images using shape context Gyozo Gidofalvi Computer Science and Engineering Department University of California, San Diego November 29, 2001
Shape Context by Mori et al. Key idea: represent an image in terms of descriptors at certain locations that describe the image relative to those locations Shape context of a point is the histogram of the relative positions of all other points in the image. Use bins that are uniform in log-polar space to emphasize close-by, local structure.
Representative shape context: efficient retrieval of similar shapes by Mori et al. Matching: Given two images, represented as n shape context descriptors, we want to find a one-to-one assignment of these descriptors, such that the X 2 distance for the assignment is minimized O(n 3 ) algorithm. Fast Pruning: 1.Represent the query image by a small number of shape context descriptors 2.To calculate the cost of a match between the query image and an image in a DB perform nearest neighbors search 3.Return a short list of the fist K best matches
Hand drawn images Mori et al. tested the representative shape context method on the Snodgrass and Vanderwart line drawings. Queries were distorted versions the original images. We gathered 6 sets of samples for these line drawings and used them as queries. All images were cropped and scaled to 500 by 500 pixels.
Performance on on hand-drawn images Results of 300 queries for varying length of short list returned Pruning factor = ( number of images in DB ) / ( length of the short list )
Sampling shape context Can we improve performance? Shape context of which points should represent and image? Pixel-density based sampling: Promote points with higher or lower densities? spread outpromote higher densitypromote lower density
Density-based sampling results promoting points with higher pixel-densities promoting points with lower pixel-densities
Finding embedded objects Task: Given a query image that may contain some clutter around an hand-drawn object find objects from the DB that are most similar to it. How does the presence of clutter affect the recognition of the hand-drawn object? To obtain images for embedded objects we find the outline of the object, construct a binary mask for it, and using logical operations (AND OR) we copy the clutter around the object. Finding the outline of objects is done using a method similar to flood-fill.
Embedding objects into some clutter
Acknowledgments Thanks to Mori at al. for providing me with source code for the shape context matching. Thanks to Serge for guidance, ideas. Special thanks to my Dad, my brother, and Hector for drawing me sample objects.