Chaincode Generation Contour separation extracted by algorithm Image Chaincode contour Represented as an array of coordinates and corresponding slopes.

Slides:



Advertisements
Similar presentations
Patient information extraction in digitized X-ray imagery Hsien-Huang P. Wu Department of Electrical Engineering, National Yunlin University of Science.
Advertisements

Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.
Top-Down & Bottom-Up Segmentation
Clustering.
A Graph based Geometric Approach to Contour Extraction from Noisy Binary Images Amal Dev Parakkat, Jiju Peethambaran, Philumon Joseph and Ramanathan Muthuganapathy.
QR Code Recognition Based On Image Processing
Word Spotting DTW.
電腦視覺 Computer and Robot Vision I
November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.
Computational Biology, Part 23 Biological Imaging II Robert F. Murphy Copyright  1996, 1999, All rights reserved.
Regional Processing Convolutional filters. Smoothing  Convolution can be used to achieve a variety of effects depending on the kernel.  Smoothing, or.
Computer Vision Lecture 18: Object Recognition II
Document Image Processing
Computer Vision Lecture 16: Texture
Chapter 8 Content-Based Image Retrieval. Query By Keyword: Some textual attributes (keywords) should be maintained for each image. The image can be indexed.
Lecture 07 Segmentation Lecture 07 Segmentation Mata kuliah: T Computer Vision Tahun: 2010.
電腦視覺 Computer and Robot Vision I Chapter2: Binary Machine Vision: Thresholding and Segmentation Instructor: Shih-Shinh Huang 1.
Computer Vision Lecture 16: Region Representation
Each pixel is 0 or 1, background or foreground Image processing to
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
1Ellen L. Walker Edges Humans easily understand “line drawings” as pictures.
Pattern recognition Professor Aly A. Farag
Image Analysis Preprocessing Image Quantization Binary Image Analysis
Segmentation Divide the image into segments. Each segment:
Smart Traveller with Visual Translator. What is Smart Traveller? Mobile Device which is convenience for a traveller to carry Mobile Device which is convenience.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
E.G.M. PetrakisBinary Image Processing1 Binary Image Analysis Segmentation produces homogenous regions –each region has uniform gray-level –each region.
Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.
Oral Defense by Sunny Tang 15 Aug 2003
CS 484 – Artificial Intelligence
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
November 29, 2004AI: Chapter 24: Perception1 Artificial Intelligence Chapter 24: Perception Michael Scherger Department of Computer Science Kent State.
CS 376b Introduction to Computer Vision 02 / 26 / 2008 Instructor: Michael Eckmann.
Chapter 10: Image Segmentation
嵌入式視覺 Pattern Recognition for Embedded Vision Template matching Statistical / Structural Pattern Recognition Neural networks.
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
ENT 273 Object Recognition and Feature Detection Hema C.R.
Classification / Regression Neural Networks 2
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
September 23, 2014Computer Vision Lecture 5: Binary Image Processing 1 Binary Images Binary images are grayscale images with only two possible levels of.
Digital Image Processing CCS331 Relationships of Pixel 1.
Intro to Raster GIS GTECH361 Lecture 11. CELL ROW COLUMN.
Image Processing Edge detection Filtering: Noise suppresion.
Introduction Image geometry studies rotation, translation, scaling, distortion, etc. Image topology studies, e.g., (i) the number of occurrences.
Handwritten Hindi Numerals Recognition Kritika Singh Akarshan Sarkar Mentor- Prof. Amitabha Mukerjee.
A survey of different shape analysis techniques 1 A Survey of Different Shape Analysis Techniques -- Huang Nan.
By Pushpita Biswas Under the guidance of Prof. S.Mukhopadhyay and Prof. P.K.Biswas.
October 1, 2013Computer Vision Lecture 9: From Edges to Contours 1 Canny Edge Detector However, usually there will still be noise in the array E[i, j],
Course 5 Edge Detection. Image Features: local, meaningful, detectable parts of an image. edge corner texture … Edges: Edges points, or simply edges,
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Instructor: Mircea Nicolescu Lecture 5 CS 485 / 685 Computer Vision.
Course 3 Binary Image Binary Images have only two gray levels: “1” and “0”, i.e., black / white. —— save memory —— fast processing —— many features of.
Morphological Image Processing (Chapter 9) CSC 446 Lecturer: Nada ALZaben.
Digital Image Processing CCS331 Relationships of Pixel 1.
Optical Character Recognition
Leaves Recognition By Zakir Mohammed Indiana State University Computer Science.
April 21, 2016Introduction to Artificial Intelligence Lecture 22: Computer Vision II 1 Canny Edge Detector The Canny edge detector is a good approximation.
Digital Image Fundamentals
Course : T Computer Vision
Frequency Distributions and Their Graphs
Mean Shift Segmentation
Computer Vision Lecture 5: Binary Image Processing
Fitting Curve Models to Edges
Computer Vision Lecture 9: Edge Detection II
Computer Vision Lecture 16: Texture II
Computer and Robot Vision I
CSC 381/481 Quarter: Fall 03/04 Daniela Stan Raicu
Data Transformations targeted at minimizing experimental variance
Fourier Transform of Boundaries
Computer and Robot Vision I
Presentation transcript:

Chaincode Generation Contour separation extracted by algorithm Image Chaincode contour Represented as an array of coordinates and corresponding slopes (0..7) at each contour point Input Eight Contour Directions Y X Status Slope Mode Curvature data information data information end of chain Output CEDAR

Algorithm Start at upper right corner of image Travel to the left, down a row, until you move from white pixel to black pixel Travel counter-clock wise around boundary, storing visited pixels, and marking pixels as necessary, until you return to the start of the contour - Array of bytes representing pixels. - Value 0 for black and 255 for white. Contour representation of the image Input New object, so mark pixel and store it Is it marked? At lower left corner? Yes No CEDAR Output

Pre-scan Digit Recognition Use fast digit recognizer - POLY OR CP on each appropriate component in address block Chaincode contour of connected components in address block - Recognition choice with confidence on each component - Confidence of characters are typically low - Confidence of “real” numerals are typically high Input Output CEDAR

POLY Digit Recognizer CEDAR Method –1240 binary pixel pair features used –Linear discriminant classifier used Performance –1000 digits per second on a RS 6000 –94% recognition rate on a standard test set –useful in separating alpha characters and numerals Feature Extraction –Set of 1240 binary (on, off) features –Features are based on whether particular pixels or pairs of pixels are BLACK –Pixel pairs are empirically determined –Consider distinguishing “7” and “2”

CEDAR Classification –Uses linear discriminant functions –Training: 1241 weights (one for each of the features) plus a constant are determined for each of the 10 classes –Testing: For each new test image do the following: For each digit class (0..9) create a sum consisting of all the weights corresponding to a feature that is “on”, add in the constant Compare the 10 sums and choose the largest value This is the top choice class –Output: Ranked list of the 10 classes sorted by the sums

CP, Digit Recognizer CEDAR Method –combines a 3-layer back propagation neural network classifier using Curvature features with POLY –Top 2 choices of POLY and Top 2 choice of Curvature recognizer are combined using logistic regression Performance –170 digits per second on a RS 6000 –96% recognition rate on standard test set

Curvature, Digit Recognizer CEDAR Input –Binary image of digit size normalized by imposing a 4x4 grid on the image –Since the features are region based (as opposed to pixel based) this form of size normalization is effective

CEDAR Feature Extraction –Set of 296 real-valued features 208 based on contour shape (slope and curvature) –For each of the 16 regions in the 4x4 grid determine percent of pixels with each of the 8 possible slopes percent of pixels with each of 5 ranges of curvature computed over a neighborhood of 12-pixel window dS S Slopes Curvatures

CEDAR 84 based on stroke transitions between regions –Chaincode represents the contour as a sequence of boundary pixels, so there is a notion of “moving” from part of the image to another –In a 4x4 grid, there are 84 possible transitions 4 based on size, location, and number of interior contours –Image is divided into 3 regions: UPPER, MIDDLE, and LOWER –Determine the center of region bounded by interior contours –Location of center determines which of 3 features is set –Value of feature is ratio of “hole” area to area of bounding box –last feature stores number of interior contours present

CEDAR Classification –Uses a 3-layer back propagation neural network 296 input nodes for feature values 80 hidden nodes 10 output nodes (1 for each digit class) –Connections between nodes have associated weights determined during training –Output node reporting the highest value corresponds to classifier’s top choice

Thresholding Performance Graph CEDAR

GSC - Top Level Put bounding box around the image Hyper GSC Recognizer Is confidence level of first class 0? GSC Recognizer Output Input Output NO YES CEDAR

GSC, Digit Recognizer CEDAR Method –512 binary valued features representing Gradient, Structural, Concavity characteristics of the image –Uses a nearest neighbor classifier Performance –100 digits per second on a RS 6000 –97% recognition rate on standard test set

CEDAR Image Processing –Size normalization accomplished by imposing a 4x4 grid Grid is determined by partitioning the image horizontally and vertically into 4 equal pixel mass partitions Uniform and Variable Gridding % Reject

CEDAR Feature Extraction –Set of 512 binary features –Choice of features motivated by belief that multi-scale features have the best chance of capturing the difference between classes of digits or characters

CEDAR –192 Gradient features (finest scale) Gradient is the angle perpendicular to local direction of the contour boundary and is computed at every pixel Quantized to 12 different ranges of angles Histogram of occurrences of angles (ranges) for each of the 16 regions in the 4x4 grid are computed Histogram values that cross a threshold are turned “on” –192 Structural features (intermediate scale) 12 structures consisting of groups of pixels form mini-strokes –horizontal strokeupper and lower surfaces –vertical strokeleft and right surfaces –diagonal risingupper and lower surfaces –diagonal fallingupper and lower surfaces –corners(4) If any pixel group falling in a region (4x4) satisfies the rule for a mini-stroke, the feature is “on”

CEDAR –128 Concavity features (coarsest scale) 16 pixel density features –Does the percentage of “on” pixels in region (4 x 4) exceed a threshold 32 large stroke features –Does region (4 x 4) contain a horizontal run or vertical run of “on” pixels greater in length than a threshold 80 concavity features –Does region (4 x 4) contain a concavity pointing up down left right enclosed “hole”

CEDAR Classification –Identifies 6 nearest neighbors from among templates –Takes the weighted vote of the neighbors where each neighbor’s vote is weighted by its proximity to the test vector –Performance of classifier is dependent on how representative the templates are of the set of “all possible” digits

Gradient Features Input Output Put a 4 x 4 non-uniform grid on the image by placing sampling of a equimass divisions of the histogram Smooth the image by filtering Convolve the image with 3x3 Sobel operators to find the gradients Dividing the range of direction in 12 non- overlapping regions each of 2*pi/12 radians Do a histogram based thresholding for each sampling region In each of the 4x4 regions if there are no pixels with gradient values in a particular range then set the corresponding bit in feature vector to 1 (12 bit feature vector for each region corresponding to 12 bins of directions) 12x4x4 = 128 bit feature vector CEDAR

NOYES Structural Features Place a 4x4 fixed grid on the image Set the corresponding bit in feature vector to 1 (12 bit feature vector for each region signifying the 12 rules) Apply a set of 12 rules to each pixel to find the stroke and corner features Set the corresponding bit in feature vector to 1 12x4x4 = 128 bit feature vector as structural features Input Output For each of the 4x4 region is the no pixels satisfying a rule > the threshold set for the rule? CEDAR

NO Concavity Features Place 4x4 fixed grid on the image Convolve the image with a starlike operator by shooting rays in 8 directions and determining what each ray hits Define eight types of pixels depending on the way the rays shoot out from the pixel hit the boundary For each type of pixel define a threshold. For each type of pixel set aside a bit in the feature vector for each of the regions Set the corresponding bit in the feature vector to 1Set the corresponding bit in the feature vector to 0 IS (no of corresponding type of pixel)/(area of region) > threshold set for the type of pixel. 8x4x4 bit feature vector as the concavity features YES Input Output CEDAR

Word Recognition Control - Word Image - Lexicon - Word Recognizer - 1 (WMR) - Word Recognizer - 2 (CMR) Call WMR with expanded lexicon WMR results Call CMR with n-best WMR choices (n<11) CMR results Classifiers concur ? REJECT ACCEPT WMR top choice ACCEPT CMR top choice ACCEPT common top choice conf = LOconf = HI conf = LO NOYES conf(top) = MED conf = MED Input Output CEDAR conf = HI

WMR Over-segmentation of word into characters so that no two characters remain merged Features extracted from each segment - Chaincode of Word Image - Lexicon Rank the lexicon based on matching score Input Match one or more (up to four) segments with each character of a single lexicon entry Derive “goodness” of match between segments and a lexicon entry Score match for all lexicon entries Output CEDAR

WMR Features 74 chaincode based features are extracted - 2 global and 72 local features. Distribution of the 8 directional slopes for 9 (3 x 3) sub-images form the 72 local feature. –global features F g i = sigmoid ( ) for i = 1, 2 where H 1 = X max - X min, V 1 = Y max - Y min for aspect ratio H 2 = N horizontal_stroke, V 2 = N vertical_stroke for aspect ratio –local feature F l ij = for i = 1, 2,... 9 and j = 0, 1,... 7 where s ij = number of components with slope j from sub- image i N i = number of components from sub-image i S j = max ( ) H i - V i ViVi s ij NiSjNiSj NiNi i CEDAR

WMR w[7.6] w[7.2] r[3.8] w[5.0] w[8.6] o[7.6]r[6.3] d[4.9] w[5.0] o[6.6] o[6.0] o[7.2] o[10.6] d[6.5] d[4.4] r[7.5] r[6.4] o[7.8]r[8.6] o[8.7]r[7.4] r[7.6] o[8.3] o[7.7]r[5.8] o[6.1] Find the best way of accounting for characters ‘w’, ‘o’, ‘r’, ‘d’ buy consuming all segments 1 to 8 in the process Distance between lexicon entry ‘word’ first character ‘w’ and the image between: - segments 1 and 4 is segments 1 and 3 is segments 1 and 2 is 7.6 CEDAR

CMR Over segmentation of characters so that no two characters remain merged Features extracted from each segment - Chaincode of word image - Lexicon Rank the lexicon based on “goodness” score Input Recognize one or more (up to four) segments as a single character of the alphabet Obtain character strings (ASCII) corresponding to the segments in the word image Derive “goodness” of match between character string and lexicon entries Output CEDAR

CMR i[.8], l[.8] u[.5], v[.2] w[.6], m[.3] w[.7] i[.7] u[.3] m[.2] m[.1] r[.4] d[.8] o[.5] -Image from 1 to 3 is a in with 0.5 confidence -Image from segment 1 to 4 is a ‘w’ with 0.7 confidence -Image from segment 1 to 5 is a ‘w’ with 0.6 confidence and an ‘m’ with 0.3 confidence Find the best path in graph from segment 1 to 8 w o r d CEDAR

Hover System img ftrslex ftrs w o r d l e v e l phrase level match length match gaps match word lengths match match ascenders match r e j e c t match descenders a c c e p t CEDAR