By George Kour Supervised By: Prof. Dana Ron Dr. Raid Saabne

Slides:



Advertisements
Similar presentations
Patient information extraction in digitized X-ray imagery Hsien-Huang P. Wu Department of Electrical Engineering, National Yunlin University of Science.
Advertisements

PEBL: Web Page Classification without Negative Examples Hwanjo Yu, Jiawei Han, Kevin Chen- Chuan Chang IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Introduction to Machine Learning BITS C464/BITS F464
Arnd Christian König Venkatesh Ganti Rares Vernica Microsoft Research Entity Categorization Over Large Document Collections.
Indexing DNA Sequences Using q-Grams
Random Forest Predrag Radenković 3237/10
Word Spotting DTW.
ONLINE ARABIC HANDWRITING RECOGNITION By George Kour Supervised by Dr. Raid Saabne.
Problem Semi supervised sarcasm identification using SASI
Recognizing hand-drawn images using shape context Gyozo Gidofalvi Computer Science and Engineering Department University of California, San Diego
Automatic Feature Extraction for Multi-view 3D Face Recognition
1 Probabilistic Artificial Neural Network For Recognizing the Arabic Hand Written Characters Khalaf khatatneh, Ibrahiem El Emary,and Basem Al- Rifai Journal.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Chapter 1: Introduction to Pattern Recognition
March 15-17, 2002Work with student Jong Oh Davi Geiger, Courant Institute, NYU On-Line Handwriting Recognition Transducer device (digitizer) Input: sequence.
Themis Palpanas1 VLDB - Aug 2004 Fair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use.
Prénom Nom Document Analysis: Segmentation & Layout Analysis Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
 Image Search Engine Results now  Focus on GIS image registration  The Technique and its advantages  Internal working  Sample Results  Applicable.
FLANN Fast Library for Approximate Nearest Neighbors
Online recognition algorithm. LEARNING Letters raw data.
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed.
(Off-Line) Cursive Word Recognition Tal Steinherz Tel-Aviv University.
Exact Indexing of Dynamic Time Warping
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
ONLINE HANDWRITTEN GURMUKHI SCRIPT RECOGNITION AND ITS CHALLENGES R. K. SHARMA THAPAR UNIVERSITY, PATIALA.
Handwriting Copybook Style Analysis Of Pseudo-Online Data Student and Faculty Research Day Mary L. Manfredi, Dr. Sung-Hyuk Cha, Dr. Charles Tappert, Dr.
Online Chinese Character Handwriting Recognition for Linux
Multimedia and Time-series Data
Language Identification of Search Engine Queries Hakan Ceylan Yookyung Kim Department of Computer Science Yahoo! Inc. University of North Texas 2821 Mission.
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
Selective Block Minimization for Faster Convergence of Limited Memory Large-scale Linear Models Kai-Wei Chang and Dan Roth Experiment Settings Block Minimization.
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Online Arabic Handwriting Recognition Fadi Biadsy Jihad El-Sana Nizar Habash Abdul-Rahman Daud Done byPresented by KFUPM Information & Computer Science.
Gili Werner. Motivation Detecting text in a natural scene is an important part of many Computer Vision tasks.
Online Kinect Handwritten Digit Recognition Based on Dynamic Time Warping and Support Vector Machine Journal of Information & Computational Science, 2015.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Handwritten Hindi Numerals Recognition Kritika Singh Akarshan Sarkar Mentor- Prof. Amitabha Mukerjee.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
Comparison of Handwritings Miroslava Božeková Thesis supervisor: Doc. RNDr. Milan Ftáčnik, CSc.
Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.
Exact indexing of Dynamic Time Warping
Reporter: 資訊所 P Yung-Chih Cheng ( 鄭詠之 ).  Introduction  Data Collection  System Architecture  Feature Extraction  Recognition Methods  Results.
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
Mining Document Collections to Facilitate Accurate Approximate Entity Matching Presented By Harshda Vabale.
An Approximate Nearest Neighbor Retrieval Scheme for Computationally Intensive Distance Measures Pratyush Bhatt MS by Research(CVIT)
Team Members Ming-Chun Chang Lungisa Matshoba Steven Preston Supervisors Dr James Gain Dr Patrick Marais.
Feature Selection and Weighting using Genetic Algorithm for Off-line Character Recognition Systems Faten Hussein Presented by The University of British.
Data Mining and Decision Support
Lukáš Neumann and Jiří Matas Centre for Machine Perception, Department of Cybernetics Czech Technical University, Prague 1.
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Arabic Handwriting Recognition Thomas Taylor. Roadmap  Introduction to Handwriting Recognition  Introduction to Arabic Language  Challenges of Recognition.
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
Computational Challenges in BIG DATA 28/Apr/2012 China-Korea-Japan Workshop Takeaki Uno National Institute of Informatics & Graduated School for Advanced.
Machine learning & object recognition Cordelia Schmid Jakob Verbeek.
Feature learning for multivariate time series classification Mustafa Gokce Baydogan * George Runger * Eugene Tuv † * Arizona State University † Intel Corporation.
Supervised Time Series Pattern Discovery through Local Importance
CS 430: Information Discovery
Fast Sequence Alignments
Ala’a Spaih Abeer Abu-Hantash Directed by Dr.Allam Mousa
Handwritten Characters Recognition Based on an HMM Model
Kostas Kolomvatsos, Christos Anagnostopoulos
Automatic Handwriting Generation
Presentation transcript:

By George Kour Supervised By: Prof. Dana Ron Dr. Raid Saabne Real-time Segmentation and Recognition of On-line Handwritten Arabic Script By George Kour Supervised By: Prof. Dana Ron Dr. Raid Saabne Masters Thesis Defense 16 November, 2014 Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Agenda Problem Statement Motivation Characteristics of the Arabic Script Solution Outline Real-time Segmentation Fast Letter Classification Demo Future Work Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Problem Statement Correct and efficient recognition of handwritten Arabic text is challenging problem due to the cursive and unconstrained nature of the Arabic script. Thus, Conventional approaches of online Handwriting recognition usually wait until the entire curve is traced out before starting the analysis. However, This delays the recognition process, and, Prevents implementing advanced features of input typing, such as automatic word completion. Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Motivation Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Motivation Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Characteristics of the Arabic Language Iso Ini Mid Fin ع عـ ـعـ ـع ه هـ ـهـ ـه ًٌٍَُِّْ 4 shapes letters Rasm (رسم)and i’jam (إعجام) Harakat (حركات) اَلْعَرَبِيّةُ العربية Fully vocalized script Segmentation Points (SPs) and Baseline Words Parts (WPs) and Strokes Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Solution Outline Real-time recognition of Arabic Handwritten script. i.e., performing analysis tasks during the course of writing. How do we do that? Continuous points of interest (POIs) nomination while scribing a stroke. Attach scoring to the resulting sub-strokes. Selecting the best set of segmentation points. This requires: Real time POIs nomination algorithm. Fast letter classifier. Segmentation points filtering and selection algorithms. Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Real-time Segmentation of On-line Handwritten Arabic Script 14th International Conference on Frontiers in Handwriting Recognition (ICFHR 2014) Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Definitions Stroke: S= { 𝑥 𝑖 , 𝑦 𝑖 } 𝑖=1 𝑛 . Points of interest {𝑃𝑂𝐼} 𝑖=1 𝐿 , i.e., potential segmentation points (SPs), are continuously nominated while the stroke is being scribed. Horizontal Fragments (HFs) are ligatures that join pairs of connected letters: Horizontal Directed right to left Located near the baseline. Key Points {𝐾𝑃} 𝑖=0 𝐿+1 is a set containing the set of POIs including the first and last point on the stroke. A sub-stroke: 𝑆 𝑖 𝑗 = { 𝑥 𝑘 , 𝑦 𝑘 } 𝑘=𝐾 𝑃 𝑖 𝐾 𝑃 𝑗 Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Stage 1 - HF Identification Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Stage 1 - HF Identification Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Stage 1 – Sub-strokes Scoring The classification information of the sub-strokes imposed by the KPs is stored in the Scoring Matrix, where each cell 𝐷 𝑖,𝑗 contains the scoring information for the sub-strokes 𝑆 𝑖 𝑗 . 𝑲 𝑷 𝟏 𝑲 𝑷 𝟎 ∅ Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Stage 1 – Sub-strokes Scoring 𝑲 𝑷 𝟏 𝑲 𝑷 𝟐 𝑲 𝑷 𝟎 ∅ Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Stage 1 – Sub-strokes Scoring Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Stage 2 – POIs Filtering Once the entire stroke is available, a rules-based process is used to refine the set of POIs and re-score the sub- strokes based on the following rules: SPs should lie close to the baseline. do not reside in loops. sub-stroke length should be proportional to the length of the containing stroke. Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Stage 3 – Segmentation Selection The matrix 𝐷 can be modeled as a directed, edge-weighted graph 𝐺=(𝑉,𝐸), for which a path from vertex 𝐾 𝑃 0 to vertex 𝐾 𝑃 𝐿+1 defines a possible segmentation. Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Results Over-segmentation: Under-segmentation: A horizontal region in initial form which does not accommodate a SP. A letter spanned over several strokes. Under-segmentation: Letter pairs that are not separated by HFs (e.g., لم and لح). Not selecting a POI in the third stage. City name Samples 319 Num. of Strokes 1237 Segmentation Rate 83% Recognition Rate [Top 3] 78%* Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Recent Work Work Results Dataset (Randa et al., 2012) 51% (SR) OHASD - a self collected dataset that includes 154 paragraphs (more than 3800 words) written by 48 writers. (Daifallah et al., 2009) 79% (RR) Self collected database contained 150 words. Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Fast Classification of Handwritten On-line Arabic Characters 6th international conference of soft computing and pattern recognition (SOCPAR 2014) Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Outline Goal Fast classification and scoring of sub-strokes using K-NN based classification Challenges Metric that imitate the perceptual similarity are computationally expensive. Scanning the entire dataset to find the closest samples. Solution principles Metric approximation by embedding to 𝐿 1 . Using indexing techniques to avoid linear scan of the dataset Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Preprocessing 𝑆 = 𝑥 𝑖 , 𝑦 𝑖 𝑖=1 40 (𝑆= 𝑥 𝑖 , 𝑦 𝑖 𝑖=1 𝑛 ,𝑃𝑜𝑠) 𝑆 = 𝑥 𝑖 , 𝑦 𝑖 𝑖=1 40 (𝑆= 𝑥 𝑖 , 𝑦 𝑖 𝑖=1 𝑛 ,𝑃𝑜𝑠) Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Preprocessing Give a uniform structure to the data by avoiding: Jagged and non-uniform sampling of the digitizer Imperfections caused by hand vibration from hesitate writing. Normalization: Uniform size bound box surrounding the pattern. Noise elimination: using the Douglas-Peucker algorithm. Re-sampling: using quadratic piecewise interpolation function. Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Feature Extraction 𝑆 = 𝑥 𝑖 , 𝑦 𝑖 𝑖=1 40 𝐹 𝑆 ∈ ℝ 40×60 𝑆 = 𝑥 𝑖 , 𝑦 𝑖 𝑖=1 40 𝐹 𝑆 ∈ ℝ 40×60 Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Feature Extraction Feature extraction is the process of extracting informative parameters for learning and recognition of patterns. Multi Angular Descriptor (MAD) (Saabni, 2013) Shape Context (SC) (Belongie, et al. 2002) Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

EMD Embedding 𝐹 𝑆 ∈ ℝ 40×60 𝑊 𝑆 ∈ ℝ 2422 Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Earth Movers Distance (EMD) (a) 𝐿 𝑝 distance. (b) Perceptual similarity. EMD: the minimum amount of work needed to transform histogram P to histogram Q. 𝐸𝑀𝐷 𝑃,𝑄 = min 𝑓 𝑖,𝑗 𝑓 𝑖,𝑗 𝑑 𝑖,𝑗 𝑖,𝑗 𝑓 𝑖,𝑗 Computing EMD can be solved in 𝑂(𝑁 3 𝑙𝑜𝑔𝑁) for 𝑁-bins histogram (using Orlin's algorithm). When used to compare histograms with the same overall mass, namely distributions, EMD is a metric. Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Fast EMD Approximation Linear time embedding to the wavelets coefficient domain. EMD( 𝐹 𝑆 1 , 𝐹 𝑆 2 )≅ 𝑊 𝑆 1 − 𝑊 𝑆 2 1 The Haar wavelet achieved the best classification results. (Shirdhonkar and Jacobs, 2008) 𝑑(𝑝) 𝑤𝑒𝑚𝑑 = 𝜆 2 −𝑗(1+ 𝑛 2 ) 𝑝 𝜆 Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Dimensionality Reduction 𝑊 𝑆 ∈ ℝ 2422 𝑅𝑊 𝑆 ∈ ℝ <10 Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Dimensionality Reduction Solve the curse of dimensionality. Embedding the SC feature vectors has produces sparse vectors in ℝ 2422 . PCA: Unsupervised but efficient LDA: Supervised but costly Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Dimensionality Reduction Before applying LDA, each character class was partitioned into four clusters, using 𝐿 1 − 𝑘 𝑚𝑒𝑑𝑜𝑖𝑑𝑠 algorithm, and for each cluster a unique sub-label was assigned. The target number of dimensions was estimated using the maximum likelihood estimation method. PCA+LDA PCA Letter Position 9 48 Ini 10 52 Mid 44 Fin 8 39 Iso Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Metric Indexing 𝑅𝑊 𝑆 ∈ ℝ <10 Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Metric Indexing Distance function approximation techniques alone cannot avoid linear scan of the entire dataset. The k-d tree is an efficient data structure for storing a finite set of points from a k- dimensional space. The 𝑘-d tree decomposition of a region containing six data points. The 𝑘-d tree representation for (a). Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Classification Flow 𝐶 1 … 𝐶 𝑘 (𝑆= 𝑥 𝑖 , 𝑦 𝑖 𝑖=1 𝑛 ,𝑃𝑜𝑠) (𝑆= 𝑥 𝑖 , 𝑦 𝑖 𝑖=1 𝑛 ,𝑃𝑜𝑠) Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Candidates Rescoring using DTW Re-scoring of the candidates is done by calculating the DTW distance between the preprocessed version of the query sequence and the candidates. Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Results The system was trained and tested on characters and word parts extracted from the ADAB database. Sample set size and distribution Letters classification results # of Samples Letter Position 1405 Ini 1196 Mid 1629 Fin 1372 Iso Shape Descriptor Accuracy [Top 1] Accuracy [Top 3] SC 91% 96% MAD 88% 94% None 87% 93% Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Recent work Work Accuracy Dataset (AL Taani and Al Haj, 2010) 75% 1400 Self collected isolated character (Ismail, Abdullah and Siti, 2012) 97% 504 characters, 66% training set (Addakiri and Bahaj, 2012) 83% Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Future Work Handle to the delayed strokes Handle multiple strokes letters Develop a word completion system Holistic approach based recognizer Standardize and publish the segmented version of the ADAB Database Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Thank You!

Sub-Stroke Position Using the relative location of the sub-stroke in the stroke, we restricted the classification process to search for similar samples feasible position databases. A mapping between the subsequence types and the possible letter positions. 𝑆 denotes a stroke containing 𝐿 POIs where 𝑚>0 and 𝑘<𝐿+1. Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Segmentation Selection Graph Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Segmentation Selection Algorithms Performance Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Preprocessing (1) Given a stroke 𝑆= 𝑥 𝑖 , 𝑦 𝑖 𝑖=1 𝑛 Normalization Uniform size bound box surrounding the pattern. Translating the sequence so that the sequence’s center of gravity is located in the origin point. Noise elimination using the Douglas-Peucker algorithm. Tolerance Parameter 𝜖= 1 75 . Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Preprocessing (2) Re-sampling 𝑓 𝑥 𝑑 𝑎𝑛𝑑 𝑓 𝑦 𝑑 are the quadratic piecewise interpolations function of 𝑥 𝑖 𝑖=1 𝑛 and 𝑦 𝑖 𝑖=1 𝑛 , respectively. Let 𝑡 𝑖 =𝑖 𝐿 𝑅 where 𝐿 is the arc-length of the pattern and 𝐿 is the resampling parameter 𝑆 = 𝑓 𝑥 𝑡 𝑖 , 𝑓 𝑦 𝑡 𝑖 𝑖=1 𝑅 Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Activation Configuration High Accuracy: The Proposed Approach Low Latency: Avoid Candidates Rescoring Fast Learning: Avoid DR and metric indexing Configuration Accuracy [Top 1] Accuracy [Top 3] Time [ms] High Accuracy 91% 96% 29.9 Low Latency 87% 94% 0.12 Fast Learning 90% 4.4 Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering

Sample set We are planning to standardize and publish the characters database extracted from the ADAB database and make available for other researches in the field. Manual Segmentation Tel Aviv University - Faculty of Engineering - Department of Electrical Engineering