Problem description and pipeline

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Advanced topics.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
A Novel Approach for Recognizing Auditory Events & Scenes Ashish Kapoor.
An Overview of Machine Learning
Simple Face Detection system Ali Arab Sharif university of tech. Fall 2012.
Reading for Next Week Textbook, Section 9, pp A User’s Guide to Support Vector Machines (linked from course website)
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Generative Models for Image Analysis Stuart Geman (with E. Borenstein, L.-B. Chang, W. Zhang)
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Supervised and Unsupervised learning and application to Neuroscience Cours CA6b-4.
Chapter 1: Introduction to Pattern Recognition
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
Document Image Analysis CSE 717 An Introduction. Document Image Analysis  DIA is the theory and practice of recovering the symbol structures of digital.
Smart Traveller with Visual Translator. What is Smart Traveller? Mobile Device which is convenience for a traveller to carry Mobile Device which is convenience.
California Car License Plate Recognition System ZhengHui Hu Advisor: Dr. Kang.
K-means Based Unsupervised Feature Learning for Image Recognition Ling Zheng.
Con-Text: Text Detection Using Background Connectivity for Fine-Grained Object Classification Sezer Karaoglu, Jan van Gemert, Theo Gevers 1.
MACHINE LEARNING AND ARTIFICIAL NEURAL NETWORKS FOR FACE VERIFICATION
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
3D Scene Models Object recognition and scene understanding Krista Ehinger.
Introduction to machine learning
Computer Vision Systems for the Blind and Visually Disabled. STATS 19 SEM Talk 3. Alan Yuille. UCLA. Dept. Statistics and Psychology.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Recognizing some of the modern CAPTCHAs Dmitry Nikulin LCME, Saint-Petersburg, 2011.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
(CMSC5720-1) MSC projects by Prof K.H. Wong (21 July2014) (shb907) MSC projects supervised by Prof.
End-to-End Text Recognition with Convolutional Neural Networks
CS 6825: Binary Image Processing – binary blob metrics
Digital Image Processing & Analysis Spring Definitions Image Processing Image Analysis (Image Understanding) Computer Vision Low Level Processes:
Compiled By: Raj G Tiwari.  A pattern is an object, process or event that can be given a name.  A pattern class (or category) is a set of patterns sharing.
Rotation Invariant Neural-Network Based Face Detection
Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,
1 Chapter 1: Introduction 1.1 Images and Pictures Human have evolved very precise visual skills: We can identify a face in an instant We can differentiate.
1 © 2010 Cengage Learning Engineering. All Rights Reserved. 1 Introduction to Digital Image Processing with MATLAB ® Asia Edition McAndrew ‧ Wang ‧ Tseng.
Object Recognition in Images Slides originally created by Bernd Heisele.
Handwritten Recognition with Neural Network Chatklaw Jareanpon, Olarik Surinta Mahasarakham University.
Handwritten Hindi Numerals Recognition Kritika Singh Akarshan Sarkar Mentor- Prof. Amitabha Mukerjee.
CS654: Digital Image Analysis Lecture 25: Hough Transform Slide credits: Guillermo Sapiro, Mubarak Shah, Derek Hoiem.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
Handwritten digit recognition
APECE-505 Intelligent System Engineering Basics of Digital Image Processing! Md. Atiqur Rahman Ahad Reference books: – Digital Image Processing, Gonzalez.
Dr. István Marosi Scansoft-Recognita, Inc., Hungary SSIP 2005, Szeged Character Recognition Internals.
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Data Mining and Decision Support
Scanned Documents INST 734 Module 10 Doug Oard. Agenda Document image retrieval  Representation Retrieval Thanks for David Doermann for most of these.
1 End-to-End Learning for Automatic Cell Phenotyping Paolo Emilio Barbano, Koray Kavukcuoglu, Marco Scoffier, Yann LeCun April 26, 2006.
POSTER TEMPLATE BY: Background Objectives Psychophysical Experiment Photo OCR Design Project Pipeline and outlines ❑ Deep Learning.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
Visual Information Processing. Human Perception V.S. Machine Perception  Human perception: pictorial information improvement for human interpretation.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
IMAGE PROCESSING RECOGNITION AND CLASSIFICATION
Deep Learning Insights and Open-ended Questions
Image Recognition. Contents: Motivation Objective Definition Introduction Preprocessing / Edge Detection Neural Networks in Image Recognition Practical.
Regularizing Face Verification Nets To Discrete-Valued Pain Regression
Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas
COMP61011 : Machine Learning Ensemble Models
Final Year Project Presentation --- Magic Paint Face
State-of-the-art face recognition systems
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
network of simple neuron-like computing elements
Dr. István Marosi Recosoft Ltd., Hungary
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
© 2010 Cengage Learning Engineering. All Rights Reserved.
Using Manifold Structure for Partially Labeled Classification
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Introduction.
Presentation transcript:

Problem description and pipeline Application example: Photo OCR Problem description and pipeline Machine Learning

The Photo OCR problem LULA B’s ANTIQUE MALL LULA B’s OPEN LULA B’s

2. Character segmentation Photo OCR pipeline 1. Text detection 2. Character segmentation 3. Character classification A N T

Character segmentation Character recognition Photo OCR pipeline Character segmentation Character recognition Image Text detection

Application example: Photo OCR Sliding windows Machine Learning

Text detection Pedestrian detection

Supervised learning for pedestrian detection pixels in 82x36 image patches Positive examples Negative examples

Sliding window detection

Sliding window detection

Sliding window detection

Sliding window detection

Text detection

Text detection Positive examples Negative examples

Text detection Expansion Rectangles around operator [David Wu]

1D Sliding window for character segmentation Positive examples Negative examples Change it to segmentation examples instead:

2. Character segmentation Photo OCR pipeline 1. Text detection 2. Character segmentation 3. Character classification A N T

Getting lots of data: Artificial data synthesis Application example: Photo OCR Getting lots of data: Artificial data synthesis Machine Learning

Character recognition Q A Sort it, spell antique

Abcdefg Artificial data synthesis for photo OCR Real data [Adam Coates and Tao Wang]

Artificial data synthesis for photo OCR Synthetic data Real data [Adam Coates and Tao Wang]

Synthesizing data by introducing distortions [Adam Coates and Tao Wang]

Synthesizing data by introducing distortions: Speech recognition Original audio: Audio on bad cellphone connection Noisy background: Crowd Noisy background: Machinery [www.pdsounds.org]

Synthesizing data by introducing distortions Distortion introduced should be representation of the type of noise/distortions in the test set. Audio: Background noise, bad cellphone connection Usually does not help to add purely random/meaningless noise to your data. intensity (brightness) of pixel random noise 2x2, add noise [Adam Coates and Tao Wang]

Discussion on getting more data Make sure you have a low bias classifier before expending the effort. (Plot learning curves). E.g. keep increasing the number of features/number of hidden units in neural network until you have a low bias classifier. “How much work would it be to get 10x as much data as we currently have?” Artificial data synthesis Collect/label it yourself “Crowd source” (E.g. Amazon Mechanical Turk)

Discussion on getting more data Make sure you have a low bias classifier before expending the effort. (Plot learning curves). E.g. keep increasing the number of features/number of hidden units in neural network until you have a low bias classifier. “How much work would it be to get 10x as much data as we currently have?” Artificial data synthesis Collect/label it yourself “Crowd source” (E.g. Amazon Mechanical Turk)

Ceiling analysis: What part of the pipeline to work on next Application example: Photo OCR Ceiling analysis: What part of the pipeline to work on next Machine Learning

Character segmentation Character recognition Estimating the errors due to each component (ceiling analysis) Character segmentation Character recognition Image Text detection What part of the pipeline should you spend the most time trying to improve? Component Accuracy Overall system 72% Text detection 89% Character segmentation 90% Character recognition 100%

Another ceiling analysis example Face recognition from images (Artificial example) Camera image Preprocess (remove background) Eyes segmentation Face detection Nose segmentation Logistic regression Label Mouth segmentation

Preprocess (remove background) Another ceiling analysis example Camera image Preprocess (remove background) Eyes segmentation Logistic regression Label Nose segmentation Face detection Component Accuracy Overall system 85% Preprocess (remove background) 85.1% Face detection 91% Eyes segmentation 95% Nose segmentation 96% Mouth segmentation 97% Logistic regression 100% Mouth segmentation