Con-Text: Text Detection Using Background Connectivity for Fine-Grained Object Classification Sezer Karaoglu, Jan van Gemert, Theo Gevers 1.

Slides:

Advertisements

Similar presentations

Applications of one-class classification

Advertisements

Three things everyone should know to improve object retrieval

A Novel Approach for Recognizing Auditory Events & Scenes Ashish Kapoor.

Query Specific Fusion for Image Retrieval

Learning Visual Similarity Measures for Comparing Never Seen Objects Eric Nowak, Frédéric Jurie CVPR 2007.

Clustering & image segmentation Goal::Identify groups of pixels that go together Segmentation.

Lecture 07 Segmentation Lecture 07 Segmentation Mata kuliah: T Computer Vision Tahun: 2010.

INTRODUCTION Heesoo Myeong, Ju Yong Chang, and Kyoung Mu Lee Department of EECS, ASRI, Seoul National University, Seoul, Korea Learning.

Qualifying Exam: Contour Grouping Vida Movahedi Supervisor: James Elder Supervisory Committee: Minas Spetsakis, Jeff Edmonds York University Summer 2009.

Robust Object Tracking via Sparsity-based Collaborative Model

Hierarchical Saliency Detection School of Electronic Information Engineering Tianjin University 1 Wang Bingren.

High-level Component Filtering for Robust Scene Text Detection

Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA

Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.

Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson

Real-time Computer Vision with Scanning N-Tuple Grids Simon Lucas Computer Science Dept.

CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.

Shadow Removal Seminar

1 Accurate Object Detection with Joint Classification- Regression Random Forests Presenter ByungIn Yoo CS688/WST665.

Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.

SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,

Overview Introduction to local features

Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.

Unsupervised Learning of Hierarchical Spatial Structures Devi Parikh, Larry Zitnick and Tsuhan Chen.

Automatic Registration of Color Images to 3D Geometry Computer Graphics International 2009 Yunzhen Li and Kok-Lim Low School of Computing National University.

BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.

Overview Harris interest points Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points Evaluation and comparison of different.

End-to-End Text Recognition with Convolutional Neural Networks

Why Categorize in Computer Vision ?. Why Use Categories? People love categories!

CS 6825: Binary Image Processing – binary blob metrics

Course Syllabus 1.Color 2.Camera models, camera calibration 3.Advanced image pre-processing Line detection Corner detection Maximally stable extremal regions.

Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.

Stylization and Abstraction of Photographs Doug Decarlo and Anthony Santella.

Chapter 10, Part II Edge Linking and Boundary Detection The methods discussed in the previous section yield pixels lying only on edges. This section.

INTRODUCTION Heesoo Myeong and Kyoung Mu Lee Department of ECE, ASRI, Seoul National University, Seoul, Korea Tensor-based High-order.

Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.

Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.

Geodesic Saliency Using Background Priors

Overview Introduction to local features Harris interest points + SSD, ZNCC, SIFT Scale & affine invariant interest point detectors Evaluation and comparison.

Image Classifier Digital Image Processing A.A

1 Self-dual Morphological Methods Using Tree Representation Alla Vichik Renato Keshet (HP Labs—Israel) David Malah Technion - Israel Institute of Technology.

Lukáš Neumann and Jiří Matas Centre for Machine Perception, Department of Cybernetics Czech Technical University, Prague 1.

Preliminary Transformations Presented By: -Mona Saudagar Under Guidance of: - Prof. S. V. Jain Multi Oriented Text Recognition In Digital Images.

Cell Segmentation in Microscopy Imagery Using a Bag of Local Bayesian Classifiers Zhaozheng Yin RI/CMU, Fall 2009.

Color Image Segmentation Mentor : Dr. Rajeev Srivastava Students: Achit Kumar Ojha Aseem Kumar Akshay Tyagi.

Martina Uray Heinz Mayer Joanneum Research Graz Institute of Digital Image Processing Horst Bischof Graz University of Technology Institute for Computer.

Course : T Computer Vision

SHAHAB iCV Research Group.

- photometric aspects of image formation gray level images

Learning Mid-Level Features For Recognition

Finding Things: Image Parsing with Regions and Per-Exemplar Detectors

Saliency detection Donghun Yeo CV Lab..

Learning to Detect a Salient Object

Project Implementation for ITCS4122

Enhanced-alignment Measure for Binary Foreground Map Evaluation

Features Readings All is Vanity, by C. Allan Gilbert,

Tremor Detection Using Motion Filtering and SVM Bilge Soran, Jenq-Neng Hwang, Linda Shapiro, ICPR, /16/2018.

Object-Graphs for Context-Aware Category Discovery

Reconstructing Shredded Documents

Text Detection in Images and Video

Rob Fergus Computer Vision

Maximally Stable Extremal Regions

Maximally Stable Extremal Regions

“The Truth About Cats And Dogs”

Brief Review of Recognition + Context

Institute for Information Industry (III) Research Report

Multiple Feature Learning for Action Classification

AN EVALUATION OF LOCAL IMAGE FEATURES FOR OBJECT CLASS RECOGNITION

Maximally Stable Extremal Regions

Semantic Segmentation

Presentation transcript:

Con-Text: Text Detection Using Background Connectivity for Fine-Grained Object Classification Sezer Karaoglu, Jan van Gemert, Theo Gevers 1

Can we achieve a better object recognition with the help of scene-text? 2

Goal Exploit hidden details by text in the scene to improve visual classification of very similar instances. Application : Linking images from Google street view to textual business inforation as e.g. the Yellow pages, Geo-referencing, Information retrieval 3 SKY CAR DJ SUBS BreakfastStarbucks Coffee

Challenges of Text Detection in Natural Scene Images o Lighting o Surface Reflections o Unknown background o Non-Planar objects o Unknown Text Font o Unknown Text Size o Blur 4

Literature Review Text Detection Texture Based: Wang et al. “End-to-End Scene Text Recognition” ICCV ‘11 Computational Complexity Dataset specific Do not rely on heuristic rules Region Based: Epshtein et al. “Detecting Text in Natural Scenes with Stroke Width Transform ” CVPR ‘10 Hard to define connectivity Segmentation helps to improve ocr performance 5

Motivation to remove background for Text Detection To reduce majority of image regions for further processes. To reduce false positives caused by text like image regions (fences, bricks, windows, and vegetation). To reduce dependency on text style.

7 Automatic BG seed selectionBG reconstruction Text detection by BG substraction Proposed Text Detection Method

Background Seed Selection Color, contrast and objectness responses are used as feature. Random Forest classifier with 100 trees based on out-of-bag error are used to create forest. Each tree is constructed with three random features. The splitting of the nodes is made based on GINI criterion. Original ImageColor BoostingContrastObjectness

Conditional Dilation for BG connectivity where B is the structring element (3 by-3 square), M is the binary image where bg seeds are ones and X is the gray level input image until repeat

Text Recognition Experiments ICDAR’03 Dataset with 251 test images, 5370 characters, 1106 words. 10

ICDAR 2003 Dataset Char. Recognition Results 11 Method Cl. Rate (%) ABBYY36 Karaoglu et. al.62 Proposed63 The proposed system removes 87% of the non-text regions where on average 91% of the test set contains non-text regions. It retains approximately %98 of text regions.

ImageNet Dataset ImageNet building and place of business dataset ( images 28 classes, largest dataset ever used for scene tekst recognition) The images do not necessarily contain scene text. Visual features : 4000 visual words, standard gray SIFT only. Text features: Bag-of-bigrams, ocr results obtained for each image in the dataset. 3 repeats, to compute standard deviations in Avg. Precision. Histogram Intersection Kernel in libsvm. Text only, Visual only and Fused results are compared. Steak PizzeriaFuneralBakeryDiscount HouseCountry House

Fine-Grained Building Classification Results ocr : 15.6 ± 0.4 Bow : 32.9 ± 1.7 TextVisual Fusion Bow + ocr : 39.0 ± 2.6 #269#431#584#2752 #1#4#5#8 Visual Text Proposed Discount House #1#4#5#8

Conclusion Background removal is a suitable approach for scene text detection A new text detection method, using background connectivity and, color, contrast and objectness cues is proposed. Improved performance to scene text recognition. Improved Fine-Grained Object Classification performance with visual and scene text information fusion. 14

DEMO TRY HERE