IIIT Hyderabad Document Image Retrieval using Bag of Visual Words Model Ravi Shekhar CVIT, IIIT Hyderabad Advisor : Prof. C.V. Jawahar.

Slides:

Advertisements

Similar presentations

Applications of one-class classification

Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Location Recognition Given: A query image A database of images with known locations Two types of approaches: Direct matching: directly match image features.

Aggregating local image descriptors into compact codes

Three things everyone should know to improve object retrieval

Word Spotting DTW.

Presented by Xinyu Chang

Content-Based Image Retrieval

Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.

VisualRank: Applying PageRank to Large-Scale Image Search Yushi Jing, Member, IEEE, and Shumeet Baluja, Member, IEEE.

MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…

Query Specific Fusion for Image Retrieval

Word Recognition of Indic Scripts

Robust Object Tracking via Sparsity-based Collaborative Model

GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.

A Novel Scheme for Video Similarity Detection Chu-Hong Hoi, Steven March 5, 2003.

CVPR 2008 James Philbin Ondˇrej Chum Michael Isard Josef Sivic

Packing bag-of-features ICCV 2009 Herv´e J´egou Matthijs Douze Cordelia Schmid INRIA.

Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun CVPR 2009.

Effective Image Database Search via Dimensionality Reduction Anders Bjorholm Dahl and Henrik Aanæs IEEE Computer Society Conference on Computer Vision.

Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,

Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.

WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1.

Object retrieval with large vocabularies and fast spatial matching

Image Search Presented by: Samantha Mahindrakar Diti Gandhi.

Automatic Image Annotation and Retrieval using Cross-Media Relevance Models J. Jeon, V. Lavrenko and R. Manmathat Computer Science Department University.

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.

MANISHA VERMA, VASUDEVA VARMA PATENT SEARCH USING IPC CLASSIFICATION VECTORS.

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.

Presented by Zeehasham Rasheed

CS292 Computational Vision and Language Visual Features - Colour and Texture.

FLANN Fast Library for Approximate Nearest Neighbors

Content Level Access to Digital Library of India Pages

Overview of Search Engines

IIIT HyderabadUMASS AMHERST Robust Recognition of Documents by Fusing Results of Word Clusters Venkat Rasagna 1, Anand Kumar 1, C. V. Jawahar 1, R. Manmatha.

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

A Search Engine for Historical Manuscript Images Toni M. Rath, R. Manmatha and Victor Lavrenko Center for Intelligent Information Retrieval University.

Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.

Final Exam Review CS485/685 Computer Vision Prof. Bebis.

Similarity measuress Laboratory of Image Analysis for Computer Vision and Multimedia Università di Modena e Reggio Emilia,

Panagiotis Antonopoulos Microsoft Corp Ioannis Konstantinou National Technical University of Athens Dimitrios Tsoumakos.

Graphite 2004 Statistical Synthesis of Facial Expressions for the Portrayal of Emotion Lisa Gralewski Bristol University United Kingdom

IIIT Hyderabad Synthesizing Classifiers for Novel Settings Viresh Ranjan CVIT,IIIT-H Adviser: Prof. C. V. Jawahar, IIIT-H Co-Adviser: Dr. Gaurav Harit,

IIIT Hyderabad Thesis Presentation By Raman Jain ( ) Towards Efficient Methods for Word Image Retrieval.

Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.

Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.

A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.

1 Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval Ondrej Chum, James Philbin, Josef Sivic, Michael Isard and.

Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.

IIIT Hyderabad Word Hashing for Efficient Search in Document Image Collections Anand Kumar Advisors: Dr. C. V. Jawahar IIIT Hyderabad Dr. R. Manmatha University.

Wei Feng , Jiawei Han, Jianyong Wang , Charu Aggarwal , Jianbin Huang

Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.

Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.

Locality-constrained Linear Coding for Image Classification

INTERACTIVELY BROWSING LARGE IMAGE DATABASES Ronald Richter, Mathias Eitz and Marc Alexa.

Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.

Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.

An Approximate Nearest Neighbor Retrieval Scheme for Computationally Intensive Distance Measures Pratyush Bhatt MS by Research(CVIT)

Image Classification for Automatic Annotation

Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al. Presented by Brandon Smith Computer Vision.

A Multiresolution Symbolic Representation of Time Series Vasileios Megalooikonomou Qiang Wang Guo Li Christos Faloutsos Presented by Rui Li.

Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Yin-Hsi Kuo1,2, Hsuan-Tien Lin 1, Wen-Huang Cheng 2, Yi-Hsuan Yang.

Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.

NN k Networks for browsing and clustering image collections Daniel Heesch Communications and Signal Processing Group Electrical and Electronic Engineering.

IIIT HYDERABAD Techniques for Organization and Visualization of Community Photo Collections Kumar Srijan Faculty Advisor : Dr. C.V. Jawahar.

Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance Hello everyone,

Large-Scale Content-Based Audio Retrieval from Text Queries

Capturing, Processing and Experiencing Indian Monuments

Text Detection in Images and Video

Presentation transcript:

IIIT Hyderabad Document Image Retrieval using Bag of Visual Words Model Ravi Shekhar CVIT, IIIT Hyderabad Advisor : Prof. C.V. Jawahar

IIIT Hyderabad Motivation Large number of printed books are digitized

IIIT Hyderabad Motivation Large number of printed books are digitized Digital libraries like Universal Digital library (UDL), Digital library of India (DLI) and Google Books etc. Digital Library Database

IIIT Hyderabad Motivation Large number of printed books are digitized Digital libraries like Universal Digital library (UDL), Digital library of India (DLI) and Google Books etc. Need to design efficient and effective methodology for content level access Digital Library Database

IIIT Hyderabad Process Overview Index Database Documents ProcessingInput Query Matching Retrieved Documents Scanning Matching can be done by two levels : “Text” and “Image”

IIIT Hyderabad Matching Approaches Recognition Based Approach (Text Level Matching) Optical Character Recognition (OCR) Recognition Free Approach (Image Level Matching) Word Spotting

IIIT Hyderabad Recognition Based Approach Optical Character Recognition (OCR) Binarization of Document Segmentation using connected components Line level Word level Character level Character recognition using different features like patch, profile etc Classification using ANN or SVM

IIIT Hyderabad Limitations of Recognition Based Approach Cuts

IIIT Hyderabad Limitations of Recognition Based Approach Cuts Merges

IIIT Hyderabad Limitations of Recognition Based Approach Cuts Merges Variation in Script

IIIT Hyderabad Limitations of Recognition Based Approach Cuts Merges Variation in Script Variation in Font and Typesetting

IIIT Hyderabad Limitations of Recognition Based Approach Cuts Merges Variation in Script Variation in Font and Typesetting Underline and Over Written

IIIT Hyderabad Recognition Free Approach Word Spotting Representation of word image using global (profile) features

IIIT Hyderabad Recognition Free Approach Word Spotting Representation of word image using global (profile) features Matching features using different distance measures like L1, L2 etc

IIIT Hyderabad Recognition Free Approach Word Spotting Representation of word image using global (profile) features Matching features using different distance measures like L1, L2 etc Comparison of different size word images using Dynamic time warping (DTW)

IIIT Hyderabad Why Recognition Free Approach ? Robust OCRs are unavailable for many non-Latin languages These languages have rich heritage and there is a need for content level search Word Spotting based methods are too slow for real time system Most of the existing retrieval methods are memory intensive Scalability is an immediate challenge

IIIT Hyderabad Word Image Retrieval using Bag of Visual Words

IIIT Hyderabad Bag of Visual Words (BoVW) Bag of Words (BoW) representation is the most popular representation for text retrieval BoW based efficient systems like Lucene are publically available Bag of Visual Words (BoVW) performs excellently for image and video retrieval BoVW based system is flexible, powerful and scalable to Billions of images

IIIT Hyderabad BoVW Representation Word Images are represented using Histogram of Visual Words

IIIT Hyderabad BoVW Representation Code Book generation Subset of Images is used Clustering is done using Hierarchical K-Means (HKM) HKM is faster than K-Means both in building tree and finding nearest neighbours

IIIT Hyderabad BoVW based Representation

IIIT Hyderabad BoVW based Representation

IIIT Hyderabad Histogram of Visual Words BoVW based Representation

IIIT Hyderabad BoVW based Representation Cuts

IIIT Hyderabad Histogram of Visual Words BoVW based Representation Cuts

IIIT Hyderabad BoVW based Representation Merges

IIIT Hyderabad Histogram of Visual Words BoVW based Representation Merges

IIIT Hyderabad Proposed Architecture

IIIT Hyderabad Fixed size representation Advantages of BoVW based Representation

IIIT Hyderabad Fixed size representation Advantages of BoVW based Representation Clean

IIIT Hyderabad Fixed size representation Robust against degradation Advantages of BoVW based Representation

IIIT Hyderabad Fixed size representation Robust against degradation Advantages of BoVW based Representation Cuts Merge Clean

IIIT Hyderabad Fixed size representation Robust against degradation Scalable to Billions of images Advantage of BoVW based Representation

IIIT Hyderabad Fixed size representation Robust against degradation Scalable to Billions of Images Language independent Advantages of BoVW based Representation

IIIT Hyderabad Lost Geometry Spatial Verification

IIIT Hyderabad Lost Geometry Spatial Verification Clean

IIIT Hyderabad Lost Geometry Spatial Verification Clean

IIIT Hyderabad Lost Geometry Spatial Verification Clean

IIIT Hyderabad Lost Geometry Spatial Verification

IIIT Hyderabad Lost Geometry Spatial Verification

IIIT Hyderabad Lost Geometry Spatial Verification

IIIT Hyderabad Re-ranking SIFT based re-ranking Higher the Total Score, better the match

IIIT Hyderabad Experimentations Books Used in Experimentations Language#Books#Pages#Words Hindi Malayalam Telugu Bangla Hindi

IIIT Hyderabad Quantitative Results Performance Statistics Language#Images#QuerymAP after Re-ranking mAP after Spatial Verification Hindi Malayalam Telugu Bangla Hindi

IIIT Hyderabad Quantitative Results Performance Statistics Language#Images#Query after Re-ranking after Spatial Verification Hindi Malayalam Telugu Bangla Hindi

IIIT Hyderabad Quantitative Results mAP Vs Query Length

IIIT Hyderabad Quantitative Results mAP Vs Query Length More the # characters, better the results

IIIT Hyderabad Quantitative Results Retrieval Time and Index Size #ImagesRetrieval TimeIndex Size 25K50ms28 MB 100K209ms130 MB 0.5M411ms550 MB 1M700ms1.2 GB

IIIT Hyderabad Qualitative Results QueryRetrieved Results HI

IIIT Hyderabad Qualitative Results QueryRetrieved Results

IIIT Hyderabad Qualitative Results QueryRetrieved Results

IIIT Hyderabad Qualitative Results QueryRetrieved Results

IIIT Hyderabad Qualitative Results Sample Output for Noisy Images where Commercial OCR fails QueryRetrieved Results

IIIT Hyderabad Enhancement over Bag of Visual Words based Word Image Retrieval

IIIT Hyderabad Query Expansion Observation: Top ranked results are correct Top-k results are used to form new query Improves the precision of retrieved list Modified average query expansion ─Instead of equal weight to every Top-k results, rank based weight (1/2 rank ) is given Improves mAP and by 2%

IIIT Hyderabad Query Expansion Query Image Index Histogram Querying Refined Histogram Rank 1 Rank 2Rank 3Rank 4Rank 5Rank 6 Query Image Rank 1 Rank 2 Rank 3 Rank 4Rank 5 Rank 6 Query Histogram

IIIT Hyderabad Query Expansion Query Image Index Expanded Query Histogram Querying Previous Results Rank 1 Rank 2Rank 3Rank 4Rank 5Rank 6 Modified Results Rank 1 Rank 2Rank 3Rank 4Rank 5Rank 6

IIIT Hyderabad Text Query Support Originally formulated in a “query by example” setting but users would prefer textual interface for document image collection We propose a novel and simple framework for text query support Used a small subset of data with ground truth covering all possible characters in a particular language Visual words are learnt specific to each character and averaged across its different variations Given a textual query, we synthesize its BoVW histogram Text query results are comparable to word image results

IIIT Hyderabad Text Query Support Query by example setting Input Query ImageHistogram

IIIT Hyderabad Text Query Support Query by example setting Text Queries Support Input Text Query Text Query Histogram

IIIT Hyderabad Qualitative Results Sample output for queries using different techniques

IIIT Hyderabad Vector Quantization In Vector Quantization (VQ), each feature vector is mapped to single visual word (VW), i.e, Hard Assignment

IIIT Hyderabad Vector Quantization In Vector Quantization (VQ), each feature vector is mapped to single visual word (VW), i.e, Hard Assignment

IIIT Hyderabad Vector Quantization In Vector Quantization (VQ), each feature vector is mapped to single visual word (VW), i.e, Hard Assignment

IIIT Hyderabad Vector Quantization In Vector Quantization (VQ), each feature vector is mapped to single visual word (VW), i.e, Hard Assignment (a) Input Descriptor

IIIT Hyderabad Vector Quantization In Vector Quantization (VQ), each feature vector is mapped to single visual word (VW), i.e, Hard Assignment Problems with VQ

IIIT Hyderabad Vector Quantization In Vector Quantization (VQ), each feature vector is mapped to single visual word (VW), i.e, Hard Assignment Problems with VQ Visual word uncertainty

IIIT Hyderabad Vector Quantization In Vector Quantization (VQ), each feature vector is mapped to single visual word (VW), i.e, Hard Assignment Problems with VQ Visual word uncertainty Mapping single VW from out of 2 or more possible

IIIT Hyderabad Vector Quantization In Vector Quantization(VQ), each feature vector is mapped to single visual word(VW) i.e Hard Assignment Problems with VQ Visual word uncertainty Mapping single VW from out of 2 or more possible

IIIT Hyderabad Vector Quantization In Vector Quantization(VQ), each feature vector is mapped to single visual word(VW) i.e Hard Assignment Problems with VQ Visual word uncertainty Visual word plausibility

IIIT Hyderabad Vector Quantization In Vector Quantization(VQ), each feature vector is mapped to single visual word(VW) i.e Hard Assignment Problems with VQ Visual word uncertainty Visual word plausibility Mapping a visual word without a suitable candidate in the vocabulary

IIIT Hyderabad Vector Quantization In Vector Quantization(VQ), each feature vector is mapped to single visual word(VW) i.e Hard Assignment Problems with VQ Visual word uncertainty Visual word plausibility Mapping a visual word without a suitable candidate in the vocabulary.

IIIT Hyderabad Vector Quantization In Vector Quantization(VQ), each feature vector is mapped to single visual word(VW) i.e Hard Assignment Problems with VQ Visual word uncertainty Visual word plausibility Solution: Soft Assignment Map each feature vector to 2 or more possible VW

IIIT Hyderabad Soft Assignment Map each feature vector to 2 or more possible VW Approached of Soft Assignment Distance based Equal weight Based on Distance in Feature Space Gaussian Distance Does not minimize reconstruction error

IIIT Hyderabad Soft Assignment Map each feature vector to 2 or more possible VW Approached of Soft Assignment Distance based Equal weight Based on Distance in Feature Space Gaussian Distance Does not minimize reconstruction error Input Descriptor

IIIT Hyderabad Soft Assignment Map each feature vector to 2 or more possible VW Approached of Soft Assignment Distance based Equal weight Based on Distance in Feature Space Gaussian Distance Does not minimize reconstruction error Through learning optimal reconstruction

IIIT Hyderabad Locality-constrained Linear Coding (LLC) Similar patch should have similar code Locality of Visual Word is used to describe feature vector

IIIT Hyderabad Locality-constrained Linear Coding (LLC) Similar patch should have similar code Locality of Visual Word is used to describe feature vector

IIIT Hyderabad Locality-constrained Linear Coding (LLC) Similar patch should have similar code Locality of Visual Word is used to describe feature vector LLC Coding Process Find K – Nearest Neighbors of x i denoted as B Reconstruct x i using B Replace input x i with non-zero code obtained from previous step Input Descriptor

IIIT Hyderabad Re-ranking SIFT based re-ranking 1 Longest common sub-sequence (LCS) based re-ranking 2 Size of LCS of visual words projected on x-axis Larger the size, better the match 1.Ravi Shekhar, C. V. Jawahar: Word Image Retrieval Using Bag of Visual Words. DAS Ismet Zeki Yalniz, R. Manmatha: An Efficient Framework for Searching Text in Noisy Document Images, DAS 2012 V1V1 V2V2 V6V6 V4V4 V4V4 V8V8 V9V9 x y

IIIT Hyderabad Re-ranking SIFT based re-ranking 1 Longest common sub-sequence (LCS) based re-ranking 2 Size of LCS of visual words projected on X-axis Larger the size, better the match Linear Combination 2 Final Score = λ * Index_Score + (1-λ) * Re-ranking _Score where λ weighting parameter 1.Ravi Shekhar, C. V. Jawahar: Word Image Retrieval Using Bag of Visual Words. DAS Ismet Zeki Yalniz, R. Manmatha: An Efficient Framework for Searching Text in Noisy Document Images, DAS 2012

IIIT Hyderabad Dataset Used Books Used For The Experiments Book#Pages#Words Telugu Telugu English

IIIT Hyderabad Quantitative Results LLC Based Statistics (mAP) BookBoVW BoVW + SIFT Re-ranking BoVW + LCS Re-ranking LLC LLC + LCS Re-raking Telugu Telugu English

IIIT Hyderabad Quantitative Results Text Query Based Statistics BookMethodmAP Telugu- 1716Text Query Telugu- 1718Text Query0.90 English-1601Text Query0.87

IIIT Hyderabad Patch Based Word Image Retrieval

IIIT Hyderabad Patch Based Word Image Retrieval Designed feature based on patch

IIIT Hyderabad Patch Based Word Image Retrieval Designed feature based on patch Representation of Patch using Profile Features

IIIT Hyderabad Patch Based Word Image Retrieval Designed feature based on patch Representation of Patch using Profile Features Profile Feature

IIIT Hyderabad Patch Based Word Image Retrieval Designed feature based on patch Representation of Patch using Profile Features Profile Feature Projection Profile

IIIT Hyderabad Patch Based Word Image Retrieval Designed feature based on patch Representation of Patch using Profile Features Profile Feature Projection Profile Measures ink distribution of word image

IIIT Hyderabad Patch Based Word Image Retrieval Designed feature based on patch Representation of Patch using Profile Features Profile Feature Projection Profile Ink Transition Measures internal shape of image

IIIT Hyderabad Patch Based Word Image Retrieval Designed feature based on patch Representation of Patch using Profile Features Profile Feature Projection Profile Ink Transition Measures internal shape of image

IIIT Hyderabad Patch Based Word Image Retrieval Designed feature based on patch Representation of Patch using Profile Features Profile Feature Projection Profile Ink Transition Upper Word Profile

IIIT Hyderabad Patch Based Word Image Retrieval Designed feature based on patch Representation of Patch using Profile Features Profile Feature Projection Profile Ink Transition Upper Word Profile Distance from Upper Boundary of word image

IIIT Hyderabad Patch Based Word Image Retrieval Designed feature based on patch Representation of Patch using Profile Features Profile Feature Projection Profile Ink Transition Upper Word Profile Distance from Upper Boundary of word image

IIIT Hyderabad Patch Based Word Image Retrieval Designed feature based on patch Representation of Patch using Profile Features Profile Feature Projection Profile Ink Transition Upper Word Profile Lower Word Profile

IIIT Hyderabad Patch Based Word Image Retrieval Designed feature based on patch Representation of Patch using Profile Features Profile Feature Projection Profile Ink Transition Upper Word Profile Lower Word Profile Distance from Lower Boundary of word image

IIIT Hyderabad Patch Based Word Image Retrieval Designed feature based on patch Representation of Patch using Profile Features Profile Feature Projection Profile Ink Transition Upper Word Profile Lower Word Profile Distance from Lower Boundary of word image

IIIT Hyderabad Overview of Feature Calculation... Calculate 4 profile features Concatenate 4 profile features Projection profile Lower word profile Ink Transition Upper word profile Input word image Descriptor

IIIT Hyderabad Fast Pre-Processing V1V1 V2V2 V3V VkVk Input Patch Corresponding Patch Vector Lookup Table Is patch Vector Present ? Find corresponding Visual Word Retrieve corresponding Visual Word Yes No Update

IIIT Hyderabad Dataset Used Book#Pages#Words Telugu English

IIIT Hyderabad Quantitative Results Baseline Statistics BookMethodmAP Telugu- 1718SIFT Telugu- 1718Patch0.53 Telugu- 1718Patch Feature Telugu- 1718Patch Feature with Overlap0.7214

IIIT Hyderabad Quantitative Results Enhancement on Baseline Statistics Enhancement MethodSIFTPatch Feature Query Expansion Spatial Verification LCS Re-ranking

IIIT Hyderabad Quantitative Results Results with Split Features BookSIFTPatch Feature Telugu English –

IIIT Hyderabad Qualitative Results

IIIT Hyderabad Contributions Language Independent System Tested on 4 different languages Scalable to huge dataset Tested on 1 Millions of word Images Handles Noisy document images Demonstrated performance on dataset where commercial OCR fails. Enhancement on baseline results Query Expansion Text Query Support Document specific Sparse coding Document Specific descriptor is proposed

IIIT Hyderabad Future Work Test on different font dataset Similar method for handwritten, camera based datasets Learning character level visual word automatically using annotated data Multi Keyword support Combine both recognition based and recognition free methods Improve patch based descriptor.

IIIT Hyderabad Related Publications Ravi Shekhar and C. V. Jawahar, “Word Image Retrieval using Bag of Visual Words”, In Proceedings of 10 th IAPR International Workshop on Document Analysis Systems (DAS), Praveen Krishnan, Ravi Shekhar and C. V. Jawahar, “Content Level Access to Digital Library of India Pages”, In Proceedings of 8 th Indian Conference on Vision, Graphics and Image Processing (ICVGIP), Ravi Shekhar and C. V. Jawahar, “Document Specific Sparse Coding for Word Retrieval”, In Proceedings of 12 th International Conference on Document Analysis and Recognition (ICDAR), 2013.

IIIT Hyderabad Thanks !!!