FRE 2645 SCSIT Talk, Nottingham University, Thursday 16th June 2005 Indexing of Graphic Document Images : a Perceptive Approach Mathieu Delalandre¹, ².

Slides:



Advertisements
Similar presentations
Office of SA to CNS GeoIntelligence Introduction Data Mining vs Image Mining Image Mining - Issues and Challenges CBIR Image Mining Process Ontology.
Advertisements

Image Retrieval: Current Techniques, Promising Directions, and Open Issues Yong Rui, Thomas Huang and Shih-Fu Chang Published in the Journal of Visual.
Employing structural representation for symbol detection, symbol spotting and indexation in line drawing document images Muhammad Muzzamil Luqman
電腦視覺 Computer and Robot Vision I
Content-Based Image Retrieval
FRE 2645 GREC 2003 : 31 July 2003 Local Structural Analysis: a Primer Mathieu Delalandre¹, Eric Trupin¹, Jean-Marc Ogier² ¹PSI Laboratory, Rouen University,
FRE 2645 Graph based Representation 2005 Graphical Knowledge Management in Graphics Recognition Systems Mathieu Delalandre¹, Eric Trupin¹, Jacques Labiche¹,
Image Processing and Interpretation Group University of Nottingham Eureka Meeting, L3i Laboratory, La Rochelle University Tuesday 20th April 2006 Fast.
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
A Robust Approach for Local Interest Point Detection in Line-Drawing Images 1 The Anh Pham, Mathieu Delalandre, Sabine Barrat and Jean-Yves Ramel RFAI.
HYPERMEDIA Chang-Yang Lin Eastern Kentucky University
Building Synthetic Graphical Documents for Performance Evaluation of Symbol Recognition M. Delalandre 1, E. Valveny 1 and T. Pridmore 2 1 CVC, Barcelona,
A new face detection method based on shape information Pattern Recognition Letters, 21 (2000) Speaker: M.Q. Jing.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Overview of Computer Vision CS491E/791E. What is Computer Vision? Deals with the development of the theoretical and algorithmic basis by which useful.
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
A Flexible Workbench for Document Analysis and Text Mining NLDB’2004, Salford, June Gulla, Brasethvik and Kaada A Flexible Workbench for Document.
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
Document Image Analysis CSE 717 An Introduction. Document Image Analysis  DIA is the theory and practice of recovering the symbol structures of digital.
Graphics Recognition – from Re-engineering to Retrieval Karl Tombre, Bart Lamiroy LORIA, France.
FRE 2645 Grec 2003 : 30 July, 2003 Adaptable Vectorisation System Based on Strategic Knowledge and XML Representation Use Delalandre Mathieu¹, Saidali.
Geographical Information System GIS By: Yahia Dahash.
00/4/103DVIP-011 Part Three: Descriptions of 3-D Objects and Scenes.
Groundtruthing for Performance Evaluation of Document Image Analysis Systems: a primer Mathieu Delalandre Pattern Recognition.
Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.
FEATURE EXTRACTION FOR JAVA CHARACTER RECOGNITION Rudy Adipranata, Liliana, Meiliana Indrawijaya, Gregorius Satia Budhi Informatics Department, Petra Christian.
Hubert CARDOTJY- RAMELRashid-Jalal QURESHI Université François Rabelais de Tours, Laboratoire d'Informatique 64, Avenue Jean Portalis, TOURS – France.
Vectorial Distortion For Performance Evaluation Current investigations …. Mathieu Delalandre and Ernest Valveny Meeting of Document Analysis Group Computer.
Fast System for the Retrieval of Ornamental Letter Image M. Delalandre 1, J.M. Ogier 2, J. Lladós 1 1 CVC, Barcelona, Spain 2 L3i, La Rochelle, France.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
–combines elements of computer science –database design –software design geography –map projections –geographic reasoning mathematics –mathematical topology.
CS654: Digital Image Analysis Lecture 3: Data Structure for Image Analysis.
FRE 2645 Graph based Representation 2005 Graphical Knowledge Management in Graphics Recognition Systems Mathieu Delalandre¹, Eric Trupin¹, Jacques Labiche¹,
Information Retrieval and Knowledge Organisation Knut Hinkelmann.
1 Peter Fox Xinformatics 4400/6400 Week 11, April 16, 2013 Information Audit and dealing with Unstructured Information.
Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.
Master Training par Norolala Ramangaseheno Tutors : Eric Trupin, Tony Pridmore Date : Unformating SVG Documents Application To Graphic Document.
1 Digital Image Processing Dr. Saad M. Saad Darwish Associate Prof. of computer science.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Towards Performance Evaluation of Symbol Recognition & Spotting Systems in a Localization Context Mathieu Delalandre CVC, Barcelona, Spain EuroMed Meeting.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2005.
Digital Libraries Lillian N. Cassel Spring A digital library An informal definition of a digital library is a managed collection of information,
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
Vector and symbolic processors
Semantic Extraction and Semantics-Based Annotation and Retrieval for Video Databases Authors: Yan Liu & Fei Li Department of Computer Science Columbia.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
A System for Automatic Personalized Tracking of Scientific Literature on the Web Tzachi Perlstein Yael Nir.
A Performance Characterization Algorithm for Symbol Localization Mathieu Delalandre 1,2, Jean-Yves Ramel 2, Ernest Valveny 1 and Muhammad Muzzamil Luqman.
FRE 2645 ICISP’2003 : Agadir, Morocco : 24 June, 2003 Symbol Recognition by Global Local Structural Approaches, Based on Scenario Use, and with a XML Representation.
Relevance Feedback in Image Retrieval System: A Survey Tao Huang Lin Luo Chengcui Zhang.
Course 3 Binary Image Binary Images have only two gray levels: “1” and “0”, i.e., black / white. —— save memory —— fast processing —— many features of.
Building Synthetic Graphical Documents for Performance Evaluation M. Delalandre 1, T. Pridmore 2, E. Valveny 1, H. Locteau 3, E. Trupin 3 1 CVC, Barcelona,
1 Review and Summary We have covered a LOT of material, spending more time and more detail on 2D image segmentation and analysis, but hopefully giving.
FRE 2645 ELCVIA Contextual System of Symbol Structural Recognition based on an Object-Process Methodology Mathieu Delalandre¹, Eric Trupin¹, Jean-Marc.
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Digital Video Library - Jacky Ma.
Mathieu Delalandre1, Ernest Valveny1, Tony Pridmore2
Binary Image processing بهمن 92
Mathieu Delalandre¹, Eric Trupin¹, Jean-Marc Ogier², Jacques Labiche¹
Descriptions of 3-D Objects and Scenes
Mathieu Delalandre, Pierre Héroux, Sébastien Adam, Eric Trupin,
Jiwon Kim Steve Seitz Maneesh Agrawala
Presentation transcript:

FRE 2645 SCSIT Talk, Nottingham University, Thursday 16th June 2005 Indexing of Graphic Document Images : a Perceptive Approach Mathieu Delalandre¹, ² Thursday 16th June 2005 ¹ PSI Laboratory, Rouen University, France ² SCSIT, Nottingham University, UK

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Who I am ?  Mathieu Delalandre  Thesis:Fourth year of PhD (defence in September)  Lab:PSI Laboratory, Rouen city, France  Super:E. Trupin, J.M. Ogier, J. Labiche  Team:S. Adam, H. Locteau, P. Héroux, E. Barbu, Y. Lecourtier  Field:Document Image Analysis (Graphics Recognition)  Postdoc:IPI, SCSIT, from April to September (4-5 months) with Tony Pridmore

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Indexing of Graphic Document Images : a Perceptive Approach  Introduction  Systems Overview  The Knowledge Level  Conclusion

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Introduction Indexing & Retrieval (I & R)  Indexing & Retrieval [Greengrass’00]  Indexing: Identification and recording of attributes of data that will aid retrieval.  Retrieval: Ability of a database management system to get back data that were stored there previously.  Applications  videos (MPEG, AVI, …)  Web pages (XML, XHTML, …)  structured documents (PDF, PS, Word, …)  images (JPG, GIF, …) -Indexing & Retrieval (I & R) -Categorization of Images -I & R of Document Images -My Topic

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Introduction Categorization of Images document images trademark logoheading journal manual photographies foreground/background images -Indexing & Retrieval (I & R) -Categorization of Images -I & R of Document Images -My Topic

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Introduction I & R of Document Images (1/3) Web Pages Images Markup Languages HTML, XHTML,.. 30%70% Document Images Logos, Headings, … Photographies 60%40%  Today, document images are not indexed by search engines due of complexity of Document Image Analysis (DIA) task [Doerman’98][Walker’00][Baird’03]  Is indexing of document images really needed ?  two questions  Question : How many document images and where [Spring’95] [Cleveland’98] [Steve’99] [Ouf’01] [Baird’03] [Hu’04] ? Deep Web Web ( ko) 0.3% 99.3% Digital Libraries Others Softwares, Data Bases, … large (or main) part Document ImagesStructured Documents minor partmain part -Indexing & Retrieval (I & R) -Categorization of Images -I & R of Document Images -My Topic

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Introduction I & R of Document Images (2/3) Paper (and image) has too many desirable properties, document images and structured documents will increasingly co-exist in the future [Breul’04] Question : New or just old document images ? -Indexing & Retrieval (I & R) -Categorization of Images -I & R of Document Images -My Topic

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Introduction I & R of Document Images (3/3)  To Conclude :  (1) DIA is needed (and will be needed) in the future of I & R of documents [Baird’03] [Breul’04]  (2) DIA must come back today under the way of I & R [Baird’03] -Indexing & Retrieval (I & R) -Categorization of Images -I & R of Document Images -My Topic

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Introduction My Topic  Indexing of graphic document images  Indexing & Retrieval  Indexing  Identification and recording of attributes of data that will aid retrieval  First step before retrieval  document images  graphic document images line drawing symbollogoasian script historical heading -Indexing & Retrieval (I & R) -Categorization of Images -I & R of Document Images -My Topic

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Indexing of Graphic Document Images : a Perceptive Approach  Introduction  Systems Overview  The Knowledge Level  Conclusion

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Systems Overview Introduction  Overview of systems to index graphic document images  we talk about Graphics Indexing Systems  Graphics Indexing Systems are specialized from DIA systems applied to recognition and understanding of graphic document images [Tombre’03]  we talk about Graphics Recognition Systems -Introduction -Graphics Recognition Systems -Graphics Indexing Systems -Open Problems

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Systems Overview Graphics Recognition Systems (1/3)  Applications deal with graphics parts (symbol and linear)  text/graphics segmentation [Tombre’02], vectorisation [Mejbri’02], symbol recognition [Llados’02], document interpretation (or understanding) [Ablameko’00], … symbollineartext  Graphics Recognition Systems :  graphic document images  structured documents -Introduction -Graphics Recognition Systems -Graphics Indexing Systems -Open Problems

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Systems Overview Graphics Recognition Systems (2/3)  Graphics are structured and connected  Graphics Recognition Systems are based on structural methods  “relational organization of low-level features (graphic primitives) into higher-level structures (graph)” [Tombre’96] [Shi’89] symbol and its structure connected symbol in drawing line connect point connect point T link line low level features graphic primitives line connect edge higher-level structure graph T edge symbol recognition -Introduction -Graphics Recognition Systems -Graphics Indexing Systems -Open Problems

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Systems Overview Graphics Recognition Systems (3/3)  Graphic Primitive Extraction, some methods [Wenyin’98] [Delalandre’04] :  skeletonization [Hilaire’04], contouring [Ramel’00], tracking [Song’00], labelling [Badawy’02], transform [Couasnon’01], meshes [Vaxiviere’95], region segmentation [Cao’00], run-length [Burge’98], …  Recognition  Graph Matching [Bunke’00], Graph Transform [Blostein’05], Primitive Matching [Foggia’99], …  Architecture of Graphics Recognition Systems : -Introduction -Graphics Recognition Systems -Graphics Indexing Systems -Open Problems Graphic Primitive Extraction Recognition document images graph of graphic primitives structured document Graphic Models

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Systems Overview Graphics Indexing Systems (1/3)  Graphics Indexing Systems [Doerman’98] [Tombre’03], 3 classes : Title block recognition [Arias’98], [Najman’01], [Lamiroy’02], … Statistical framework [Samet’96], [Worring’99], [Tabbone’03], [Terrades’03], … Connected so no matched Partial matching Graphics indexing [Kasturi’88], [Lorenz’95], [Huang’97], [Hu’97], [Barbu’04], [Valasoulis’04], … -Introduction -Graphics Recognition Systems -Graphics Indexing Systems -Open Problems

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Systems Overview Graphics Indexing Systems (2/3)  Architecture of Graphics Indexing Systems : Graphic Primitive Extraction Indexing Graph of graphic primitives indexing attributes specific set of graphic primitives Index attributes+ document links -Introduction -Graphics Recognition Systems -Graphics Indexing Systems -Open Problems

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Works [Huang’97] [Kasturi’88] [Lorenz’95] [Barbu’04] [Hu’04] [Dosh’04] Graphic Primitives Extraction thinning and chaining run length encoding and polygonisation contouring and polygonisation thinning and neighbour analysis of skeleton’s pixels thinning, chaining, and polygonisation thinning, chaining, and polygonisation Graph of Graphic Primitives line graph of skeleton straight line graph of contours and skeleton 2-D strings of contours region adjacency graph set of straight line of skeleton set of straight line of skeleton Indexing cycle search, width and length matching of lines Fourier approximation of line graph string matching graph mining string matching vectorial signature Systems Overview Graphics Indexing Systems (3/3) thinning contouringregion graph skeleton graph statistical structural -Introduction -Graphics Recognition Systems -Graphics Indexing Systems -Open Problems

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Systems Overview Open Problems (1/2)  All these systems use a Lexical/Syntactic (or Bottom/Up) approach [Tombre’96]  Lexical (Bottom) : Extraction from images of graphical primitives in an fixed way  Syntactic (Up) : Analysis of graphical primitives without returns on image  So, all these systems use a Document Understanding Approach, but I & R is not an Understanding problem -Introduction -Graphics Recognition Systems -Graphics Indexing Systems -Open Problems CriterionUnderstandingI & R Image Sizelargesmall and medium Data Base Sizesmalllarge Process Executionone shotevery-time complexity Graphic Primitivesaccurateapproximated Noise Levelhigh and mediumlow and medium robustness Prior Knowledgeyesno Document Classfew and knownseveral and unknown content adaptation  content adaptation is the most important feature of I & R systems

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Systems Overview Open Problems (2/2) -Introduction -Graphics Recognition Systems -Graphics Indexing Systems -Open Problems region based [Roque’03] both based [Ramel’00] line based [Hilaire’04]  Examples of Content Adaptation  A broad class of document  Context text/graphics segmentation noise adaptation  To conclude  A I & R must deal with the content adaptation  Content adaptation can’t be solved without a knowledge based approach

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Indexing of Graphic Document Images : a Perceptive Approach  Introduction  Systems Overview  The Knowledge Level  Conclusion

SCSIT Talk, Nottingham University, Thursday 16th June 2005 The Knowledge Level Introduction  Some (general) definitions [Tuthill’90] [Holsapple’04]  Knowledge : human mental grasp of reality  Representation : placement (and meaning) of knowledge into (from) computer memory  Formalism : a set of symbols corresponding to knowledge inside computers Knowledge Human Formalism(s) Computer placementmeaning Human/Computer  Different types of knowledge  on strategies []  on case based reasoning []  on ontologies []  …. -Introduction -Graphical Knowledge -Graphics Model -a Perceptive Approach

SCSIT Talk, Nottingham University, Thursday 16th June 2005 pixel-based formalisms vector-based formalisms graph-based formalisms graphic primitives high-level objects formalism levels The Knowledge Level Graphical Knowledge (1/2)  Graphical Knowledge [Delalandre’05] : It is a type of knowledge corresponding to human mental grasp of graphics Levels of Graphical Knowledge image symbol perception interpretation abstraction levels it is a gate ! -Introduction -Graphical Knowledge -Graphics Model -a Perceptive Approach

SCSIT Talk, Nottingham University, Thursday 16th June 2005 primitivesline images The Knowledge Level Graphical Knowledge (2/2)  Two formalism levels [Tombre’96]  Graphic Primitives [Murray’96]  Pixel-based formalism : pixel, raster, run, connected component, …  Vector-based formalism : vector, arc, curve, ellipsis, square, …  Graph-based formalisms [Sowa 99]: Relational Attributed Graphs (RAG), Frames, Object-Oriented Languages, … Relational Attributed Graphs [Seong’93] -Introduction -Graphical Knowledge -Graphics Model -a Perceptive Approach

SCSIT Talk, Nottingham University, Thursday 16th June 2005 The Knowledge Level Graphics Model (1/2)  Model [Seguela’01] : a knowledge representation using given formalisms and for given system’s purposes  Graphics Model [Delalandre’05] : model used to represent the graphical knowledge a (simple) shape graphic primitives extremity junction line line based model junction edge line junction based model extremity junction line edge -Introduction -Graphical Knowledge -Graphics Model -a Perceptive Approach

SCSIT Talk, Nottingham University, Thursday 16th June 2005 The Knowledge Level Graphics Model (2/2) region-based models component loop neighbour include contour based models quadrilateral Line link Junction link skeleton based models extremity junction line edge  One system = one model  a considerable number of models  [Joseph’92] [Pasternak’93] [Han’94] [Burgue’95] [Yu’97] [Lee’98] [Ramel’00] [Couasnon’01] [Badawy’02] [Yan’04] …  Models depend of extracted graphic primitives, we can defined a graphics model taxonomy into 3 classes [Delalandre’05] -Introduction -Graphical Knowledge -Graphics Model -a Perceptive Approach

SCSIT Talk, Nottingham University, Thursday 16th June 2005 The Knowledge Level a Perceptive Approach (1/6) Region Level Contour Level Skeleton Level Perception Level of Representations Global Local -Introduction -Graphical Knowledge -Graphics Model -a Perceptive Approach specialisationaggregation two links between levels

SCSIT Talk, Nottingham University, Thursday 16th June 2005 The Knowledge Level a Perceptive Approach (2/6) classic models -Introduction -Graphical Knowledge -Graphics Model -a Perceptive Approach Contour Level Skeleton Level Perception Level of Representations Global Local Region Level hybrid models perceptive approach (jump or browse)

SCSIT Talk, Nottingham University, Thursday 16th June 2005 The Knowledge Level a Perceptive Approach (3/6)  First step, the region level : connected component analysis [Alnuweiri’92] foregroundbackground foreground’s components background’s components main background loops -Introduction -Graphical Knowledge -Graphics Model -a Perceptive Approach

SCSIT Talk, Nottingham University, Thursday 16th June 2005  Six Features  (F) Foreground  (B) Background  (R) Resolution (ie. distance) The Knowledge Level a Perceptive Approach (4/6)  (N) Neighboring  (S) Size  (I) Inclusion -Introduction -Graphical Knowledge -Graphics Model -a Perceptive Approach

SCSIT Talk, Nottingham University, Thursday 16th June 2005  Use-Case Queries The Knowledge Level a Perceptive Approach (5/6) -Introduction -Graphical Knowledge -Graphics Model -a Perceptive Approach started imageFR 1 FR 2 BR 2 BR 2 S 2 BR 2 S 2 N 2

SCSIT Talk, Nottingham University, Thursday 16th June 2005 The Knowledge Level a Perceptive Approach (6/6)  True-Life Query FS 1 -Introduction -Graphical Knowledge -Graphics Model -a Perceptive Approach BR 2 N>2N>2

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Indexing of Graphic Document Images : a Perceptive Approach  Introduction  Systems Overview  The Knowledge Level  Conclusion

SCSIT Talk, Nottingham University, Thursday 16th June 2005 Conclusion  Conclusion  It is just a bibliography study and ideas  Start on this ideas ?  Perspectives  Contour and skeleton levels ?  System to control the representation building ?