Retrieval of the Ornaments from the Hand-Press Period: an Overview Etienne BaudrierLSIIT (Illkirch, France) Sébastien BussonCESR (Tours, France) Silvio CorsiniBCU (Lausanne, Switzerland) Mathieu DelalandreCVC (Barcelona, Spain) Jérôme LandréCReSTIC (Troyes, France) Frédéric Morain-NicolierCReSTIC (Troyes, France)
Plan About this work … Hand Press Period About Ornaments Digital Collection of Ornaments How DIA can help ? Content Based Image Retrieval Visual Comparison Conclusions and Perspectives
About this work … Computer Science People 1.Etienne Baudrier 2.Mickael Coustaty 3.Mathieu Delalandre 4.Nathalie Girard 5.Nicholas Journet 6.Dimosthenis Karatzas 7.Jerome Landré 8.Kamel Ait-Mohand 9.Jean-Marc Ogier 10.Nicolas Ragot 11.Jean-Yves Ramel Human Science People 1.Pierre Aquilon 2.Sébastien Busson 3.Silvio Corsini 4.Marie-Luce Demonet 5.Stephen Rawles 6.Toshinori Uetani One-day Workshop 13 th November 2007 CESR, Tours city, France CESR Labs of Human Science Labs of Computer Science
Hand Press Period (1/2) The Hand-Press period runs from around 1454 (approximate date of Gutenberg’s invention) to through the first half of the nineteenth century (when mechanized presses started to appear). a hand-press book 1454 Gutenberg half 18 th mechanized presses Hand Press hand press character matrix
Hand Press Period (2/2) HPB Database 22 European libraries half 19 th 3 Millions books Trinity old library (Dublin, Ireland) 16 th - today Mathematics, medicine, history, music, religion, literature, etc.
About Ornaments (1/2) Ornaments in pages “lettrine” “fleuron” to start a paragraph trademark of a printing house “cul de lampe” to close a part or a chapter to epitomize a concept, or to represent a person, such as a king or saint. “emblème” Categories of ornaments ornamentstext
About Ornaments (2/2) Page 3,4 ornaments/page Book 103,4 ornaments/book Foreground pixels [Journet’05] Text63% Graphics37% Part of ornaments in books (BVH dataset, 46 books) sciences, medical, religion … Hand Press books are composed for a large part of ornaments. Pictures were a powerful mean of communication at this period due to the low education level of people.
Digital Collections of Ornaments (1/2) 25H112rocks 44G312prisoner; in fetters 91E461punishment of Prometheus; he is chained to a rock, usually by Vulcan and/or Mercury 91E4611an eagle tears at Prometheus' liver Digitalization Pre-processing (deskew, lighting correction, filtering, cropping…) Layout analysis and segmentation [Ramel’07] Expert Classification using thesaurus icon class encoding of an emblem image
Digital Collections of Ornaments (2/2) DLsSizePeriodsWeb links BVH th Fleuron th Impact th -18 th Mouriau th Moriane th Collections of ornaments are small in regard to mass digitalization collections (e.g. Million Book Project), two main reasons: (1)Mass digitalization projects are thought in terms of OCR only (layout analysis aims to perform text/graphics separation, final electronic documents are “ASCII code”, no use of high-level document model) Digitalization programs should consider better the graphics aspects. (2)Classification using thesaurus by human experts is time consuming (15-20 mn per image) Collaborative platforms, integrating DIA components, can help in. Other smallest datasets are ArtDico, Canadian heraldry, Printers' Devices, etc.
How DIA can help ? (1/2) A duplicated block Redundancy of ornaments in books A same block used in 2 books Vascosan 1555Marnef 1576 Printing house tampon exchange copy Tracking of plugs noise offset precision skewing scaling scalability, mass of data weak resolution, lossy compression
How DIA can help ? (2/2) DB 1 DB 2 CBIR DB n --- Query image Visual Comparison R1 R2 R3 Context information Publication dates Publication places Practices of printers … submit a query retrieval results comparison visualization assign previous classification Meta Digital Collections Of ornaments
Content Based Image Retrieval Ideal method High precision (weak difference) Robust (noise, skew, offset) Invariant to scale Fast comparison (online, mass of data) Scalable Precision Scale invariant Speed Images used forthe experiments Image adequacy Bigun’96-no++500medium Chen’03-no+50large Baudrier’08++no--68none Delalandre’07+no-2048none Bigun’96 Chen’03 w h hw Radiogram 0°Radiogram 90° Detection of key points (Haris) Zernike moments (local template) Nearest points compared with a likelihood estimation Baudrier’08 Expert set resolution analysis Hausdorff distance between images SVM classification Delalandre’07 Run Length Encoding Histogram centering RLE Comparison Orientation Radiograms Fourier Descriptors Euclidean Distance Comparison
Visual Comparison Ideal method Highlight pertinent differences Make an hypothesis of relative dating Invariant to scale Robust (noise, skew, offset) Beusekom’07 Detection of points of interest (connected components) Pixel to Pixel Difference Map (PPDMap) PPDMapBlockA#1 LDMap BlockA#2 Baudrier’07 Equivalent ellipse computation (first image moments) Local Dissimilarity Map (LDMap) Image Registration Visualization Method
Conclusions and Perspectives Large ornament material is available, but there is few digital collections Digitalization programs should consider better the graphics aspects. Collaborative platforms, integrating DIA components, can help in. Two database levels (with, without thesaurus classification) DIA components CBIR systems (orientation signature, points of interest, image distance, compressed representation) Lack of evaluation of the methods make difficult the comparison To define benchmark datasets (time, precision/recall) Methods propose a tradeoff between complexity/precision, possible combination Visual Comparison (registration, PPDMap, LDMap) Hard point is the registration, user interaction could help in