Download presentation
Presentation is loading. Please wait.
1
Filigranes pour tous Watermarks For All
A new project based on deep-learning technology and crowd-sourcing Marc H. Smith École nationale des chartes / Centre-Jean Mabillon Paris Sciences & Lettres Watermarks in digital collections 4th International Conference Vienna, October 2017
5
« Science des données, données de la science »
IRIS – Initiative de recherche interdisciplinaire et stratégique École nationale des chartes Christine Bénévent, Olivier Poncet, Marc Smith École des Ponts ParisTech Mathieu Aubry INRIA – Institut national de recherche en informatique et en automatique Joseph Sivic IRHT – Institut de recherche et d’histoire des textes François Bougard, Bruno Bon
6
Repertories of watermarks: evolution and limitations
From drawings to photographs From paper to digital From single/national corpora to portals and interoperability Limitations: – Identifying watermarks: image > word > image – Number of reference images: more often “similar” than identical – Closed data, from producer to user
7
Filigranes pour tous Identification : image to image Deep-learning technology for image comparison Initial corpus: French watermarks > international collaboration? User interaction: image matching and database augmentation > Multiple images of (identical or variant) watermarks
8
Test corpus Set of homogeneous watermarks from French archives Notarial records from the Archives nationales (1650) 4 different watermarks × 61 photographs using 3 lightsheets and 3 smartphones Minimal guidelines for framing. Pages with and without writing
9
Test sample: four watermarks
10
Random sample of multiple occurrences of a watermark
11
Image capture and pre-processing
12
Image capture and pre-processing
1/6 1/6 1/6 1/6
13
Image capture and pre-processing
1/6 1/6 300 x 300 pixels 1/6 1/6
14
Deep learning Convolutional neural network:
Iteration of simple operations with multiple parameters Parameters are optimized on training data, producing a different result for each watermark … Image Layer 1 Layer 2 classifier
15
Elementary operation of a single ‘neuron’
x = input, w = parameters
16
Image matching: first results
Training set: 200 images (50 / watermark ) 100% correct matching Control set: 44 images (11 / watermark) 95 % correct matching (42/44) Caution: “black box” syndrome: is the matching actually based on watermarks?
17
Further development: the app
Tools for image capture: Ruler & framing mask > scale Real-time uploading and image comparison User-uploaded images and metadata added to the database
18
Open questions Expanding the data set: how will the software adapt? Minimum training data set? (a single image?) Fragmentary/partially visible watermarks (sub-folio quires) Capture: close-ups vs full pages — at 300 × 300 pix ! Comparing photographs and drawings? Stimulation of crowdsourcing
19
Research questions Quantitative measurements Watermark variants and evolution: copies, deterioration, etc. Paper history: from production to circulation and consumption Functional distribution of formats and quality: books vs documents vs art…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.