Presentation is loading. Please wait.

Presentation is loading. Please wait.

Filigranes pour tous Watermarks For All

Similar presentations


Presentation on theme: "Filigranes pour tous Watermarks For All"— Presentation transcript:

1 Filigranes pour tous Watermarks For All
A new project based on deep-learning technology and crowd-sourcing Marc H. Smith École nationale des chartes / Centre-Jean Mabillon Paris Sciences & Lettres Watermarks in digital collections 4th International Conference Vienna, October 2017

2

3

4

5 « Science des données, données de la science »
IRIS – Initiative de recherche interdisciplinaire et stratégique École nationale des chartes Christine Bénévent, Olivier Poncet, Marc Smith École des Ponts ParisTech Mathieu Aubry INRIA – Institut national de recherche en informatique et en automatique Joseph Sivic IRHT – Institut de recherche et d’histoire des textes François Bougard, Bruno Bon

6 Repertories of watermarks: evolution and limitations
From drawings to photographs From paper to digital From single/national corpora to portals and interoperability Limitations: – Identifying watermarks: image > word > image – Number of reference images: more often “similar” than identical – Closed data, from producer to user

7 Filigranes pour tous Identification : image to image Deep-learning technology for image comparison Initial corpus: French watermarks > international collaboration? User interaction: image matching and database augmentation > Multiple images of (identical or variant) watermarks

8 Test corpus Set of homogeneous watermarks from French archives Notarial records from the Archives nationales (1650) 4 different watermarks × 61 photographs using 3 lightsheets and 3 smartphones Minimal guidelines for framing. Pages with and without writing

9 Test sample: four watermarks

10 Random sample of multiple occurrences of a watermark

11 Image capture and pre-processing

12 Image capture and pre-processing
1/6 1/6 1/6 1/6

13 Image capture and pre-processing
1/6 1/6 300 x 300 pixels 1/6 1/6

14 Deep learning Convolutional neural network:
Iteration of simple operations with multiple parameters Parameters are optimized on training data, producing a different result for each watermark Image Layer 1 Layer 2 classifier

15 Elementary operation of a single ‘neuron’
x = input, w = parameters

16 Image matching: first results
Training set: 200 images (50 / watermark ) 100% correct matching Control set: 44 images (11 / watermark) 95 % correct matching (42/44) Caution: “black box” syndrome: is the matching actually based on watermarks?

17 Further development: the app
Tools for image capture: Ruler & framing mask > scale Real-time uploading and image comparison User-uploaded images and metadata added to the database

18 Open questions Expanding the data set: how will the software adapt? Minimum training data set? (a single image?) Fragmentary/partially visible watermarks (sub-folio quires) Capture: close-ups vs full pages — at 300 × 300 pix ! Comparing photographs and drawings? Stimulation of crowdsourcing

19 Research questions Quantitative measurements Watermark variants and evolution: copies, deterioration, etc. Paper history: from production to circulation and consumption Functional distribution of formats and quality: books vs documents vs art…

20


Download ppt "Filigranes pour tous Watermarks For All"

Similar presentations


Ads by Google