Multilingual Access to Online Content - the Europeana Experience Vivien Petras (Humboldt-Universität zu Berlin) With the help of many people involved in Europeana (referenced in the slides) Eurovoc Conference, November 2010
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Outline Europeana – a brief introduction Multilingual access to Europeana – approaches Europeana Semantic Data Layer Multilingual Alignments of Vocabularies Semantic Search Engine Prototype
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Europeana A digital library that is a single, direct and multilingual access point to the European cultural heritage. European Parliament, 27 September 2007
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Europeana Today 13 million objects 28 data aggregators 1500 participating institutions 200 partners 35 FTEs 21 projects 1 million visits in ,000 My Europeana signees 2008: Prototype 2010: Operational Service Stable portal Open Source Code EuropeanaLabs Public Domain Charter From: Cousins, Jill (2010). Europeana Overview. Europeana Open Cultures Conference, October Amsterdam
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Europeana Contributions by Country From: Cousins, Jill (2010). Europeana Overview. Europeana Open Cultures Conference, October Amsterdam Different languages!(?)
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Goethe, Johann Wolfgang von Title: Goethe, Johann Wolfgang von Date: unknown Creator: Goethe, Johann Wolfgang von Description: Goethe, Johann Wolfgang von Language: de-DE Format: image/jpeg Source: SLUB/Deutsche Fotothek Rights: Deutsche Fotothek Provider: Deutsche Fotothek ; Germany Identifier: Subject: Bildnis; Bildniskatalog; Foto; Fotos; Portrait Type: image Books, Articles, Postcards, Folklore objects, Photography, Art Europeana Content Types
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Multilingual Acess to Europeana Interface static pages Search query translation (document translation) Subject Browse (& Search) Controlled vocabularies Semantic Data Layer FrenchEnglishSpanish GermanItalianPolish DutchPortugese HungarianSwedish
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Europeana Semantic Data Layer Doerr, M.; Gradmann, S.; Hennicke, S.; Isaac, A.; Van de Sompel, H. (2010). The Europeana Data Model (EDM).
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Europeana Semantic Data Layer Doerr, M.; Gradmann, S.; Hennicke, S.; Isaac, A.; Van de Sompel, H. (2010). The Europeana Data Model (EDM). library archive museum Bridging isles of information by connecting objects from different domains via cross-vocabulary links.
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Semantic Data Layer Alignment Example Irish vocabulary From: Cousins, Jill (2010). Europeana Overview. Europeana Open Cultures Conference, October Amsterdam Norwegian vocabulary SKOS Mapping skos:exactMatch
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Multilingual Alignment: Approach Identify and convert relevant semantic resources Pivot vocabularies for relevant categories (subject, persons, places…) = multilingual and with wide coverage E.g. UDC, DDC, VIAF, TGN, Geonames, Wordnets, dbPedia From: Isaac, Antoine; Schreiber, Guus (2010). Vrije Universiteit Amsterdam Approach to Multilingual Mapping of Vocabularies.
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Multilingual Alignment: Approach Align more specific vocabularies to the pivots = anchoring mappings Finding instances of skos:exactMatch mappings Vocabulary characteristics important for matching: Lexical variance of lables (e.g. plural/singular, diacritics, multilinguality) Preferred / alternative labels Nature of hierarchy From: EuropeanaConnect Milestone (2010). Specification of preferred terms identification methodology.
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Multilingual Alignment: Approach Methodology: Conversion to SKOS/RDF Application of different alignment methods: Lexical matching Structure-based matching Instance-based matching Filtering / disambiguation of matching candidates: Analyzing children / parent matches Combining alignments From: EuropeanaConnect Milestone (2010). Specification of preferred terms identification methodology.
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 VUA Vocabulary Aligment Tool Amalgame AMsterdam ALignment GenerAtion MEtatool Uses EDOAL (Expressive and Declarative Ontology Alignment Language) or SKOS Also provides pre- / post-mapping statistics and an evaluation tool From: EuropeanaConnect Milestone (2010). Semantics of descriptions aligned (intermediary).
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 VUA Vocabulary Aligment Tool Amalgame Skosified: en, fr, de, nl, hu Mappings (>500,000): en, fr, nl Mostly label matches
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Europeana Semantic Search Engine
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Europeana Semantic Search Engine Disambiguation of search terms
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Europeana Semantic Search Engine Multilingual query expansion
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Europeana Semantic Search Engine Works created by matching person Works related to matching person Works created by a teacher of matching person Works related to an artefact created by matching person Works created by an artist professionally related to matching person Works titled Works showing concept Works with matching Location …. Clustering of search results
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Next Steps Adding more vocabularies from the content providers: VIAF Spanish and Polish subject heading lists Switching metadata delivery to Europeana Data Model (EDM) format (2011) And: linking with the cloud…
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November 2010 Europeana & Linked Open Data Doerr, M.; Gradmann, S.; Hennicke, S.; Isaac, A.; Van de Sompel, H. (2010). The Europeana Data Model (EDM). Information Spaces DBpedia PND and SWD (prototype) Geonames LCSH …
Vivien Petras, Humboldt-Universität zu Berlin Eurovoc Conference, November Thank you.