Semantic & Multilingual Interoperability in Cultural Heritage Information Systems Vivien Petras Berlin School of Library and Information Science 14 November.

Slides:



Advertisements
Similar presentations
WDL Technical Architecture Working Group (TAWG) June 2010 Achievements and Recommendations Co-chaired by Noha Adly, Bibliotheca Alexandrina Babak Hamidzadeh,
Advertisements

Multilingual Access to Online Content - the Europeana Experience Vivien Petras (Humboldt-Universität zu Berlin) With the help of.
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
OpenUp! General Overview. OpenUp! – What it aims at: Because access to multimedia resources from natural history collections in Europe.
ICT 2010: "Global Information Structures for Science & Cultural heritage: The Interoperability Challenge" Networking Session Coordination Action on Digital.
Interoperability Scenarios All Working Groups Meeting May, Rome, Italy.
Interoperability Aspects in Europeana Antoine Isaac Workshop on Research Metadata in Context 7./8. September 2010, Nijmegen.
A Single Entrance for Access to Cultural Data (Archives, Museums, Libraries, Heritage) at the French Ministry of Culture Knowledge.
Data modeling at Europeana Antoine Isaac METS Workshop at the Digital Libraries 2014 Conference London, Sept. 11, 2014.
OneGeology-Europe - the first step to the European Geological SDI INSPIRE Conference 2010, Session Thematic Communities: Geology Krakow, June 24 th 2010.
Europeana.eu: cultural collaboration across borders and domains Jon Purday Senior Communications Advisor, Europeana The 12th Annual International Conference.
Creating the User’s European Digital Library Jill Cousins The European Library Knowbynet, Berlin, June 2007.
OpenUp! A New Project on Opening up the European Natural History Heritage for EUROPEANA W. G. Berendsohn, A. K. Michel, A. Güntsch, W.-H. Kusber (2011)
Notes on ThoughtLab / Athena WP4 November 13, 2009 Antoine Isaac
‘european digital library’ (EDL) Julie Verleyen TEL-ME-MOR / M-CAST Seminar on Subject Access Prague, 24 November 2006.
National libraries and identity in the Semantic Web Gordon Dunsire BNE, Madrid, 14 Dec 2011.
Europeana: Europe's Digital Library, Museum and Archive Ashley Carter and Dana Sagona.
IASA-AMIA 2010 ANNUAL CONFERENCE PHILADELPHIA EUROPEANAEUROPEANA Benefits and progress.
Creating Access to Europe’s Television Heritage FIAT/IFTA World Conference Madrid :: October 2006 Prof. Dr. Sonja de Leeuw (Project coordinator - Utrecht.
GL12 Conf. Dec. 6-7, 2010NTL, Prague, Czech Republic Extending the “Facets” concept by applying NLP tools to catalog records of scientific literature *E.
Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.
Networking Session: Global Information Structures for Science & Cultural Heritage - The Interoperability Challenge «INTEROPERABILITY FROM THE CULTURAL.
Europeana Introduction & Progress Olaf Janssen EuropeanaLocal Kick-off June 2008, London.
Towards Online Accessibility of Valuable Phenomena of the Bulgarian Folklore Heritage Radoslav Pavlov 1 Konstantin Rangochev 1 Desislava Paneva-Marinova.
Creating Access to Europe’s Television Heritage Prof. Dr. Sonja de Leeuw (project-coordinator, Utrecht University) Johan Oomen MA (technical director,
Exploring Europe's Television Heritage in Changing Contexts Connected to: Funded by the European Commission within the eContentplus programme
Europeana: Update on Metadata Mapping and Normalisation, Content Ingestion and Aggregation Activities Robina Clayphan Interoperability Manager, EDLF ECDL.
The Europeana Data Model: tackling interoperability via modelling Carlo Meghini, Antoine Isaac, Stefan Gradmann, Guus Schreiber, et al. DL.org Autumn School.
Europeana Sounds – Uniting the sounds of Europe Richard Ranft (British Library) Zane Grosa (National Library of Latvia) IASA Nordic conference, 26 May.
‘The Universal Catalogue’ a cultural sector viewpoint David Dawson Senior Policy Adviser (Digital Futures) Museums, Libraries and archives Council.
15/11/2011EVA Minerva Jerusalem1 Linked Heritage : Coordination of standards and technologies for the enrichment of Europeana Marie-Véronique Leroi Ministry.
Europeana - next steps Policy and practice Yvo Volman European Commission DG Information Society and Media Conference on the integration of Bulgarian cultural.
The Europeana Data Model: Constraints and Opportunities Prof. Dr. Stefan Gradmann Based on work done with M. Doerr, S. Hennicke, A. Isaac, C. Meghini,
METADATA QUALITY IN EUROPEANA , Den Haag.
Europeana as a Linked Open Data case (in progress) Antoine Isaac ISKO UK Seminar “Making Metadata Work” London, June 23, 2014.
Europeana and semantic alignment of vocabularies Antoine Isaac Jacco van Ossenbruggen, Victor de Boer, Jan Wielemaker, Guus Schreiber Europeana & Vrije.
Cross-domain access to Europe’s heritage Jon Purday Senior Communications Advisor, Europeana Doom or Bloom: reinventing the library in the digital age.
Exploring Europe's Television Heritage in Changing Contexts Connected to: Funded by the European Commission within the eContentplus programme
EUscreen: Examining An Aggregator ’ s Role in Digital Preservation Samantha Losben Digital Preservation - Final Project December 15, 2010.
Producción de Sistemas de Información Agosto-Diciembre 2007 Sesión # 8.
Semantics and Syntax of Dublin Core Usage in Open Archives Initiative Data Providers of Cultural Heritage Materials Arwen Hutt, University of Tennessee.
Aligning library-domain metadata with the Europeana Data Model Sally CHAMBERS Valentine CHARLES ELAG 2011, Prague.
Congratulations Public Library of Veria on winning the Bill and Melinda Gates Foundation Access to Learning Award 2010!! Congratulations Public Library.
ENOMA - European Network of Online Musical Archives ENOMA Workshop – The Grieg Academy, UiB 26 May 2006 Leif Arne Rønningen and Lars Erik Løvhaug NTNU.
The KOS interoperability in aquatic science field through mapping processes Carmen Reverté Reverté Aquatic Ecosystems Documentation Center. IRTA. (Sant.
Antoine Isaac 1 st PRELIDA Workshop Pisa, June 26, 2013.
A Systemic Approach for Effective Semantic Access to Cultural Content Ilianna Kollia, Vassilis Tzouvaras, Nasos Drosopoulos and George Stamou Presenter:
The MICHAEL Project is funded under the European Commission eTEN Programme The multilingual catalogue of digital cultural heritage in Europe.
Which Log for which Information? Gathering Multilinguality Data from Different Log File Types Maria Gäde, Vivien Petras, and Juliane Stiller Humboldt-Universität.
EConnect WP1 & semantic issues VU members –Guus Schreiber, Antoine Isaac, Jacco van Ossenbruggen, Jan Wielemaker.
Building the digital world from local to universal Adolf Knoll National Library of the Czech Republic
1 Linked Open Europeana: Semantic Leveraging of European Cultural Heritage Prof. Dr. Stefan Gradmann Humboldt-Universität.
National Library of the Czech Republic Integration of digital materials into EDL Adolf Knoll National Library of the Czech Republic Helsinki CENL Workshop.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Antoine Isaac Europeana – VU University Amsterdam Dagstuhl Multilingual Semantic Web seminar.
Objectives and scope of semantic enrichment and tools Europeana v1.0 work package 3 meeting Berlin, 25/26 January 2010 Stefan Gradmann / Marlies Olensky.
Creating Access to Europe’s Television Heritage Vienna, EDL Workshop November Dr. Alexander Hecht (Austrian Broadcasting Corporation ORF) Johan.
Multilingual terminologies: the experience of Europeana Collection Athena Plus Workshop : “Innovative tools and pilots for access to digital.
The TERENA-OER Portal Eli Shmueli IUCC- Israeli-Inter Universities Communication Center MEITAL- Inter-University Center for e-Learning
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
MICHAEL Culture Association WP4 Integration of existing data structure into Europeana ATHENA, WP4 Working group technical meeting Konstanz, 7th of May.
MICHAEL and the European Digital Library: promoting teaching, learning and research The MICHAEL Project is funded under the European Commission eTEN Programme.
1 The Europeana Data Model (EDM): Object Representations, Context and Semantics Prof. Dr. Stefan Gradmann Humboldt-Universität zu Berlin / School of Library.
Exploring Europe’s Television Heritage in the Digital Age
AIT Austrian Institute of Technology
User Requirements in the Cultural Heritage Domain
Antoine Isaac.
Antoine Isaac SEMIC conference
Presentation transcript:

Semantic & Multilingual Interoperability in Cultural Heritage Information Systems Vivien Petras Berlin School of Library and Information Science 14 November 2012 The Case of Europeana

Contents Dimensions of Semantic & Multilingual Interoperability Europeana: History Europeana: Portal Who is the Europeana User? Multilingual Interoperability in Europeana Semantic Interoperability in Europeana Semantic Enrichment in Europeana Previewing the new Europeana 2 Image:

Cultural Heritage Information Systems Collect, store, preserve, organize, search and display cultural heritage objects or their (metadata) representations in a digital environment Answer questions: who, why, where, how, when, what (Bearman & Trant, 2002) Answers depend on  users and their cultural context  where the content is coming from  representation of content 3 Bearman & Trant (2002). Issues in Structuring Knowledge and Services for Universal Access to Online Science and Culture. Nobel Symposium (NS 120) “Virtual Museums and Public Understanding of Science and Culture”. Stockholm, Sweden. Image:

Interoperability  Aggregate information resources from different information systems  Enable seamless information access by mapping /merging:  Formats  Vocabulary  Types of access  Result representation  Forms of interaction  (Meaning? / Context?) 4 Image:

Dimensions of Multilingual Interoperability Interface Search – Query translation – Document translation Result presentation Browsing 5 Image:

Dimensions of Semantic Interoperability Data formats Metadata content Content terminology – Knowledge Organization Systems – Names User terminology – Search vocabulary – Technical vocabulary 6 Image:

Europeana 7 “A digital library that is a single, direct and multilingual access point to the European cultural heritage.” European Parliament, 27 September 2007

Europeana: History 2005 Google Books & the French 2005 EC: creation of European Digital Library (i2010 Strategy)  digitization 2006 Working group on technical & functional interoperability 2007 EDLnet Functional Specification Nov Portal launch 8

Europeana Portal 2008 /

Europeana: History 2005 Google Books & the French 2005 EC: creation of European Digital Library (i2010 Strategy)  digitization 2006 Working group on technical & functional interoperability 2007 EDLnet Functional Specification Nov Portal launch Spring 2009 Portal Re-launch 2010 Rhine Release Fall 2011 Re-Design Fall 2012 Open Data 2013 Re-launch 10

Europeana Today 23.5 million objects – 14.5 million images – 8.4 million textual objects – 400,000 sound files – 200,000 video files More than 2,200 institutions 33 countries Image:

Who is the Europeana User?  European citizens EU = 27 member states, ca. 1/2 billion people 23 official EU languages 60 regional / minority languages 12 Image: 20% children/young (0-19 yrs)30% basic school education 60% adults45% high school diploma 20% retired25% college degree

Who is a Cultural Heritage User? cultural heritage professionals organizing or providing content cultural heritage professionals producing or selling content cultural heritage professionals creating content educational users studying or teaching objects or cultural heritage in general tourist users interested in visiting or providing guidance to cultural heritage objects or sights general users interested in culture (the “informed citizen”) 13 Image:

Challenges for User-centered Design  Who are we designing for?  Representing different cultural, political and societal perspectives (both on the producer and user sides) in a multilingually balanced way One default language is not an option. Most Europeana objects are language-independent (images), metadata is sparse and needs to be translated.  valid for spoken & technical languages! 14 Image:

Multilingual Interoperability in Europeana Interface translated into 31 languages Query translation: prototype (EuropeanaConnect) Query result filtering by language Document translation (user enabled) Semantic data layer – Multilingual alignment of controlled vocabularies – Multilingual enrichment of metadata 15 Image:

Multilingual Interoperability in Europeana? Interface translated into 31 languages  static content only  Does not affect search  User awareness? 16

Multilingual Interoperability in Europeana? Query translation: prototype (EuropeanaConnect)  How many languages?  How to deal with ambiguities?  How much user influence?  Which software? 17

Multilingual Interoperability in Europeana? Query result filtering by language  Dependent on metadata record information  Language of record or language of content?  What is „multilingual“? 18

Multilingual Interoperability in Europeana? Document translation (user enabled)  Only after record has been found 19

Multilingual Solutions in Europeana? Semantic data layer – Multilingual alignment of controlled vocabularies – Multilingual enrichment of metadata  Europeana Semantic Data Layer  Excurs: Europeana Data Model  Semantic Alignment 20 Image:

Semantic Interoperability in Europeana library archive museum Europeana Semantic Data Layer: Bridging „isles of information“ by connecting objects from different domains via cross-vocabulary links. Doerr, M.; Gradmann, S.; Hennicke, S.; Isaac, A.; Van de Sompel, H. (2010). The Europeana Data Model (EDM). 76th IFLA General Conference and Assembly August 2010, Gothenburg, Sweden. 21

Europeana Semantic Data Layer = linked metadata records + linked contextual resources (KOS)  Allows seamless structured search across different collections  Allows faceted browsing across different collections  Allows search across different vocabularies  Who does the linking of metadata records?  Who does the mapping of contextual resources? 22 Image:

Semantic Alignment of Contextual Resources Irish vocabulary Cousins, Jill (2010). Europeana Overview. Europeana Open Cultures Conference, October Amsterdam Norwegian vocabulary SKOS Mapping skos:exactMatch Identify and convert relevant semantic resources Pivot vocabularies for relevant categories (subject, persons, places…)

Semantic Alignments of Vocabularies Datacloud as developed in EuropeanaConnect, 2011 How many metadata records are covered? How successful is the matching? Is this useful for search?

Semantic Similarity: „similar content“ function Based on textual similarity of metadata (title, subject, description) Semantic Enrichment: Add mapped (multilingual) concepts from selected vocabularies to metadata records Concept, agent, period, place  Increase search vocabulary  Increase semantic interoperability  Increase multilingual access 25 Image: Semantic Interoperability in Europeana

Semantic Similarity in Europeana 26

Semantic (& Multilingual) Enrichment in Europeana 27 Image: VocabularyTag typeEnriched metadata fields GEMET Thesaurus Conceptdc:subject dc:type dcterms:alternative DBpediaAgentdc:contributor dc:creator Semium Time Ontology Perioddc:date dc:coverage dcterms:temporal GeoNamesPlacedc:coverage dcterms:spatial

Semantic (& Multilingual) Enrichment in Europeana 28

Poisonous India…and other Enrichment Problems Olensky, M., Stiller, J., Dröge, E. (2012). Poisonous India or the Importance of a Semantic and Multilingual Enrichment Strategy. In: Proc. of MTSR 2012: Metadata and Semantics Research Conference, Nov. 2012, Cádiz, Spain. Study of enrichments of 200 Europeana metadata records Common errors and causes Query: „poison“ Result: Indian movie posters (in Swiss collections) Reason: India  (French) Inde Inde = (Latvian) Poison 29 Image:

Enrichments – Problem Diagnosis Incorrect metadata (incorrect fields) Inconsistent name structures Bongiorno, Michelangelo, Fr  Michelangelo (Buanarrotti) Inconsistent date structure Inconsistent field structure / refinements Choice of enrichment fields dc:type (?) Named entity treatment Common terms history and its enrichments 30 Image:

Enrichments – Problem Diagnosis Syntax correct, semantics wrong (context needed) Córdoba = Spain | Argentina Daniel Richter = French trade unionist | German artist Non-domain-specific enrichment vocabulary GEMET print  (German) Druck  pressure Cross-lingual ambiguity (false friends) electrical Power  (German) Strom  (Czech) strom  (English) tree 31 Image:

Enrichment – Problem Areas Records: – metadata quality / structure – data cleaning / normalization before enrichment Vocabulary – domain-specificity, appropriateness – language ambiguity – scope of enrichment Work flow – Named entities – Matching rules  Unsolved problem at this scale! 32 Image:

Previewing the new Europeana Re-launch with EDM-based metadata structure  Improved interface  Improved mobile access  More structure in search  Dynamic query suggestions  More structure in result representation  (no automatic enrichments?) 33 Image:

Previewing the new Europeana 34

Result List 35

Single View 36

EDM in Action 37

Summary Major efforts have gone into improving information access in Europeana. Lots of challenges still remain. The dynamic growth of Europeana requires dynamic solutions.  What are the consequences of opening the data?  What are the consequences of moving to EDM?  Can collaborative features (collective intelligence) be the answer? The good news: plenty of work for us! 38 Image:

Questions, comments, suggestions? 39 Image: