Download presentation
Presentation is loading. Please wait.
Published byRoberta Jones Modified over 8 years ago
1
Antoine Isaac Europeana – VU University Amsterdam Dagstuhl Multilingual Semantic Web seminar
2
Europeana 24 M objects (images, text, sound and video) From over 2.200 libraries, museums, archives From 33 countries For everyone “A digital library that is a single, direct and multilingual access point to the European cultural heritage.” European Parliament
3
Multilingual Access in Europeana
4
Dimensions of multilingual access Interface Search (q uery translation or document translation) Result presentation Browsing
5
Europeana's efforts Interface translated into 26 languages Query translation: only prototype Query result filtering by country/language Document translation (user enabled) Semantic contextualization of objects Multilingual enrichment/annotation of metadata
6
Making metadata work for multilingual access
7
Current metadata in Europeana Simple object records Flat (text values) Without language tags! Only language-related info on metadata is at collection level Can be "mul" Need to change! a new Europeana Data Model (EDM)
8
"Semantic layer" of contextual resources (concepts, persons, places, events...) Networked objects Cultural artefact Painting Sculpture Buildling Exploiting semantic relations e.g. “broader concept”, “place of birth”, “involved person”…
9
Multilingual metadata
10
Fetching already available linked data http://www.w3.org/2005/Incubator/lld/XGR-lld-vocabdataset/ E.g., from libraries
11
Interoperability Encouraging the use of RDF + common and simple elements
12
Interoperability Encouraging the use of common and simple data elements Piano carré Pianoforte a tavolino Square pianoforte Tafelklavier Tafelpiano Taffel Pianofortes
13
Interoperability mixed nature of eligible contextual resources: dictionaries, synonym/translation lists, thesauri, authority lists, gazetteers… interplay: “semantic” data next to multilingual data
14
Simultaneous approaches Getting richer semantic/multilingual metadata from providers Fetching third-party contextual data and linking it to “un- contextualized” objects Linking contextual data from an institution to another more general / more commonly used contextual dataset Dbpedia.org, VIAF.org …
15
Status and challenges
16
Current status All this is work in progress and will take time R&D prototypes (EuropeanaConnect) showing the challenges of gathering appropriate multilingual tools and dataEuropeanaConnect First tests of simple techniques in production portal: GeoNames (places) and GEMET (concepts) GeoNamesGEMET Encouraging, but illustrate issues with too naïve approaches (no NLP) and incomplete data Cheval Poison http://www.europeana.eu
17
Problems & requirements For providers & Europeana Continue work on metadata Benchmarking (cf. CHiC lab @ CLEF)CHiC lab Positioning as consumers and contributors of data (cf Asun’s slides) data.europeana.eu For language-intensive tools and resources Availability: open resources Interoperability Simplicity But not always! E.g., not only “first hit” translations Scale: scalability of tools, number and scope of datasets Many languages, some lesser-resourced (wrt. English)
18
Another illustration: VOICES project S o m e t h i n g e n t i r e l y d i f f e r e n t b u t n o t c o m p l e t e l y u n r e l a t e dVOICES Voice-based community-centric mobile services for social development Easing communication on agricultural trade Listing of products/prices via phone/radio Pilot in Mali Challenges Data-centric project, but language technology plays a crucial role Objects should be provided with textual and audio labels (text-to- speech system) in different languages Local languages: e.g., Bambara Lack of resource: need low-cost, easy-to-adapt solutions Victor de Boer, VU Amsterdam (v.de.boer@cs.vu.nl)
19
Thank you aisaac@few.vu.nl http://www.few.vu.nl/~aisaac/ Some slides based on Marlies Olensky and Juliane Stiller - Multilingual Web Workshop, June 11, 2012, Dublin
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.