Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop 22-23 November 2007.

Slides:



Advertisements
Similar presentations
Inventories, Discovery of Digital Content Minerva WP3 Sarah Faraud.
Advertisements

Multilingual Access to Online Content - the Europeana Experience Vivien Petras (Humboldt-Universität zu Berlin) With the help of.
Federal Department of Home Affairs FDHA Swiss Federal Office of Culture FOC Swiss National Library SNL EuroVoc Conference – Mind the Lexical Gap,18-19.
OAEI 2007: Library Track Results Antoine Isaac, Lourens van der Meij, Shenghui Wang, Henk Matthezing Claus Zinn, Stefan Schlobach, Frank van Harmelen Ontology.
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007.
STITCH final event KB July Agenda Brief presentation of STITCH main achievements Demo: annotation suggestion at KB The future use of STITCH results.
Bibliographic Framework Initiative Approach for MARC Data as Linked Data Sally McCallum Library of Congress.
LogCLEF 2009 Log Analysis for Digital Societies (LADS) Thomas Mandl, Maristella Agosti, Giorgio Maria Di Nunzio, Alexander Yeh, Inderjeet Mani, Christine.
Enrichment of Library Authority Files by Linked Open Data Sources
Interoperability Aspects in Europeana Antoine Isaac Workshop on Research Metadata in Context 7./8. September 2010, Nijmegen.
Data Intensive Techniques to Boost the Real-time Performance of Global Agricultural Data Infrastructures SEMAGROW U SING A POWDER T RIPLE S TORE FOR BOOSTING.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
SKOS and Linked Data Antoine Isaac ISKO, London, Sept. 14th 2010.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn.
Notes on ThoughtLab / Athena WP4 November 13, 2009 Antoine Isaac
‘european digital library’ (EDL) Julie Verleyen TEL-ME-MOR / M-CAST Seminar on Subject Access Prague, 24 November 2006.
Aligning Thesauri for an integrated Access to Cultural Heritage Collections Antoine ISAAC (including slides by Frank van Harmelen) STITCH Project UDC Conference.
The Value of Usage Scenarios for Thesaurus Alignment in Cultural Heritage Context Antoine Isaac, Claus Zinn, Henk Matthezing, Lourens van der Meij, Stefan.
An Empirical Study of Instance-Based Ontology Mapping Antoine Isaac, Lourens van der Meij, Stefan Schlobach, Shenghui Wang funded by NWO Vrije.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Multi-Concept Alignment and Evaluation Shenghui Wang, Antoine Isaac, Lourens van der Meij, Stefan Schlobach Ontology Matching Workshop Oct. 11 th, 2007.
Vocabulary Matching for Book Indexing Suggestion in Linked Libraries – A Prototype Implementation & Evaluation Antoine Isaac, Dirk Kramer, Lourens van.
Putting ontology alignment in context: Usage scenarios, deployment and evaluation in a library case Antoine Isaac Henk Matthezing Lourens van der Meij.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
SemanTic Interoperability To access Cultural Heritage Frank van Harmelen Henk Matthezing Peter Wittenburg Marjolein van Gendt Antoine Isaac Lourens van.
Accessing Cultural Heritage using Semantic Web Techniques Antoine ISAAC VU Amsterdam - KB Digital Access to Cultural Heritage Master March 20 th, 2008.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.
Networking Session: Global Information Structures for Science & Cultural Heritage - The Interoperability Challenge «INTEROPERABILITY FROM THE CULTURAL.
1/ 27 The Agriculture Ontology Service Initiative APAN Conference 20 July 2006 Singapore.
Federal Department of Home Affairs FDHA Swiss Federal Office of Culture FOC Swiss National Library SNL Multilingual Access to Subjects (MACS) Patrice Landry.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
Rutherford Appleton Laboratory SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory Semantic Web Best Practices and Deployment.
The OAI-ORE based data model of Europeana and the Digital Public Library of America: implications for educational publishing Dov Winer MAKASH – Advancing.
1 Intra- and interdisciplinary cross- concordances for information retrieval Philipp Mayr GESIS – Leibniz Institute for the Social Sciences, Bonn, Germany.
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
Semantic Interoperability and Retrieval Paradigms Paradigms and conceptual systems in KO February 23, 2010 – February 26, 2010 Prof. Winfried Gödert Felix.
The MMI Tools Carlos Rueda Monterey Bay Aquarium Research Institute OOS Semantic Interoperability Workshop Marine Metadata Interoperability Project Boulder,
Europeana and semantic alignment of vocabularies Antoine Isaac Jacco van Ossenbruggen, Victor de Boer, Jan Wielemaker, Guus Schreiber Europeana & Vrije.
Towards an ecosystem of data and ontologies Mathieu d’Aquin and Enrico Motta Knowledge Media Institute The Open University.
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
MD9.6 Release: Highlights Increased the character limit for all URL resources to 600 characters. Data_Center/Service_Provider Data_Set_Citation/Service_Citation.
Incorporating ARGOVOC in DSpace-based Agricultural Repositories Dr. Devika P. Madalli & Nabonita Guha Documentation Research & Training Centre Indian Statistical.
10/18/20151 Business Process Management and Semantic Technologies B. Ramamurthy.
The Agricultural Ontology Service (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Library and Documentation Systems.
Antoine Isaac 1 st PRELIDA Workshop Pisa, June 26, 2013.
A Systemic Approach for Effective Semantic Access to Cultural Content Ilianna Kollia, Vassilis Tzouvaras, Nasos Drosopoulos and George Stamou Presenter:
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
GEMET GEneral Multilingual Environmental Thesaurus leading the way to federated terminologies Stefan Jensen, Head of information services group with input.
EConnect WP1 & semantic issues VU members –Guus Schreiber, Antoine Isaac, Jacco van Ossenbruggen, Jan Wielemaker.
On-To-Knowledge review Juan-Les-Pins/France, October 06, 2000 Hans Akkermans, VUA Hans-Peter Schnurr, AIFB Rudi Studer, AIFB York Sure, AIFB KMKMMethodology.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
Objectives and scope of semantic enrichment and tools Europeana v1.0 work package 3 meeting Berlin, 25/26 January 2010 Stefan Gradmann / Marlies Olensky.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
Of 24 lecture 11: ontology – mediation, merging & aligning.
D3.4 Report on Cross-Language Subject Access Options Subject access seminar, Prague Patrice Landry Swiss National Library.
MICHAEL Culture Association WP4 Integration of existing data structure into Europeana ATHENA, WP4 Working group technical meeting Konstanz, 7th of May.
METADATA MANAGEMENT AT ISTAT: CONCEPTUAL FOUNDATIONS AND TOOLS Istituto Nazionale di Statistica ITALY.
Usage scenarios, User Interface & tools
CrissCross, Seoul
LOD reference architecture
Introduction to Information Retrieval
Antoine Isaac SEMIC conference
Business Process Management and Semantic Technologies
Information Retrieval and Web Design
Presentation transcript:

Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007

Agenda TELPlus Context Improving subject access –3 sub-tasks Services for TEL

TELPlus Context Started October 2007 Running 27 months Content WPs –OCRing previously digitised material –Improving the usability of TEL through OAI PMH compliancy –Improving Access –Integrating services with TEL portal –User personalisation services –Extending TEL to Bulgaria & Romania

WP3 – Improving Access Task 1: Indexing for usability –Review/test state-of-the-art semantic search engines On content of documents Task 2: Improving subject access Task 3: FRBR aggregation, search and browsing –Create/exploit FRBR metadata repositories Task 4: Focus on users –Focus groups on prototypes

WP 3 Task 2 – Improving Subject Access Improving subject access via semantic alignment between subjects Search through collections –Using metadata –In a controlled setting Paving the way for enhanced usages –Advanced treatments mentioned in TELplus need conceptual structures and links between these structures E.g. clustering

WP 3 Task 2 – Improving Subject Access Improving subject access via semantic alignment between subjects Reference: MACS project –Manually-built semantic equivalences between Rameau, SWD & LCSH headings

MACS: Querying Collections

MACS: Query Reformulation Options

WP 3 Task 2 – Improving Subject Access Improving subject access via semantic alignment between subjects Reference: MACS project –Manual equivalences between Rameau, SWD, LCSH headings Here: an experiment on deploying automatic alignment techniques –Determining possible strategies –Assessing feasibility and usefulness –MACS context

WP3.2 Sub-tasks Converting the subjects to standard representation language –Semantic web format (SKOS) Aligning the vocabularies –Semantic correspondences between subjects Deploying the alignment knowledge obtained into TEL framework –E.g. using links to reformulate queries from one subject list to the other

Converting subjects to standard representation language Goal: solving syntactic heterogeneity between vocabularies Enabling the use of standard tools –E.g. for query (re)formulation Paving the way for dealing with semantic heterogeneity –Definitions of concepts expressed according to a common model

Converting subjects to standard representation language Approach: Semantic Web and SKOS Semantic Web –Knowledge objects as web resources (URIs) –Description by linking resources (RDF) –Description using shared formal vocabularies (ontologies) SKOS –A standard Semantic Web model (ontology) –For knowledge organization systems (thesauri, subject heading lists…)

skos:Concept rdf:type skos: broader skos: prefLabel the Virgin skos: prefLabel la Vierge skos: inScheme skos:ConceptScheme rdf:type SKOS: Example

Converting subjects to standard representation language - Process Getting processable versions from owners –E.g. XML Analyzing the models Converting to SKOS

WP3.2 Sub-tasks Converting the subjects to standard representation language –Semantic web format (SKOS) Aligning the vocabularies –Semantic correspondences between subjects Deploying the alignment knowledge obtained into TEL framework –E.g. using links to reformulate queries from one subject list to the other

Vocabulary Alignment Specifying required alignment format (links) –Type of mapping links: equivalence, broader –Cardinality: one-to-one, one-to-many –Taking application context (TEL) into account

Vocabulary Alignment Specifying required alignment format (links) Selecting (& running) alignment techniques/tools –Inspired by semantic web approaches

Vocabulary Alignment Techniques Similar to ontology alignment problem Existing approaches for (semi-) automatic ontology alignment –Using techniques from linguistics, computer science, statistics Problem: performances do not allow 100% automatic alignment Problem: multilingual case –Some techniques cannot be used

Background knowledge Potential Technique: Using Background Knowledge Using a shared conceptual reference to find links SHL 1 SHL 2 Calendar Publication

Potential Technique: Statistical Alignment Object information (book indexing) SHL 1SHL 2 Dually-indexed books Dutch Literature Dutch

Vocabulary Alignment Specifying required alignment format (links) Selection (& running) of tool/method Evaluation (& cleaning) –Considering application

Evaluation of Alignments MACS has produced mappings! –Possible gold standard But: has MACS produced all mappings? –Which proportion of the SHLs is covered? –Taking into account all indexing strings? Are MACS mappings the only interesting ones? –Serendipity mappings Concepts that are not equivalent but could bring useful results when added to queries –Compensating for indexing variability

Evaluation of Alignments Several scenarios for using and evaluating alignments –Concept-based search –Re-indexing –Integration of one SHL into the other –SHL Merging –Free-text search –Navigation

Evaluation of Alignments Several scenarios for using and evaluating alignments –Concept-based search Retrieving books indexed by SHL1 using SHL2 concepts –Re-indexing –Integration of one SHL into the other –SHL Merging –Free-text search Matching user search terms to both SHL1 or SHL2 concepts –Navigation Browsing several collections using one SHL structure

Evaluation of Alignments Several settings for a single scenario –Fully automatic reformulation vs assisted reformulation (candidates) Different evaluation measures –Good mappings vs acceptable ones –Number of candidates for reformulation –Semantic closeness to original query

Vocabulary Alignment Specifying required alignment format (links) Selection (& running) of tool/method Evaluation (& cleaning) Assessment of the approach –Efforts required, quality, extendibility

WP3.2 Sub-tasks Converting the subjects to standard representation language –Semantic web format (SKOS) Aligning the vocabularies –Semantic correspondences between subjects Deploying the alignment knowledge obtained into TEL framework –E.g. using links to reformulate queries from one subject list to the other

Deploying the alignment knowledge obtained into TEL framework Observing integration of MACS data into TEL –Conceptual input for alignment requirements Integration of the obtained alignment in TEL Assessment of the alignment integration –Technical aspects, usage aspects

Reminder Alignment is a difficult problem Application-specific alignment pretty much unexplored in Semantic Web research More a feasibility study than a complete solution to the problem Practical goal: investigate how automatic techniques could help MACS-like initiatives Manual mapping is labour-intensive

Agenda TELPlus Context Improving subject access –3 sub-tasks Services for TEL

WP4 – Integrating services with the European Library portal Theo van Veen (KB) Tasks: Identifying services that are going to give the user the greatest return Creating new services Integrating services within TEL …

WP4 – Some Services Mentioned Preliminary inventory: no official commitment! Services based on controlled vocabularies: Thesaurus and name authority service –Providing terms linked to query terms Semantic enrichment service –Users can annotate search results with terms Distance between terms and related terms

WP4 – Some Services Mentioned Preliminary inventory: no official commitment! Services based on controlled vocabularies: Thesaurus and name authority service Semantic enrichment service Distance between terms and related terms Adding more value from controlled vocabularies and alignments between them

Thanks!