Download presentation
Presentation is loading. Please wait.
Published byTyree Bellman Modified over 10 years ago
1
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop 22-23 November 2007
2
Agenda TELPlus Context Improving subject access –3 sub-tasks Services for TEL
3
TELPlus Context Started October 2007 Running 27 months Content WPs –OCRing previously digitised material –Improving the usability of TEL through OAI PMH compliancy –Improving Access –Integrating services with TEL portal –User personalisation services –Extending TEL to Bulgaria & Romania
4
WP3 – Improving Access Task 1: Indexing for usability –Review/test state-of-the-art semantic search engines On content of documents Task 2: Improving subject access Task 3: FRBR aggregation, search and browsing –Create/exploit FRBR metadata repositories Task 4: Focus on users –Focus groups on prototypes
5
WP 3 Task 2 – Improving Subject Access Improving subject access via semantic alignment between subjects Search through collections –Using metadata –In a controlled setting Paving the way for enhanced usages –Advanced treatments mentioned in TELplus need conceptual structures and links between these structures E.g. clustering
6
WP 3 Task 2 – Improving Subject Access Improving subject access via semantic alignment between subjects Reference: MACS project –Manually-built semantic equivalences between Rameau, SWD & LCSH headings
7
MACS: Querying Collections
8
MACS: Query Reformulation Options
9
WP 3 Task 2 – Improving Subject Access Improving subject access via semantic alignment between subjects Reference: MACS project –Manual equivalences between Rameau, SWD, LCSH headings Here: an experiment on deploying automatic alignment techniques –Determining possible strategies –Assessing feasibility and usefulness –MACS context
10
WP3.2 Sub-tasks 3.2.1. Converting the subjects to standard representation language –Semantic web format (SKOS) 3.2.2. Aligning the vocabularies –Semantic correspondences between subjects 3.2.3. Deploying the alignment knowledge obtained into TEL framework –E.g. using links to reformulate queries from one subject list to the other
11
Converting subjects to standard representation language Goal: solving syntactic heterogeneity between vocabularies Enabling the use of standard tools –E.g. for query (re)formulation Paving the way for dealing with semantic heterogeneity –Definitions of concepts expressed according to a common model
12
Converting subjects to standard representation language Approach: Semantic Web and SKOS Semantic Web –Knowledge objects as web resources (URIs) –Description by linking resources (RDF) –Description using shared formal vocabularies (ontologies) SKOS –A standard Semantic Web model (ontology) –For knowledge organization systems (thesauri, subject heading lists…)
13
http://www.iconclass.nl/s_11 http://www.iconclass.nl/s_11F skos:Concept rdf:type skos: broader skos: prefLabel the Virgin Mary@en skos: prefLabel la Vierge Marie@fr http://www.iconclass.nl/ skos: inScheme skos:ConceptScheme rdf:type SKOS: Example
14
Converting subjects to standard representation language - Process Getting processable versions from owners –E.g. XML Analyzing the models Converting to SKOS
15
WP3.2 Sub-tasks 3.2.1. Converting the subjects to standard representation language –Semantic web format (SKOS) 3.2.2. Aligning the vocabularies –Semantic correspondences between subjects 3.2.3. Deploying the alignment knowledge obtained into TEL framework –E.g. using links to reformulate queries from one subject list to the other
16
Vocabulary Alignment Specifying required alignment format (links) –Type of mapping links: equivalence, broader –Cardinality: one-to-one, one-to-many –Taking application context (TEL) into account
17
Vocabulary Alignment Specifying required alignment format (links) Selecting (& running) alignment techniques/tools –Inspired by semantic web approaches
18
Vocabulary Alignment Techniques Similar to ontology alignment problem Existing approaches for (semi-) automatic ontology alignment –Using techniques from linguistics, computer science, statistics Problem: performances do not allow 100% automatic alignment Problem: multilingual case –Some techniques cannot be used
19
Background knowledge Potential Technique: Using Background Knowledge Using a shared conceptual reference to find links SHL 1 SHL 2 Calendar Publication
20
Potential Technique: Statistical Alignment Object information (book indexing) SHL 1SHL 2 Dually-indexed books Dutch Literature Dutch
21
Vocabulary Alignment Specifying required alignment format (links) Selection (& running) of tool/method Evaluation (& cleaning) –Considering application
22
Evaluation of Alignments MACS has produced mappings! –Possible gold standard But: has MACS produced all mappings? –Which proportion of the SHLs is covered? –Taking into account all indexing strings? Are MACS mappings the only interesting ones? –Serendipity mappings Concepts that are not equivalent but could bring useful results when added to queries –Compensating for indexing variability
23
Evaluation of Alignments Several scenarios for using and evaluating alignments –Concept-based search –Re-indexing –Integration of one SHL into the other –SHL Merging –Free-text search –Navigation
24
Evaluation of Alignments Several scenarios for using and evaluating alignments –Concept-based search Retrieving books indexed by SHL1 using SHL2 concepts –Re-indexing –Integration of one SHL into the other –SHL Merging –Free-text search Matching user search terms to both SHL1 or SHL2 concepts –Navigation Browsing several collections using one SHL structure
25
Evaluation of Alignments Several settings for a single scenario –Fully automatic reformulation vs assisted reformulation (candidates) Different evaluation measures –Good mappings vs acceptable ones –Number of candidates for reformulation –Semantic closeness to original query
26
Vocabulary Alignment Specifying required alignment format (links) Selection (& running) of tool/method Evaluation (& cleaning) Assessment of the approach –Efforts required, quality, extendibility
27
WP3.2 Sub-tasks 3.2.1. Converting the subjects to standard representation language –Semantic web format (SKOS) 3.2.2. Aligning the vocabularies –Semantic correspondences between subjects 3.2.3. Deploying the alignment knowledge obtained into TEL framework –E.g. using links to reformulate queries from one subject list to the other
28
Deploying the alignment knowledge obtained into TEL framework Observing integration of MACS data into TEL –Conceptual input for alignment requirements Integration of the obtained alignment in TEL Assessment of the alignment integration –Technical aspects, usage aspects
29
Reminder Alignment is a difficult problem Application-specific alignment pretty much unexplored in Semantic Web research More a feasibility study than a complete solution to the problem Practical goal: investigate how automatic techniques could help MACS-like initiatives Manual mapping is labour-intensive
30
Agenda TELPlus Context Improving subject access –3 sub-tasks Services for TEL
31
WP4 – Integrating services with the European Library portal Theo van Veen (KB) Tasks: Identifying services that are going to give the user the greatest return Creating new services Integrating services within TEL …
32
WP4 – Some Services Mentioned Preliminary inventory: no official commitment! Services based on controlled vocabularies: Thesaurus and name authority service –Providing terms linked to query terms Semantic enrichment service –Users can annotate search results with terms Distance between terms and related terms
33
WP4 – Some Services Mentioned Preliminary inventory: no official commitment! Services based on controlled vocabularies: Thesaurus and name authority service Semantic enrichment service Distance between terms and related terms Adding more value from controlled vocabularies and alignments between them
34
Thanks!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.