Retrieval Concepts and Mapping Strategies: The Potential of CrissCross for Improving Access to the DDC Jessica Hubrich, M.A., M.L.I.S. Team leader CrissCross.

Slides:



Advertisements
Similar presentations
Taxonomy as Content Outline, Site Map and Search Aid SLA NWR Vancouver October 6, 2006 Marjorie M.K. Hlava President
Advertisements

1 Federica Paradisi Italian National Bibliography Classification and Indexing Division National Central Library of Florence (Italy) Linking DDC numbers.
Fachhochschule Köln Institut für Informationsmanagement Towards a comprehensive international Knowledge Organization System Networked Knowledge Organization.
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
Database Searching: How to Find Journal Articles? START.
Chapter 5: Introduction to Information Retrieval
Not just numbers on shelves: using the DDC for information retrieval Gordon Dunsire Presented at the Symposium “Bridging the class(ification) divide: the.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
Digitisation and Access to Archival Collections: A Case Study of the Sofia Municipal Government (1878 – 1879) Maria Nisheva-Pavlova, Pavel Pavlov Faculty.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Thesaurus Design and Development
A Registry for controlled vocabularies at the Library of Congress
Information Retrieval
Lecture Nine Database Planning, Design, and Administration
SemanTic Interoperability To access Cultural Heritage Frank van Harmelen Henk Matthezing Peter Wittenburg Marjolein van Gendt Antoine Isaac Lourens van.
New Ways of Mapping Knowledge Organization Systems Using a Semi-Automatic Matching- Procedure for Building Up Vocabulary Crosswalks Andreas Oskar Kempf.
International Atomic Energy Agency INIS Training Seminar Principles of Information Retrieval and Query Formulation 07 – 11 October 2013 Vienna, Austria.
Some facets of knowledge management in mathematics Wolfram Sperber (Zentralblatt Math) Patrick Ion (Math Reviews) Facets of Knowledge Organization A tribute.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Overview of the Database Development Process
Terminology services and the DDC: the High-Level Thesaurus and beyond Presented to the symposium Dewey goes Europe: on the use and development of the Dewey.
WISER : OvidSP OvidSP is the new interface for searching many of the science and medicine databases available via OxLIP Catherine Dockerty
1 The BT Digital Library A case study in intelligent content management Paul Warren
Automatic Subject Classification and Topic Specific Search Engines -- Research at KnowLib Anders Ardö and Koraljka Golub DELOS Workshop, Lund, 23 June.
1 Intra- and interdisciplinary cross- concordances for information retrieval Philipp Mayr GESIS – Leibniz Institute for the Social Sciences, Bonn, Germany.
Internet Research Fourth Edition Unit C. Internet Research – Illustrated, Fourth Edition 2 Internet Research: Unit C Browsing Subject Guides.
Lecture Four: Steps 3 and 4 INST 250/4.  Does one look for facts, or opinions, or both when conducting a literature search?  What is the difference.
Semantic Interoperability and Retrieval Paradigms Paradigms and conceptual systems in KO February 23, 2010 – February 26, 2010 Prof. Winfried Gödert Felix.
Modern Information Retrieval Computer engineering department Fall 2005.
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
Information Retrieval Evaluation and the Retrieval Process.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
The UNESCO Thesaurus Meeting for Managers of UNESCO Documentation Networks Meron Ewketu UNESCO Library June
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
PREMIS Controlled vocabularies Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
EBSCO Discovery Service. Discovery Background –Quickly –By small development teams –Using rudimentary relevance algorithms built around searching article.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
Translingual Retrieval Moving between vocabularies MACS 2010 Jahns / Karg, Deutsche Nationalbibliothek Concepts in Context - Cologne Conference on Interoperability.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Thesauri usage in information retrieval systems: example of LISTA and ERIC database thesaurus Kristina Feldvari Departmant of Information Sciences, Faculty.
LIS 6771 Indexing with a Controlled Vocabulary Basic Concepts.
Basics of Information Retrieval and Query Formulation Bekele Negeri Duresa Nuclear Information Specialist.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
1 Information Retrieval LECTURE 1 : Introduction.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Controlled Vocabulary & Thesaurus Design Associative Relationships & Thesauri.
PubMed …featuring more than 20 million citations for biomedical literature from MEDLINE, life science journals, and online books.
Automatic vs manual indexing Focus on subject indexing Not a relevant question? –Wherever full text is available, automatic methods predominate Simple.
Jean-Yves Le Meur - CERN Geneva Switzerland - GL'99 Conference 1.
Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.
Charlyn P. Salcedo Instructor Types of Indexing Languages.
Of 24 lecture 11: ontology – mediation, merging & aligning.
MEDLINE®/PubMed® PubMed for Trainers, Fall 2015 U.S. National Library of Medicine (NLM) and NLM Training Center An introduction.
Ontologies COMP6028 Semantic Web Technologies Dr Nicholas Gibbins
D3.4 Report on Cross-Language Subject Access Options Subject access seminar, Prague Patrice Landry Swiss National Library.
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
Information Organization
DDC in French public libraries
Information Organization
CrissCross, Seoul
SYLVIA ROBERTS Communication Librarian
SYLVIA ROBERTS Communication Librarian
CSc4730/6730 Scientific Visualization
Introduction to Information Retrieval
PubMed.
Semantic Interoperability and Retrieval Paradigms
Presentation transcript:

Retrieval Concepts and Mapping Strategies: The Potential of CrissCross for Improving Access to the DDC Jessica Hubrich, M.A., M.L.I.S. Team leader CrissCross project Cologne University of Applied Sciences Institute of Information Management Jessica Hubrich, M.A., M.L.I.S. Team leader CrissCross project Cologne University of Applied Sciences Institute of Information Management Symposium “Dewey goes Europe”, Austrian National Library, 28 th April 2009

Starting Point Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Functionality and efficiency of topical search processes depend on the underlying retrieval concepts and the kind of subject data that is integrated within information retrieval systems. Compared to homogeneous retrieval environments, heterogeneous information spaces require enhanced concepts taking into account the specifity of the information space and the potential of the used distinct indexing data. Questions How do retrieval concepts influence search functionalities? To which extent can the establishment of links between distinct indexing languages improve efficiency of topical queries in heterogeneous information spaces? What are the benefits of the linkages produced within the project CrissCross? Vienna, 28th April 2009

Retrieval Concepts (I) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Retrieval concepts aim to support document retrieval in the narrower sense of the term information seekers in finding relevant documents by providing tools for orientation, navigation, exploration Ideally, retrieval concepts are accompanied by concepts of relevance ranking. Vienna, 28th April 2009 Basic Search Topical Exploration Concept Exploration Concept Search Central retrieval concepts in respect to topical queries

Retrieval Concepts (II) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Basic search based on string matching Initial search terms are compared with elements of a generated index and might refer to keywords of titles or of abstracts main form of subject headings notations Modifications of this search are found in many librarian opacs often combined with the possibility to search within indices. Vienna, 28th April 2009

Retrieval Concepts (III) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Conceptual query based on concept matching Initial search terms are enhanced and modified in regard to the meant concept. The efficiency of this feature depends on the quality of the integrated controlled vocabulary that identifies synonyms. This search can be found in many librarian opacs, sometimes combined with the possibility to search within the specific subject index. Vienna, 28th April 2009 (Resource: ULB Münster)

Retrieval Concepts (IV) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Conceptual exploration based on a priori conceptual relations The semantic environment of a concept that corresponds to the initial search term is provided for search modification. The degree of orientation and the efficiency of such a feature depend on the quality and expressive- ness of the semantic structure of the knowledge system that is referred to. The expressiveness of semantic relations within indexing languages is often restricted. This retrieval concept has not yet been integrated adequately in librarian opacs. Vienna, 28th April 2009 (Resource: ULB Münster)

Retrieval Concepts (V) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Topical Exploration based on a posteriori conceptual relations Taking former search results as initial points, this retrieval concept aims to support topical exploration processes to assist information seekers in clarifying their information needs. Expressive a priori semantic relations between concepts of an integrated knowledge organization system as well as syntactical operators are provided that allow qualified statements about a posteriori relations inherent in topics of the specific documents. A system that adequately supports processes of topical exploration has not been realized yet. Vienna, 28th April 2009

Relevance Ranking Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Search concepts in the narrower sense of the term can be supplemented by concepts of relevance ranking. Concepts of relevance ranking provide algorithms for ordered display of search results based on specific assumptions concerning the factors that may influence the relevance of a document in respect to the conducted search. Criteria for topical ranking in librarian catalogues might be Uniqueness of search terms within the database Proportion of search terms present in a bibliographic record Fields in which search terms occur (Subject fields vs. title fields)..... In respect to heterogeneous information spaces, criteria concerning the relevance of embedded data of distinct indexing languages must be developed integrating the potential given with the specific mapping data. Vienna, 28th April 2009

Retrieval Concepts and Mapping Strategies Retrieval Concepts and Mapping Strategies : The Potential of CrissCross In respect to heterogeneous information spaces, functionality and efficiency of queries can considerably be improved by establishing links between relevant indexing languages. However, their practicability concerning the different retrieval concepts differ according to the specific mapping strategy applied. Vienna, 28th April 2009 Conceptual Mapping Basic Mapping Semantic Mapping Concept Exploration Basic Search Topical Exploration Concept Search Retrieval ConceptsMapping Strategies

Mapping Strategies (I) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Basic Mapping focused on the main representation form of a concept Crosswalks between indexing languages are established taking the main representation form of a concept as initial point. The semantic relations between the mapped terms are not further described. Generally, the mappings are saved separatly from the databases of the knowledge systems. In retrieval scenarios the matching algorithms are extended taking advantage of existing indexing data. Recall is improved. equivalence links are conceived as term clusters controlled access points to other vocabularies are provided in form of main headings; information seeker might use the language he or she is familiar with Vienna, 28th April 2009

Mapping Strategies (II) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross The mapping strategy of Multilingual Access to Subjects (MACS) is originally based on this mapping concept. (Resource: Vienna, 28th April 2009

Mapping Strategies (III) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Conceptual Mapping focused on concepts The mapping strategy aims to establish linkages between concepts of distinct indexing languages taking the whole connotation scope of a concept as initial point and describing exactly the mapping direction wherever necessary. The intersystem relations are further described and are stored together with the identifier of the mapped concept/s within a knowledge organization system. In retrieval scenarios the matching algorithms are further extended taking advantage of existing indexing data. Recall is improved. conceptual search is supported intersystem relations allow to influence recall and precision and to navigate more effectively between knowledge systems Vienna, 28th April 2009

Mapping Strategies (IV) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Semantic Mapping considering the concepts as well as intraconcept relations Ideally, mapping relations complement highly expressive and accurately structured relational knowledge systems. The relational structure of the participating systems contribute to the meaning and usage of the individual concepts. Taking the structural and functional setups of these systems into account and additionally erecting expressive, logical valid and specified intersystem relations characterizes the strategy of semantic mapping. Semantic mapping has not been conducted yet. However, the additional value would be substantial: In retrieval scenarios all search matching processes would be supported as well as intercultural and international concept exploration. Vienna, 28th April 2009

CrissCross Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Project run time: 2006 – 2010 Project Sponsor:German Research Foundation Cooperation partners: German National Library Cologne University of Applied Sciences Aim: Creation of a thesaurus-based and user-friendly research vocabulary that facilitates research in heterogeneously indexed collections Vienna, 28th April 2009 Semantic Mapping Conceptual Mapping Basic MappingBasic Search Concept Exploration Concept Search Central focus:Linking of subject headings of the German Subject Heading Authority File (SWD) to notations of the Dewey Decimal Classification (DDC)

CrissCross — Mapping Strategy (I) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Characteristica of the CrissCross Conceptual Mapping unidirectional: SWD  DDC as comprehensive as possible / One-to-many Mapping as specific as possible / Deep Level Mapping Built numbers constructed within the frame of CrissCross are stored institutionally in MelvilClass (including number components) Vienna, 28th April 2009 interdisciplinary works on apples – located in class for apples as food + works that refer to disciplinary aspects of the subject heading (botany / agriculture)

CrissCross — Mapping Strategy (II) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross allocated notations are stored directly in the data record of the specific SWD subject heading Vienna, 28th April 2009 Semantic structure of SWD is available with mappings

CrissCross — Mapping Strategy (III) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross The different levels of contentual congruence between SWD subject headings and assigned DDC notations are expressed by four so-called Degrees of Determinacy which are aligned to the direction of the mapping as well as to the mapping specifity and are - wherever possible - adjusted to the structure of the target classification (esp. instance- class relations) Det 4: Connotation scope is (nearly) identical Det 3: Connotation scope approximates the whole Det 2: Connotation scope reflects a part Det 1: Connotation scope corresponds to a small part Vienna, 28th April 2009

CrissCross — Retrieval Concepts (I) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross String Matching / Concept Matching Vienna, 28th April 2009 Apfel UF Gartenapfel SWD main headings as additional access points to the DDC UF Malus communis UF Äpfel UF Malus domestica SWD concepts as additional access points to the DDC

CrissCross — Retrieval Concepts (II) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Conceptual Exploration  based on semantic structure of the DDC (primarily hierarchical)  based on semantic structure of the SWD (BT, NT, RT) Vienna, 28th April 2009

CrissCross — Retrieval Concepts (III) Conceptual Exploration based on CrissCross Vienna, 28th April 2009

CrissCross — Retrieval Concepts (IV) Conceptual Exploration based on SWD and CrissCross Vienna, 28th April 2009

CrissCross — Relevance Ranking (I) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Due to the qualitative mapping strategy that is adjusted to the participating knowledge systems, CrissCross provides several possibilities for relevance ranking: Ranking of documents that are assigned a specific DDC number based on the Degrees of Determinacy as the Degrees on Determinacy describe how a subject heading „fits“ into a class Vienna, 28th April 2009

CrissCross — Relevance Ranking (II) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Ranking of documents with different DDC numbers based on the Degrees of Determinacy As the Degrees of Determinacy are adjusted to the relations between topics and classes like they are displayed in the DDC and the latter are based on literary warrant, it is likely that more relevant literature concerning the concept described by the subject heading can be found within a set of documents that are assigned a DDC number that is mapped with a higher Degree of Determinacy. Retrieval tests conducted so far could prove this assumption. If the integration of the mapping data leads to an unmanageable search result set, the Degrees of Determinacy can likewise be used to controll recall (and precision) Vienna, 28th April

CrissCross — Relevance Ranking (III) Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Even in respect to displaying search results in subsequence to a search expansion integrating a posteriori concepts, the Degrees of Determinacy give hints to which assigned DDC numbers might be of higher relevance. Vienna, 28th April 2009 Search term: Schmetterling (Lepidoptera) #4# Search results: 23 Automatic Search expansion possible Documents with assigned notations that reflect subordinate classes Ex (Papilionoidea (Butterfly)) Documents with assigned built number with base number Ex (Lepidoptera in Europe) 12

CrissCross — Future Prospects Retrieval Concepts and Mapping Strategies : The Potential of CrissCross CrissCross and the Semantic Web Simple Knowledge Organization Language (SKOS) as quasi-standard for publishing knowledge organisation systems on the Semantic Web but  not adjusted to classifications and to mappings between typological distinct knowledge sytems  CrissCross relations cannot adequately be represented in SKOS mapping relations  Solution: Using SKOS and OWL (Web Ontology Language), constructing adequate RDF representation Vienna, 28th April 2009

CrissCross — Future Vision? Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Vienna, 28th April 2009

Retrieval Concepts and Mapping Strategies : The Potential of CrissCross Thank you for your attention! Vienna, 28th April 2009 Homepage CrissCross project Jessica Hubrich, M.A., M.L.I.S Team Leader CrissCross project