Advanced Information Systems Laboratory Department of Computer Science and Systems Engineering GI-DAYS MÜNSTER A software tool.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Taxonomy as Content Outline, Site Map and Search Aid SLA NWR Vancouver October 6, 2006 Marjorie M.K. Hlava President
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
DELOS WP5 Workshop: Semantic Interoperability in DL systems, 17 th September 2004, Bath, UK Semantic Interoperability in Digital Library Systems Task 3:
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
Advanced Information Systems Laboratory Department of Computer Science and Systems Engineering Müesteraner GI-Tage 03 GIS COTS.
Basic Data Analysis: Descriptive Statistics. Ch 152 Coding Data and the Data Code Book Data entry refers to the creation of a computer file that holds.
Union Catalog and Knowledge Engineering for TELDAP Keh-Jiann Chen Principal Investigator Core Platforms for Digital Contents Project, TELDAP Research Fellow.
Galia Angelova Institute for Parallel Processing, Bulgarian Academy of Sciences Visualisation and Semantic Structuring of Content (some.
Text Operations: Preprocessing. Introduction Document preprocessing –to improve the precision of documents retrieved –lexical analysis, stopwords elimination,
Applications Chapter 9, Cimiano Ontology Learning Textbook Presented by Aaron Stewart.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
ADVISE: Advanced Digital Video Information Segmentation Engine
WMES3103 : INFORMATION RETRIEVAL
University of the Aegean, Department of Geography The emerge of Semantic Geoportals Athanasis Nikolaos Konstantinos Kalabokidis Vaitis Michail Soulakellis.
Queensland University of Technology An Ontology-based Mining Approach for User Search Intent Discovery Yan Shen, Yuefeng Li, Yue Xu, Renato Iannella, Abdulmohsen.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
A Digital Geolibrary: Integrating Keywords and PlacenamesECDL A Digital GeoLibrary: Integrating Keywords And Place Names Mathew Weaver and Lois Delcambre.
1 Languages for aboutness n Indexing languages: –Terminological tools Thesauri (CV – controlled vocabulary) Subject headings lists (CV) Authority files.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Knowledge organisation and information architecture, Nils Pharo Knowledge organisation and the Web Nils Pharo, 6th November 2002.
Ontology-based Access Ontology-based Access to Digital Libraries Sonia Bergamaschi University of Modena and Reggio Emilia Modena Italy Fausto Rabitti.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Taxonomies: Hidden but Critical Tools Marjorie M.K. Hlava President Access Innovations, Inc.
Rutherford Appleton Laboratory SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory Semantic Web Best Practices and Deployment.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Ontology-Driven Information Retrieval Nicola Guarino Laboratory for Applied Ontology Institute for Cognitive Sciences and Technology (ISTC-CNR) Trento-Roma,
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
D4: SKOS and HIVE—Enhancing the Creation, Design and Flow of Information Speakers: Hollie White Jane Greenberg Coordinator: Alan Keely.
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
The Agricultural Ontology Service (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Library and Documentation Systems.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
19/10/20151 Semantic WEB Scientific Data Integration Vladimir Serebryakov Computing Centre of the Russian Academy of Science Proposal: SkTech.RC/IT/Madnick.
Conceptual Maps and Thesauri : A Comparison of Two Models of Representation Arising from Different Disciplinary Traditions Lalthoum Saàdani and Suzanne.
Ontology Summit2007 Survey Response Analysis Ken Baclawski Northeastern University.
Math Information Retrieval Zhao Jin. Zhao Jin. Math Information Retrieval Examples: –Looking for formulas –Collect teaching resources –Keeping updated.
The UNESCO Thesaurus Meeting for Managers of UNESCO Documentation Networks Meron Ewketu UNESCO Library June
, 1/21, © Library and Documentation Systems Division 21 st APAN Meeting Tokyo, January 2006 AGROVOC and AOS, Margherita Sini, FAO From.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Thesauri usage in information retrieval systems: example of LISTA and ERIC database thesaurus Kristina Feldvari Departmant of Information Sciences, Faculty.
Building a Topic Map Repository Xia Lin Drexel University Philadelphia, PA Jian Qin Syracuse University Syracuse, NY * Presented at Knowledge Technologies.
New Tools for astronomy librarians D Donna Thompson SLA PAM Roundtable June 9, 2014.
GEMET GEneral Multilingual Environmental Thesaurus leading the way to federated terminologies Stefan Jensen, Head of information services group with input.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
1 MedAT: Medical Resources Annotation Tool Monika Žáková *, Olga Štěpánková *, Taťána Maříková * Department of Cybernetics, CTU Prague Institute of Biology.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
June 2003INIS Training Seminar1 INIS Training Seminar 2-6 June 2003 Subject Analysis Thesaurus and Indexing Alexander Nevyjel Subject Control Unit INIS.
1 WS-GIS: Towards a SOA-Based SDI Federation Fábio Luiz Leite Júnior Information System Laboratory University of Campina Grande
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Controlled Vocabulary & Thesaurus Design Associative Relationships & Thesauri.
Knowledge Support for Modeling and Simulation Michal Ševčenko Czech Technical University in Prague.
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Charlyn P. Salcedo Instructor Types of Indexing Languages.
The Agricultural Ontology Server (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Food and Agriculture Organization.
Ontologies COMP6028 Semantic Web Technologies Dr Nicholas Gibbins
1 How do we describe something? n What something is about? –What the content of an object is “about”? n Different methods (Wilson, 1968) –counting terms.
SERVICE ANNOTATION WITH LEXICON-BASED ALIGNMENT Service Ontology Construction Ontology of a given web service, service ontology, is constructed from service.
COMP6215 Semantic Web Technologies
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
CS 620 Class Presentation Using WordNet to Improve User Modelling in a Web Document Recommender System Using WordNet to Improve User Modelling in a Web.
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
Semantic Interoperability in Digital Library Systems
Presentation transcript:

Advanced Information Systems Laboratory Department of Computer Science and Systems Engineering GI-DAYS MÜNSTER A software tool for thesauri management, browsing and supporting advanced searches J. Nogueras-Iso, J.A. Bañares, J. Lacasta, J. Zarazaga-Soria Münster, June 2003

15-ene-152 Contents  Introduction  Architecture of THManager application  Basic capabilities  Enhanced capabilities  Conclusions

15-ene-153 Introduction to thesauri  „ A thesaurus is a set of terms that describe the vocabulary of a controlled indexing language, formally organized so that the a priori relationships between concepts (for example synonymous terms, broader terms, narrower terms and related terms) are made explicit“ [ISO 2788]  Used to improve the precision and recall of information retrieval in digital libraries  provide a uniform and consistent vocabulary for indexing metadata ("description of the data holdings“)  supply users with a suitable vocabulary for the retrieval.  expansion of users queries by automatically adding new terms to the query

15-ene-154 Introduction to thesauri  A thesaurus management tool becomes a vital component in the development of any kind of digital library  One of the main objectives of Spatial Data Infrastructures is to provide the discovery, evaluation and access to spatial data for a community of users.  an SDI can be considered as digital library specialised in geographic information resources.  A thesaurus management tool will be also a vital component for the development of SDIs.

15-ene-155 Level 3. Application Level 2. GUI Level 1. Model Level 0. Database Thesaurus management Import/export Thesaurus.model Keywords expansion Keywords Thesaurus -100% SQL (basic) -Oracle IntermediaText (enhanced) WordNet files Metadata records Thesaurus.gui Generic GUI components for thesauri visualization Architecture of THManager application Lexicon WordNetPolisemy Polisemy extraction Branch disambiguation ThesaurusMngmt ThManager basic enhanced >

15-ene-156 Basic Capabilities  Edition of thesauri according to ISO norms  Broader (BT), narrrower terms (NT)  Related terms (RT), preferred terms (PT)  Scope notes (SN), Synonyms (SYN,USE)  Language translations (TR)  Visualization of thesauri  Hierarchical, alphabetical  Search of terms  Multilingual access support  Browsing according to the language selected by users  Import/Export  Text file proprietary formats

15-ene-157 Browsing /Edition

15-ene-158 Import/export formats  Formats  Dot based notation  sucession of narrower terms + additional relationships (SYN,TR,...)  Hierarchical Numbering of terms  It should use more standardized formats:  RDFS/XML,...

15-ene-159 Enhanced capabilities  Thesauri are intended for the homogeneous classification of resources  They are used to fill metadata keywords  However, there is still heterogeneity in metadata keywords  Metadata creators use different thesauri in different application domains  If metadata catalogs provide access to general public  Queries may not contain same terms as keywords in metadata records  A possible solution to fill the semantic gap  Disambiguation of thesauri (and queries) in relation with the concepts of an upper level ontology

15-ene-1510 Enhanced capabilities  Additional tools around semantic disambiguation  Browsing WordNet as another thesaurus  Searching polysemic senses in WordNet  Thesauri disambiguation  Automatic Expansion of Keywords Other knowledge representation models Thesaurus 1 Thesaurus 2 Thesaurus N Controlled list 1 Controlled list 2 Controlled list N WordNet

15-ene-1511 Browsing WordNet  WordNet is structured in a hierarchy of synsets  Synsets are defined as set of synonyms representing a particular concept (sense)  WordNet libraries and files are accessed by JNI

15-ene-1512 Searching polysemic senses in WordNet  Functionality provided by Polisemy package  Compound terms are partioned if no synset is found  If adjectives found, associated nouns are also searched to reduce number of not-found words

15-ene-1513 Thesauri Disambiguation  Unsupervised disambiguation method  The senses of every thesaurus term are searched in WordNet.  The hierarchical structure of the thesaurus is used as the word context for a voting algorithm to find the closest sense  Thesauri are partitioned into branches (trees formed by BT/NT terms whose root has no BT) accident source environmental accident major accident traffic accident work accident technological accident shipping accident nuclear accident core meltdown oil sick accident explosion leakage administration...

15-ene-1514 Thesauri Disambiguation II  Voting algorithm to obtain the disambiguated synset of a term a  Every synset s associated to the rest of terms in the branch votes (proximity weight) for the synsets of term “a”  Main weight: number of subsummers in WordNet hierarchy  Matches in WordNet hierarchy of ancestors  Discounting factors:  Synset depth  Branch distance  Polisemy of term associated with synset “s”

15-ene-1515 Thesauri disambiguation III Annotation of disambiguated synsets

15-ene-1516 Automatic expansion of keywords with new disambiguated thesauri Comparison between the initial collection of synsets and the synsets of a new term

15-ene-1517 Expansion of keywords II

15-ene-1518 Conclusions & future lines  ThManager is a flexible tool to manage thesauri  It provides enhanced functionality for the improvement of classifications.  This tool can be easily integrated in other tools  It is used by a metadata edition tool (also presented here) to select the appropriate term for the distinct metadata fields.  Future lines:  Creation of a thesaurus Web Service providing some of the functionality offered by this tool.  thesaurus browsing, WordNet polysemy extraction, keywords expansion,...  Concept based retrieval  Exploit the semantic disambiguation of thesauri to test different information retrieval strategies for geographic data catalogs.  It is possible to index metadata records according to a unified system: the disambiguated WordNet synsets

15-ene-1519 Advanced Information Systems Laboratory