Vocabulary and Terminology Standards

Slides:



Advertisements
Similar presentations
웹 서비스 개요.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Ontology Assessment – Proposed Framework and Methodology.
Taxonomy as Content Outline, Site Map and Search Aid SLA NWR Vancouver October 6, 2006 Marjorie M.K. Hlava President
Project of the Darmstadt University of Technology within the competence network New Services, Standardization, Metadata (bmb+f) Stephan Körnig Ali Mahdoui.
Alexandria Digital Library Project Integration of Knowledge Organization Systems into Digital Library Architectures Linda Hill, Olha Buchel, Greg Janée.
ISO – plans and progress towards the revised international standard for thesauri Stella G Dextre Clarke Project Leader, ISO NP
6. Applying metadata standards: Controlled vocabularies and quality issues Metadata Standards and Applications Workshop.
Applying ISO25964 to thesaurus mapping and other forms of linkage Stella Dextre Clarke Convenor, ISO TC46/SC9 WG8 1.
Helping people find content … preparing content to be found Enabling the Semantic Web Joseph Busch.
ANSI TAG 37 Committee F43 Language Services and Products Interagency Language Roundtable September 30, 2011 Sue Ellen Wright ISO TC 37, Terminology and.
Text Operations: Preprocessing. Introduction Document preprocessing –to improve the precision of documents retrieved –lexical analysis, stopwords elimination,
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Standards for networked knowledge organisation systems Ron Davies European Library Automation Group Bucharest, April 2006.
Environmental Terminology System and Services (ETSS) June 2007.
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
Thesaurus Design and Development
A Registry for controlled vocabularies at the Library of Congress
1 Languages for aboutness n Indexing languages: –Terminological tools Thesauri (CV – controlled vocabulary) Subject headings lists (CV) Authority files.
Metadata : Setting the Scene or a Basic Introduction Wendy Duff University of Toronto, Faculty of Information Studies.
Confidential 111 Semantics Repository Case Study, Findings, Wider implications Mike Bennett, Head of Semantics and Standards, EDM Council July 21-22, 2010.
Sunday May 4 – 5 PM Bradford, Hlava, McNaughton
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. Commonalities and Differences.
Controlled Vocabulary & Thesaurus Design Planning & Maintenance.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
Rutherford Appleton Laboratory SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory Semantic Web Best Practices and Deployment.
A J Miles Rutherford Appleton Laboratory SKOS Standards and Best Practises for USING Knowledge Organisation Systems ON THE Semantic Web NKOS workshop ECDL.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
Taxonomies: Insuring compatibility and crosswalks Marjorie M. K. Hlava Access Innovations / Data Harmony
Annual reports and feedback from UMLS licensees Kin Wah Fung MD, MSc, MA The UMLS Team National Library of Medicine Workshop on the Future of the UMLS.
Agent Model for Interaction with Semantic Web Services Ivo Mihailovic.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
D4: SKOS and HIVE—Enhancing the Creation, Design and Flow of Information Speakers: Hollie White Jane Greenberg Coordinator: Alan Keely.
1 Issues in Reusing and Sharing the Content of Thesauri and Taxonomies in OOR Marcia Zeng NKOS (Networked Knowledge Organization Systems/Services) My participating.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Interfacing Registry Systems December 2000.
The Agricultural Ontology Service (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Library and Documentation Systems.
Ontology Summit2007 Survey Response Analysis Ken Baclawski Northeastern University.
CS3773 Software Engineering Lecture 04 UML Class Diagram.
TOPIC: Transportation Research Thesaurus: Taxonomy Development and Use Cases 14 February :00 PM EST Presented by Jay Ven Eman, Ph.D., CEO Access.
1 Everyday Requirements for an Open Ontology Repository Denise Bedford Ontolog Community Panel Presentation April 3, 2008.
ISO 25964: a standard in support of interoperability Stella G Dextre Clarke Project Leader, ISO NP
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
Evolution of a production pipeline Marjorie M.K. Hlava President Access Innovations.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Alternative Architecture for Information in Digital Libraries Onno W. Purbo
Oreste Signore- Quality/1 Amman, December 2006 Standards for quality of cultural websites Ministerial NEtwoRk for Valorising Activities in digitisation.
Introduction to the Semantic Web and Linked Data
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Text Analytics in Action: Using Text Analytics as a Toolset TBC 4:15 p.m. - 5:00 p.m. Marjorie Hlava Semantic enrichment / Semantic Fingerprinting.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Revising ANSI/NISO Z39.19 Updates for the 21 st Century.
Object storage and object interoperability
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Chapter 7 K NOWLEDGE R EPRESENTATION, O NTOLOGICAL E NGINEERING, AND T OPIC M APS L EO O BRST AND H OWARD L IU.
ISO TC 37/CLARIN DISCUSSION UTRECHT, DECEMBER 9/ Thinning Down a Bloated Cat SUE ELLEN WRIGHT DECEMBER 2013.
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
The Agricultural Ontology Server (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Food and Agriculture Organization.
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
The Semantic Web By: Maulik Parikh.
The Re3gistry software and the INSPIRE Registry
PREMIS Tools and Services
Metadata in Digital Preservation: Setting the Scene
Presentation transcript:

Vocabulary and Terminology Standards Marjorie M.K.Hlava President Access Innovations, Inc.

What are they for? Accuracy Disambiguation Recall Precision Relevance Removing Noise Increasing Hits

Vocabulary Control Define Scope of term (meaning) Equivalence between synonyms And Quasi Synonyms Single concept Cars / Automobiles Distinguish homographs Mercury – Planet, metal, myth, car

Nice to have Equivalence – synonyms - Rogets Associative – related terms Hierarchies – taxonomy view History – development of the term Notes – help the user Notation – different way to sort Display – how to show the terms

Where do they come from? ANSI / NISO ISO W3C US OMB IFLA National Information Standards Organization ISO International Standards Organization TC 46 TC 37 W3C World Wide Web Consortium US OMB IFLA

Effectiveness of Indexing Indexing as a means for identifying and retrieving content objects depends upon a well constructed indexing language Improves precision by defining the scope of the terms Improves recall by allowing different terms for the same concept

Vocabulary Control Eliminate ambiguity Control synonyms – insure a term has only one meaning Establish relationships Test and validate terms Gather the linguistic relationships in one place

Facets Bottom up approach to organization Find the parts of the knowledge Useful in Developing areas Interdisciplinary fields Multiple hierarchies Where location of CO is not important Where several groups have ownership of the whole…

Facets Multiple attributes for a single object Could be fields, Topic By composition By use Format Audience Intellectual level Could be fields, Could be facets

Interoperability Same content – different domains Different vocabulary – same domain Degree of specificity / granularity Handling of synonyms Search methodology Z39.50 Pre coordinated Post coordinated Literary warrant established in vocab used Intended purpose (academic vs lay public)

Interoperability Merging vocabularies Merging databases Indexing using single vocab Federated searching Mapping or crosswalks Decide on master vocab Authority records

Z39.19 -2006 – Controlled Vocabularies - What’s new? Interoperability Synonym Rings Facet Analysis Taxonomy Semantic Relationships Glossary Web Format Changes in terminology

Other methods - Zeng Derivation Modeling Translation and Adaptation Satellite controlled vocabularies Node or leaf linking

Other methods Section 10.8 Switching Linking Through a Temporary Union List Linking Through Controlled Vocabulary Servers

Semantic Network Use to cluster terms from many sources Create an underlying structure for all to map to Group terms to conceptual scheme Define types of concepts Define types of relationships Generate synonym rings for retrieval

Semantic network for clustered terms from a combined vocabulary – UMLS portion

Semantic network showing types of relationships among concepts - UMLS

Lexical Database Terms from many vocabularies Create clusters of concepts Allow many kinds of relationships IsA, HasA, synonyms, hierarchical, etc

Multiple senses for the same noun in a lexical database bridge, span -- (a structure that allows people or vehicles to cross an obstacle such as a river or canal or railway etc.) bridge, bridge circuit -- (a circuit consisting of two branches (4 arms arranged in a diamond configuration) across which a meter is connected) bridge -- (something resembling a bridge in form or function; "his letters provided a bridge across the centuries") bridge -- (the hard ridge that forms the upper part of the nose; "her glasses left marks on the bridge of her nose") bridge -- (any of various card games based on whist for four players) bridge -- (a wooden support that holds the strings up) bridge, bridgework -- (a denture anchored to teeth on either side of missing teeth) bridge, nosepiece -- (the link between two lenses; rests on nose) bridge, bridge deck -- (an upper deck where a ship is steered and the captain stands)

ISO TC 46 – SC 6 or 9 Controlled vocabulary and other information standards ISO 5127 – Information and Documentation – Vocabulary ISO 2788-1986 Guidelines for the establishment and development of monolingual thesauri = BS 5723:1987 ISO 5964-1985 Guidelines for the establishment and development of multilingual thesauri = BS 6723:1985 NEW - BSA 8723 - Parts 1 – 4 Stella Dexter Clarke

British Standards - BS 8723 Structured vocabularies for information retrieval – Guide Part 1: General Part 2: Thesauri Part 3: Vocabularies other than thesauri Part 4: Interoperability between vocabularies Part 5: Interoperability with applications

ISO TC 37 Scope of ISO TC 37: Standardization of principles, methods and applications relating to terminology and other language resources. TC 37/SC 1 - Principles and methods TC 37/SC 2 - Terminography and lexicography TC 37/SC 3 - Computer applications for terminology TC 37/SC 4 - Language resource management

Sample Standards Principles of concept-oriented terminology and data categories: ISO 704:2000 Terminology work - Principles and methods ISO 860:1996 Terminology work - Harmonization of concepts and terms ISO 1087-1:2000 Terminology work - Vocabulary - Part 1: Theory and application ISO 1087-2:2000 Terminology work - Vocabulary - Part 2: Computer applications ISO 10241:1992 Preparation and layout of international terminology standards ISO 12200:1999 Computer applications in terminology - Machine-readable terminology interchange format (MARTIF) - Negotiated interchange ISO 12616:2002 Translation-oriented terminography ISO/TR 12618:1994 Computer aids in terminology - Creation and use of terminological databases and text corpora ISO 12620:1999 Computer applications in terminology - Data categories used to create glossaries

W3C OWL – Web Ontology Language RDF – Resource Description Format Topic Maps SKOS - Simple Knowledge Organization Systems Which community to serve? Build on the current standard Might make this link next

Other things to watch Other W3C and ISO areas Support groups SIMILE Blogs Communities of Practice SIMILE Web 2.0 activities WSDL – Web Services Digital Library

Other Relevant ISO & W3C Standards Markup Languages Metadata Resources Character Coding Access Protocols and Interoperability Content Creation, Manipulation, and Maintenance Authoring Standards Text and Content Markup Translation Standards Terminology and Lexicography Standards ISO TC 37 Standards Terminology Interchange Standards Controlled Language Standards Taxonomy and Ontology Standards Corpus Management Standards   Locale-Related Standards For translation, terminology and applied linguists go to : http://appling.kent.edu/ResourcePages/LTStandards/Chart/standards.chart.htm#Ontology

SIMILE Semantic Interoperability of Metadata and Information in unLike Environments Forming a data reference for open source taxonomies

Web Services – Web 2.0 Acts as middleware Allows migration from legacy systems Uses HTTP remote procedure calls Allows interaction between distributed, loose coupled and reusable software components Openness: Machine readable description available WSDL – Web Services Digital Library accessible via public (UDDI-) Registries Will help integrate “taxonomies into web presentations of data

Summary Standards for Vocabulary now come from many places. There are 19 main ones to watch If you have a current standard compliant thesaurus you can probably implement it in the Web 2.0 arena Vocabulary control improves search Precision and accuracy Recall and relevance

Access Innovations / Data Harmony Thank you for your time Marjorie M.K. Hlava President Access Innovations / Data Harmony 505-998-0800 mhlava@accessinn.com

: xrefer Research Mapping

Data Harmony View - VxInsight

Scenario: Web Services based Infrastructure for Digital Libraries UDDI- Registry Existing web services Newsservice Scientific database OAI based document harvester XML/SOAP Network- node 1 WSDL- descriptions node 2 node n This slide shows a situation, not the possible relations between all the elements. Each node is a web service as well – just wanted to show that existing web services can be used.

DL-Services (logical view) (user‘s perspective) Types of services Types of material Metamodel Functionality model and implementation Service Provision (Web-Service) Supporting users, developers, and provider of services in their different roles -> representing their different views within the framework

Framework (logical view) XML-Repository Metamodel of Services Data processed by Services Webserver (Cocoon2) Service creation and assembly Support Services Transaction- management Service <<facade>> WSDL-Service Support Generator Service- Localization Dispatcher Java-classes (framework application)

For services and software Contact Marjorie Hlava Jay Ven Eman Access Innovations, Inc Data Harmony Thesaurus Master, MAI, XIS Auto indexing and content Management 505-998-0800 mhlava@accessinn.com j_ven_eman@accessinn.com