TMF - a tutorial TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Slides:



Advertisements
Similar presentations
TMF - a tutorial Part 3: Designing (schemas and) filters TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.
Advertisements

OLIF V2 Gr. Thurmair April OLIF April 2000 OLIF: Overview Rationale Principles Entries Descriptions Header Examples Status.
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
® IBM Software Group © IBM Corporation WS-Policy Attachment- spec overview Maryann Hondo IBM.
Music Encoding Initiative (MEI) DTD and the OCVE
METS: An Introduction Structuring Digital Content.
Expansion/contraction des squelettes structurels Notes LR.
XML/EDI Overview West Chester Electronic Commerce Resource Center (ECRC)
MLIF: A Metamodel to Represent and Exchange Multilingual Textual Information ISO TC37 SC4 WG Samuel Cruz-Lara, Gil Francopoulo, Laurent Romary,
ICS (072)Database Systems: A Review1 Database Systems: A Review Dr. Muhammad Shafique.
DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,
Interchange using TBX 8 th Metadata conference Berlin April 2005 Alan K. Melby Brigham Young University, Provo campus.
UML CASE Tool. ABSTRACT Domain analysis enables identifying families of applications and capturing their terminology in order to assist and guide system.
Data Management I DBMS Relational Systems. Overview u Introduction u DBMS –components –types u Relational Model –characteristics –implementation u Physical.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Representation of Web Data in a Web Warehouse Ragini A.S. & Shipra Dutta November 20 th, 2001.
Procedures to Develop and Register Data Elements in Support of Data Standardization September 2000.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. Commonalities and Differences.
Principles of the GOLD Ontology & Conversion of GOLD to DCIF Presenters: Anthony Aristar, Evelyn Richter.
SC32 WG2 Metadata Standards Tutorial Metadata Registries and Big Data WG2 N1945 June 9, 2014 Beijing, China.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
Neminath Simmachandran
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
Standards for language resources the ISO/TC 37(/SC 4) perspective
LexEVS 6.0 Overview Scott Bauer Mayo Clinic Rochester, Minnesota February 2011.
Experiments with ODD outside the TEI framework Laurent Romary & Piotr Banski The ISO-TEI connection.
C. Huc/CNES, D. Boucon/CNES-SILOGIC Specification for the Formal Definition and Transfer Phase of a Producer-Archive Interface.
TEXT ENCODING INITIATIVE (TEI) Inf 384C Block II, Module C.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. TBX TermBase Exchange Format.
CountryData Technologies for Data Exchange SDMX Information Model: An Introduction.
SDMX Standards Relationships to ISO/IEC 11179/CMR Arofan Gregory Chris Nelson Joint UNECE/Eurostat/OECD workshop on statistical metadata (METIS): Geneva.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
What is MOF? The Meta Object Facility (MOF) specification provides a set of CORBA interfaces that can be used to define and manipulate a set of interoperable.
ISO a tutorial Part 2: Representing data categories TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.
New Perspectives on XML, 2nd Edition
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. Presentation Title: Day:
A comprehensive framework for multimodal meaning representation Ashwani Kumar Laurent Romary Laboratoire Loria, Vandoeuvre Lès Nancy.
XML 2nd EDITION Tutorial 4 Working With Schemas. XP Schemas A schema is an XML document that defines the content and structure of one or more XML documents.
Common Terminology Services 2 CTS 2 Submission Team Status Update HL7 Vocabulary Working Group May 17, 2011.
C. Huc/CNES, D. Boucon/CNES-SILOGIC Producer-Archive Interface Specification.
TMF - Terminological Markup Framework Laurent Romary Laboratoire LORIA (CNRS, INRIA, Universités de Nancy) ISO meeting London, 14 August 2000.
Overview of SC 32/WG 2 Standards Projects Supporting Semantics Management Open Forum 2005 on Metadata Registries 14:45 to 15:30 13 April 2005 Larry Fitzwater.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
ISO CD Editorial and technical comments. Contact Mailing list Subject: sub FirstName LastName.
Dictionary based interchanges for iSURF -An Interoperability Service Utility for Collaborative Supply Chain Planning across Multiple Domains David Webber.
Tutorial on XML Tag and Schema Registration in an ISO/IEC Metadata Registry Open Forum 2003 on Metadata Registries Tuesday, January 21, 2003; 4:45-5:30.
ISO TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.
ISO 191** Overview A “Family” of Standards. Resources ISO Standards Web Page – Technical.
Working with XML. Markup Languages Text-based languages based on SGML Text-based languages based on SGML SGML = Standard Generalized Markup Language SGML.
SemAF – Basics: Semantic annotation framework Harry Bunt Tilburg University isa -6 Joint ISO - ACL/SIGSEM workshop Oxford, January 2011 TC 37/SC.
Statistical Data and Metadata Exchange SDMX Metadata Common Vocabulary Status of project and issues ( ) Marco Pellegrino Eurostat
1 Chapter 2 Database Environment Pearson Education © 2009.
Ontologies Reasoning Components Agents Simulations An Overview of Model-Driven Engineering and Architecture Jacques Robin.
The CEN Metalex Naming Convention Fabio Vitali University of Bologna.
ADN Framework Overview A Collaboration of ADEPT, DLESE and NASA (2002 Nov. 19)
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Marc Kemps-Snijders Menzo Windhouwer Sue Ellen Wright
Chapter (12) – Old Version
XML QUESTIONS AND ANSWERS
DATA MODELS.
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
2. An overview of SDMX (What is SDMX? Part I)
Data Model.
Session 2: Metadata and Catalogues
SDMX Information Model: An Introduction
Practical Database Design and Tuning Objectives
Presentation transcript:

TMF - a tutorial TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Three parts 4 Part 1: Basic concepts 4 Part 2: Representing data categories 4 Part 3: Designing (schemas and) filters

TMF - a tutorial Part 1: Basic concepts TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

Background - ISO etc. The need for abstraction Structure and content of terminological data - picture virtual-actual The meta-model (structural skeleton) Describing data categories Styles and vocabularies XTMF as a mapping tool - examples Further work: extending the model to a wider scope (language engineering)

Overview

General principles 4 Expressing constraints on the representation of computerized terminologies What is the underlying structure of computerized terminologies? Which data-category is used and under which conditions? 4 Maintaining interoperability between representations Providing a conceptual tool to compare two given formats

Definitions 4 TMF: Terminological Mark-up Framework Definition of underlying structures and mechanisms needed for the computer representation of terminological data Independence with regards any specific format 4 GMT: Generic Mapping Tool Abstract XML format equivalent to the underlying model of TMF

Definitions - cont. 4 TML: Terminological Mark-up Language One specific representation format generated within TMF E.g.: DXLT is a possible TML

A family of formats TMF TML 1 TML 2 TML 3 TML i … (DXLT)(Geneter) GMT

Meta-model Representing the underlying structure of terminological data

* * 1 * 1 * * 1 * 1 * * 11 * 1 0:1 Terminological Data Collection Global Information Terminological Entry Complementary Information Terminology- related Information Language Section Term Section Term Component Section

Meta-model description 4 Terminological Data Collection (TDC) A collection of data containing information on concepts of specific concept fields. 4 Terminological Entry (TE) An entry containing information on terminological units (i.e., subject-specific concepts, terms, etc.). »Example: Domain description, Conceptual relations etc.

Meta-model description - cont. 4 Language Section (LS) The part of a terminological entry containing information related to one language. »Note: One terminological entry may contain information on one, two or more languages. 4 Term Section (TS) The part of a language section giving information about a term. »Example: Term status (e.g. abbreviation), Usage information (temporal, geographical etc.)

Meta-model description - cont. 4 Term Component Section (TCS) The section of a term section giving information about components of a term. »Example: Component grammatical information (Part of speech)

Meta-model description - cont. 4 Global Information (GI) Technical and administrative information applying to the entire data collection. »Example: title of the data collection, revision history 4 Complementary Information (CI) Information supplementary to terminology-related information. »Example: bibliographical source, documentary language or description thereof.

The structural skeleton Terminological Data Collection (TDC) Global Information (GI)Complementary Information (CI) Terminological Entry (TE) Language Section (LS) Term Level (TL) Term Component Level (TCL) * * * *

How does this work? Walking through an example…

DXLT example manufacturing A value between 0 and 1 used in... alpha smoothing factor fullForm Alfa...

Identifying the structural skeleton id=‘ID67’ [attribute] subjectField=‘ manufacturing ’ [typedElement] definition=‘A value…’ [typedElement] lang=‘ hu ’ [attribute]lang=‘ en ’ [attribute] term=‘…’ [element] term=‘alpha smoothing factor’ [element] termType=‘fullForm’ [typedElement] TE LS TS TE: Terminological Entry LS: Language Section TS: Term Section

TMF information model TE TS LS TS id=‘ID67’ subjectField=‘ manufacturing ’ definition=‘A value…’ lang=‘ hu ’ lang=‘ en ’ term=‘…’ term=‘alpha smoothing factor’ termType=‘fullForm’

GMT representation ID67 manufacturing A value between 0 and 1 used in... en alpha smoothing factor fullForm hu Alfa...

TML à la mode ISO –Ingredients –A structural skeleton »(take the TMF Metamodel) –A reference Data Category Registry »ISO is a good place to find one –Recette –Choose some data categories from the registry »You can even constrain the values of your datcats –Associate a style and vocabulary to each datcat »You can inspire yourself from others (DXLT) –Serve it hot to your software guy with a piece of SALT software

GMT Generic Mapping Tool

Background 4 Interoperability principle –If any two TMLs have exactly the same DCS, even though they differ radically in style and vocabulary, they are equivalent. 4 Consequence –It is always possible to define a filter from one TML to another when they are interoperable GMT is the intermediate representation to do so

From one TML to another 4 GMT - Generic mapping tool –an abstract XML representation identification of levels – … »a recursive element representation of data-categories – …

The tmf element Description: –The tmf element is the root element for any valid XTMF document. It contains both the global information that corresponds to a terminological data collection, the collection itself, and the complementary information comprising external resources in particular, which are needed for describing the various terminological entries. Content model:

The struct element Description –The struct element should be used to represent a locus in a given structural skeleton. The struct element is recursive and may also contain feat and/or brack elements to express attributes belonging to the corresponding level of the meta model. Attributes: –type: level in the meta model (TDC, TE, LS, TS or TCS) Content model:

The feat element Description –The feat element represents any feature that is either directly attached to a locus in the structural skeleton (represented by a struct element). The feat element accepts the following attributes: – type: categorises the feat element through the reference to the name of the corresponding data category. Content model (DTD) –

Bracketing information

Rationale 4 Describing the context of use of a given data category –Example 1: »Classification Code: AG1 »Classification System: Lenoc –Example 2: »Transaction type: modification »Responsible person: Mr. X »Date: 23 avril 1988

Formal model 4 Hierarchical feature structure –Constraint: Type given by ‘ main ’ (first) data category

GMT description Bracketing features xxx Lenoc Rem: no type for ‘ brack ’

Annotating content

Rationale 4 Why should we annotate specific content? –To identify components which are not explicitly expressed as a specific part of a terminological entry E.g.: Characteristics of a concept –To relate a component to another entry or an external resource E.g.: bibliographical reference

Formal model ?

XML model 4 Mixed content – Attributes –type: categorises the annot element through the reference to the name of the corresponding data category. Rem.: Problems with mixed content in XML schemas

GMT description Annotating information pencil whose casing is fixed around a cental graphite medium which is used for writing or making marks

Representation of relations

XML links 4 Transparency as to the actual location of a resource (internal vs. external) 4 Maybe useful to identify ontologies –External links between concepts entry i entry j entry i entry j

Representation in GMT 4 Two attributes Target - a pointer to a ‘ struct ’ element in the case the feature expresses a relation between the current locus and another locus in the structural skeleton; Source - a pointer to a ‘ struct ’ element in cases where the feature is described external to the locus to which it is supposed to be attached.

Some examples Simple atomic feature attached directly to a locus: ID67 Basic feature whose value is a reference to a locus in the structural skeleton: Basic feature anchored at the locus in the structural skeleton whose id attribute value is “TE24”: ID67 Compound feature anchored at “TE 23” and which makes reference to “TE 24”:

Styles and vocabularies

Implementating a DatCat –Definitions: ‘ style ’ — The way a given DatCat is implemented as an XML object… ‘ vocabulary ’ — symbols needed to express the implementation of a given DatCat in its associated style ; –E.g.: »DatCat: /definition/ »Vocabulary = [def] »Style = Element » pencil whose casing … DatCat value

Implementating a DatCat (Cont.) –Definition: ‘ anchor ’ — the XML element(s) to which the implementation of a given DatCat can be attached –E.g.: alpha smoothing factor

Styles - element 4 Element Def.: The Datcat is implemented as an element, child of its anchor Vocabularies : the name of the corresponding element E.g.: pencil whose casing … alpha smoothing factor DatCat value

Styles - typedElement 4 typedElement Def.: The Datcat is implemented as a generic XML element, which is a child of the anchor, and which is further specified by means of a type attribute. Its content is the value of the feature in the structural skeleton. Vocabularies : the element name and the value of the type attribute E.g.: Bla, bla, bla… DatCat value

Styles - attribute 4 Attribute Def.: The Datcat is implemented as an attribute of its anchor Vocabularies : the name of the corresponding attribute E.g.: … DatCat value

4 ValuedElement 4 TypedValuedElement