Presentation is loading. Please wait.

Presentation is loading. Please wait.

TMF - a tutorial TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Similar presentations


Presentation on theme: "TMF - a tutorial TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria."— Presentation transcript:

1 TMF - a tutorial TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

2 Three parts 4 Part 1: Basic concepts 4 Part 2: Representing data categories 4 Part 3: Designing (schemas and) filters

3 TMF - a tutorial Part 1: Basic concepts TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

4 Background - ISO etc. The need for abstraction Structure and content of terminological data - picture virtual-actual The meta-model (structural skeleton) Describing data categories Styles and vocabularies XTMF as a mapping tool - examples Further work: extending the model to a wider scope (language engineering)

5 Overview

6 General principles 4 Expressing constraints on the representation of computerized terminologies What is the underlying structure of computerized terminologies? Which data-category is used and under which conditions? 4 Maintaining interoperability between representations Providing a conceptual tool to compare two given formats

7 Definitions 4 TMF: Terminological Mark-up Framework Definition of underlying structures and mechanisms needed for the computer representation of terminological data Independence with regards any specific format 4 GMT: Generic Mapping Tool Abstract XML format equivalent to the underlying model of TMF

8 Definitions - cont. 4 TML: Terminological Mark-up Language One specific representation format generated within TMF E.g.: DXLT is a possible TML

9 A family of formats TMF TML 1 TML 2 TML 3 TML i … (DXLT)(Geneter) GMT

10 Meta-model Representing the underlying structure of terminological data

11 * * 1 * 1 * * 1 * 1 * * 11 * 1 0:1 Terminological Data Collection Global Information Terminological Entry Complementary Information Terminology- related Information Language Section Term Section Term Component Section

12 Meta-model description 4 Terminological Data Collection (TDC) A collection of data containing information on concepts of specific concept fields. 4 Terminological Entry (TE) An entry containing information on terminological units (i.e., subject-specific concepts, terms, etc.). »Example: Domain description, Conceptual relations etc.

13 Meta-model description - cont. 4 Language Section (LS) The part of a terminological entry containing information related to one language. »Note: One terminological entry may contain information on one, two or more languages. 4 Term Section (TS) The part of a language section giving information about a term. »Example: Term status (e.g. abbreviation), Usage information (temporal, geographical etc.)

14 Meta-model description - cont. 4 Term Component Section (TCS) The section of a term section giving information about components of a term. »Example: Component grammatical information (Part of speech)

15 Meta-model description - cont. 4 Global Information (GI) Technical and administrative information applying to the entire data collection. »Example: title of the data collection, revision history 4 Complementary Information (CI) Information supplementary to terminology-related information. »Example: bibliographical source, documentary language or description thereof.

16 The structural skeleton Terminological Data Collection (TDC) Global Information (GI)Complementary Information (CI) Terminological Entry (TE) Language Section (LS) Term Level (TL) Term Component Level (TCL) * * * *

17 How does this work? Walking through an example…

18 DXLT example manufacturing A value between 0 and 1 used in... alpha smoothing factor fullForm Alfa...

19 Identifying the structural skeleton id=‘ID67’ [attribute] subjectField=‘ manufacturing ’ [typedElement] definition=‘A value…’ [typedElement] lang=‘ hu ’ [attribute]lang=‘ en ’ [attribute] term=‘…’ [element] term=‘alpha smoothing factor’ [element] termType=‘fullForm’ [typedElement] TE LS TS TE: Terminological Entry LS: Language Section TS: Term Section

20 TMF information model TE TS LS TS id=‘ID67’ subjectField=‘ manufacturing ’ definition=‘A value…’ lang=‘ hu ’ lang=‘ en ’ term=‘…’ term=‘alpha smoothing factor’ termType=‘fullForm’

21 GMT representation ID67 manufacturing A value between 0 and 1 used in... en alpha smoothing factor fullForm hu Alfa...

22

23 TML à la mode ISO –Ingredients –A structural skeleton »(take the TMF Metamodel) –A reference Data Category Registry »ISO 12620 is a good place to find one –Recette –Choose some data categories from the registry »You can even constrain the values of your datcats –Associate a style and vocabulary to each datcat »You can inspire yourself from others (DXLT) –Serve it hot to your software guy with a piece of SALT software

24 GMT Generic Mapping Tool

25 Background 4 Interoperability principle –If any two TMLs have exactly the same DCS, even though they differ radically in style and vocabulary, they are equivalent. 4 Consequence –It is always possible to define a filter from one TML to another when they are interoperable GMT is the intermediate representation to do so

26 From one TML to another 4 GMT - Generic mapping tool –an abstract XML representation identification of levels – … »a recursive element representation of data-categories – …

27 The tmf element Description: –The tmf element is the root element for any valid XTMF document. It contains both the global information that corresponds to a terminological data collection, the collection itself, and the complementary information comprising external resources in particular, which are needed for describing the various terminological entries. Content model:

28 The struct element Description –The struct element should be used to represent a locus in a given structural skeleton. The struct element is recursive and may also contain feat and/or brack elements to express attributes belonging to the corresponding level of the meta model. Attributes: –type: level in the meta model (TDC, TE, LS, TS or TCS) Content model:

29 The feat element Description –The feat element represents any feature that is either directly attached to a locus in the structural skeleton (represented by a struct element). The feat element accepts the following attributes: – type: categorises the feat element through the reference to the name of the corresponding data category. Content model (DTD) –

30 Bracketing information

31 Rationale 4 Describing the context of use of a given data category –Example 1: »Classification Code: AG1 »Classification System: Lenoc –Example 2: »Transaction type: modification »Responsible person: Mr. X »Date: 23 avril 1988

32 Formal model 4 Hierarchical feature structure –Constraint: Type given by ‘ main ’ (first) data category

33 GMT description Bracketing features xxx Lenoc Rem: no type for ‘ brack ’

34 Annotating content

35 Rationale 4 Why should we annotate specific content? –To identify components which are not explicitly expressed as a specific part of a terminological entry E.g.: Characteristics of a concept –To relate a component to another entry or an external resource E.g.: bibliographical reference

36 Formal model ?

37 XML model 4 Mixed content – Attributes –type: categorises the annot element through the reference to the name of the corresponding data category. Rem.: Problems with mixed content in XML schemas

38 GMT description Annotating information pencil whose casing is fixed around a cental graphite medium which is used for writing or making marks

39

40 Representation of relations

41 XML links 4 Transparency as to the actual location of a resource (internal vs. external) 4 Maybe useful to identify ontologies –External links between concepts entry i entry j entry i entry j

42 Representation in GMT 4 Two attributes Target - a pointer to a ‘ struct ’ element in the case the feature expresses a relation between the current locus and another locus in the structural skeleton; Source - a pointer to a ‘ struct ’ element in cases where the feature is described external to the locus to which it is supposed to be attached.

43 Some examples Simple atomic feature attached directly to a locus: ID67 Basic feature whose value is a reference to a locus in the structural skeleton: Basic feature anchored at the locus in the structural skeleton whose id attribute value is “TE24”: ID67 Compound feature anchored at “TE 23” and which makes reference to “TE 24”:

44 Styles and vocabularies

45

46 Implementating a DatCat –Definitions: ‘ style ’ — The way a given DatCat is implemented as an XML object… ‘ vocabulary ’ — symbols needed to express the implementation of a given DatCat in its associated style ; –E.g.: »DatCat: /definition/ »Vocabulary = [def] »Style = Element » pencil whose casing … DatCat value

47 Implementating a DatCat (Cont.) –Definition: ‘ anchor ’ — the XML element(s) to which the implementation of a given DatCat can be attached –E.g.: alpha smoothing factor

48 Styles - element 4 Element Def.: The Datcat is implemented as an element, child of its anchor Vocabularies : the name of the corresponding element E.g.: pencil whose casing … alpha smoothing factor DatCat value

49 Styles - typedElement 4 typedElement Def.: The Datcat is implemented as a generic XML element, which is a child of the anchor, and which is further specified by means of a type attribute. Its content is the value of the feature in the structural skeleton. Vocabularies : the element name and the value of the type attribute E.g.: Bla, bla, bla… DatCat value

50 Styles - attribute 4 Attribute Def.: The Datcat is implemented as an attribute of its anchor Vocabularies : the name of the corresponding attribute E.g.: … DatCat value

51 4 ValuedElement 4 TypedValuedElement


Download ppt "TMF - a tutorial TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria."

Similar presentations


Ads by Google