Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Flexible XML-Based Glossary Approach for the Federal Government

Similar presentations


Presentation on theme: "A Flexible XML-Based Glossary Approach for the Federal Government"— Presentation transcript:

1 A Flexible XML-Based Glossary Approach for the Federal Government
By Ken Sall for the US Federal XML Community of Practice January 19, 2005

2 Flexible XML-Based Glossary Approach for the Federal Government
Problem Statement After examining standard glossary terminology (ISO 1087 and others), define an XML Schema or DTD that models “all useful” aspects of a term and its definition. Should be applicable to any government agency. Consider flexibility and collaborative development as key design criteria. Many different agencies may use the model and many individuals may author specific term definitions. Create an XSLT stylesheet that knows about the model and displays an XML glossary instance document as HTML in any modern browser. Eventually consider XSL-FO for PDF rendering of the glossary. 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

3 Flexible XML-Based Glossary Approach for the Federal Government
Design Goals Standards-Based - XML element names are loosely based on an international standard, ISO 1087. Flexible - The Glossary DTD, although initially a strawman to stimulate discussion, is fairly flexible with few required elements, many optional elements, and several repeatable elements. Provides a Framework - Since so few elements are required, terms can be added even before definitions are known. These terms act as placeholders that are fully supported by the DTD and XSLT. (For example, see the stub terms "DTD" and "XSLT" in the example instance.) 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

4 Flexible XML-Based Glossary Approach for the Federal Government
Design Goals Specialized - Any term may have multiple definitions so that different agencies may use the same term with their own specialized meaning, where necessary. Collaborative - Since an XSLT stylesheet is used to sort the terms alphabetically, many individuals can work on their own glossary fragments (XML instances of the Glossary DTD). At any time, the various contributions can be easily merged without manual editing. Leverages Links - Search links are automatically generated for each term by means of the XSLT, both to help kick-start and to augment the definition. 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

5 Flexible XML-Based Glossary Approach for the Federal Government
ISO 1087 Terminology (etc.) Key: ISO Used ISO 1087 & Used Unused Characteristic: Abstraction of a property of an object or of a set of objects. Note - Characteristics are used for describing concepts. [ISO :2000, 3.2.4] Concept: A unit of thought constituted through abstraction on the basis of properties common to a set of objects. Note - Concepts are not bound to particular languages. They are, however, influenced by the social or cultural background. (ISO 1087:1990) Unit of knowledge created by a unique combination of characteristics. [ISO :2000, 3.2.1] Definition: Statement which describes a concept and permits its differentiation from other concepts within a system of concepts. (ISO 1087:1990) Representation of a concept by a descriptive statement which serves to differentiate it from related concepts. [ISO :2000, 3.3.1] 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

6 Flexible XML-Based Glossary Approach for the Federal Government
ISO 1087 Terminology (etc.) Designation: Representation of a concept by a sign which denotes it. [ISO :2000, 3.4.1] Dictionary [see terminology and vocabulary]: Structured collection of lexical units with linguistic information about each of them. (ISO 1087:1990) Key: ISO Used ISO 1087 & Used Unused 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

7 Flexible XML-Based Glossary Approach for the Federal Government
ISO 1087 Terminology (etc.) Entry, Headword: The term headword appears in two different meanings. In lexicography, a headword is the word used as the heading in a dictionary entry or encyclopedia. In a descriptive terminology entry where no preference is given to any one term, there is no head term, but if preference is given to a term, head term is sometimes used in analogy to lexicography, as is main entry term. (Wright & Budin, 1997) Key: ISO Used ISO 1087 & Used Unused 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

8 Flexible XML-Based Glossary Approach for the Federal Government
ISO 1087 Terminology (etc.) Glossary [see dictionary, terminology, vocabulary]: Alphabetical list of terms or words found in or relating to a specific topic or text. It may or may not include explanations. Note - The distinguishing criterion is that glossaries are considered to reside in backmatter attached to books and other publications rather than being independent works in their own right. Glossaries are sometimes perceived as being less scientific in intent and methodology than terminologies, terminology standards, and even vocabularies, although a certain degree of synonymy exists. (Wright & Budin, 1997) 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

9 Flexible XML-Based Glossary Approach for the Federal Government
ISO 1087 Terminology (etc.) Nomenclature: System of terms which is elaborated according to pre-established naming rules. (ISO 1087:1990) Object: Anything perceivable or conceivable. Note - Objects may also be material (e.g. an engine, a sheet of paper, a diamond), immaterial (e.g. a conversion ratio, a project plan) or imagined (e.g. a unicorn). [Adapted from ISO :2000, 3.1.1] Key: ISO Used ISO 1087 & Used Unused 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

10 Flexible XML-Based Glossary Approach for the Federal Government
ISO 1087 Terminology (etc.) Synonym: A word with the same meaning or nearly the same meaning as another word in the same language. (Longman Dictionary of English Language and Culture: Longman Group UK Limited 1992) Note: Terminologists distinguish between real synonyms, i.e. terms which can be substituted with each other whatever the context, and the more common quasi-synonyms, which can differ from one another by context and sometimes by subject field (Sager, 1990) Term: Designation of a defined concept in a special language by a linguistic expression. Note - A term may consist of one or more words or even contain symbols. (ISO 1087:1990) 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

11 Flexible XML-Based Glossary Approach for the Federal Government
ISO 1087 Terminology (etc.) Terminological Dictionary [see dictionary and vocabulary]: Dictionary containing terminological data from one or more specific subject fields. Note - admitted term: technical dictionary (ISO 1087:1990) Terminological Record: Structured collection of terminological data relevant to one concept. (ISO 1087:1990) Terminological Database: Structured sets of terminological records in an information processing system. (ISO 1087:1990) Key: ISO Used ISO 1087 & Used Unused 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

12 Flexible XML-Based Glossary Approach for the Federal Government
ISO 1087 Terminology (etc.) Terminology Work: Any activity concerned with the systematization and representation of concepts or with the presentation of terminologies on the basis of established principles and methods. (ISO 1087:1990) Vocabulary [see terminology, dictionary, glossary]: Terminological dictionary containing the terminology of a specific subject field or of related subject fields and based on terminology work. (ISO 1087:1990) Key: ISO Used ISO 1087 & Used Unused 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

13 Summary: ISO 1087 Terminology
Unused ISO 1087 Terms Characteristic Designation Dictionary Nomenclature Object PreferredTerm – TBD? Terminological Dictionary / technical dictionary Terminological Record Terminological Database Terminological Dictionary Terminology Work Vocabulary ISO 1087 Terms Used: Concept Definition Term Used but not ISO 1087: Glossary Synonym RelatedTerm Additional Terms by Sall (next slide): Name Acronym ExpandedAcronym DefinitionSection Source Usage 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

14 Additional (Non-Standard) Terminology
Glossary – change to Dictionary, Vocabulary, Technical Dictionary or Terminology? Name – added only to allow Term to be a container; could change Term to Entry and Name to Term? Acronym – necessary option for technical terms ExpandedAcronym – ditto DefinitionSection - added simply as a repeatable container to encompass all aspects pertaining to a specific definition of a term Source - useful for traceability and credibility Usage – useful to have an optional example sentence for a given definition (use in context) 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

15 XML Glossary Model Strawman
1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

16 Flexible XML-Based Glossary Approach for the Federal Government
XML Example of One Term <Term id="ontology"> <Name>ontology</Name> <DefinitionSection> <Concept>semantic web</Concept> <Concept>knowledge management</Concept> <Definition>Defines the common words and concepts used to describe and represent an area of knowledge, and so standardizes the meanings. An ontology includes classes in the domains of interest, instances, relationships, properties and their values, functions of and processes involving the objects, and relevant constraints and rules.</Definition> <Source>Daconta, Obrst, Smith</Source> <Usage>An onotology can range from the simple notion of a taxonomy to a thesaurus, to a conceptual model, to a logical theory. [Daconta, Obrst, Smith]</Usage> <Synonym>classification system</Synonym> <RelatedTerm>taxonomy</RelatedTerm> <RelatedTerm>OWL</RelatedTerm> </DefinitionSection> <Concept>philosophy</Concept> <Definition>[sometimes "Ontology"] the metaphysical study of the nature of being and existence</Definition> <Source>WordNet</Source> <Usage>Both the ontology and manner of human existence are of concern to Existentialism.</Usage> <Synonym>metaphysics</Synonym> </Term> 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

17 XML Ex: Client-Side XSLT (Firefox)
1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

18 XML Ex: Client-Side XSLT (IExplorer)
1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

19 XML Example: XSLT Details
CSS Styling DefinitionSection based on Concept Optional and Repeatable Elements New DefinitionSection based on 2nd Concept Auto-generated Search Links 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

20 Collaboration – Merging Instances
Since a Glossary consists of one or more Terms, a relatively simple XSLT can be created to merge the Term elements for two or more XML instances. This means different authors (from the same or different agencies) can work independently. Issue: What if same Term is defined by different authors? Automatically add each definition, even though they may overlap/conflict, or manually edit collisions (could generate a conflict message)? Issue: Should agency name be a Source or another element (e.g., AgencySource)? Advantage is that custom XSLT could extract or render terms on per agency basis, if desired. Should there be an optional, repeatable SourceLink element for a URL? 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

21 Alternative: GlossXML
1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

22 Alternative: XML Acronym Desmystifier
1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government

23 Flexible XML-Based Glossary Approach for the Federal Government
Next Steps Determine interested agencies. Establish funding. Resolve terminology issues for the Glossary model. Consider merge or replacement by GlossXML and/or XML Acronym Demystifier. Need to finalize DTD or XML Schema before agencies start authoring. Revise initial XSLT to match final Glossary model. Determine repository and submission mechanisms. Could be a good use for CORE.gov? Coordinate with Plans for Derived XML Registry Prototype? Write additional XSLT stylesheets for merging and pulling agency-specific terms, etc. Develop XSL-FO stylesheets for PDF rendering of Glossary. 1/19/2005 Flexible XML-Based Glossary Approach for the Federal Government


Download ppt "A Flexible XML-Based Glossary Approach for the Federal Government"

Similar presentations


Ads by Google