The ISO 12620 Data Category Registry ISO 12620:2009 introduces – A web-based electronic Data Category Registry (DCR) for simple, complex and (in the future)

Slides:



Advertisements
Similar presentations
ISOcat Data Model: Workflow & Guidelines Marc Kemps-Snijders a, Sue Ellen Wright b, Menzo Windhouwer a a Max Planck Institute for Psycholinguistics, b.
Advertisements

ISOcat Data Category Registry Defining widely accepted linguistic concepts Menzo Windhouwer 1CLARIN-NL MD tutorial, September 2009.
Bulk loading ISOcat data categories with the Data Category Interchange Format 10/24/20111CLARIN-NL ISOcat Call 2 followup.
Principles of ISOcat, a Data Category Registry Marc Kemps-Snijders a, Menzo Windhouwer a, Sue Ellen Wright b a Max Planck Institute for.
Tero Hemiö Product Data Technology Europa th Symposium May 2 nd –5 th 2000 ESTEC, Noordwijk, The Netherlands Building Technology Product Data Technology.
Data Category specifications 19 June 20121CLARIN-NL 2012 ISOcat tutorial.
CLARIN-NL/VL procedure 20 June 20131CLARIN-NL ISOcat workshop.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Metadata Component Framework Possible Standardization Work.
MLIF: A Metamodel to Represent and Exchange Multilingual Textual Information ISO TC37 SC4 WG Samuel Cruz-Lara, Gil Francopoulo, Laurent Romary,
Uncovering the TEI and ODD A pedagogical strip-tease Laurent Romary - Max Planck Digital Library.
Lecture 14 XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name.
More XML namespaces, DTDs CS 431 – February 16, 2005 Carl Lagoze – Cornell University.
RELAX NG. Caveat I did not have a RELAX NG validator when I wrote these slides. Therefore, if an example appears to be wrong, it probably is.
Unit 4 – XML Schema XML - Level I Basic.
Incompatible or Interoperable? A METS bridge for a small gap between two digital preservation software packages Lucas Mak Metadata & CatalogLibrarian
4/20/2017.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
Modeling XML. XML Schema Languages DTD, XML Schema, Relax NG Specification of structure of XML documents What elements and attributes can be used Problems.
Lecture 15 XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name.
Provo, 16 Aug 2007 LMF meeting 1 Lexical Markup Framework: ISO Provo meeting Gil Francopoulo.
Scientific Markup Languages Birds of a Feather A 10-Minute Introduction to XML Timothy W. Cole Mathematics Librarian & Professor of.
Contracts & the Semantic Web John McClure Hypergrove Engineering Port Townsend, Washington.
The ISO-DCR 17 January /20111CMDI tutorial Marc Kemps-Snijders a, Menzo Windhouwer b, Sue Ellen Wright c a Meertens Institute, b MPI for.
Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev.
Neminath Simmachandran
CLARIN-NL Call 3 ISOcat follow-up 10/10/20121CLARIN-NL ISOcat Call 3 follow-up.
Cornell CS 502 More XML XHTML, namespaces, DTDs CS 502 – Carl Lagoze – Cornell University.
Content of the Data Category Registry 10 May /20111CLARIN-NL ISOcat workshop.
TEXT ENCODING INITIATIVE (TEI) Inf 384C Block II, Module C.
XML – Tools and Trends Schematron Tim Bornholtz Session 55.
Report on the ISOcat project Marc Kemps-Snijders Menzo Windhouwer Peter Wittenburg Sue Ellen Wright January 8,
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
CLARIN-NL Call 4 ISOcat follow-up 2/10/20131CLARIN-NL Call 4 ISOcat follow-up.
ISOcat introduction 20 June 20131CLARIN-NL ISOcat workshop.
ISOcat introduction 20 March 20121CLARIN-NL ISOcat workshop.
More XML namespaces, DTDs CS 431 – Carl Lagoze – Cornell University.
0 Federal XML Community of Practice (xmlCoP) Meeting Washington, DC December 17, 2004 Registration of Fine-Grained XML Artifacts in ebXML Registry Joseph.
Historical Perspective - DICOM Native Models from WG-23.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
ADN Framework Overview A Collaboration of ADEPT, DLESE and NASA (2002 Nov. 7)
XML 2nd EDITION Tutorial 4 Working With Schemas. XP Schemas A schema is an XML document that defines the content and structure of one or more XML documents.
1 Tutorial 14 Validating Documents with Schemas Exploring the XML Schema Vocabulary.
Schematron Tim Bornholtz. Schema languages Many people turn to schema languages when they want to be sure that an XML instance follows certain rules –DTD.
Processing of structured documents Spring 2003, Part 3 Helena Ahonen-Myka.
Beyond ISOcat 20 June 2013CLARIN-NL ISOcat tutorial1.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
ISO TC 37/CLARIN SEMANTIC DATA REGISTRY WORKSHOP UTRECHT, DECEMBER ISOcat: Metadata Registry SUE ELLEN WRIGHT DECEMBER 2013.
CLARIN Concept Registry: the new semantic registry Ineke Schuurman, Menzo Windhouwer, Oddrun Ohren, Daniel Zeman
WP 3: Standardisation of shared metadata Mode of operation –All partners are involved –Building on practice outside the project Achievements of Year 1.
ISOcat status
CLARIN Requirements for a Semantic Registry Daan Broeder The Language Archive – MPI Ineke Schuurman CLARIN-NL/VL – KU Leuven & Utrecht.
Slide #1 Nov 6 – 11, 2005XCON WG IETF54 Conference Package Extensions draft-levin-xcon-conference-package-ext-00 by Orit Levin The Discussion Starter.
XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name value pair;
Annotation by category – ELAN and ISO DCR Han Slöetjes, Peter Wittenburg Max-Planck-Institute for Psycholinguistics LREC,
Formats, interoperability and standards Marc Kemps-Snijders.
ISOcat tutorial DCR data model and guidelines. Simple and complex DCs Simple Data CategoryComplex Data CategoryConceptual Domain Data CategoryDescription.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
Group work and standardization features in ISOcat Menzo Windhouwer 8/14/20101Standardizing Data Categories in ISOcat - Implementing Group.
CMD and TEI CMDI interoperability workshop Utrecht Matej Ďurčo, ICLTT, Vienna.
Linking to Linguistic Data Categories in ISOcat Menzo Windhouwer a, Sue Ellen Wright b a The Language Archive - MPI for Psycholinguistics,
ISOcat introduction 10 May /20111CLARIN-NL ISOcat workshop.
Marc Kemps-Snijders Menzo Windhouwer Sue Ellen Wright
XML QUESTIONS AND ANSWERS
TEI Workshop 10. ROMA Summer 2010.
XML in Web Technologies
Keep it simple! - a Nordic view on technical simplification issues
XML Data Introduction, Well-formed XML.
XML Problems and Solutions
Quiz Points 2 Rules Raise your hand if you know the question
Presentation transcript:

The ISO Data Category Registry ISO 12620:2009 introduces – A web-based electronic Data Category Registry (DCR) for simple, complex and (in the future) container Data Categories (DCs) – ISO DIS compliant Persistent IDentifiers (PIDs) for each DC, e.g., – The DC Reference schema, a small XML vocabulary, to embed these DC PIDs in XML documents, e.g.,

Standards and Data Category references Some standards already provide their own constructs for embedded DC references However, these constructs sometimes – Use ambiguous DC identifiers instead of PIDs – Are not able to handle the current DC PIDs – Do not cover all DC types, i.e., container, complex and simple DCs

SpecificationCan handle DC PIDs?Handles DC typesSuggestion DTDsNoNoneUse Relax NG or XML Schema instead Relax NGYesAllUse the DC Reference vocabulary XML SchemaYesAllUse the DC Reference vocabulary TEI ODDYesAllUse TMFYesComplex DCsUse Relax NG of XML Schema instead, and use the DC Reference vocabulary LMFUnspecified Use the DC Reference vocabulary for an LMF compliant schema TBX XCSYesComplex DCsValue picklist needs to be opened up and may need provisions for the upcoming container DCs GeneterNoNoneUse Relax NG or XML Schema instead or use the DC Reference vocabulary in the instance MAFYesComplex and simple DCsMay need provisions for the upcoming container DCs LAFYesComplex DCsNeeds provisions for the other DC types

Improving the current situation Use Relax NG, XML Schema or ODD instead of DTD Create open schemas, which allow adding attributes and/or elements from foreign namespaces, or embed dcr:datcat or dcr:valueDatcat hooks at the proper places in the schemas The DC Reference vocabulary can then be used to embed DC references for various DC types at the right places For existing specifications with some support for DC references, make sure all relevant DC types can be covered, and make use of DC PIDs

References Latest version of the DC References vocabulary – Survey of the support for DC references – M.A. Windhouwer, S.E. Wright, M. Kemps- Snijders. Referencing ISOcat data categories. In proceedings of the LRT standards workshop (LREC 2010), Malta, May 18, 2010.Referencing ISOcat data categoriesLRT standards workshopLREC 2010 –

ODD example … unknown the text is freely available. … Note: this example does use PIDs from the ISOcat test server.

LMF example … <feat att="partOfSpeech" dcr:datcat=" val="commonNoun" dcr:valueDatcat=" <feat att="writtenForm" dcr:datcat=" val="clergyman"/> … Note: once the DCR supports container data categories LexicalResource, LexicalEntry and Lemma could also have dcr:datcat attributes.

LAF example … Note: each value needs it’s own DC reference hence the addition of the valueDescription element.