ISO TC 37/CLARIN DISCUSSION UTRECHT, DECEMBER 9/10 2013 Thinning Down a Bloated Cat SUE ELLEN WRIGHT DECEMBER 2013.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

ISOcat Data Model: Workflow & Guidelines Marc Kemps-Snijders a, Sue Ellen Wright b, Menzo Windhouwer a a Max Planck Institute for Psycholinguistics, b.
ISOcat Data Category Registry Defining widely accepted linguistic concepts Menzo Windhouwer 1CLARIN-NL MD tutorial, September 2009.
Advanced Metadata Usage Daan Broeder TLA - MPI for Psycholinguistics / CLARIN Metadata in Context, APA/CLARIN Workshop, September 2010 Nijmegen.
Interoperability Aspects in Europeana Antoine Isaac Workshop on Research Metadata in Context 7./8. September 2010, Nijmegen.
IPY and Semantics Siri Jodha S. Khalsa Paul Cooper Peter Pulsifer Paul Overduin Eugeny Vyazilov Heather lane.
Principles of ISOcat, a Data Category Registry Marc Kemps-Snijders a, Menzo Windhouwer a, Sue Ellen Wright b a Max Planck Institute for.
Flexible Syntax and Concept Registries as a basis for Metadata Daan Broeder TLA - MPI for Psycholinguistics & CLARIN Metadata in Context, APA/CLARIN Workshop,
Data Category specifications 19 June 20121CLARIN-NL 2012 ISOcat tutorial.
Chapter 6 UNDERSTANDING AND DESIGNING QUERIES AND REPORTS.
Edition 3 Metadata registry (MDR) Ray Gates May 12, /05/20151.
Relational Databases Chapter 4.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Metadata Component Framework Possible Standardization Work.
The current state of Metadata - as far as we understand it - Peter Wittenburg The Language Archive - Max Planck Institute CLARIN Research Infrastructure.
Geospatial standards Beyond FGDC Geog 458: Map Sources and Errors March 3, 2006.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
(C) 2013 Logrus International Practical Visualization of ITS 2.0 Categories for Real World Localization Process Part of the Multilingual Web-LT Program.
XML, DITA and Content Repurposing By France Baril.
Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.
OpenMDR: Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. Commonalities and Differences.
Principles of the GOLD Ontology & Conversion of GOLD to DCIF Presenters: Anthony Aristar, Evelyn Richter.
OpenMDR: Alternative Methods for Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
The ISO-DCR 17 January /20111CMDI tutorial Marc Kemps-Snijders a, Menzo Windhouwer b, Sue Ellen Wright c a Meertens Institute, b MPI for.
ISOcat demo and providing RELcat input Menzo Windhouwer The Language Archive tla.mpi.nl Data Archiving and Networked Solutions
1 Metadata for Citizens’ Information UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher.
Environmental Terminology Research in China HE Keqing, HE Yangfan, WANG Chong State Key Lab. Of Software Engineering
Trends in Concept Modelling Turning Issues into Solutions How to Discipline a Cat Sue Ellen Wright, Kent State University.
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Content of the Data Category Registry 10 May /20111CLARIN-NL ISOcat workshop.
North American Profile: Partnership across borders. Sharon Shin, Metadata Coordinator, Federal Geographic Data Committee Raphael Sussman; Manager, Lands.
Architecture for a Database System
CLARIN Metadata Infrastructure Component Metadata and intermediate solutions Daan Broeder Claus Zinn Dieter van Uytvanck - Max-Planck Institute for Psycholinguistics.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
This material was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information.
Semantics and Syntax of Dublin Core Usage in Open Archives Initiative Data Providers of Cultural Heritage Materials Arwen Hutt, University of Tennessee.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
, 1/21, © Library and Documentation Systems Division 21 st APAN Meeting Tokyo, January 2006 AGROVOC and AOS, Margherita Sini, FAO From.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands NP CMDI-1 Metadata Component Framework New Standardization.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS Instructor Ms. Arwa Binsaleh.
CRITERIA FOR STANDARDIZING DATA CATEGORIES The Well-Formed Data Category Specification SUE ELLEN WRIGHT METADATA TDG WEBINAR
Oreste Signore- Quality/1 Amman, December 2006 Standards for quality of cultural websites Ministerial NEtwoRk for Valorising Activities in digitisation.
Beyond ISOcat 20 June 2013CLARIN-NL ISOcat tutorial1.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands TLA/MPI requirements for a Semantic Registry.
Overview of SC 32/WG 2 Standards Projects Supporting Semantics Management Open Forum 2005 on Metadata Registries 14:45 to 15:30 13 April 2005 Larry Fitzwater.
ISO TC 37/CLARIN SEMANTIC DATA REGISTRY WORKSHOP UTRECHT, DECEMBER ISOcat: Metadata Registry SUE ELLEN WRIGHT DECEMBER 2013.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
ISOcat status
THE BIBFRAME EDITOR AND THE LC PILOT Module 3 – Unit 1 The Semantic Web and Linked Data : a Recap of the Key Concepts Library of Congress BIBFRAME Pilot.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
May 2007 Registration Status Small Group Meeting 1: August 24, 2009.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
Santi Thompson - Metadata Coordinator Annie Wu - Head, Metadata and Bibliographic Services 2013 TCDL Conference Austin, TX.
ISOcat tutorial DCR data model and guidelines. Simple and complex DCs Simple Data CategoryComplex Data CategoryConceptual Domain Data CategoryDescription.
1 Educational Metadata Paul Miller Interoperability Focus UKOLN U KOLN is funded by Resource: the Council for.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
TDS-Curator DANS MPI for Psycholinguistics Utrecht Institute of Linguistics OTS languagelink.let.uu.nl/tds/ 9/21/20101CLARIN-NL - Call 1 - ISOcat status.
Group work and standardization features in ISOcat Menzo Windhouwer 8/14/20101Standardizing Data Categories in ISOcat - Implementing Group.
Software Architecture Patterns (3) Service Oriented & Web Oriented Architecture source: microsoft.
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
Marc Kemps-Snijders Menzo Windhouwer Sue Ellen Wright
Attributes and Values Describing Entities.
Metadata in Digital Preservation: Setting the Scene
Some Options for Non-MARC Descriptive Metadata
Attributes and Values Describing Entities.
Presentation transcript:

ISO TC 37/CLARIN DISCUSSION UTRECHT, DECEMBER 9/ Thinning Down a Bloated Cat SUE ELLEN WRIGHT DECEMBER 2013

Terminology Communities of Practice Discourse-oriented terminology  Text & discourse production  Semantic modeling of concept relations Object-oriented terminology  Thesauri and controlled language, library community  Retrieval of objects and information Terminology for semantic reasoning – MPI  Automated reasoning across heterogonous semantic networks  Retrieval of information from aggregated and non-aggregated networks Metadata-oriented terminology – TC 37/SC 3  Definition of structured metadata  Discovery and modeling of standards-compliant data sets  Facilitation of highly efficient, highly precise interoperability

ISOcat History as a Metadata Registry Long evolution within ISO TC 37, Terminology and other language and content resources Metadata Registry (MDR) in the spirit of ISO/IEC Not intended as a concept database nor as a terminology database nor as a semantic registry

ISOcat History as a Metadata Registry Reasoning across a semantic resource would be interesting, but information retrieval is not our goal, although ontological resources would be something we would definitely be able to use. We are interested in collecting structured DCSs for integration into data model design environments. We are interested in tools and application integration via the intelligent extraction of structured DCSs.

Header Area We need identifier, but it doesn’t have to appear multiple times. It could be hidden, or included in a synonyms class as long as it’s identified. We need PID, but the appearance of key is redundant. We consider other options for hosting the DCR if we lose typing capability in some form. ●We are not currently contemplating eliminating type because our whole raison d’être for the DCR (which predates ISOcat) centers here.

Administration Record We can live with changes here Justification and origin are potentially redundant, but we need some sort of origin info for variants on the DC name.

Dates Dates are automated anyway, and can be easily hidden

Data Element Description Section

Data element name and English name could be conflated. One and only one English definition based on community consensus is good. We only need either explanation or note, but not both Sources – problem for plagiarism reasons if omitted for good definitions sourced from other resources; could be optional

Linguistic Section This we can live without, despite all our work on it.

Conceptual Domain, Simple DC Type Fundamental to our core We contend that noun, for instance, defined as a complex DC, takes on new attributes that make it incompatible with its role as a simple DC, so it constitutes a different DC concept. isA is not fundamental for us, but it might be used as the seed for a new way of integrating complex with simple DC concepts

Language Sections We are concerned about the high proliferation of really bad translations and the potential for error that exists in the current language sections. We could live without this or maybe better, move it to standoff status. At any rate, it should be carefully policed and subject to consensus. A wiki-like solution would be ideal.

Other Features TDGs/Profile to be replaced with a more flexible, but potentially controllable new system (avoid proliferation of near clones) Private/public collections, sharing groups – keep in some form Eliminate standardization features but keep recommendations in the context of new profiles Other features TDGs Private collections Public collections Shared collections & groups Standardization features Recommendation features

Other Features Output formats – highly desirable, and they don’t complicate the model; it’s a hidden functionality unless used. Display features – much could be hidden that is now visible Multiple languages – come up with a way to clean this up – consensus driven wiki functionality would be great. RRs, maybe use SKOS if its fully functional for us; good SKOS interfaces? If not, we are considering our own terminology management options. Output features and formats Display features Multiple languages External ontology resources