Draft Data Foundation and Terminology (DFT) Vocabulary Development Process Prepared for WG-Core meeting 24/25.2 Munich/Garching Gary Berg-Cross Co-Chair.

Slides:



Advertisements
Similar presentations
Effective management Accurate tracking Easier automation.
Advertisements

A Unified Approach to Combat Counterfeiting: Use of the Digital Object Architecture and ITU-T Recommendation X.1255 Robert E. Kahn President & CEO CNRI,
<<Date>><<SDLC Phase>>
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
InterPARES Project Joanne Evans, School of Information Management and Systems, Monash University Description Cross-domain Description Cross Domain - Metadata.
Software Documentation Written By: Ian Sommerville Presentation By: Stephen Lopez-Couto.
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT.
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
1 APARSEN - WP2200 Identifiers and Citability Interoperability Framework for PI systems Webinar on PI - 15 February 2013 Maurizio Lunghi.
SWITCHaai Team Federated Identity Management.
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA 6 th Plenary Paris, Sept. 25, 2015 Gary Berg-Cross, Raphael Ritz Co-Chairs.
Discussion of Larger Scope DFT Concepts & Terminological Issues Prepared for RDA P4, Amsterdam, Sept 2014 Gary Berg-Cross: Co-Chair DFT WG.
Sept 19,  Provides a common set of terminology and definitions  A framework for describing resources and processes  Enables computer based interoperability.
Jenn Riley Metadata Librarian IU Digital Library Program New Developments in Cataloging.
Data Fabric IG Introduction. 2  about 50 interviews & about 75 community interactions  Data Management and Processing is too time consuming and costly.
RDA Terminology: Data Management and Data Fabric Prepared for RDA 6 th Plenary Paris, Sept. 23, 2015 Gary Berg-Cross Co-Chair DFT IG, Co-organizing Chair.
The Final Study Period Report on MFI 6: Model registration procedure SC32WG2 Meeting, Sydney May 26, 2008 H. Horiuchi, Keqing He, Doo-Kwon Baik SC32WG2.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
CTI STIX SC Monthly Meeting August 19, 2015.
Moving from a locally-developed data model to a standard conceptual model Jenn Riley Metadata Librarian Indiana University Digital Library Program.
Chapter 1 1 Lecture # 1 & 2 Chapter # 1 Databases and Database Users Muhammad Emran Database Systems.
0 Federal XML Community of Practice (xmlCoP) Meeting Washington, DC December 17, 2004 Registration of Fine-Grained XML Artifacts in ebXML Registry Joseph.
1 What is OO Design? OO Design is a process of invention, where developers create the abstractions necessary to meet the system’s requirements OO Design.
RDA Data Foundation and Terminology (DFT) WG: Overview  Prepared for Collab Chairs Meeting, NIST, Nov 13-14, 2014  Gary Berg-Cross, Raphael Ritz, Peter.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
NIST BIG DATA WG Reference Architecture Subgroup Agenda for the Subgroup Call Co-chairs: Orit Levin (Microsoft) James Ketner (AT&T) Don Krapohl (Augmented.
Alternative Architecture for Information in Digital Libraries Onno W. Purbo
CISB113 Fundamentals of Information Systems Data Management.
Information Architecture WG: Report of the Spring 2004 Meeting May 13, 2004 Dan Crichton, NASA/JPL.
Report of the Architecture and Data Committee (ADC) R.Shibasaki (ADC, Japan)
Electronic Submission of Medical Documentation (esMD)
Information Architecture BOF: Report of the Fall 2003 Meeting October 28, 2003 Dan Crichton, NASA/JPL.
RDA: history and background Ann Huthwaite Library Resource Services Manager, QUT ACOC Seminar, Sydney, 24 October 2008.
Information Architecture WG: Report of the Spring 2005 Meeting April 14, 2005 Steve Hughes, NASA/JPL.
Discussion of Data Fabric Terms & Preparation for RDA P7 Virtual Meeting Monday, January 25, 2016 Organized by Gary Berg-Cross (DFT-IG) and Peter Wittenburg.
CaBIG Architecture Working Group Face-To-Face Meeting  Best Practices SIG  March 18th, 2005  David Kane and Jim Harrison.
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
Data Foundation IG DF Organizing Chairs: Gary Berg-Cross & Peter Wittenburg.
DC Architecture WG meeting Wednesday Seminar Room: 5205 (2nd Floor)
1 Steve Hughes Daniel J. Crichton NASA/JPL January 16, 2007 CCSDS Information Architecture Working.
March 18th, 2005http://jhh.opi.upmc.edu/main/cabig/BestPracticesSig1 caBIG Architecture Working Group Face-To-Face Meeting Best Practices SIG March 18th,
An Introduction to PREMIS Jenn Riley Metadata Librarian IU Digital Library Program.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
The Database Concept and the Database Management System (DBMS) Databases.
Information Architecture WG: Report of the Fall 2004 Meeting November 16th, 2004 Dan Crichton, NASA/JPL.
Financial Industry Business Ontology (FIBO) Monthly Status/review call Wednesday November 2 nd 2011.
Draft Ideas on a Process to Design and Build the DFT Vocabulary Gary Berg-Cross Developed for DFT WG Session at 2 nd RDA Plenary Sept Washington.
Building the Corporate Data Warehouse Pindaro Demertzoglou Lally School of Management Data Resource Management.
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
Data Foundations And Terminology (DFT) IG Virtual Meeting July 6 th 2016 Co-Chairs DFT IG :Gary Berg-Cross & Raphael Ritz P8 Sessions DFT IG Breakout Session.
Intentions and Goals Comparison of core documents from DFIG and Publishing Workflow IG show that there is much overlap despite different starting points.
Data Foundations And Terminology (DFT) IG
RDA Data Foundation and Terminology (DFT) WG
RDA Data Fabric (DF) Interest Group Peter Wittenburg & Gary Berg-Cross
DATA MODELS.
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Data Foundations And Terminology (DFT) IG
Software Documentation
Data Foundation and Terminology (DFT) Vocabulary Development Session
CS 501: Software Engineering Fall 1999
OGSA Data Architecture Scenarios
Maggie, Carlo, Peter, Rebecca (GEDE discussions)
Data Type Registries (DTR)
Attributes and Values Describing Entities.
2. An overview of SDMX (What is SDMX? Part I)
Metadata in Digital Preservation: Setting the Scene
Bird of Feather Session
Presentation transcript:

Draft Data Foundation and Terminology (DFT) Vocabulary Development Process Prepared for WG-Core meeting 24/25.2 Munich/Garching Gary Berg-Cross Co-Chair DFT WG A PID record that points to a metadata record and to instantiations of identical bit- streams that may store additional attributes

DFT Goals Describe a basic, abstract (but clear) data organization model that systemizes the already large body of definition work on data management terms, especially as involved in RDA’s efforts. The model and its derived reference data should be sound, practical and agreed to within the community for use: across communities and stakeholders to better synchronize data conceptualization, to enable better understanding within and between communities, and to stimulate adopters & tool building, such as for data services, supportive of the basic model’s use. Need to get the story straight on model to govern the use of related tools. Candidate List Evolves to Refined List Cross WGs DFT WG Discussion & Plenary 3 Future Work 2015?

Draft Data Foundation and Terminology (DFT) Vocabulary Development Process by Gary Berg-Cross Five Stage process 1.Start up/Scoping Requirements analysis and development of candidate list 1.Tool prototyping 2.Vocabulary Analysis & Revision Process (3 rd Plenary) 1.Tool demo and final requirements 3.Focused Vocabulary Design Process and Community Agreement (after 3 rd Plenary) 4.Refinement & Maintenance (ongoing) 5.Draft Vocabulary Publication and Review (4 th Plenary)

Overview of Term Development Starter areas and items : Persistent Identifiers (PIDs and types) Digital Object - Data Object Collection - Data Set - Aggregation Repository (Registries and related Policies Scope Terms from Model Papers Placed In Tool Digital Information Object A digital item or group of items referred to as a unit, regardless of type or format that a computer can address or manipulate as a single object. Defs & Refinement Analysis and Revision Process Getting Defs organized for review

Term Definition Example digital entity: An entity represented as, or converted to, a machine-independent data structure consisting of one or more elements in digital form that can be parsed by different information systems; the structure helps to enable interoperability among diverse information systems in the Internet. From Framework for discovery of identity management information Alternative? This page was last modified on 9 December 2013, at 14:03. Revision Discussion: : This definition does not refer to our practice and is not specific enough. A digital object can cover different types of digital information such as data, software, knowledge etc. So we should separate data and other types of digital information. Also the reference to databases is not useful enough since there are many types of “containers” data is in - the term “database” does not help us since it refers to any type of container. And in DFT we need to stress the fact that a DO is something that has an identity one can refer to, that has a number of properties that can be accessed etc. Peter

PID Term and Discussion Discussion on and Tool ( We should emphasize that persistence is not purely technical, which is a point I think John Kunze in particular would agree to - there's social contracts associated with the idea of persistence. If you don't put those policies in place, persistence is undefined at best. Which, on 2 nd thought, also means that not just the resolution service is persistent, but also the association between identifier and target object. Which is a contract probably put on the shoulders of the agent requesting the PID in the first place, because the service will be unable to decide/maintain this.-- Tobias Tobias, you have evoked a few things such as PID Service (need to include this as a term). So should we have defs with the idea of a Contract by Agent as part of the metadata for a PID? Assertions: PID Requesting Agent (sub-type of Agent) contracts to maintain connection (definition?) between ID & Target Object. TO has contract. PID service is a service.– Gary The PID Service and the PID System might be the same thing in reality. One diff may be that the PID System is maintaining a Resolution Service, while the PID Service is the entity with which the contract is made. Each PID Service employs a PID System. Each PID System can be employed by several PID Services. Example for a PID Service: DataCite Example for a PID System: The DOI System Example for a Resolution Service: 2a00:1a48:7805:112:2c13:65be:ff08:2e89 - better known as dx.doi.org (In reality, there really is a contract between e.g. DKRZ and DataCite; so this seems adequate) TobiasWeigel (talk) 09:01, 10 December 2013 (UTC) TobiasWeigeltalk

Today’s Session- A focus on the following terms / term Data / Realtime Data / Gappy Data / Dynamic Data Digital Object / Registered Digital Object / Information Object Bit Stream / Instances of Bit Stream / Data Stream Identity / Integrity / Authenticity Object Property / Object Attribute / Property Record / Internal Property / External Property Persistent Identifier / PID Record / PID Attribute / PID Resolution / Reference Resolution Data Organization / Data Model Repository / Repository of Origin Aggregation / Collection / Data Set / Corpus / Container Data LifeCycle We can prepare this by looking at the stuff which is in the wiki and comparing etc. we need to make a quantum job in defining a few terms. We need to argue from the different data models/organizations that were presented and of course also look what others have done. clusters:

Peter’s Conceptual Space digital object bit stream instance of a bit stream service object informational object aggregation data object is_a is_part_of has_a has_many collection is_a metadata record PID record has_a is_a data stream is_equal data set is_equal corpus is_equal attribute has_a property contains_a

Some Notes/Questions on the Conceptual Space for the ­RDA DFT Term Definitions - Some existing stuff digital object bit stream instance of a bit stream service object informational object aggregation data object is_a is_part_of has_a has_many collection is_a metadata record PID record has_a is_a data stream is_equal data set is_equal corpus is_equal attribute has_a property contains_a Perhaps both Data and Service Objects are Informational Objects Better to say that a data set is a type of (Data) Collection. – not every collection is a data set…it could be a Corpus. A data steam is a Data Object, right? An instance of data stream is one Manifestation of the content of the original DO. Shouldn’t we show the relation between the PID and Metadata?

Status & Plan Going Forward We now a table of Core Terms with some initial Definitions These are also in the Tool perhaps some still being updated. Our Joint WG-Core meeting represents and opportunity to take stock and do some editing, testing of ideas and refining as well as strategize on next steps. Get some sense of agreement and where issues are for the WG-Core Preparation for 3 rd Plenary Documents Tool and Demo Discussion of working Core

Checklist of Issues - What is Needed for DFT Term Progress? Ramp up of effort by DFT WG Community Review of table, categories and definition refinement Confirmation of scope of work How do we handle points of contention? What is the process by which we converge and move to adoption? Training in and exposure of Term Tool Use by other WGs for their needs Is our table example useful as a model for them? Test of Scenarios – are they useful? Examples of term-concepts involved with real data What is a data set, what aggregation examples do we point to etc.?