Data Foundation IG DF Organizing Chairs: Gary Berg-Cross & Peter Wittenburg.

Slides:



Advertisements
Similar presentations
2 Introduction A central issue in supporting interoperability is achieving type compatibility. Type compatibility allows (a) entities developed by various.
Advertisements

Connect communicate collaborate A Network Management Architecture proposal for the GEANT-NREN environment Pavle Vuletić, Afrodite Sevasti TNC 2010, ,
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Welcome to the Conference !! Juan Bicarregui Chair, APA Executive.
Rutgers University Libraries What is RUcore? o An institutional repository, to preserve, manage and make accessible the research and publications of the.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
Nov. 14, 2007 Systems Engineering ä System ä A set or arrangement of things so related as to form a unity or organic whole. ä A set of facts, principles,
CS 290C: Formal Models for Web Software Lecture 6: Model Driven Development for Web Software with WebML Instructor: Tevfik Bultan.
Data Seal of Approval Overview Lightning Talk RDA Plenary 5 – San Diego March 11, 2015 Mary Vardigan University of Michigan Inter-university Consortium.
Chapter 10: Architectural Design
Chapter 3 Computer Science and the Foundation of Knowledge Model
Computational Thinking Related Efforts. CS Principles – Big Ideas  Computing is a creative human activity that engenders innovation and promotes exploration.
Introduction to the course January 9, Points to Cover  What is GIS?  GIS and Geographic Information Science  Components of GIS Spatial data.
Extended Enterprise Architecture Framework (E2AF)
Enterprise Architecture
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Database Systems: Design, Implementation, and Management Ninth Edition
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
Continuation From Chapter From Chapter 1
Enterprise Systems & Architectures. Enterprise systems are mainly composed of information systems. Business process management mainly deals with information.
Smith’s Aerospace © P. Bailey & K. Vander Linden, 2005 Architecture: Component and Deployment Diagrams Patrick Bailey Keith Vander Linden Calvin College.
Chapter 6 System Engineering - Computer-based system - System engineering process - “Business process” engineering - Product engineering (Source: Pressman,
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
EXPECTATIONS OF TURKISH ENVIRONMENTAL SECTOR FROM INSPIRE Ministry of Environment and Forestry June, 2010 Özlem ESENGİN Ahmet ÇİVİ Tuncay DEMİR.
Working Group: Practical Policy Rainer Stotzka, Reagan Moore.
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA 6 th Plenary Paris, Sept. 25, 2015 Gary Berg-Cross, Raphael Ritz Co-Chairs.
Database System Concepts and Architecture
Position Paper for Data Fabric IG Interoperability, Infrastructures and Virtuality Gary Berg-Cross, Keith.
Odyssey A Reuse Environment based on Domain Models Prepared By: Mahmud Gabareen Eliad Cohen.
Data Fabric IG Introduction. 2  about 50 interviews & about 75 community interactions  Data Management and Processing is too time consuming and costly.
RDA Terminology: Data Management and Data Fabric Prepared for RDA 6 th Plenary Paris, Sept. 23, 2015 Gary Berg-Cross Co-Chair DFT IG, Co-organizing Chair.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
© 2012 xtUML.org Bill Chown – Mentor Graphics Model Driven Engineering.
Chapter 6 Supporting Knowledge Management through Technology
Working Group Practical Policy based on slides and latest documents from the PP WG chaired by Reagan Moore, Rainer Stotzka presented by Johannes Reetz.
Software Engineering Prof. Ing. Ivo Vondrak, CSc. Dept. of Computer Science Technical University of Ostrava
RDA Data Foundation and Terminology (DFT) WG: Overview  Prepared for Collab Chairs Meeting, NIST, Nov 13-14, 2014  Gary Berg-Cross, Raphael Ritz, Peter.
Making sense of Interest Group/Working Group Activity by RDA Technical Advisory Board Beth Plale Professor of Data Science Indiana University USA With.
United States Department of Justice Achieving Information Interoperability and Business Agility The Justice Reference Architecture:
Measurement Data Workspace and Archive: Current State and Next Steps GEC15 Oct 2012 Giridhar Manepalli Corporation for National Research Initiatives
1 1 Developing a framework for standardisation High-Level Seminar on Streamlining Statistical production Zlatibor, Serbia 6-7 July 2011 Rune Gløersen IT.
MODEL-BASED SOFTWARE ARCHITECTURES.  Models of software are used in an increasing number of projects to handle the complexity of application domains.
Enterprise Solutions Chapter 10 – Enterprise Content Management.
Database Administration Basics. Basic Concepts and Definitions  Data Facts that can be recorded and stored  Metadata Data that describes properties.
RDA End to End RDA Global Tested, Hardened, Integrated Council TAB OAB Sec Tech Transfer Outreach Mtgs Publication Testing & Eval RDA Coord Groups Third.
Discussion of Data Fabric Terms & Preparation for RDA P7 Virtual Meeting Monday, January 25, 2016 Organized by Gary Berg-Cross (DFT-IG) and Peter Wittenburg.
Enterprise Architectures. Core Concepts Key Learning Points: This chapter will help you to answer the following questions: What are the ADM phase names.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Basic Concepts Key Learning Points : The objectives of this chapter are as follows:  To provide an introduction to the basic Concepts of enterprise architectures,
Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
Preservation e-Infrastructure IG Description: help ensure preservation of needed data succeeds Goals: foster worldwide collaboration; ensure consistency.
Enterprise Architectures Course Code : CPIS-352 King Abdul Aziz University, Jeddah Saudi Arabia.
Research Data Repository Interoperability Thomas Jejkal.
Design Engineering 1. Analysis  Design 2 Characteristics of good design 3 The design must implement all of the explicit requirements contained in the.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Bringing visibility to food security data results: harvests of PRAGMA and RDA Quan (Gabriel) Zhou, Venice Juanillas Ramil Mauleon, Jason Haga, Inna Kouper,
1 This slide indicated the continuous cycle of creating raw data or derived data based on collections of existing data. Identify components that could.
Intentions and Goals Comparison of core documents from DFIG and Publishing Workflow IG show that there is much overlap despite different starting points.
RDA 9th Plenary Breakout 3, 5 April :00-17:30
User Characterization in Search Personalization
RDA Data Fabric (DF) Interest Group Peter Wittenburg & Gary Berg-Cross
Joseph JaJa, Mike Smorul, and Sangchul Song
Agenda Welcome and overview (Peter)
C2CAMP (A Working Title)
Identifiers Answer Questions
From Observational Data to Information (OD2I IG )
Introduction to Systems Analysis and Design Stefano Moshi Memorial University College System Analysis & Design BIT
Bird of Feather Session
Presentation transcript:

Data Foundation IG DF Organizing Chairs: Gary Berg-Cross & Peter Wittenburg

2 Introduction to DF Group Coming to a reproducible data science is a high priority. Only highly automated self-documenting procedures respecting proper data organization principles will overcome current barriers. The Data Fabric group needs to work out directions in the belief that integrated data fabrics are a critical component of infrastructures paving the way to reproducible science. The idea for this IG emerged from the discussions amongst the chairs of various RDA WGs Characteristics of a Data Fabric: We are just beginning to scout out the landscape of data fabric. In one view it is a minimalistic set of infrastructure and service requirements by which services can plug into (belong to) the defined fabric. In a data fabric we ask how the separate components, developed separately, can be made to work together, this means that for different sets of components the data fabric will be different. We note, strongly, that it is meant as a descriptive/conceptual way to deal with the interrelation between many components, rather than prescriptive (like you would have with an architecture).

3  The goal is to develop alternative views, components and aspects of the DF concept and related infrastructure.  Conceptualizations be discussed to come to an agreed RDA view on how the evolving DF landscape can be productively described.  As part of this, essential DF components and their interrelation need to be identified and defined.  Some of the existing RDA groups including metadata WGs and IGs are working on DF components and need to be positioned in such a landscape.  New working groups need to be defined to work on identified components and interfaces. Goals

4  This diagram provides a high-level view of possible actions within a Data Fabric running from raw data to increasingly documented data that has been enriched and analyzed creating referable and citable data. As shown publications are part of this Data Fabric since they are often used for data mining and other analysis. High Level View

5 Infrastructure Component View (after Reagan Moore)  A data fabric is the set of software and hardware infrastructure components that are used to manage data, information, and knowledge.  When an enterprise implements a data management solution, one of multiple types of DFs infrastructure is typically chosen to enable the:  Data management –enterprise to build a data repository, manage an information catalog, & enforce management policy  Data analysis –enterprise to process a data collection, apply analysis tools, and automate a processing pipeline.  Data preservation –enterprise to build reference collections and knowledge bases that comprise the intellectual capital, while managing technology evolution  Data publication –discovery and access of data collections.  Data sharing – controlled sharing of a data collection, shared analysis workflows, and information catalogs.

6 Data Fabric Service View (after Beth Plale) A DF should:  Be self-documenting – a service contributes to the lifecycle of data objects it handles and must keep track of the scientifically relevant actions it performs on those data objects.  The resulting log files are periodically be sent to a provenance consolidator.  Track data objects through its service processing using one of the well- known object identifier schemes  Identify itself as one type of service as drawn from an RDA- agreed upon list of service types.  Implement an interface to a publish-subscribe system which serves as the Data Fabric Control mechanism.

7 Data Object View (after Peter Wittenburg) Data Fabric Service View (after Beth Plale) A DF should: Be self-documenting – a service contributes to the lifecycle of data objects it handles and must keep track of the scientifically relevant actions it performs on those data objects. The resulting log files are periodically be sent to a provenance consolidator. Track data objects through its service processing using one of the well-known object identifier schemes Identify itself as one type of service as drawn from an RDA- agreed upon list of service types. Implement an interface to a publish-subscribe system which serves as the Data Fabric Control mechanism. The data fabric covers a domain of registered digital objects (DO) that are stored in well managed repositories. DOs are associated with metadata describing its creation context and history (provenance). The Data Fabric covers a domain of registered software components (workflows, services) that are in fact a special class of DOs. Actions on DOs may be guided by abstract policies that are explicit and thus auditable. There can be multiple data fabric implementations that should be highly interoperable.

8  The suggested Data Fabric IG is planned as a forum to discuss these alternative views, components and aspects of the DF concept.  To be discussed: What is the agreed RDA view on a Data Fabric. How the outputs from the RDA working groups fit in the DF concept and how they relate to each other and to various related WGs and IGs within the RDA. Which further activities are required to push the data fabric concept ahead. Continuation and initialization of working group activities related to the DF. Improving the uptake of the WG outputs by communicating them as a coherent whole within the DF concept. Discussion

Thanks for your attention.