Download presentation
Presentation is loading. Please wait.
Published byJohn Webster Modified over 8 years ago
1
Discussion of Data Fabric Terms & Preparation for RDA P7 Virtual Meeting Monday, January 25, 2016 Organized by Gary Berg-Cross (DFT-IG) and Peter Wittenburg (DF-IG)
2
Agenda, Context and Recap 1.Brief update by Gary Berg-Cross on DFT IG activities 1.New terms, 2.Use cases for vocabulary services, 3.Context for DF term discussion and 4.P7 plans 2.Overview of vocabulary issues from Data Fabric standpoint by Peter Wittenburg who will provide some overview of terms and issues as part of the meeting 3. Discussion of how to handle vocabulary issues going forward. 1.This meeting will be an opportunity to discuss some of the troublesome DF terms (and maybe other ones) in context to see if we can develop some working draft definitions that can be firmed up over time. 4.Vocabulary issues and plans from other RDA groups 1.If interested people can respond here with candidate terms or issues and perhaps working definitions as well as bring them up at the meeting as noted in the agenda.
3
DFTIG Status and Plans 1.Some terms about repository registries, for example, have been entered into the RDA DFT term tool based on recent DF discussions and posts as well as RDA-WDS Data-Pub Workflows. http://smw-rda.esc.rzg.mpg.de/index.php/Special:AllPages Collection Registry Collection Registry Repository Registry; Repository Registry Data repository entry;Data repository entry Data review. Data review, Data journal;Data journal 2. In addition we are working with the Vocabulary Services IG to use some of their tool-based services to improve our vocabularies: Providing URLs for each term for referencing Creating taxonomies from the definitions Handling synonyms etc.
4
Digital Data Management including unregistrered data (is a broader concept) Broadening the Discussion (Stepwise or Scope- wise) Data Management (and use) is broad so we are building out from our start Digital Object Management (registered, digital data) Where are datasets???
5
Integrate Concepts: Policy-based Digital Data Management Concept Graph (Reagan Moore) Based on practical principles, Policy defines when in a workflow a PID is created as well as other curation activities..These defs are linked
6
Based on DF Discussions we developed suggested concepts with candidate terminology: Examples 1.Data practice is the actual application/ use of ideas & methods (as opposed to theories) about how data are collected, created, stored (maintained), curated, used, shared and released (disseminated). 2.Data principles are rules that provide guidance across data management and use for such things as” data acquisition, data lifecycle control, data policy & ownership, metadata practices, data quality etc. 3.Common data solutions are agreed upon, easily available, tested & approved approaches to widely occurring problems in data management and use 4.Data discovery is a process of query and/or search to find (research) data of interest. 5.Database cracking features incremental partial indexing and/or sorting of the data. It combines features of automatic index selection and partial indexes. It reorganizes data within the query operators, integrating the re-organization effort (occasionally invoking creation or removal of indexes on tables and views based on use) into query execution. It shifts the cost of index maintenance from updates to query processing. 6.Adaptive indexing is characterized by the partial creation and refinement of preliminary or fixed DB indexes as side effects to support efficient query execution. (after http://www.vldb.org/pvldb/vol4/p586-idreos.pdf)
7
Now we have a new, long list of terms to discuss For example, “searchable” what makes (data, publication etc. ) searchable? Rich metadata, use of a standard vocabulary, use of a registry etc... Some terms on our list have relevant RDA groups Metadata (e.g rich metadata etc.), Data publishing workflow (e.g. workflow), Domain repository, Repository Platforms for Research Data IG, Repository Platforms for Research Data IG Active Data Management Plans IG, Active Data Management Plans IG BioSharing Registry: connecting data policies, standards & databases in life sciences WG Practical Policy (follow on) ? etc. Some (general) terms we can leverage standards organizations & bodies (NIST, ISO etc.) System, architecture, actor, service, schema, protocols, layer, physical layer, re-usable Some we may have particular advocates for (Research Object, self documentation- etc.)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.