Presentation is loading. Please wait.

Presentation is loading. Please wait.

RDS / AAF / ANDS / NeCTAR / AARNET Data Lifecycle framework

Similar presentations


Presentation on theme: "RDS / AAF / ANDS / NeCTAR / AARNET Data Lifecycle framework"— Presentation transcript:

1 RDS / AAF / ANDS / NeCTAR / AARNET Data Lifecycle framework
Ian Duncan – Director, Research Data Services (RDS)

2 What is the research data lifecycle?
Creation / Discovery Preserve / Archive / Discard Description / Provenance Analysis / Manipulation Integration / Storage

3 Another way of looking at it

4 And another

5 What’s the problem? reliability

6 What’s the problem? accessibility

7 Findable Accessible Interoperable Reusable

8 Research Data Services (RDS) *
Australian National Data Service (ANDS) * National eResearch Collaboration Tools and Resources (NeCTAR ) * The Australian Access Federation (AAF) Australia’s Academic Research Network (AARNET) * funded by the National Collaborative Research Infrastructure Strategy (NCRIS)

9 Existing components / Services
ingest share dropbox-like analyse/process analyse/process inc NCI, Pawsey, local HPC, etc researchers projects Identity / Authorisation / Access Various Storage Resources including portal 1 portal 2 portal 3 Existing components / Services repo 1 repo 2 repo 3 Research Data Australia

10 New “connector” components: Global Project ID
Proposed New/Enhanced Components DMP and Provisioning ingest share dropbox-like analyse/process analyse/process inc NCI, Pawsey, local HPC, etc Data Management Plan researchers Project ID projects Local DMP systems National Storage Resources API DLCF Metadata Store (DLCF-MS) Enhanced Identity / Authorisation / Access Services (ORCiD, eduGAIN) portal 1 portal 2 portal 3 New “connector” components: Global Project ID Group ID and Group management service Minimal & extensible DMP metadata definition Project-based resource allocation Provisioning API (storage and allocation metadata) repo 1 repo 2 repo 3 Research Data Australia

11 National Storage Resources 2
Phases / Workflows DMP & provisioning 5 “Can I process it on the Cloud?” “And here at Uni of X?” “I need a bigger machine..” Use (phase 2) analyse/process inc NCI, Pawsey, local HPC, etc ingest share dropbox-like 3 4 1 grants db researchers 6 ”I’ve finished my project and I think this data could be useful to someone in the future, please pack it away and make it available somehow” “I don’t want to share it just yet, please hold on to it and let me know if someone wants access” archive/publish/share/reuse/discard (phase 3) portal 1 portal 2 portal 3 repo 1 repo 2 repo 3 Research Data Australia Project ID projects Local DMP systems National Storage Resources metadata 2 API Data Lifecycle framework Outline – phases/workflows Provisioning (phase 1) 1 “What can I access?” 2 “Where is it?” 3 “How can I feed my data into it?” 4 “How can I share and use it with my group and my collaborators?”

12 Data Lifecycle framework – All components – aspirational target
Possible Project Components DMP & provisioning ingest share dropbox-like analyse/process analyse/process inc NCI, Pawsey, local HPC, etc 3 4 1 5 5 grants db researchers Project ID projects Local DMP systems National Storage Resources metadata 2 API Data Lifecycle framework – All components – aspirational target 6 portal 1 portal 2 portal 3 Researchers access grants database which indicates which grants they ‘own’ or have access to. This “surrounding” metadata is registered with the metadata db. 1 Space marked with this metadata is provisioned on dropbox-like storage which is visible to the NeCTAR cloud – this space should belong to a project, not a person. 2 repo 1 repo 2 repo 3 Automated and manual ingest processes feed data to this store, harvesting additional metadata where possible and relevant 3 4 Provisioned space should be as dropbox like as possible. Storage is immediately visible to NeCTAR cloud and processes developed to ship data to local HPC or peak facilities using existing high-speed networks and tools 5 Once project is complete, the data is packaged and shipped to and indexed by the relevant domain repository as well as registered with the RDA index. 6 Research Data Australia

13 Identity / Authorisation / Access
Possible Project Areas of Responsibility DMP & provisioning ingest share dropbox-like analyse/process analyse/process inc NCI, Pawsey, local HPC, etc grants db researchers Project ID Identity / Authorisation / Access projects National Storage Resources DMP System metadata API Identity / Authorisation / Access Data Lifecycle framework Potential Areas of Responsibility portal 1 portal 2 portal 3 RDS repo 1 repo 2 repo 3 NeCTAR ANDS AAF AARNet Others (eg unis, NCRIS projects, etc) Research Data Australia

14 University Workflow 3 4 1 5 5 2 Example: 6 Examples: Virtual Lab
DMP & provisioning ingest share dropbox-like analyse/process analyse/process inc NCI, Pawsey, local HPC, etc 3 4 1 5 5 National grants db Local grants db Ethics db researchers Local DMP systems projects Local Storage metadata API 2 Example: University Workflow portal 1 portal 2 Uni Research Data portal Use (phase 2) Provisioning (phase 1) archive/publish/share/reuse/discard (phase 3) repo 1 repo 2 Local repo 6 Research Data Australia

15 Amazon 3 4 5 5 1 2 Example: 6 Examples: Amazon DMP & provisioning
ingest share dropbox-like EC 2 processing inc NCI, Pawsey, local HPC, etc 3 4 5 5 1 National grants db Local grants db Ethics db PAP portal Amazon S3 metadata API 2 Example: Amazon portal 1 portal 2 portal 3 Use (phase 2) Provisioning (phase 1) archive/publish/share/reuse/discard (phase 3) repo 1 repo 2 Glacier 6 Research Data Australia

16 Cloudstor - a National Solution 3 4 1 5 5 2 Example: 6
Examples: Cloudstor DMP & provisioning ingest share dropbox-like analyse/process analyse/process inc NCI, Pawsey, local HPC, etc 3 4 1 5 5 grants db researchers projects Local DMP systems metadata API 2 Example: Cloudstor - a National Solution portal 1 portal 2 portal 3 Use (phase 2) Provisioning (phase 1) archive/publish/share/reuse/discard (phase 3) repo 1 repo 2 repo 3 6 Research Data Australia

17 inc NCI, Pawsey, local HPC, etc
Examples: OwnCloud Federation DMP & provisioning ingest share dropbox-like analyse/process analyse/process inc NCI, Pawsey, local HPC, etc 3 4 1 5 5 grants db researchers projects Local DMP systems Example: OwnCloud Federation metadata API 2 portal 1 portal 2 portal 3 2 2 2 repo 1 repo 2 repo 3 Uni A Uni C Uni B (uses the national provisioning portal) 6 Uni A provisioning portal Uni C provisioning portal Research Data Australia

18

19 Engagement Initial steps for DLCF was engagement - a “market scan” of who was doing what, with whom, using what. i.e. Identifying the blocks

20 Thought Bubbles Output #1: DLCF Summary Document
This allowed us identify and communicate the scale of the challenge as well as to zero in on one area to focus on Provisioning Phase

21 MVP – DLCF Connectors Framework: Resulting in (for step 1)
Metadata Store + REST API (inter?)National Project ID Group Id and Group Management Service Resulting in (for step 1) Project based data resource allocation Using AARNet Cloudstor+

22 Minimal & Extensible Metadata
Section Schema Comments Phase Research Project ID <ID> Auto-generated (by national RDS system?) 1 Collaborators ORCiD’s Identified by ORCiD's, Data Links <Defined by Data Providers> Data providers to define a JSON fragment. 2 Service Links <Defined by the services> Service providers to define a JSON fragment 3 Project Title <text> May contain sensitive information Public Funding URI's Links to ARC/NHMRC/other funders Institutional proj ID institution specific As per institutional requirements 4 Ethics Approval HREC? Finance Institutional Storage <UID> Local, Dropbox, OneDrive, as per local requirements Required for DMP Connector Optional for DMP Connector   Not Aggregated (institution specific)

23 Existing DMP tools

24 Metadata store & REST API
National Services - Metadata Store API/Service Single API Pass-through service Two-way traffic Where and how..? Project_ID Request <Project_ID > Group_ID Request <Group_ID > REST API Group Membership <Group_ID > <ORCiD 01> <ORCiD 02> <…>

25 National Services – Project ID
National Project ID User story:  As a researcher, after completing a DMP for a project I want to connect with the DLC tool and have a DLC project ID automatically allocated. This will provide a common key to all my resources for the duration of the project and post-publication. National Services – Project ID DLCF developed Project_ID <CERIF, ANDS, ORCiD?> This identifier is a critical part of the DLC process; it provides a unique key for identifying not only the project but also all associated project entities and collaborations.  

26 Group ID & Group Management
User story:  As a data provider I want to determine who has specific access permissions for a research data set and associated project assets. User story:  As a project collaborator I need to have access to the datasets and tools associated with one or more projects. I will use my ORCID as my primary identifier and will then be able to access assets for which I have permissions, across all projects for which I am a collaborator. People ID’s vs Role ID’s.   (Data Custodians) User story:  As a research organisation I want to ensure that research data and associated project assets have a reliable custodian assigned. I want the custodian to be aligned with an organisational role and not a specific person, although it is understood that a person will be assigned that role for a duration of time. National Services - Groups AARNet developed VOOT Group_ID <?????> <ORCiD 1> <ORCiD 2> <ORCiD 3> <…> <Institution_ID 01> <GoogleID 01?> <...>

27 DLCF Connectors REST API National Services – Project ID
National Services - Metadata Store API/Service National Services – Project ID DLCF developed Institution or national example DMP service Project_ID <CERIF or ANDS> DMP <Project_ID 01> Project Name <Group_ID 01> <Group_ID 02> New project ID request Project_ID Request <Project_ID > New Project Request Project_ID National Services - Groups AARNet developed Group_ID Request <Group_ID > New Group request VOOT Group_ID <?????> <ORCiD 1> <ORCiD 2> <ORCiD 3> <…> <Institution_ID 01> <GoogleID 01?> <...> Group ID New Group Request REST API Group Membership <Group_ID > <ORCiD 01> <ORCiD 02> <…> Group Query Group Query National Services – Resource Provisioning

28 Roadmap & Minimum Viable Product


Download ppt "RDS / AAF / ANDS / NeCTAR / AARNET Data Lifecycle framework"

Similar presentations


Ads by Google