Download presentation
Presentation is loading. Please wait.
Published byDagny Sletten Modified over 6 years ago
1
Data Services at CSC ©2016 OKM ATT initiative Licensed under Creative Commons BY 4.0
2
CSC and Customers Computing Services
Research Information Management Services Funet Network Services Education Management and Student Administration Services Identity and Access Management Services Datacenter and Capacity Services (IaaS) Training Services Consultation and Tailored Solutions Ministry of Education and Culture Other ministries and state administration Higher education institutions Research institutions Companies
3
Data Service Portfolio
Data services for open science (details later) HPC archive 20GB to 5TB default quota/user iRODS (see IDA) Cloud (ePouta, cPouta) ePouta: secure and private for organizations cPouta: general purpose IaaS (includes FGCI resources) Databases for HPC and others HPC data-analysis, off-the-shelf and tailored services EUDAT Pan-European research data infra, training and consultancy B2DROP*, B2Share*, B2Safe, B2Stage, B2Find* Coordinated by CSC
4
Open Science Services Development: Policy Work
E.g. requirements for higher education institutions Framework architecture A target-level description of open science & research processes, services, information, data structures, actors, roles and information system services Defines the framework for national common solutions, components, data management, information system and local service design and implementation Finished in 2015, put into practice in 2016 Long term preservation model for research data Recognize the most important or unique research output Ensure linkages between publications, data and methods Make services easy to use, efficient and adaptable Enable organizations to easily adopt the services in their own operations
5
Data Services for Open Science
Etsin research data finder IDA research data storage service AVAA open data publishing portal PAS digital preservation solution Tuuli data management planning tool Research infrastructure databank
6
Data lifecycle and services
Data planning Data search Data analysis Data storage Data sharing Data reuse Open science & research handbook PAS
7
AVAA Current architecture IDA Etsin iRODS Reetta REMS Apache CKAN
Anyone Browser/API Haka user Browser/AP I IDA user Anyone Browser Haka user Browser Browser Folder Command line http http https WebDAVS https https https irods Current architecture AVAA Liferay SUI My files IDA-AVAA download IDA-REMS download AVAA sites irods irods irods https IDA Davis irods oai-pmh iRODS SQL OAI-PMH Etsin Reetta REMS Apache CKAN https RESTful API
8
research data storage service
IDA
9
IDA research data storage
Offered since 2012 for projects in Finnish universities, universities of applied sciences and the Academy of Finland Organizational usage quotas vary according to size from 30 TB to 1260 TB Open-source iRODS technology provides secure storage procedures with data replication openscience.fi/ida
10
IDA research data storage
Currently 130 projects 500 registered users 19 million data files 470 TB used Data owner decides on openness and use policy Metadata catalogue and open data portal data metadata User Producer Research organization’s service
11
plans for new IDA Needs:
Everyday storage and sharing A medium-length term (~10 yrs) preservation buffer for PAS (long term) Data lifecycle support: storage, “freezing” and hand-over Centralized metadata management: Data registration in external metadata resource, linking files to datasets to storage packages Access management improvements: roles, organizations Upgrades planned for multiple layers: software (iRODS vs. OwnCloud?), storage solution (scale out) and system architecture
12
research data finder Etsin
13
Etsin research data finder
National metadata catalogue for research data Adheres to the national metadata model URN PIDs assigned, also support for other IDs Currently dataset metadata entries published etsin.avointiede.fi
14
Etsin research data finder
Extension of CKAN data portal & Solr search engine DDI and OAI-PMH metadata harvesting from outside sources Lately: UX improvements, new datasets harvested, plans for integration with research organization catalogues
15
open data publishing portal
AVAA
16
AVAA open data publishing platform
For producers and users of open data since 2013 Pilot cases of research data and access tools developed together with researchers Open data from IDA Roughly 3000 users yearly, 10 million+ API requests avaa.tdata.fi
17
AVAA open data publishing platform
Applications and interfaces for data download, analysis and visualizations Applications developed as open source: github.com/avaa-csc/
18
AVAA open data publishing platform
Applications and interfaces for data download, analysis and visualizations Applications developed as open source
19
digital preservation solution
PAS
20
Layers in data storage and discovery
Managing status (is data integrity intact? is data available?) Managing location (where is the data?) Managing roles (who owns rights to the data? who is responsible for sustainability?) Managing risks (how to keep data discoverable and usable? what actions are needed?) Source: McDonald 2008
21
PAS digital preservation solution
PAS infrastructure operational National Digital Library digital preservation (KDK-PAS) in production since Under the administration of the Ministry of Education and Culture Preserving cultural heritage ISO27001 audited service Research Data PAS Same infrastructure as KDK-PAS Preservation model published in 12/2015 At piloting phase To production in stages starting 2017
22
data management planning tool
dmpTuuli
23
dmpTuuli data management planning tool
What: data management planning (DMP) tool for Finnish research organizations How: a collaborative project with a user driven approach Why: DMP is an integral part of good research practise and ensures research integrity and quality Where: When: Piloting with national funders in 2016
24
dmpTuuli Data management plan (DMP) will help you manage your data, meet funder requirements and help others use your data if shared. – DMPTuuli will help you write data management plans. DMPTuuli is provided by the Finnish Tuuli-project. The project has worked closely with researchers and research funders to produce guidance and templates that assist researchers to produce an effective data management plan (DMP) to cater for the whole lifecycle of a project, from bid-preparation stage through to completion. DMPTuuli is based on DMPonline code, developed by the UK's Digital Curation Centre.
25
Data management plan A living document – updatable and reviewable
Create your data management plan early and review it regularly throughout the research project Describes what data will be collected and how the usage and storage of your data how to enable the reuse of your data after the project Covers issues concerning Responsibilities Data ownership and licensing Costs
26
Research infrastructure databank
27
Research infrastructure databank
Unified descriptions of RIs and services Promotes openness and sharing Centralized and easily updatable For researchers, RI service providers, funders infras.openscience.fi
28
RI DB: Features in development
PIDs for RIs Open API Updates through harvesting Linking data: publications, data, funding, projects, organizations, resources etc.
29
Common challenges
30
Common Challenges Metadata management & creation, metadata reserve
Levels of abstraction in research data management: file vs. dataset Researcher vs. organization, handover Roles of funders International data
31
Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.