UPData A data curation experiment at the University of Porto using DSpace João Rocha da SilvaFEUP Cristina RibeiroDEI- FEUP / INESC-Porto João Correia LopesDEI- FEUP / INESC-Porto Eugénia Matos FernandesU.PORTO Reitoria (Central Services)
Contents Motivation – Goals of the experiment Our users, the researchers – Researcher concerns & needs – Adding data curation to the research workflow Building a repository – Using DSpace for research data curation Conclusions
MOTIVATION
The “standard” research workflow
However…
GOALS
Evaluating the research data management effort Interviewing researchers in several areas Collecting data samples Documenting use cases for research data Identifying data curation practices
Project Phases Gather Datasets & Use Cases Specify WorkflowBuild platformDeposit Datasets Phase 1 : Interviews
Our users, the researchers …are not data preservation experts...use many document formats...create and gather data from many sources
Researcher concerns and needs Repositories cannot be “graveyards for data”, they have to provide effective ways to access the stored data Data has to be well annotated or else cannot be reused (experiment contexts, meanings of variables…) Better ways to find data (e.g. domain-specific restrictions and not just generic metadata)
Researcher concerns and needs Easy sharing of data (e.g. sending a link to the place where a user can find a specific dataset) Researchers can be cited by their peers through the datasets that they offer Ensuring reproducibility of scientific findings
Project Phases Gather Datasets & Use Cases Specify WorkflowBuild platformDeposit Datasets Phase 2 : Determine changes to current workflow
ADDING A DATA CURATION STEP TO THE RESEARCH WORKFLOW
The role of the “Data Curator”
Data curation meeting
Annotating data
After the meeting Data+Metadata in Excel format
How other researchers will see it Explore Filter Download just what you need
Project Phases Gather Datasets & Use Cases Specify WorkflowBuild platformDeposit Datasets Phase 3 : Build tools to support the workflow
Project Phases Gather Datasets & Use Cases Specify WorkflowBuild platformDeposit Datasets Phase 4 : Test tool using real world data
DATA DEPOSIT - DEMO
VIDEO 1
DATA EXPLORING AND DOWNLOAD - DEMO
VIDEO 2
FIND DATASETS - DEMO
VIDEO 3
Conclusions + Future Work Some data management requirements of the researchers at U.Porto have been analysed and approached Dspace has been successfully customized to include Data Exploration capabilities for tabular data Future Work Gather feedback on the data repository extension from the group of researchers who have been interviewed
Thank you João Rocha da Cristina João Correia