Download presentation
1
Research data: lifecycle, plans and planning
CQU Librarians’ workshop, 11 February 2015 Kathryn Unsworth – Data Librarian Research data: lifecycle, plans and planning Kathryn Unsworth, ANDS How and why do we manage data through the data lifecycle? This segment unpacks the difference between “data management plan” and “planning for data management”. Q&A This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
2
Research data management and lifecycle models… Why?
There must be something in it as a lot of people have gone to the trouble of creating many different models – refer to link on this slide, which takes you to a document with various lifecycle models. Because research activity often takes place in stages it can be viewed as forming a lifecycle Similarly the activities around research data can be seen as cyclic or going through often iterative phases within the wider research lifecycle. Essentially, the Data lifecycle is not a separate or standalone set of activities, instead is an embedded part of the research lifecycle. [Data Lifecycle Models and Concepts Version 8]
3
“Data often have a longer lifespan than the research project that creates them. Researchers may continue to work on data after funding has ceased, follow-up projects may analyse or add to the data, and data may be re-used by other researchers. Well organised, well documented, preserved and shared data are invaluable to advance scientific inquiry and to increase opportunities for learning and innovation.” However the life of research data often extends past the discrete nature of a research project. Researchers tend to think about their research data within the scope of the research project, but the research lifecycle extends past this through to preservation and data re-use by the researcher or other researchers or might end their life via secure disposal (where required) after the required retention period.
4
Some examples of research/data related lifecycle models
So let’s have a look at some examples. Many of which you may well have seen before.
5
Re-using data follow-up research new research undertake research reviews scrutinise findings teach and learn Processing data enter data, digitise, transcribe, translate check, validate, clean data anonymise data where necessary describe data manage and store data Giving access to data distribute data share data control access establish copyright promote data Analysing data interpret data derive data produce research outputs author publications prepare data for preservation A simple version of the lifecycle outlining actions for each of the phases of the cycle. Preserving data migrate data to best format migrate data to suitable medium back-up and store data create metadata and documentation archive data
6
ANDS data curation continuum
Quite a high level view of the curation process – concentrating mainly on sharing (with collaborators and partners) and the dissemination/publication of research data as a research output. The data curation continuum begins in the private domain, with the creation of research data by a researcher (see Figure 1). There may be a large number of data objects which are updated frequently. At this stage, researchers typically manage their own data. Preservation and metadata may not be needed, and access to the data is limited At the other end of the continuum is the public domain. There are likely to be a smaller number of selected static data objects which have accrued more metadata, and which may be managed and preserved through institutional arrangements such as repositories. This data is more likely to be publicly accessible, possibly in association with print publications.
7
Research lifecycle - JISC
JISC – Joint Information Systems Committee UK This lifecycle models appears to represent the research project and not beyond it – no mention of preservation or reuse I do however think Conceptualising the research idea, Seeking partners, Proposal writing and Publication are all part of the research process?? Any ideas for a new label? The larger blue circle, excepting “research process”, represents activities not directly a part of the “doing” of research, almost administrative tasks.
8
The DCC Curation Lifecycle Model
Most popular or well-known of the research/data related lifecycle models. Note: additional labels and arrows have been added by presenter to point out the three “action” stages: Full lifecycle, sequential and occasional actions Breaks the stages of the lifecycle into three levels of actions: full lifecycle actions, sequential actions and occasional actions. My main criticism of this and some other data lifecycle models is that they provide an impression that data curation is a separate set of or extraneous activities. Also not sure about the placement of some of the sequential actions, e.g. as soon as a researcher creates or receives data they usually have to “store” it otherwise they won’t have data to appraise and select from. With the UK Data Archive model and this the DCC curation lifecycle model we can see the areas of consideration which assist researchers in shaping a data management plan, and assist institutions and funders with wider planning around the management of research data.
9
United States Geological Survey Science (USGS) Data Lifecycle Model
Really like the interactive version on the website. Well worth a look. Good explanations of each of the stages. Criticism: too linear – research is not always this sequential. Process and analyse are greyed out because these phases are seen as being quite unique to a research project and its methodology – domain specific. Like the representation of the full lifecycle actions.
10
OK, so what’s the practical use for these lifecycle models?
11
A Lifecycle model can provide the framework that informs planning for the management of data, from which researchers are given a clear roadmap for creating their data management plans The lifecycle models provide a great framework that informs planning for the management of data at the funder, institution and researcher levels. *Note: There is a difference between planning for the management of data and Data Management Plans
12
Data management planning (or planning for the management of data)
& Data Management Plans Is there a difference? Just a quick word on terminology.
13
Let’s look at this visually.
Data management planning is a strategic action undertaken by funders and institutions, with heavy investment by business units iwho support research. Data management planning (or planning the management of data) informs service development and delivery. Data management plans on the other hand are undertaken at the individual project level, but play a role in informing funder and institutional DM planning initiatives. Planning for the management of data is a strategic endeavour on the behalf of funders, institutions and researchers. Whilst Data Management Plans can inform strategic directions, they are more simply the formal documentation of how data will be managed by a researcher or research team over the course of their research project (and hopefully beyond) to improve the efficiency of the research project and to provide appropriate care for the data created, collected or compiled – safekeeping for later use/reuse. Let’s look at this visually.
14
To further illustrate practical applications
To further illustrate practical applications. Let’s take a look at my favourite model. The UCF research lifecycle shows how these models can be used in planning – here we see a really nice referral map for research support at UCF, including the grey circles “activity not yet supported” Note: This lifecycle represents this university’s current research support status, your institution’s research lifecycle map would look quite different, particularly in who is providing what service and when within the research lifecycle. Your institution may have different gaps, i.e. services not yet supported, or may even provide services not even listed in this lifecycle. When looking at planning to manage research data, it’s important to take into account both the research lifecycle and the data lifecycle; the intersections between the two are interdependent and critical to informing all stakeholders of their roles and responsibilities in context of the whole [research process].
15
Taken from the ANDS website, this table provides an overview of what needs to come together within an institution in order to meet funder requirements and at the same time provide researchers with the path of least resistance to good data management practice – through the creation of a living data management plan and the planning required to manage data to provide appropriate and fit for purpose services.
16
Data management plans are not mandatory inclusions in grant applications for Australian-funded research YET! It is worth noting that Data Management Plans are not mandatory yet, but recent changes with the inclusion of the “Management of Data” section in ARC grant applications, suggest that further change may come. A data management plan is, as the name suggests, a formal document which outlines how data will be managed over the course of a research project, quite different to the current ARC application requirements. In brief the DM plan gives details of what sort of data the project expects to create, collect or compile, and what will be done with it. This might include: A description of the type of data that will be used and where it will come from – how it will be created, or where it will be obtained from if pre-existing datasets are being used. How the data will be stored and kept safe during the project What plans there are for preserving the data after the end of the project, and for sharing it with other researchers
17
Group discussion: Lifecycle models: how practical are they?
Consider the models you’ve just seen. As a group let’s discuss your overall impressions and whether you think they have a practical application in your role. Which of these, if any, would you prefer to use when discussing RDM with your library colleagues and/or researchers? Why? EXERCISE: Group discussion 10 minutes Scenario discussion: Consider the models you’ve just seen. Which of these, if any, would you prefer to use when discussing RDM with your library colleagues and/or researchers, and why?
18
This work is licensed under a Creative Commons Attribution 3
This work is licensed under a Creative Commons Attribution 3.0 Australia License ANDS is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy (NCRIS).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.