Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital data – integrity and standards

Similar presentations


Presentation on theme: "Digital data – integrity and standards"— Presentation transcript:

1 Digital data – integrity and standards
Dr Simon Cockell Bioinformatics Support Unit

2 Generating data

3 Generating big data

4 Problem… Data doesn’t fit in: (try pasting an Informative genome sequence, DIGE gel, array experiment or Excel spreadsheet into this)

5 So where do we store it instead?

6 Why is this a problem? Is your PC backed up?
Do you check integrity of files? Back up – daily? Usb or ‘real’? Data integrity – bit rot is a real problem

7 Why is this a problem? How organised is your electronic life?
How well does it correlate with your lab book? How do you know which files go with which experiment? Be organised, and systematic (true for small experiments as well as big – get into the habit). Need to record what has happened to data, as well as the data itself.

8 Data Management Options
H: drive Lab/Department solution Research Data Warehouse Cloud Repository

9 Repositories Functional Genomics (transcriptomics, ChIP, epigenomics etc) Array Express Gene Expression Omnibus Proteomics PRIDE (EBI) Next generation sequencing data SRA (NCBI) ENA (EBI)

10 Metadata Data repositories not simple data dump Also require metadata
Data about data Good metadata describes experimental setup Enables reproducibility Recording as you go makes life easier

11 There is another issue Big data - scientific data - is EXPENSIVE to generate It makes sense to get the most value out of it Your funding bodies know this Often your funding is from public money You may think you have ownership of this data But do you? Or the University?

12 Increasing pressure to share data
BBSRC expects research data generated as a result of BBSRC support to be made available with as few restrictions as possible in a timely and responsible manner to the scientific community for subsequent research. Applicants should make use of existing standards for data collection and management and make data available through existing community resources or databases where possible.

13 Increasing pressure to share data
The MRC expects valuable data arising from MRC-funded research to be made available to the scientific community with as few restrictions as possible so as to maximize the value of the data for research and for eventual patient and public benefit. Such data must be shared in a timely and responsible manner.

14 To sum up Back up your machine! USB ‘flash’ drives do not count
Be as organised with your digital data as you are with your lab books Be aware of the expectations for releasing data Capturing good metadata will make many things easier (including writing up!) <shameless plug>Want to talk about how best to analyse and store your digital data? Come talk to us!</shameless plug>

15 Digital Data – Integrity and Standards
Simon Cockell bsu.ncl.ac.uk @sjcockell


Download ppt "Digital data – integrity and standards"

Similar presentations


Ads by Google