Archiving electronic data: example from the NPEU Peter Brocklehurst National Perinatal Epidemiology Unit Oxford
NPEU National Perinatal Epidemiology Unit Established 1978 by Department of Health Remit: Our aim is to conduct research with a view to providing information that can improve the health and welfare of babies, mothers and their families and promote the effective use of resources in the perinatal health services.
Perinatal research Pregnancy Newborn
ECMO
Projects Randomised controlled trials Disease registers Observational studies
NPEU trials Pregnancy –3 rd trimester ultrasound –Fetal movement –Cervical cerclage trial –Dublin Fetal Heart Rate Monitoring Trial –Collaborative eclampsia trial –BLASP –HOOP –Antenatal TRH trial –TEAMS –APPLE –PEACH –CAESAR –CORONIS –INFANT Newborn –ECMO –Dexamethasone trial –Ethamsylate –PHVD –OSIRIS –INIS –PROGRAMS –TOBY –ADEPT –NEST –BOOST-II UK –PREFER –Xenon
Birthplace To compare outcomes of births planned at home, in different types of midwifery units, and in hospital units with obstetric services Prospective cohort study of 60,000 births
Perinatal interventions Unique population Rapidly developing organism –Fetus –Preterm infant –Term infant Interventions (particularly drugs) can have long term effects which are unpredictable
Rare adverse outcomes Thalidomide Diethylstiboestrol
NPEU data Data are stored indefinitely –always fit for purpose because of unpredictable need for long-term follow-up includes identifiers
NPEU archiving All electronic datasets have been archived (from 1978) Complemented by archiving of paper documents –data collection forms, protocol, “print-outs”, published papers etc –library of existing datasets
Electronic archiving Challenges –Migration –Anonymisation –Access Ownership Sharing –Documentation
Migration 113 datasets ‘migrated’ from the University mainframe to Windows platform –mainly SPSS (SAS) files recreated as portable file format –plus original data in text files –three months work by senior programmer –on-going need to ensure that new version of software will read previous versions
Anonymisation Separation of identifiers and study data into separate files Making data available for sharing –dates and times (e.g. duration of labour, length of hospital stay etc) –Hospital names/codes –multiple pregnancy (triplets, quads etc) –sensitive data previous TOPS HIV status
Access Security Access for research
Security Limited access to data files –Passwords (changed regularly) –SOPs for gaining access specify duration of access purpose of access permission from Director or Chief Investigator emergency access
Access for research Other researchers –Re-analysis –Sub-group analysis –Sample size estimation –Individual patient meta-analysis
Access for research Ownership –Permissions ?Chief Investigator ?Steering Committee ?Funder ?Data guardian
Access for research Sharing –What information is needed to give permission? ?Study protocol ?Ethics approval ?Confidentiality agreement –Process of sharing What variables? Funding for work to extract dataset/anonymise –Transfer Data transfer regulations/legislation e.g. outside EU
Documentation Crucial activity at time of archiving –by people who have handled data/undertaken analysis –standardised Requires adequate resources –archiving usually occurs at end of project – often after project funded staff have left
Wish list Central University/National storage facility Guidelines (preferably national) setting standards for: –documentation –format –access –sharing