research-driven data standards CIMI 11 th April 2013
patient records 1.clinician’s notes for self or colleagues, for communication or justification 2.notifications and summary reports against standard data sets 3.detailed record of diagnosis, treatment, outcomes, and follow-up for translational research and service improvement
clinical studies -ensure consistency between observations made by different people in different settings -observers are trained to follow a single protocol, compiling the same, sequential record of observations for each participant -observations are structured and coded for subsequent analysis, and reviewed for quality and consistency
patient records: cancer date of referral, an agreed diagnosis, pathology and imaging data, chemotherapy prescriptions, and notes of consultations but not (consistently): risk factors, comorbidities, adverse reactions, disease progression, recurrence, response, and quality of life
meta-analysis “...the drug Tamoxifen—an oestrogen blocker that may prevent breast cancer cells growing— was the object of forty-two studies world-wide, of which only four or five had shown significant benefits. But this did not mean that Tamoxifen did not protect against breast cancer. When we put all the studies together it was blindingly obvious that it does...” Richard Gray
this can work Early Breast Cancer Trialists’ Collaborative Group (1983) 100s of participating institutions worldwide consensus on 30 variables analysis of data every 5 years computable data on 200,000 cases (by 2007)
but mostly it doesn’t systematic review of TP53 and platinum response (2005) 75 clinical studies, 8331 patients no conclusions could be drawn most of the study metadata was missing insufficient immunohistochemistry detail
after the fact
problem data is collected to different definitions in different locations much of the information about definitions is not recorded even when it is, the definitions often turn out to be incompatible
solution create candidate data models for key therapeutic areas create semantic metadata to describe data sources and data standards publish semantic metadata to support – harmonisation of existing data – standardisation of clinical practice
semantic metadata linked data instances, models, and metamodels partial, extensible descriptions of context and intended interpretation components of documents, forms, and database schemas
example: stratified medicines improve access to molecular testing for cancer patients, while capturing genetic data and comparing it to patient outcomes CR UK programme for cancer: 9,000 patients across 6 tumour types, 21 clinical sites, 3 labs, 14 genes (in Phase 1) Cancer Outcomes and Services Dataset
dataset
also
question we need more detailed information, at a much higher quality we need comparable information about millions of people how can we make our data acquisition and curation processes scalable?
answer open, linked metadata standards, describing the context of data acquisition, processing, and use data tools whose behaviour is driven by linked metadata, but which also create and maintain linked data automatically
but also patient-reported outcomes, patient-supplied data, patient-managed data, patient- (and carer-) engagement
oxford
oxford and cambridge Oxford has Cerner (and more than one hundred other systems) Cambridge has Epic (or will have, at some point in the next few years) We want to conduct collaborative research across the two institutions
integrated record
challenges standardisation – data needs to be collected in a consistent, computable fashion adaptation – context, systems, and requirements will change motivation – we are asking people for more information, and they should derive some benefit from providing it