Comparing the multiple sources of cancer treatment data No acronym for the title yet Sean McPhail, Sam Johnson, Daniela Tataru, Margreet Lüchtenborg, Sally Vernon, Michael Peake (and thanks to James Thomas)
Background We are reasonably good at counting tumours, mortality, and calculating survival. Historically we have been less good at counting treatments. Is that still true? Comparing the multiple sources of cancer treatment data
Treatment Treatment Surgery Radiotherapy (RT) Chemotherapy (CT) Comparing the multiple sources of cancer treatment
5 Treatment data sources Inpatient Hospital Episode Statistics: HES Cancer Waiting Times: CWT Radio-therapy Data Set: RTDS Systematic Anti-Cancer Therapy: SACT Cancer Registration Comparing the multiple sources of cancer treatment data
Motivation Can we measure the fraction of patients are treated with each modality in each of these datasets? Are these sources consistent with one another? Can we use the most complete of these sources, or a combination, to assess the overall level of cancer treatment? Comparing the multiple sources of cancer treatment data
Methods Look at - Invasive tumours excluding non-melanoma skin Persons diagnosed 2012-10-01 to 2013-03-31 Treatment in -30 to +182 days of diagnosis Concentrate on – Fact of treatment Trust of first treatment Date of first treatment Comparing the multiple sources of cancer treatment data
The group of tumours under discussion 143,132 tumours in 6 months. 43,073 CTs. 30,927 RTs. Comparing the multiple sources of cancer treatment data
Linking these tumours to treatments Comparing the multiple sources of cancer treatment data
Fraction of tumours treated by CT and RT Registry CT: 24.8% RT: 18.3% Hospital Episode Statistics CT: 22.4% RT: 3.4% Cancer Waiting Times CT: 22.9% RT: 18.6% Systematic Anti-Cancer Therapy / RadioTherapy DataSet CT: 21.4% RT: 20.2% All of the above… CT: 30.1% (43,073) RT: 21.6% (30,927) Comparing the multiple sources of cancer treatment data
Dividing up a dataset with Venn Diagrams Of the people in this room how many have been to London and/or Dublin this year? London Dublin 30 20 20 Comparing the multiple sources of cancer treatment data
2-way and 3-way Venn Diagrams Comparing the multiple sources of cancer treatment data
Distribution of 43,073 chemotherapies HES CWT Registry SACT Comparing the multiple sources of cancer treatment data
Distribution of 43,073 chemotherapies HES CWT Registry SACT 3% 2% 1% 4% 1% 11% 4% 4% 4% 43% 4% 2% 10% 6% 1% Comparing the multiple sources of cancer treatment data
Are chemotherapy datasets consistent? Trust Matches Date Matches (exact) Results Implications Comparing the multiple sources of cancer treatment data
Are chemotherapy datasets consistent? Trust Matches Date Matches (30 day) Results Implications Comparing the multiple sources of cancer treatment data
Distribution of 30,927 Radiotherapies HES CWT Registry RTDS Comparing the multiple sources of cancer treatment data
Distribution of 30,927 Radiotherapies HES CWT Registry RTDS 1% 1% 0% 1% 11% 0% 1% 2% 5% 11% 1% 1% 61% 1% 3% Comparing the multiple sources of cancer treatment data
Are radiotherapy datasets consistent? Trust Matches Date Matches (exact) Results Implications Comparing the multiple sources of cancer treatment data
Are radiotherapy datasets consistent? Trust Matches Date Matches (30 day) Results Implications Comparing the multiple sources of cancer treatment data
The overall level of cancer treatment? If data sources were independent… … and capture of treatment data was also at random… … then flattening of cumulative completeness would imply complete data. But… registry feeds from CWT & HES systematic effects (e.g. private care, primary chemo, local issues, other reasons…) Comparing the multiple sources of cancer treatment data
But, what datasets do we believe? Combining all sources: CT: 30.1% RT: 21.6% Insist on dual source confirmation: CT: 26.4% RT: 19.7% Accept single source Registry, SACT/RTDS. Only accept HES with CWT: CT: 28.7% RT: 21.3% Lazy solution: Registry and SACT/RTDS only: CT: 28.2% RT: 21.3% Comparing the multiple sources of cancer treatment data
Summary Can we measure completeness in each? Yes Are they consistent? Yes Good agreement between both trust of diagnosis and treatment period (except RT for HES) – we appear to be matching clinical treatments. Can we define an overall cohort? Yes (-ish) CT nearly 30%, RT just over 20% We think we have eliminated much of the “random” missing data – “systematically” missing data may still be missing. Some clinical input and additional work to finalise definitions. Comparing the multiple sources of cancer treatment data
Thank you for listening! Background Methods Results Implications sean.mcphail@phe.gov.uk Comparing the multiple sources of cancer treatment