Presentation is loading. Please wait.

Presentation is loading. Please wait.

TCGA Clinical Data Analysis of Sources of Error Mary E. Edgerton, MD, PhD Department of Pathology UT MD Anderson Cancer Center October 7, 2010.

Similar presentations


Presentation on theme: "TCGA Clinical Data Analysis of Sources of Error Mary E. Edgerton, MD, PhD Department of Pathology UT MD Anderson Cancer Center October 7, 2010."— Presentation transcript:

1 TCGA Clinical Data Analysis of Sources of Error Mary E. Edgerton, MD, PhD Department of Pathology UT MD Anderson Cancer Center October 7, 2010

2 Ship Date Distribution by Site
Clinical Aliquot MEE

3 Tissue Type as a Function of Ship Date-No Biased Relationship
Aliquot_MEE

4 Ship Date for Samples Receiving Targeted Therapy from Different Institutions
Clinical_drug_OC_3

5 Days to Death Should be Date of Last Follow-up: Scatter Plot of Differences by Institution

6 Difference between Age entry in years and DAYSTOBIRTH converted to years by Institution

7 DAYSTODEATH Relationship to VITALSTATUS
All DECEASED values have corresponding DAYSTODEATH entries LIVING entries all have null DAYSTODEATH (correct) 18 null entries for vital status, one of which has DAYSTODEATH T/C QC checks or installed procedures to normalize db entries: Not null value of days to death should trigger vital status change

8 DAYSTOTUMORPROGRESSION should be orthogonal to DAYSTO TUMORRECURRENCE if progression is defined as during therapy and recurrence after therapy

9 Mismatch of DAYSTOTUMORPROGRESSION with SITEOFTUMORFIRSTRECURRENCE
Site is METASTASIS Has entry DAYSTOTUMORPROGRESSION With SITEOFTUMORFIRSTRECURRENCE

10 Mismatch by Institution

11 Need to better define Progression and Recurrence
While different disciplines may view these differently, TCGA needs to determine a single definition to use across sites and install DB checks to insure quality control

12 TUMORRESIDUALDISEASE and PRIMARYTHERAPYOUTCOMES
Null, COMPLETE RESPONSE, PARTIAL RESPONSE, PROGRESSIVE DISEASE, AND STABLE DISEASE ALL HAVE AS ENTRIES No Macroscopic Disease 1-10 11-20 >20

13 Residual Disease as a Function of Therapy Outcomes
CR PR SD PD

14 Consider Take TUMORRESIDUALDISEASE to be a measure after surgery and PRIMARYTHERAPYOUTCOMESUCCESS to be the response after adjuvant therapy or after additional therapy (complete response, partial response, stable disease, or progression PR should never start with No Macroscopic Disease as anything with NMD that developed disease would be Progressive Disease (or PD)

15 Consider Both TUMORRESIDUALDISEASE and PRIMARYTHERAPYOUTCOME are measures after surgery Complete Response (CR) should only have entries of No Macroscopic Disease (NMD)

16 Most common definition
TUMORRESIDUALDISEASE in mm would be after primary surgery and PRIMARYTHERAPYOUTCOME would be after Chemotherapy or after Additional Therapy CR after Chemo can have any starting TUMORRESIDUALDISEASE PR and Stable Disease (SD) would never have NMD as a starting point Given these rules, then there are incorrect entries

17 Incorrect entries based on my definitions
Institution 4 20 61 13 30 1 5 Number of Incorrect entries

18 PERSONNEOPLASMCANCERSTATUS
Is this at the end of primary therapy, as of last follow-up date, or after additional therapy? Does not appear to correlate with at least end of primary therapy or as of last follow-up date, e.g. patient with CR is WITH TUMOR, patient with recurrence is TUMOR FREE. Suggests that this is not computed from follow-up information but is an independent entry Db lacks normalization or internal QC

19 Drug Therapy Patient clinical data file (clinical_patient_public_OV.txt) has different therapy choices from drug file (clinical_drug_public_OV.txt) such that targeted therapy becomes other drugs There is nothing about salvage therapy vs primary therapy in clincal_drug_public_OV.txt Should Regimen Indication be split into to so Regimen has values Neoadjuvant, Adjuvant and Salvage while Indication has values Primary Diagnosis, Progression and Recurrence. Can DAYSTODRUGTREATMENT be used to define this as salvage, etc. What is the meaning of INITIALCOURSE All of the entries are null

20 Adjuvant Therapies:RADIATION, CHEMOTHERAPY, IMMUNOTHERAPY, and TARGETEDMOLECULARTHERAPY
Database normalization queries: Are the entries seen in the clinical_drug_public.txt files generated from the same table elements as clinical_patient_public (i.e. is the database normalized)? Definition issues Within the clinical_drug_public.txt there are entries that do not match with NCI definitions for special therapies, e.g. Pt 1666 received oca rex oregovomab, an antibody targeted against ca-125, and this was entered as immunotherapy. This does not match NCI definition of immunotherapy in which the patient’s immune system is boosted, but matches the definition for targeted therapy. How are these entries being QC’d?

21 Tumor Residual Disease
Kaplan-Meier Curve for TUMORRESIDUALDISEASEFIELD values of No Macroscopic disease and null overlap Is null being used for No Macroscopic Disease

22 Clinical_slide_public
Null being used for Not Applicable In order to track completion of entries, should nulls be avoided as correct responses

23 Other General Q/A of Clinical Data FIles
Should defaults be employed, such as female for ovarian Allowable values for initial pathologic diagnosis method is a mix of pathology sample types and procedures-should these be split into procedures and sample types If Informed Medical Consent Verified is not checked yes, should we be seeing this patient’s clinical data

24 Where is ca-125


Download ppt "TCGA Clinical Data Analysis of Sources of Error Mary E. Edgerton, MD, PhD Department of Pathology UT MD Anderson Cancer Center October 7, 2010."

Similar presentations


Ads by Google