Download presentation
Presentation is loading. Please wait.
Published byKevin Clyde Wade Modified over 9 years ago
1
UCLA MII A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports Maurine Tong BS, William Hsu PhD, Ricky K Taira PhD Medical Imaging Informatics Group University of California, Los Angeles
2
UCLA MII Problem: Querying Free Text CTRs Clinical Trial Reports (CTRs) Patient Recruitment Internal/External Validity Testing Disease Modeling Query Processor Informatics Applications Representation
3
UCLA MII Why Focus on Numerical Info Predictive disease modeling Ex: Bayesian Belief Networks Key to identifying trial quality Hypothesis testing context and measures Key to synthesizing evidence What is the context for reported probabilities P ( effect | cause, context ) Internal Validity Disease Modeling Patient Recruitment
4
UCLA MII Background and Prior Work Ontologies for Experiments and Clinical Trials Ontology of Clinical Research (OCRe) Sim et al. Ontology of Scientific Experiments (EXPO) Soldatova et al. Standardizing and sharing clinical trial data BRIDG, CDISC, SNOMED CT Representing individual sections of a clinical trial report Eligibility criteria: EliXR, Weng et al. Scientific claims: Blake et al. These systems primarily help to improve patient recruitment. Our focus is on modeling numerical information for quality assessment and disease modeling
5
UCLA MII Problem: Fragmentation
6
UCLA MII Methods: Requirements Analysis What are the queries to be supported by the representation? Study Quality Disease Modeling
7
UCLA MII Methods: Requirements Analysis Study quality queries What is the p-value (population parameter associated with hypothesis? What is the statistical test used to calculate the p-value? What is the power of the sample size tested? … Study Quality and experts James Sayre, PhD Biostatician Consulted textbooks
8
UCLA MII Methods: Requirements Analysis Disease modeling queries What are the prior probabilities? Can we estimate posterior probabilities from p-values or other reported information? … Disease Modeling Consulted experts, textbooks and literature Thomas Belin, PhD Biostatician
9
UCLA MII Methods: Initial Design Conceptual model of representation Domain: Metastatic Melanoma Flaherty KT. et al. N Engl J Med. 2010 Aug 26;363(9):809-19
10
UCLA MII Pop. Stats Sample Pop. InterventionBaseline Measurements Variables <240mg240mg 320 / 360mg 720 mg<24mg240mg 320 / 360mg 720 mg Prevalence of MAP kinase pathway mutation 40-60% Age 23-86 Confirmed histology refractory to standard treatment 0:5,1:16, 2:5, >2:23 PLX4032 Formulation Crystalline n=3/6 Microprecipitated bulk powder n=34 Plasma samples (uM x hr) 100 +/- 50 350+/- 78 650+/- 100 1500+/- 1000 <240mg240mg CT Studies Total Response Rate 100%34% 67%80% Partial Response 02,602,4 … … … … …
11
UCLA MII Pop. Stats Sample Pop. InterventionBaseline Measurements Variables <240mg240mg 320 / 360mg 720 mg<24mg240mg 320 / 360mg 720 mg Prevalence of MAP kinase pathway mutation 40-60% Age 23-86 Confirmed histology refractory to standard treatment 0:5,1:16, 2:5, >2:23 PLX4032 Formulation Crystalline n=3/6 Microprecipitated bulk powder n=34 Plasma samples (uM x hr) 100 +/- 50 350+/- 78 650+/- 100 1500+/- 1000 <240mg240mg CT Studies Total Response Rate 67%80% Partial Response 02,602,4 … … … … … A Process Model
12
UCLA MII Pop. Stats Sample Pop. InterventionBaseline Measurements Variables <240mg240mg 320 / 360mg 720 mg<24mg240mg 320 / 360mg 720 mg Prevalence of MAP kinase pathway mutation 40-60% Age 23-86 Confirmed histology refractory to standard treatment 0:5,1:16, 2:5, >2:23 PLX4032 Formulation Crystalline n=3/6 Microprecipitated bulk powder n=34 Plasma samples (uM x hr) 100 +/- 50 350+/- 78 650+/- 100 1500+/- 1000 <240mg240mg CT Studies Total Response Rate 67%80% Partial Response 02,602,4 … … … … … B Global Variable List
13
UCLA MII Pop. Stats Sample Pop. InterventionBaseline Measurements Variables <240mg240mg 320 / 360mg 720 mg<24mg240mg 320 / 360mg 720 mg Prevalence of MAP kinase pathway mutation 40-60% Age 23-86 Confirmed histology refractory to standard treatment 0:5,1:16, 2:5, >2:23 PLX4032 Formulation Crystalline n=3/6 Microprecipitated bulk powder n=34 Plasma samples (uM x hr) 100 +/- 50 350+/- 78 650+/- 100 1500+/- 1000 <240mg240mg CT Studies Total Response Rate 67%80% Partial Response 02,602,4 … … … … … C Variable Characterization
14
UCLA MII Pop. Stats Sample Pop. InterventionBaseline Measurements Variables <240mg240mg 320 / 360mg 720 mg<24mg240mg 320 / 360mg 720 mg Prevalence of MAP kinase pathway mutation 40-60% Age 23-86 Confirmed histology refractory to standard treatment 0:5,1:16, 2:5, >2:23 PLX4032 Formulation Crystalline n=3/6 Microprecipitated bulk powder n=34 Plasma samples (uM x hr) 100 +/- 50 350+/- 78 650+/- 100 1500+/- 1000 <240mg240mg CT Studies Total Response Rate 67%80% Partial Response 02,602,4 … … … … … D Statistical Hypothesis Testing
15
UCLA MII Results: Implementation
16
UCLA MII Example 1: Capturing context Demonstration of how the representation captures context for the observations of an intervention group. Query Domain: Lung Cancer In Johnson et al., what is the context (e.g., intervention, population characteristics, measurement methodology) associated with progression free survival (PFS) in the high dose group (HDG)? Johnson DH. et al. J Clin Oncol. 2004 Jun 1;22(11):2184-91.
17
UCLA MII Steps to Capture Context 1.Find the node in the process model 2.Find corresponding column 3.Find variable of interest 4.Backtrack through the process model to obtain context for observations and get associated data to backtracked node 5.Construct logical representation of context 6.Repeat steps 4-5 until the start node
18
UCLA MII Step 1: Find the node in process model This node represents the progression free survival time point for high dose group.
19
UCLA MII Step 2: Find corresponding column This column represents the numerical data and data elements associated with this node
20
UCLA MII Step 3: Find variable of interest
21
UCLA MII Step 4: Backtrack & Obtain Data Obtain context by looking at linked nodes in process model
22
UCLA MII Step 5: Construct logical context Data modeling is straightforward from semantics of process model link and node Cell name:Bevacizumab Cell Location #: 474 Drug: Bevacizumab Dose: 15 mg/kg How was it administered: Vehicle: Intravenous infusion Duration: Over 90 minutes Cycle: 3 weeks Maximum dose: 18 doses Exception: Well tolerated Resulting Action: New duration Duration: 30-60 minutes Cell name:Bevacizumab Cell Location #: 474 Drug: Bevacizumab Dose: 15 mg/kg How was it administered: Vehicle: Intravenous infusion Duration: Over 90 minutes Cycle: 3 weeks Maximum dose: 18 doses Exception: Well tolerated Resulting Action: New duration Duration: 30-60 minutes
23
UCLA MII Step 6: Repeat steps 4-5 until start Continue backtracking through process model Aggregate associated data Repeat until first node Context for Adverse Event (Node #740): Name of n847
24
UCLA MII Example 1: Capturing context Demonstration of how the representation captures context for the observations of an intervention group. Query What is the context (e.g., intervention, population characteristics, measurement methodology) associated with progression free survival (PFS) in the high dose group?
25
UCLA MII Example 1: Capturing context Data: Associated Context: Context for Adverse Event (Node #740): 1 ) INTERVENTION: Bevacizumab (Node #474) 2) POPULATION CHARACTERISTICS: High Dose Bev (Arm #3) Eligibility Criteria: Stage 3 Recurrent NSCLC (Node #847) No Prior Chemotherapy (Node #628) Other criteria (Node #748) Baseline characteristics of the patient (Node #222) 3) METHODS: Progression Free Survival
26
UCLA MII Example 2: Comparisons Comparison of outcomes in the intervention vs. control arms Query Compare PFS for intervention and control arm Context from two nodes can be placed on the same chart
27
UCLA MII Example 3: Analyses How was the p-value calculated? Visualization includes: Data Test Statistics P-value Statement
28
UCLA MII Pilot Evaluation Can representation answer user queries from requirements analysis? Preliminary evaluation questions Characteristics of the trial Quality of the trial Significance of the science
29
UCLA MII Evaluation: Objectives Objective 1 Utility of the representation to accurately identify numerical data to support key contributions made by a clinical trial report Objective 2 Intuitiveness of the representation through reproducibility of the visualization by different users
30
UCLA MII Evaluation: Study Design Study design 2-arm study Status quo group using paper copy Intervention group using proposed representation Participants (n=6) Graduate students in biology, biostatistics, informatics, or engineering Statistical methods Student’s paired t-test Gold standard Established by graduate student supervised by domain expert 4 clinical trial papers in NSCLC J Clin Oncol. 2004 Jun 1;22(11):2184-91. J Clin Oncol. 2008 May 20;26(15):2442-9. Lancet Oncol. 2012 Jan;13(1):33-42. J Clin Oncol. 2011 Nov 1;29(31):4113-20.
31
UCLA MII Evaluation: Questions What is the purpose of this trial? What is the sample size for each experimental arm? How was the primary outcome assessed? How many patients experienced positive outcomes in this trial? How was the data analyzed?
32
UCLA MII Evaluation: Results Users of the representation was able to accurately identify numerical data that support key contributions as compared with status quo User visualizations was reproducible 68.1% ± 6.45% was of the gold standard was reproduced by users AccuracySDTimeSD Representation79%18%309% Status Quo76%9%347%
33
UCLA MII Discussion Our work supports queries related to study quality and disease modeling We developed a representation to associate appropriate context from numerical data within clinical trial reports The pilot evaluation shows that the utility of the representation is promising To extend this work: Instantiate using automatic methods and capture numerical data using NLP methods Develop an interface to support frequently-asked queries for specific clinical trial reports Test in journal club setting
34
UCLA MII Conclusion We are establishing a systematic way of extracting information from clinical trial reports in a machine-understandable way The overarching objective is to have a computer reason on this representation to facilitate clinical decision making
35
UCLA MII Acknowledgements James Sayre, PhD, Biostatician Domain experts Research participants NLM Training Grant NLM R01-LM009961
36
UCLA MII T HANK Y OU
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.