Download presentation
Presentation is loading. Please wait.
1
Transforming Data Quality Using Machine Learning
SCOPE, February 2019
2
Stacey Yount Vice President, Product Clinical Trial Management
Medidata Solutions
3
Objectives Assess current state data quality oversight efficiency and effectiveness Demonstrate why machine learning provides opportunities for transformation Review case studies that highlight the impact of ML in the data review cycle
4
EDC Data Feeds Are Complex And Increasing Labs PAYMENTS
PROTOCOL DESIGN BUDGETING CTMS SAS extracts IRT Labs LIMS ePRO SAFETY CODING SENSORS PAYMENTS APPS IMAGING SDR SQM EHR/ EMR BATCH UPLOAD WEB SERVICES Biomarkers eDiaries Claims Mobile health EDC Listings The problem is that there is SO much noise in today’s clinical data – noise we didn’t have 10 years ago. It’s becoming harder and harder to see the signals – both the good and the bad signals.
5
Processes and Technology Study Milestones Investigator Sites
Icon optional. Processes and Technology Study Milestones Investigator Sites Inefficient edit checks and manual data review with 7.3% of submissions failing 1st attempt due to data issues Database locks averaging between 30+ days due to issues uncovered late in study lifecycle Increased frustration by sites and clinical teams with increased queries near database lock Organizational Value Employee Engagement Concerns voiced by Statistics organizations regarding cleanliness of data and use of Stats resources for cleaning Negative perception due to inefficient processes, impact to study milestones, and lack of insights prior to final reviews
6
Centralized Statistical Analytics
Shadow settings on screenshot: size 100%, blur 5pt, distance 0pt, transparency 70% . Centralized Statistical Analytics Known and unknown risk discovery through the use of statistical, machine learning analytics Data quality and site performance Single place to ensure logical and statistical data quality across all of your “monitoring” functions Study Day of Resection Study Day of Diagnosis
7
Data Management Reimagined
Robust Edit Checks & Standards Integrated Quality & Risk Management Process Risk Oversight Team CSA Reviews Earlier Stats Reviews Risk Evaluation Outcome Risk Communication Pathway Risk Oversight Team provides cross-functional governance over data quality and site performance Technology-enabled process optimization provides early insights Iterative process provides feedback and insights into IQRM and oversight Risk Communication Pathways ensure clear understanding of risk
8
Technology Enabled Convergence
RBM and advanced analytics create opportunities for operational convergence
9
Hurdles Learning curve with new technology
Significant as-is to-be differences in processes Lack of industry best practices Misalignment with current skill set
10
Best Practices Design a new process instead of changing an existing process Adjust implementation to account for organizational plans (e.g., size, structural changes) Gain cross-functional alignment to process and tech changes Leverage a team approach to oversight of data quality and site performance management
11
CSA Case Studies
12
5 day database locks Transformation Image: • Sent to Back
• Adjust image transparency to make text readable Transformation 5 day database locks
13
Image: • Sent to Back • Adjust regangle shape transparency to make text readable >25% of data quality issues found across recent studies had potential to delay drug approval
14
Medidata CSA impact on submission
Centralized Statistical Analytics Remediation FDA Approval Q1’ 18 Launch Sponsor A Extra analyses FDA Letter Extra analyses Delay Sponsor B Remediation
15
Supplement
16
Illustration “My medical record says my systolic BP was 180 in Jan 2017.” Is this right or wrong? What can you use from my medical record to help answer the question?
17
Illustration My medical record says my systolic BP was 180 in Jan Is this right or wrong? What can you use from my medical record to help answer the question? Systolic BP before/after Diastolic BP measurement Medical history or hypertension Concurrent hypertension adverse event reported Con Med Hypertension Drug Risk Factors (Age/Sex/etc.) After reviewing all of my data, could you answer the question with reasonable confidence?
18
Illustration How could we use medical records of people with characteristics similar to mine and give an even more accurate, quantitative answer to whether the Systolic BP is an error?
19
Illustration How could we use medical records of people like me with the list we came up with and give an even more accurate, quantitative answer to whether the Systolic BP is an error? Normal variation of Systolic BP Slope of Systolic BP vs. Diastolic BP relationship Ranking of risk factors for relationship to Systolic BP Bottom Line: You can calculate the likelihood of data being wrong with higher confidence
20
Current data review practices lead to inefficiency and sub-optimal data quality
Significant fraction of data quality issues cannot be identified by traditional data review Site performance is not equal; some are significantly worse Most sites make mistakes in specific areas (which may differ from site to site) Most EDC data entered by sites are correct, but errors can have significant impact on speed and success of study
21
Most common quality issues uncovered with statistical data cleaning
Site inconsistency for unknown risks Site inconsistency for known risks Differences in adverse event reporting Inconsistencies in how sites evaluate or measure endpoints Inconsistencies in how sites follow the protocol Differences in the actions sites take with regard to an adverse event Potential misconduct Data inconsistency Sites that make up data out of neglect or forgetfulness Data that are impossible or highly unlikely due to data entry errors
22
Iterative value of process
Risk Oversight Team provides cross-functional governance over data quality and site performance Technology-enabled process optimization provides early insights Iterative process provides feedback and insights into IQRM and oversight Risk Communication Pathways ensure clear understanding of risk Site Corrective Action Plan(s) Study Team Accountability Assignments Early Escalation of Safety and Efficacy Issues Portfolio Compound Risk Evaluation Expedited Database Lock Timelines
23
Icon optional. Mid-sized sponsor transforms clinical data management using Medidata CSA Processes and Technology Efficient iterative processes supported by ML-based technology; nearly doubling number of database locks YOY Study Milestones Database locks average 5 days for critical studies and 10 days for others with significantly fewer findings after LPLV Investigator Sites Improved sites, clinical team communications and involvement in review cycles; excited to participate in study milestones Organizational Value Statistics findings 0 to <2%; Positive organizational satisfaction CDM Job Satisfaction Empowered through early insights with increased confidence in data quality and completeness = high job satisfaction and retention
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.