Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Anomaly Detection in CDER: An Overview of Efforts and Methods

Similar presentations


Presentation on theme: "Data Anomaly Detection in CDER: An Overview of Efforts and Methods"— Presentation transcript:

1 Data Anomaly Detection in CDER: An Overview of Efforts and Methods
Paul Schuette Scientific Computing Coordinator Food and Drug Administration Center for Drug Evaluation and Research Office of Translational Sciences Office of Biostatistics

2 Disclaimer This presentation reflects the views of the author and should not be construed to represent the FDA's views or policies.

3 What is a data anomaly? A definition derived from Hawkes: A data anomaly is a data point or collection of data points which differ so much from the other data as to arouse suspicions that it was generated by a different mechanism. Data which are too similar may also be considered to be anomalous. Data anomalies may arise from multiple sources including sloppiness, incorrectly calibrated instruments, poor training, fraud, etc. The presence of data anomalies can lead to concerns regarding data quality and data integrity.

4 Office of Bioresearch Monitoring (BIMO) Operations
FDA's Bioresearch Monitoring (BIMO) program is a comprehensive program of on-site inspections and data audits designed to monitor all aspects of the conduct and reporting of FDA regulated research.  The BIMO Program was established to assure the quality and integrity of data submitted to the agency in support of  new product approvals and marketing applications, as well as to provide for protection of the rights and welfare of the thousands of human subjects and animals involved in FDA regulated research.  It has become a cornerstone of the FDA preapproval process for new medicines, medical devices, food and color additives, veterinary products and, just recently, new tobacco products introduced to the U.S. consumer. The program is implemented domestically and internationally through seven multi-center compliance programs resulting in over 1500 inspections annually. These compliance programs address inspections of nonclinical testing laboratories in accordance with Good Laboratory Practice (GLP), clinical investigators in accordance with Good Clinical Practice (GCP), sponsors/Contract Research Organizations (CROs)/clinical trial monitors, clinical and analytical in vivo bioequivalence facilities, and institutional review boards (IRBs).

5 CDER BIMO Efforts Office of Regulatory Affairs (ORA) staff conduct inspections, assisted by subject matter experts from: Office of Scientific Investigations (OSI): clinical and nonclinical studies, BLA, NDA Office of Study Integrity and Surveillance (OSIS) pharmacokinetic, bioavailability/bioequivalence (BA/BE), Good Laboratory Practice (GLP), and Animal Rule (AR) studies.

6 Software Tools Clinical Investigator Site Selection Tool (CISST)
R Shiny apps based on inspection data (prototypes) CISST Assist, Pilot for using JMP Clinical with CISST CluePoints CRADA, Statistical Monitoring Applied to Research Trials (SMART) Data Anomalies in BioEquivalence R Shiny (DABERS) app

7 CDER’s Clinical Investigator Site Selection Tool (CISST)
Based on expert elicitation (2009) from reviewers in Office of Biostatistics Office of Compliance Office of New Drugs Office of Business Informatics

8 Clinsite Data Set

9 CISST Score Determination

10 Data Mining Approaches (ORISE 2016, 2017)
Inspection Outcomes: No Action Indicated (NAI) Voluntary Action Indicated (VAI) Official Action Indicated (OAI) 2016 ORISE: Random Forest and Boosted Tree Methods used to predict NAI/VAI/OAI, trained on data from inspections, clinsite data sets 2017 ORISE: Random Forest, Boosted Tree and Boosted Tree with dropout used to predict NAI, and not NAI (VAI together with OAI)

11 CISST Assist Service Provided through Office of Computational Science
Adds JMP Clinical Anomaly Detection (JCAD) analyses to sites ranked by CISST Requires SDTM data, Labor intensive, entails manual evaluation of potential signals

12 CISST Assist Example, Proportion of Site Visits on a Weekend

13 CluePoints CRADA Software
Statistical Monitoring Applied to Research Trials (SMART) Applies battery of statistical tests (missing value, categorical, binary, means, standard deviations, date, outlier and propagated values) to multiple domains Compares subject/site level values for a variable to all sites in trial Creates a matrix of p-values Creates a Data Inconsistency Score (DIS), based on scoring algorithm, which down weights correlated values. Returns ranked list of sites with data anomalies, bubbleplot of sites. Implemented via configurable scripts. Approaching end of 2nd year of a 3 year CRADA Note: Paul Schuette is the PI for this CRADA

14 Example of CRADA software output
Data from an approved product Most identified anomalous sites (magenta) were outside of US Treatment affect observed primarily outside of US Sponsor’s audit revealed problems at the two largest anomalous sites Clinsite BIMO data were not available for this trial

15 Example of a Data Anomaly

16 DABERS app Internally developed (Office of Biostatistics) R Shiny app, author Nam Hee Choi Addresses need unfilled by commercial software packages Deals with Bioequivalence data, typically from a single site, rather than multiple site BIMO or clinical endpoint SDTM data. Employs EDA, multiple metrics and statistical methods to allow reviewers to identify potentially suspect profiles.

17 Observations Different tools are based on different data sets and employ very different methodologies. Concordance between tools when possible, can be relatively weak. The recent ICH E6(R2) guidance, suggests that multiple approaches, including source data verification, quality tolerance limits, and centralized statistical monitoring be employed.

18 Conclusions Many issues and problems are not discovered with source data verification. Data anomaly detection can be an important component of ensuring data quality. Clinical significance of data anomalies may not be obvious. Multiple approaches may be required. Need both clinical endpoint (typically multiple sites) and bioequivalence (typically single site) tools. There are heterogeneity challenges with multiregional trials.

19 Questions?


Download ppt "Data Anomaly Detection in CDER: An Overview of Efforts and Methods"

Similar presentations


Ads by Google