Data Cleaning. Understanding Discrepancies Discrepancies are “Inconsistencies” found in the clinical trial data which need to be corrected as per the.

Slides:



Advertisements
Similar presentations
1 US Investigator Meeting DIAS-4, Chicago, July 2011 Good Clinical Practice (GCP) for Investigators and the Research Team.
Advertisements

Tips to a Successful Monitoring Visit
Quality Management within the Clinical Research Process
Audit Ebru Mutlu-Omega CRO. General Purpose to help quality (to maintain quality at present) to assure quality (make sure that quality in future is maintained)
1 What’s New in the Therapy Prior Authorization Review Process? December 2011 Therapy Clinical Webinars.
Medication Reconciliation Insert your hospital’s name here.
PPA 502 – Program Evaluation Lecture 5b – Collecting Data from Agency Records.
Using EDC-Rave to Conduct Clinical Trials at Genentech
Clinical Pharmacy’s Role in Research Trials Sheree Miller Pharm.D. Investigational Drug Service University of Washington Medical Center.
Software Development Unit 2 Databases What is a database? A collection of data organised in a manner that allows access, retrieval and use of that data.
Clinical Data Management is involved in all aspects of processing the clinical data, working with a range of computer applications, database systems.
ACRIN 6698 Diffusion-weighted MRI Biomarkers for Assessment of Breast Cancer Response to Neoadjuvant Treatment: An I-SPY 2 Trial Substudy Presented by:
M ODULE H I NTERIM M ONITORING V ISIT Denise Thwing 21 Apr Version: Final 21-Apr-2010.
© 2011 Octagon Research Solutions, Inc. All Rights Reserved. The contents of this document are confidential and proprietary to Octagon Research Solutions,
Topics Covered: Data preparation Data preparation Data capturing Data capturing Data verification and validation Data verification and validation Data.
Performing the Study Data Collection
How to process data from clinical trials and their open label extensions PhUSE, Berlin, October 2010 Thomas Grupe and Stephanie Bartsch, Clinical Data.
Solutions Summit 2014 Discrepancy Processing & Resolution Terri Sullivan.
The 2 nd Clinical Data Management Training CDM System & Validation Maggie Fu EPS International (China) Co.,Ltd September, 2010 at SMMU, Shanghai.
OCAN College Access Program Data Submissions Vonetta Woods HEI Analyst, Ohio Board of Regents
Prescribing Errors in General Practice The PRACtICe Study (2012) GMC Investigating Prevalence and Causes.
Joint Research & Enterprise Office Training The team, the procedures, the monitor and the Sponsor Lucy H H Parker Clinical Research Governance Manager.
MODULE B: Case Report Forms Jane Fendl & Denise Thwing April 7, Version: Final 07-Apr-2010.
September, 2010 at SMMU, Shanghai
Research Studies GOTCHA’S By Sally Duffy. Failure to follow protocol, investigator agreements and regulations Did not use device/drug in manner specified.
CLINICAL TRIALS – PHASE III. What are phase III trials  Confirmatory phase (Therapeutic confirmatory trial)  Trials are done to obtain sufficient evidence.
The 2 nd Clinical Data Management Training Clinical Data Review September, 2010 at SMMU, Shanghai.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Cleansing Ola Ekdahl IT Mentors 9/12/08.
Siebel 8.0 Module 5: EIM Processing Integrating Siebel Applications.
SAE data entry: Clinical versus Pharmacovigilance standards Daniel Becker Solvay Pharmaceuticals Hannover, Germany T:
Data Validation OPEN Development Conference September 19, 2008 Sushmita De Systems Analyst.
HIT Policy Committee METHODOLOGIC ISSUES Tiger Team Summary Helen Burstin National Quality Forum Jon White Agency for Healthcare Research and Quality October.
Investigator’s Meeting
Using EDC-Rave to Conduct Clinical Trials at Genentech Susanne Prokscha Principal CDM PTM Process Analyst February 2012.
Statistical Expertise for Sound Decision Making Quality Assurance for Census Data Processing Jean-Michel Durr 28/1/20111Fourth meeting of the TCG - Lubjana.
Development and Approval of Drugs and Devices EPI260 Lecture 6: Late Phase Clinical Trials April 28, 2011 Richard Chin, M.D.
بسم الله الرحمن الرحيم جامعة أم درمان الإسلامية كلية الطب و العلوم الصحية - قسم طب المجتمع مساق البحث العلمي / الدفعة 21 Basics of Clinical Trials.
0 / Database Management. 1 / Identify file maintenance techniques Discuss the terms character, field, record, and table Describe characteristics.
Flat Files Relational Databases
Session 1 Module 1: Introduction to Data Integrity
Office of Human Research Protection Georgia Health Sciences University.
UNIT-II CLINICAL DATA. UNIT-II CLINICAL DATA: Clinical Data, Application, Challenges, Solutions, Clinical Data Management System.
Data Management in Clinical Research Rosanne M. Pogash, MPA Manager, PHS Data Management Unit January 12,
24 Nov 2007Data Management and Exploratory Data Analysis 1 Yongyuth Chaiyapong Ph.D. (Mathematical Statistics) Department of Statistics Faculty of Science.
CSS OCTraining 1 Oracle Clinical Overview/Hands-On.
HEI/OCAN College Access Program Data Submissions.
CONDUCTING COMPLIANCE ASSESSMENTS Allen Ditch Director Corporate Quality Bristol Myers Squibb Medical Research Summit March 6, 2003.
Session 6: Data Flow, Data Management, and Data Quality.
Korea Food & Drug Administration Deputy director Kwang-Soo Joo Korea FDA Sep. 29, 2000 : Korean Good Clinical Practice & Relative Guidelines How to Manage.
How good is your SEND data? Timothy Kropp FDA/CDER/OCS 1.
How Good is Your SDTM Data? Perspectives from JumpStart Mary Doi, M.D., M.S. Office of Computational Science Office of Translational Sciences Center for.
Research Documentation Betty Wilson, CIP, MS Senior Compliance Manager MU IRB Lori Wilcox, EdD Director of Academic Compliance, Corporate Compliance.
Responsibilities of Sponsor, Investigator and Monitor
Data Management in Clinical Research
Design of Case Report Forms
Data quality & VALIDATION
Responsibilities of Sponsor, Investigator and Monitor
CLINICAL DATA MANAGEMENT
Texas Student Data System
HOW TO ENTER BASELINE DATA
Texas Student Data System
Clinical Data Management and Case Report Form
Data Management in Support of A Clinical Event Committee (CEC)
E-CRF Overview Oracle® Clinical Remote Data Capture Training (Version 4.6 HTML) e-CRF Completion John McDonach Manager CDM, PPD.
The ultimate in data organization
Discrepancy Management
A Clinical Data Analyst with SAS Programming
Serious Adverse Event Reconciliation
Presentation transcript:

Data Cleaning

Understanding Discrepancies Discrepancies are “Inconsistencies” found in the clinical trial data which need to be corrected as per the study protocol (the guiding document) For better query management, one has to know the source or origin of discrepancies Data in clinical trial should be congruent with the study protocol

Data Inconsistencies Data consistent? Data legible? Data complete? Data correct? Data logical? Data Inconsistencies

Discrepancy Types Completeness & Consistency:  Checks for empty fields  Looking for all related items  Cross-checking values Real-world checks:  Range checks  Discrete value checks  One value greater/less than/equal to another

Discrepancy Types Quality control:  Are dates in logical sequence?  Is header information consistent?  Any missing visits or pages? Compliance & Safety:  Are visits in compliance with protocol?  Inclusion/exclusion criteria met?  Checking lab values against normals  Comparison of values over time

Validating/Cleaning Data  Data cleaning or validation is a collection of activities used to assure validity & accuracy of data  Logical & Statistical checks to detect impossible values due to  data entry errors  coding  inconsistent data

Who Cleans the Database? Data Management Plan through SOPs clearly defines tasks & responsibilities involved in database cleaning Please see SOPs… who is Responsible for Cleaning!

Cleaning Data Activities include addressing discrepancies by:  Manual review of data and CRFs by clinical team or data management  Computerized checks of data by the clinical data management system:  Validation/discrepancy designing and build-up into the database

 Making sure that raw data were accurately entered into a computer-readable file  Checking that character variables contain only valid values  Checking that numeric values are within predetermined ranges  Checking for & eliminating duplicate data entries  Checking if there are missing values for variables where complete data are necessary Definition of data cleaning

Definition of data cleaning …contd  Checking for uniqueness of certain values, such as subject IDs  Checking for invalid data values & invalid date sequences  Verifying that complex multi-file (or cross panel) rules have been followed. For e g., if an AE of type X occurs, other data such as concomitant medications or procedures might be expected

Documents to be Followed  Protocol  Guidelines – General & Project-Specific  SOPs  Subject Flowcharts  Clean Patient Check Lists  Tracking Spreadsheets

Clean Data Checklist  Refers to a list of checks to be performed by data management while cleaning database  Checklist is developed & customized as per client specifications  Provides list of checks to be performed both on  ongoing/periodic basis  towards end of study  Strict adherence to checklist prevents missing out on any of critical activities

Point-by- point checks Textual data Continuing events Query generation Query integration Missing data Duplicate data Protocol violation Coding SAE Recon Data consistency Ranges External data Visit sequence Data Cleaning

Point-by-Point Checks  Refers to cross checking between CRF & database for every data point  Constitutes a “second-check” apart from data entry  Incorrect entries/entries missed out by Data Entry are corrected during cleaning  Special emphasis to be given for  Dates  Numerical values  Header information (including indexing)

Missing Data Checks  Missing responses to be queried for, unless indicated by investigator as  not done  not available  not applicable  Validations to be programmed to flag missing field discrepancies Missing Data…!!

Missing Page Checks  Expected pages identified during setup of studies  Tracking reports of missing pages to be maintained to identify  CRFs misrouted in-house  CRFs never sent from Investigator’s site AE Record

Protocol Violation Checks  Protocol adherence to be reviewed & violations, if any, to be queried  Primary safety & efficacy endpoints to be reviewed, to ensure protocol compliance

Key protocol violations  Inclusion & exclusion criteria adherence  Age  Concomitant medications/antibiotics  Medical condition  Study drug dosing regimen adherence  Study or drug termination specifications  Switches in medications

Continuity of Data Checks  Refers to checking continuity of events that occur  across study  across visits  Includes  Adverse Events  Medications  Treatments/Procedures  Overlapping Start/Stop Dates & Outcomes to be checked across visits

Continuity of Data Checks Overlapping dates across visits: Scenario: Per protocol, AEs are to be recorded on Visits 1, 2 & 3 “Headache” is recorded as follows: VisitStart DateStop DateOutcome 101-Jan Jan-2004Continues 201-Jan Jan-2004Resolved 320-Jan-2004 Resolved

Consistency Checks  Designed to identify potential data errors by checking  sequential order of dates  corresponding events  missing data (indicated as existing elsewhere)  Involves cross checking between data points  across CRFs  within same CRF

Consistency Checks Cross check across different CRFs: AE reported with action “concomitant medication” (AE Record) Ensure corresponding concomitant medication reported in appropriate timeframe (Concomitant Medication Record) EventStart DateStop DateOutcome Fever13-Jun Jun-2005Resolved MedicationStart DateStop DateOutcome Paracetamol14-Jun Jun-2005Stopped

Consistency Checks Cross check within same CRF: 1 st DCM: Report doses of antibiotics taken “before” intake of first dose of study drug 2 nd DCM: Report doses of antibiotics taken “after” intake of first dose of study drug: NOTE: First dose of study drug is taken on 15-May-2001 AntibioticDoseRouteStart DateStop Date Amoxicillin6 mgOral11-May May-2001 AntibioticDoseRouteStart DateStop Date Streptomycin7 mgIV16-May May-2001

Coding Checks  Textual or free text data collected & reported (AEs, medications) must be coded before they can be aggregated & used in summary analysis  Coding consists of matching text collected on CRF to terms in a standard dictionary  Items that cannot be matched, or coded without clarification from site  Ulcers, for example, require a location (gastric, duodenal, mouth, foot, etc.) to be coded code

Range Checks  Designed to identify  statistical outliners  values that are physiologically impossible  values that are outside normal variation of population under study  Ensure that appropriate range values are applied  For eg., ranges for WBCs can be applied either in ‘percentage’ or in ‘absolute’  Ensure that appropriate ranges are applied depending on whether lab used is  Primary  Secondary

Range Checks Cross check between Hematology record & AE record: Hematology Test DateResultNormal Range WBC05-Jan ,710 cells/µL/cu mm 4, ,800 cells/µL/cu EventStart DateStop DateOutcome Streptococcal infection 04-Jan Jan-2006Resolved

External Data Checks  Ensure receipt of all required external data from centralized vendors:  Laboratory Data  Device Data (ECG, Bioimages)  Missing e-data records to be tracked & requested from vendor on a periodic basis  Missing data to be noted & corresponding values to be ‘re-loaded’ by vendor

External Data Checks Examples of missing data/values:  Missing collection time of blood sample  Missing date of ECG  Missing location of chest radiograph  Missing systolic blood pressure  Missing microbiological culture transmittal ID

External Data Checks Examples of invalid data/values:  Incorrect loading of visit number  Incorrect loading of subject number  Incorrect loading of date/time of collection

Duplicate Data Checks  Refers to duplicate entries  within a single CRF  across similar CRFs  Duplicate entries & duplicate records to be deleted per guideline specifications Examples:  Treatment ‘physiotherapy’ on ‘30-Aug-2001’ reported twice on either same Treatment Record or across two different Treatment Records

Duplicate Data Checks Examples:  Both Visit 4 & Visit 10 Blood Chemistry CRFs (with different collection dates) are updated with same values for all tests performed  Both ‘primary’ & ‘additional’ Medical History CRFs at Screening are reported with same details of abnormalities Which one to Retain…?

Textual Data Checks  All textual data to be proofread & checked for spelling errors  Obvious mis-spellings to be corrected per Internal Correction (as specified by guidelines)  Common examples of textual data:  Abnormalities/pre-existing conditions in Medical History record  Adverse Events  Medications/Antibiotics  Project & study-specific data

Visit Sequence Checks  Sequence of visits should be reviewed & if out of sequence, should be either  queried  corrected per Internal Correction (as per guidelines)  Either a single CRF or a group of CRFs could be out of sequence with that particular visit

Visit Sequence Checks VisitVisit date 101-Jan Jan Jan Jan-2000 VisitVitals Record Date of Vitals 101-Jan Jan Jan Jan-2000 Screening RecordVisit date Demography20-Feb-2006 Med. History20-Feb-2005 Inclusion Criteria 20-Feb-2006 AE20-Feb-2006

SAE Reconciliation Checks  All SAEs reported on CRFs should be reconciled with those reported on SAE Reports & vice versa  Communication to be maintained with  Sponsor  Clinical Scientist

Questions ?

Thank you!