FDA Hematology and Pathology Devices Panel Meeting October 22-23, 2009

FDA Hematology and Pathology Devices Panel Meeting October 22-23, 2009
Practical Issues on Clinical Validation of Digital Imaging Applications in Routine Surgical Pathology FDA Hematology and Pathology Devices Panel Meeting October 22-23, 2009 Tan Nguyen, MD, PhD, RAC FDA/CDRH/OIVD/DIHD-DCTD

Digitalization Not a Barrier to Pathologic Diagnosis
Image-based telepathology having been in place for a number of years Availability of capable automated high-speed, high-resolution whole slide imaging technology (WSI) At issue: How can we demonstrate that pathologists can safely and effectively sign out routine surgical cases via WSI of H&E glass slides? Compare with diagnoses made by light microscopy

Presentation Outline Quality of images Clinical performance study
Image acquisition, image display Clinical performance study Possible study designs Selection of study participants Case (specimen) selection Establishing “reference” diagnosis Evaluating diagnosis agreement Other issues

Image Acquisition Optimal objective lens power for image scanning?
Digital magnification or magnification by interchangeable objective lenses? Single-focus plane or 3-D image enhancement? Z-stacks needed for certain examinations (e.g., surgical margin, H. pylori, microcalcifications, nucleoli) Compression algorithm, user-selectable ratio? Diagnosis made on uncompressed image or image retrieved from prior compressed image data file?

Image Display Viewing monitor Viewing software Viewer functionality
Standardized size, aspect ratio, display resolution (low, medium, high)? Viewing software Image storage, retrieval, annotation Viewer functionality “Thumbnail” view Panning, zooming, side-by-side viewing of multiple images

Types of Possible Clinical Study
Prospective study (“field study”)? Replicating real-world surgical pathology practice Minimizing case selection bias Introducing multiple new sources of variation e.g., non-uniform specimen selection/suitability, variable quality of glass slides Impractical? Resource constraints at each study site Possibly longer overall study duration

Types of Possible Clinical Study
Retrospective study? Ability to select archival cases to challenge (“stress test”) the competing diagnostic modalities Possible to incorporate more case variation Inherent case selection bias Often employed in MRMC ROC studies* to assess diagnostic accuracy of radiologic imaging interpretations Large study to detect small differences in accuracy possible * Multiple-reader multiple-case receiver operating characteristic studies

MRMC ROC Paradigm Possible to adopt MRMC ROC paradigm?
Frequently used tool in diagnostic radiology More information per case, smaller sample sizes Ability to compare accuracy of diagnostic modalities that rely on wide range of subjective interpretations by readers of varying skill degrees Generalizable to similar readers and similar cases Potentially complicated by multiple observations (diagnoses) in the same specimen

Selecting Study Participants
Spectrum of pathologists without formal specialty training to specialty experts or more homogeneous population? Prior exposure to digital pathology Study locations Community/academic practices, commercial laboratories Number of study participants? Traditional MRMC ROC studies: readers; cases

Selecting a Balanced Set of Cases
Adequate mix of biopsies to radical excisions Broad spectrum of diagnostic complexity Not based on ease of diagnosis, typicality of appearance Randomly or sequentially selected specimens Anonymized archival or prospectively collected cases Use of enriched samples for low-prevalence diseases? Including all or only representative diagnostic part(s)? How many cases? Statistical power against reader’s burden

Observer Variation Inherent subjectivity in interpretation thresholds
e.g., “atypia,” tumor grading, borderline or uncommon lesions Paucity of lesional area; intra-lesional variation Lack of clear diagnostic criteria Non-quantitative nature of scoring (e.g., pleomorphism) Subjective distinctions on a histologic continuum Broad spectrum of experience and confidence Diagnostic “aggressiveness” or hedging in uncertainty

Reducing Observer Variation
Strict adherence to diagnostic criteria and guidelines Use of pro forma histopathology reporting form Use of checklist standardized diagnostic lines Free-text diagnosis for diagnostic uncertainty? Accommodating personal reporting style, judgment Statistically problematic to evaluate Collapsed 2-tiered versus 3-tiered grading system? Circulating an annotated training set prior to study?

Establishing Light Microscopy “Reference” Diagnosis
Diagnosis by expert or consensus panel? Number of experts Consensus diagnosis by study participants? Unanimous agreement or majority agreement? Allowing “acceptable” diagnosis? Disagreement in opinion, but not “error” (i.e., no amendment necessary)? “Reference” diagnosis be abstraction of the primary diagnosis or all diagnostic lines?

Evaluating Diagnosis Agreement
Primary diagnosis agreement only? 2o diagnoses often posing no clinical impact Unacceptable for a pathologist simply to make accurate diagnosis of malignancy! Line-by-line agreement (1o and 2o diagnoses)? Ideal for collecting performance testing data Unrealistic to expect high agreements without clearly defined diagnostic criteria for all lesions under inquiry Incomplete agreement on 2o diagnoses?

“Major” versus “minor” discrepancy Determined based on clinical impact or flat-out histopathologic error? Compound nevus versus junctional nevus; CIN II versus CIN III → flat-out error, but no difference in treatment Tumor on inked margin or within 1 mm of inked margin in breast biopsy → often a subjective call if specimen not adequately inked, but greatly affecting treatment decision False-positive versus false-negative diagnosis? Treated differently or equally in statistical evaluation?

Panel’s “reference” diagnoses (light microscopy) R3 Participants’ diagnoses by light microscopy Participants’ diagnoses by digital pathology

“Wash-out” Period E.g., a study involving the same pathologist reading: ½ cases: digital imaging followed by light microscopy ½ cases: light microscopy followed by digital imaging “Wash-out” period between digital imaging reading and light microscopy reading? Easier said than done! Not necessary, if desirable to know whether one modality, when seen first, resulting in improved agreement rate of the subsequent one?

If significant disagreement between R1, R2 , and R3: Case-sample variation Intra- and interobserver variations Variation intrinsic to each diagnostic modality Possible or need to tease out all variations? Or, account for effects of case and reader variations on accuracy of competing diagnostic modalities? e.g., by MRMC statistical models; then comparing the overall accuracy

Other Issues Assuming valid performance data exist for one tissue type (e.g., breast pathology): Can the test system be generalized and labeled for all other surgical pathology tissue types without the need for further validations? Can it be generalized and labeled for intraoperative (frozen section) diagnosis and telepathology? If not, how should the label explicitly state the test system’s limitations?

Other Issues Generalizing performance of WSI of H&E glass slides to non-H&E-stained glass slides? Required training of pathologists prior to using WSI? What type of training? Need for post-marketing study for additional safety and effectiveness data? How to conduct such study? What data to collect?

FDA Hematology and Pathology Devices Panel Meeting October 22-23, 2009

Similar presentations

Presentation on theme: "FDA Hematology and Pathology Devices Panel Meeting October 22-23, 2009"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

FDA Hematology and Pathology Devices Panel Meeting October 22-23, 2009

Similar presentations

Presentation on theme: "FDA Hematology and Pathology Devices Panel Meeting October 22-23, 2009"— Presentation transcript:

Similar presentations

About project

Feedback