Download presentation
Presentation is loading. Please wait.
Published byΈρασμος Δασκαλόπουλος Modified over 6 years ago
1
Joel Saltz MD, PhD Director Center for Comprehensive Informatics
Data Archive Issues: Integrative Analysis of Pathology, Radiology and High Throughput Molecular Data Joel Saltz MD, PhD Director Center for Comprehensive Informatics
2
Objectives of Integrative Analysis
Create categories of jointly classified data to describe pathophysiology, predict prognosis, response to treatment Reproducible anatomic/functional characterization at gross level (Radiology) and fine level (Pathology) Integration of anatomic/functional characterization with multiple types of “omic” information Data modeling standards, semantics Curation and propagation of results
3
Informatics Approach Integrative analysis of complementary data sources to improve ability to forecast survival & response Leverage public “omic” datasets Create, analyze microscopy and Radiology datasets Radiology Imaging Patient Outcome “Omic” Data Pathologic Features
4
In Silico Center for Brain Tumor Research
Specific Aims: Influence of necrosis/ hypoxia on gene expression and genetic classification. Molecular correlates of high resolution nuclear morphometry. Gene expression profiles that predict glioma progression. 4. Molecular correlates of MRI enhancement patterns.
5
In Silico Center for Brain Tumor Research
Key Public Data Sets REMBRANDT: Gene expression and genomics data set of all glioma subtypes The Cancer Genome Atlas (TCGA): Rich “omics” set of GBM, digitized Pathology and Radiology Vasari Feature Set: Standardized annotation of gliomas of all subtypes
6
TCGA Research Network Digital Pathology Neuroimaging
7
Pathology Image Analysis and in Silico Example: Distinguishing Among the Gliomas
“There are also many cells which appear to be transitions between gigantic oligodendroglia and astrocytes. It is impossible to classify them as belonging in either group” Bailey P, Bucy PC. Oligodendrogliomas of the brain. J Pathol Bacteriol 1929: 32:735
8
Nuclear Qualities Oligodendroglioma Astrocytoma
9
Interaction between gene expression classification and Pathology Characterization
Hierarchical clustering of 176 Rembrandt samples using TCGA classification genes defines four major subtypes. Proneural Neural Mesenchymal Classical (Lee Cooper and Carlos Moreno)
10
Predicting Recurrence/Survival
Lee Cooper Carlos Moreno
11
TCGA: GBM Frozen Sections
179 cases were assessed for % necrosis on frozen section slides for TCGA quality assurance. Cox-based regression analysis for % necrosis vs. gene expression (795 probe sets; 647 distinct genes)
12
Network Analysis based on % Necrosis
Carlos Moreno David Gutman
13
Identify correlates of MRI enhancement patterns in astrocytic neoplasms with underlying vascular changes and gene expression profiles. Rim-enhancement Vascular Changes Rapid progression No enhancement Normal Vessels Stable lesion ?
14
Angiogenesis Segmentation
H&E Image Color Deconvolution Hematoxylin Eosin Eosin intensity image Eosin Image Spatial Norm. Density Image Calculation Boundary Smoothing Density Image Object ID Segmented Vessels Angiogenic Segmentation
15
Sharing and Archiving Multi-scale Integrative In Silico Experiments
Metadata Identifiers Federated Archives Semantic Query Support 9/17/20189/17/2018
16
Metadata Precisely and unambiguously describe an in silico experiment (reproducibility and reuse for new analyses) Experiment design and protocol Input datasets: context, characteristics, provenance Analysis methods and workflows: at different granularities Analysis results Semantic information: datasets, results, workflows Standards based, common data elements, and controlled vocabularies Versioned, scalable, and computable semantics Different metadata models for different data types Linkages through backbone models, common attributes, ontologies 9/17/20189/17/2018
17
Metadata: Semantic Information
Data and semantic models to represent virtual slide related image, annotation, markup and feature information. context relating to patient data, specimen preparation, special stains etc, human observations involving Path classification and characteristics and algorithm and human-described segmentations, features and classifications. Semantic description of the computation being carried out, semantic and concrete identification of input and output datasets. Desirable to have computable semantics 9/17/20189/17/2018
18
Identifiers Databases of in silico experiments may be updated over time Data may be moved to different archives Globally uniquely identify an in silico experiment and its components Should be possible to validate an identifier Scalable mechanisms for generation, assignment, and resolution of identifiers Leverage standards such as IHE PIX, LSID, URI 9/17/20189/17/2018
19
Federated Archives May not be possible to store in silico experiments and its components in a centralized location Multiple types of input data and results may be hosted in different systems caArray AIM and PAIS services PACS and NBIA Interoperability is critical Common interfaces Common information models: common data elements and controlled vocabularies 9/17/20189/17/2018
20
Semantic Query Capabilities
Ability to query metadata on experiments Ability to query data and analysis results Analysis results are semantically complex Features, shapes, annotations Use of domain ontologies Scalable mechanisms for query Large volumes of analysis results Millions of nuclei in high-resolution microscopy images 9/17/20189/17/2018
21
Challenges Structural and semantic metadata management: how to manage tradeoff between flexibility and curation Semantic modeling infrastructures and policies able to scale to handle distributed systems with an aggregate of 109 or more data models/concepts. How to curate changes to ontologies/data models and how to curate instance data in face of changing semantic models
22
Challenges Describe, store and replay analyses
Modeling needs to include a semantic description of the computation being carried out, curation/semantic description of input and output datasets. Identifiers and version control Computer assisted annotation and markup for very large datasets - large collection of cascaded algorithm/parameter dependencies challenge reproducibility
23
Thanks to: In silico center team: Dan Brat (Science PI), Tahsin Kurc, Ashish Sharma, Tony Pan, David Gutman, Jun Kong, Sharath Cholleti, Carlos Moreno, Chad Holder, Erwin Van Meir, Daniel Rubin, Tom Mikkelsen, Adam Flanders, Joel Saltz (Director) caGrid Knowledge Center: Joel Saltz, Mike Caliguiri, Steve Langella co-Directors; Tahsin Kurc, Himanshu Rathod Emory leads caBIG In vivo imaging team: Eliot Siegel, Paul Mulhern, Adam Flanders, David Channon, Daniel Rubin, Fred Prior, Larry Tarbox and many others In vivo imaging Emory team: Tony Pan, Ashish Sharma, Joel Saltz Emory ATC Supplement team: Tim Fox, Ashish Sharma, Tony Pan, Edi Schreibmann, Paul Pantalone Digital Pathology R01: Foran and Saltz; Jun Kong, Sharath Cholleti, Fusheng Wang, Tony Pan, Tahsin Kurc, Ashish Sharma, David Gutman (Emory), Wenjin Chen, Vicky Chu, Jun Hu, Lin Yang, David J. Foran (Rutgers)
24
Thanks!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.