Presentation is loading. Please wait.

Presentation is loading. Please wait.

Integrative Analysis of Pathology, Radiology and High Throughput Molecular Data Joel Saltz MD, PhD Director Center for Comprehensive Informatics.

Similar presentations


Presentation on theme: "Integrative Analysis of Pathology, Radiology and High Throughput Molecular Data Joel Saltz MD, PhD Director Center for Comprehensive Informatics."— Presentation transcript:

1 Integrative Analysis of Pathology, Radiology and High Throughput Molecular Data Joel Saltz MD, PhD Director Center for Comprehensive Informatics

2 Objectives Reproducible anatomic/functional characterization at gross level (Radiology) and fine level (Pathology) Integration of anatomic/functional characterization with multiple types of “omic” information Create categories of jointly classified data to describe pathophysiology, predict prognosis, response to treatment Data modeling standards, semantics Data research issues

3 In Silico Program Objectives (from NCI) In silico is an expression used to mean "performed on http://en.wikipedia.org/wiki/Computer or via http://en.wikipedia.org/wiki/Computer_simulation.” (Wikipedia) http://en.wikipedia.org/wiki/Computer http://en.wikipedia.org/wiki/Computer_simulation In silico science centers: support investigator-initiated, hypothesis-driven research in the etiology, treatment, and prevention of cancer using in silico methods Generating and publishing novel cancer research findings leveraging caBIG tools and infrastructure Identifying novel bioinformatics processes and tools to exploit existing data resources Encouraging the development of additional data resources and caBIG analytic services Assessing the capabilities of current caBIG tools Emory, Columbia, Georgetown, Fred Hutchinson Cancer, Translational Genomics Research Institute

4 In Silico Center for Brain Tumor Research Specific Aims: 1.Influence of necrosis/ hypoxia on gene expression and genetic classification. 2. Molecular correlates of high resolution nuclear morphometry. 3.Gene expression profiles that predict glioma progression. 4.Molecular correlates of MRI enhancement patterns. TCGA, Rembrandt, Vasari breaks down into a category of: Molecular Data Image Data Tissues Specimens This category breaks down into: Molecular Data Generation Digitized Pathology Slides Molecular Data, Image Data, Tissue Specimens then lead to a category of (respectively): Molecular Analysis From Molecular Data Radiology Image Analysis & Annotation From Image Data From the molecular data generation of tissue specimen Pathology Image Analysis & annotation From the digitized pathology slides Molecular Analysis, Radiology Image Analysis & Annotation, and Pathology Image Analysis & Annotation will all go to: Correlative Analysis, Quality Control, Clinical and Biologic Assessment From here it can branch off to the Refinement of Analysis Methods which would lead it back the original category of Molecular Analysis, Radiology Image Analysis & Annotation, and Pathology Image Analysis & Annotation OR it will lead to another category of: Research Reports caBIG-compatible, caGrid-enabled data and analytical resources

5 Informatics Requirements Parallel initiatives Pathology, Radiology, “omics” Exploit synergies between all initiatives to improve ability to forecast survival & response. Informatics Requirements involves four areas: Radiology Imaging Patient Outcome Pathologic Features “Omic” Data These four areas have their own distinct requirements but still have some similarities Radiology Imaging Patient Outco me Pathologic Features “Omic” Data

6 In Silico Center for Brain Tumor Research Key Data Sets REMBRANDT: Gene expression and genomics data set of all glioma subtypes The Cancer Genome Atlas (TCGA): Rich “omics” set of GBM, digitized Pathology and Radiology Vasari Feature Set: Standardized annotation of gliomas of all subtypes

7 TCGA Research Network The TCGA Research Network starts off with data at the Molecular Analysis from BCR level. Data will then branch off into different paths depending on what needs to be analyzed. Ultimately, all the data will end up at the same place; at the Integrated Multi-Dimensional Analysis. Starting at Molecular Anaylsis from BCR, data will either go to: Gene Sequencing   Genome Sequencing Centers   Sanger Sequencing Phase I Genes & Phase II Genes   Validation   Gene Mutation   Integrated Multi-Dimensional Analysis DNA Copy Number Analysis   Agilent Human Genome CGH Microarray 244A  Affymetrix Genome Wide SNP Array 6.0  Illumina Infinium 550K Bead Chip   Copy Number Change Consensus   Integrated Multi-Dimensional Analysis DNA Methylation Analysis   Illumina GoldenGate BeadArray   Integrated Multi-Dimensional Analysis Transcriptome Analysis   Affymetrix Human Genome U133 Plus 2.0 Array  Agilent 244K Array  Affymetrix GeneChip Human Exon 1.0 ST Array   Gene Expression Consensus   Integrated Multi-Dimensional Analysis MicroRNA Analysis   Agilent microRNA   Integrated Multi-Dimensional Analysis The only exception is Clinical Data, which goes straight to Integrated Multi-Dimensional Analysis. Clinical data includes digital pathology and nueroimaging Digital Pathology Neuroimaging

8 Distinguishing Among the Gliomas “There are also many cells which appear to be transitions between gigantic oligodendroglia and astrocytes. It is impossible to classify them as belonging in either group” Bailey P, Bucy PC. Oligodendrogliomas of the brain. J Pathol Bacteriol 1929: 32:735

9 Correlate nuclear shape and texture features of gliomas to genetics and gene expression defined by REMBRANDT and TCGA data sets. Define specific features that carry genetic and prognostic weight. Determine molecular correlates of high resolution nuclear morphometry of gliomas using Rembrandt and TCGA datasets.

10 Nuclear Qualities With the Oligodendroglioma microscope image, you can see the nucleus is round, like a circle. With the Astrocytoma image, you can see that the nucleus is more oblong shaped. Oligodendroglioma Astrocytoma

11 TCGA Neuropathology Attributes 120 TCGA specimens; 3 Reviewers Presence and Degree of: Microvascular hyperplasia  Complex/glomeruloid  Endothelial hyperplasia Necrosis  Pseudopalisading pattern  Zonal necrosis Inflammation  Macrophages/histiocytes  Lymphocytes  Neutrophils Differentiation:  Small cell component  Gemistocytes  Oligodendroglial  Multi-nucleated/giant cells  Epithelial metaplasia  Mesenchymal metaplasia Other Features  Perineuronal/perivascular satellitosis  Entrapped gray or white matter  Micro-mineralization

12 TCGA Whole Slide Images Feature Extraction For feature extraction, it includes four different categories. Each category has different subcategories. 1.Nuclear Morphometry 1.Nuclei Area 2.Nuclei Perimeter 3.Eccentricity 4.Circularity 5.Major Axis 6.Minor Axis 7.Extent Ratio 8.Fourier Shape Descriptor 2.Intensity Information 1.Avg Inty 2.Std Inty 3.Max Inty 4.Min Inty 3.Texture Information 1.Entropy 2.Energy 3.Skewness 4.Kurtosis 4.Gradient Statistics 1.Avg GM 2.Std GM 3.Etropy GM 4.Skewness GM 5.Energy GM 6.Kurtosis GM 7.Edge Pixel Summation 8.Edge Pixel Percentage Jun Kong

13 Astrocytoma vs Oligodendroglima Overlap in genetics, gene expression, histology Astrocytoma vs Oligodendroglima Assess nuclear size (area and perimeter), shape (eccentricity, circularity major axis, minor axis, Fourier shape descriptor and extent ratio), intensity (average, maximum, minimum, standard error) and texture (entropy, energy, skewness and kurtosis).

14 Machine-based Classification of TCGA GBMs (J Kong) Whole slide scans from 14 TCGA GBMS (69 slides) 7 purely astrocytic in morphology; 7 with 2+ oligo component 399,233 nuclei analyzed for astro/oligo features Cases were categorized based on ratio of oligo/astro cells TCGA Gene Expression Query: c-Met overexpression

15 Nuclear Feature Analysis: TCGA Using the parallel computation infrastructure of Sun Grid Engine, we analyzed image tiles of 4096x4096 of 213 whole- slide TCGA images of permanent tissue sections. Approximately 90 million nuclei segmented. 79 patients:57 are diagnosed as GBM (‘oligo 0’) 17 are classified as GBM with ‘oligo 1’, 5 as GBM with ‘oligo 2+’. With each data file including all nuclear features from one patient, all nuclei were classified with color blue, green, and red representing nuclei scored as 1~3, 4~6, and 7~10, respectively.

16 Discriminating Features (Grade 1 vs. Grade 7-10) Graph compares the difference between oligo 0 and oligo 2 in the probability density function vs. circularity. Oligo 2 peaks at 0.03 (probability density function) and about a 0.6 (circularity) Oligo 0 peaks at a little above 0.03 but not close enough to 0.035 (probability density function) and a little over 1 (circularity) but under 1.2

17 Progression to GBM Anaplastic Astrocytoma (WHO grade III) Glioblastoma (WHO grade IV) Is a devastating disease with a very poor prognosis. Brain tumors are divided into 4 WHO grades degree of: Grade 1 – rarely advances to the other stages.pilocytic Grade 4 the highest grade is most clinically aggressive. Is very rapidly growing, infiltrative, hypercellularity, nuclear atypia, mitoses, necrosis, and endothelial cell proliferation.

18 Imaging Pathology Molecular Time 1 – 8 yrs Examine gene expression profiles of low grade gliomas that progress to GBM for predictive clustering and correlates with pathologic and radiologic features.

19 Hierarchical clustering of 176 Rembrandt samples using TCGA classification genes defines four major subtypes. Proneural Neural Mesenchymal Classical Image display the clustering of 176 Rembrandt samples using TCGA classification genes to define four major subtypes: From left to right: A) Proneural B) Neural C) Mesenchymal D) Classical (Lee Cooper and Carlos Moreno)

20 Lee Cooper Carlos Moreno Predicting Recurrence/Survival Top Left Graph: Survival vs. Time (months) of All Cases (n=176) Compares the four major subtypes: Proneural, Neural, Mesenchymal, Classical Top Right Graph: Survival vs. Time (months) of GBM Cases (n=101) Compares the four major subtypes: Proneural, Neural, Mesenchymal, Classical Bottom Left Graph: Survival vs. Time (months) of Oligodendrogliomas (n=43) Compares Classical, Neural, Proneural Bottom Right Graph: Survival vs. Time (months) of Astrocytomas (n=32) Compares the four major subtyes: Proneural, Neural, Mesenchymal, Classical

21 GBM: necrosis, hypoxia, angiogenesis and gene expression Does the presence or degree of necrosis within digitized frozen section slides correlate with specific gene expression patterns or determine algorithm- based unsupervised clustering of GBMs gene expression categories? Does the presence or degree of necrosis influence the type of angiogenesis or pro-angiogenic gene expression patterns within human gliomas?

22 GBM: % Necrosis

23 TCGA: GBM Frozen Sections 179 cases were assessed for % necrosis on frozen section slides for TCGA quality assurance. Cox-based regression analysis for % necrosis vs. gene expression (795 probe sets; 647 distinct genes)

24 Carlos Moreno David Gutman Network Analysis based on % Necrosis

25 Identify correlates of MRI enhancement patterns in astrocytic neoplasms with underlying vascular changes and gene expression profiles. No enhancement Normal Vessels Stable lesion ? Rim-enhancement Vascular Changes Rapid progression

26 Angiogenesis Segmentation H & E Image Color Deconvolution Hematoxylin Image Eosin Image Spatial Norm. Density Calculation Density Image Object ID Boundary Smoothing Segmented Vessels Eosin intensity image Angiogenic Segmentation H&E Image Color Deconvolution Hematoxylin Image Eosin Image Eosin Image Spatial Norm. Density Image Density Calculation Boundary Smoothing Density Image Object ID Segmented Vessels

27 States of Angiogenesis The stages of angiogenesis include: Endothelial hypertrophy Endothelial hyperplasia Complex microvascular hyperplasia Endothelial Hypertrophy Complex Microvascular Hyperplasia Endothelial Hyperplasia Lee Cooper Sharath Cholleti

28 Vessel Characterization Bifurcation detection

29 Vasari Imaging Criteria (Adam Flanders, TJU; Dan Rubin, Stanford, Lori Dodd, NCI) Require standardized validated feature sets to describe de novo disease. Fundamental obstacle to new imaging criteria as treatment biomarkers is lack of standard terminology: To define a comprehensive set of imaging features of cancer For reporting imaging results To provide a more quantitative, reproducible basis for assessing baseline disease and treatment response

30 Defining Rich Set of Qualitative and Quantitative Image Biomarkers Community-driven ontology development project; collaboration with ASNR Imaging features (5 categories) Location of lesion Morphology of lesion margin (definition, thickness, enhancement, diffusion) Morphology of lesion substance (enhancement, PS characteristics, focality/multicentricity, necrosis, cysts, midline invasion, cortical involvement, T1/FLAIR ratio) Alterations in vicinity of lesion (edema, edema crossing midline, hemorrhage, pial invasion, ependymal invasion, satellites, deep WM invasion, calvarial remodeling) Resection features (extent of nCE tissue, CE tissue, resected components)

31 Coupling silico methodology with a clinical trial: Will Treatment work and if not, why not? Example: Avastin and Glioblastoma as in RTOG-0825 (plus institutional accrual) Treatment: Radiation therapy and Avastin (anti angiogenesis) Predict and Explain: Genetic, gene expression, microRNA, Pathology, Imaging Imaging/RT reproducibility, Integration with EMR, PACS, RT systems

32 Use of caBIG Tools in Emory in Silico Brain Tumor Research Center

33 Same Infrastructure – Different Integrative Project: Image Mining for Comparative Analysis of Expression Patterns in Tissue Microarray (PI’s: Foran and Saltz) Build reference library of expression signatures, integrate state-of-the-art multi-spectral imaging capability and build a deployable clinical decision support system for analyzing imaged specimens. Technologies and computational tools developed during the course of the project to be tested on a Grid-enabled, virtual laboratory established among strategic sites located at CINJ, Emory, RU, UPenn, OSU, and ASU. Funded by NIH through grant #5R01LM009239-02 David J. Foran, Ph.D.

34 Annotations and Imaging Markup (AIM) Annotations and Imaging Markup Developer (AIM) provides a standard for medical image annotation and markup for images used in the research space, and in particular, the image based cancer clinical trial. It is notable that there is no existing standard for radiology annotation and markup. The caBIG ® program is working with almost every standards body such as DICOM to elicit consensus regarding use of AIM as the accepted standard for radiology annotation and markup, and is positioned to extend AIM to digital pathology. The pixel at the tip of the arrow [coordinates (x,y)] in this image [DICOM: 1.2.814.234543.23243] represents the Ascending Thoracic Aorta [SNOMED:A3310657] Aim Data Service Emory (AIME) is a caGrid data service that manages AIM documents in XML databases. AIME supports query, enumerationQuery, queryByTransfer, submit and submitByTransfer methods

35 Modeling and Managing Pathology Image Analysis Results Pathology images contain rich metadata, including both annotations and image markups, made either by humans or computer-generated approaches The metadata can be used for sharing knowledge about images and for comprehensive data analysis No applicable data model that can effectively support the modeling, managing, querying and sharing of such metadata Our goal: develop a generic and effective data model standard for pathology image analysis and characterization * Joint work with Center for Biomedical Imaging & Informatics, The Cancer Institute of New Jersey

36 Challenges Large scale image metadata set One whole slide image may contain up to a million markups Each markup could contain dozens of annotations e.g., 7GB metadata in XML for one image Support of comprehensive queries Metadata based queries such as: Count nuclei where [ grade ≤ 3 ] Spatial queries such as: Find density of nuclei where [ 1 ≤ grade ≤ 3 ] in selected ROI Find brain tumor nuclei classified by observer O and brain tumor nuclei classified by observer P exhibit spatial overlaps Semantic queries, e.g., reasoning on spatial relationships and ontology relationships

37 Complex Data Complex data objects Image: resolution, magnification, region, coordinate reference Markup: geometric shape such as line, polygon, multiline and multipolygon, or image masks Annotation: observation, inference, machine computed feature or classification, to external annotation Provenance: derivation history of results Complex relationships Multi-level granularities: image level, tile level, cellular or subcellular anatomic entity level Multiple annotations per image, derived annotations

38 Pathology Analytical Imaging Standards PAIS |pās| : Pathology Analytical Imaging Standards Designed to provide a standardized, semantically enabled data model to support pathology analytical imaging PAIS provides highly generalized data objects, comprehensive data types, and flexible relationships Storage and performance efficiency oriented, alternative implementations Object-oriented design, easily extensible Reuse existing standards Reuse relevant classes already defined in AIM Follow DICOM WG 26 metadata specifications on WSI reference Specimen information in DICOM Supplement 122 and caTissue Use caDSR for CDE and NCI Thesaurus for ontology concepts

39 PAIS The logical model is defined in UML, and consists of 62 classes, classified as following categories: Image reference information: the reference and metadata of the images Image target information: who, where, and how the images are generated Organizational information: who performs the study and annotation and for what purpose Markup: graphical symbols representing areas on images Annotation: explanatory or descriptive information on top of markups or images AnnotationReference: reference to external annotations Provenance: computational derivation history of objects

40 Thanks to: In silico center team: Dan Brat (Science PI), Tahsin Kurc, Ashish Sharma, Tony Pan, David Gutman, Jun Kong, Sharath Cholleti, Carlos Moreno, Chad Holder, Erwin Van Meir, Daniel Rubin, Tom Mikkelsen, Adam Flanders, Joel Saltz (Director) caGrid Knowledge Center: Joel Saltz, Mike Caliguiri, Steve Langella co-Directors; Tahsin Kurc, Himanshu Rathod Emory leads caBIG In vivo imaging team: Eliot Siegel, Paul Mulhern, Adam Flanders, David Channon, Daniel Rubin, Fred Prior, Larry Tarbox and many others In vivo imaging Emory team: Tony Pan, Ashish Sharma, Joel Saltz Emory ATC Supplement team: Tim Fox, Ashish Sharma, Tony Pan, Edi Schreibmann, Paul Pantalone Digital Pathology R01: Foran and Saltz; Jun Kong, Sharath Cholleti, Fusheng Wang, Tony Pan, Tahsin Kurc, Ashish Sharma, David Gutman (Emory), Wenjin Chen, Vicky Chu, Jun Hu, Lin Yang, David J. Foran (Rutgers)


Download ppt "Integrative Analysis of Pathology, Radiology and High Throughput Molecular Data Joel Saltz MD, PhD Director Center for Comprehensive Informatics."

Similar presentations


Ads by Google