Integrative omics analysis Qi Liu Center for Quantitative Sciences Vanderbilt University School of Medicine

Slides:



Advertisements
Similar presentations
What is neural stemness? Why is it important? What are the molecular signatures of neural stemness? What are the regulatory networks that control neural.
Advertisements

Inferring Quantitative Models of Regulatory Networks From Expression Data Iftach Nachman Hebrew University Aviv Regev Harvard Nir Friedman Hebrew University.
Regulomics II: Epigenetics and the histone code Jim Noonan GENE760.
Gene Regulation in Eukaryotic Cells. Gene regulation is complex Regulation, and therefore, expression of a gene is complex. Regulation of these genes.
Statistical methods and tools for integrative analysis of perturbation signatures Mario Medvedovic Laboratory for Statistical Genomics and Systems Biology.
Data integration across omics landscapes Bing Zhang, Ph.D. Department of Biomedical Informatics Vanderbilt University School of Medicine
TCGA(The cancer genome atlas) catalogue genetic mutations responsible for cancer, using genome sequencing and bioinformatics The TCGA is sequencing the.
Next-generation sequencing and PBRC. Next Generation Sequencer Applications DeNovo Sequencing Resequencing, Comparative Genomics Global SNP Analysis Gene.
Transcriptomics Jim Noonan GENE 760.
Introduction Integrative Analysis of Genomic Variants in Carcinogenesis Syed Haider, Arek Kasprzyk, Pietro Lio Artificial Intelligence and Computational.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
The Central Dogma of Molecular Biology (Things are not really this simple) Genetic information is stored in our DNA (~ 3 billion bp) The DNA of a.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Presented by Karen Xu. Introduction Cancer is commonly referred to as the “disease of the genes” Cancer may be favored by genetic predisposition, but.
Geuvadis RNAseq analysis at UNIGE Analysis plans
Characterizing the role of miRNAs within gene regulatory networks using integrative genomics techniques Min Wenwen
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Radiogenomics in glioblastoma multiforme
Experimental validation. Integration of transcriptome and genome sequencing uncovers functional variation in human populations Tuuli Lappalainen et al.
Bioinformatics Brad Windle Ph# Web Site:
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
Karl Clauser Proteomics and Biomarker Discovery Breast Cancer Proteomics and the use of TCGA Mutational Data - Broad Institute update/issues Karl Clauser.
Small RNAs and their regulatory roles. Presented by: Chirag Nepal.
Inferring transcriptional and microRNA-mediated regulatory programs in glioblastma Setty, M., et al.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Marco Magistri , Journal Club. A non-coding RNA (ncRNA) is any RNA molecule that is not translated into a protein “Structural genes encode proteins.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Geuvadis achievements and contributions Robert Häsler, functional genomics.
Introduction to biological molecular networks
ACCELERATING CLINICAL AND TRANSLATIONAL RESEARCH Challenges in Bioinformatics R.W. Doerge Department of Statistics Department Agronomy.
Anthony Gitter Cancer Bioinformatics (BMI 826/CS 838) May 5, 2015
Pan-cancer analysis of prognostic genes Jordan Anaya Omnes Res, In this study I have used publicly available clinical and.
CBioPortal Web resource for exploring, visualizing, and analyzing multidimentional cancer genomics data.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
(1) Genotype-Tissue Expression (GTEx) Largest systematic study of genetic regulation in multiple tissues to date 53 tissues, 500+ donors, 9K samples, 180M.
Advances and challenges in computational modeling and statistical learning of biological systems Qi Liu Department of Biomedical Informatics Vanderbilt.
CRISPRi-based genome-scale identification of functional long noncoding RNA loci in human cells Presented by Nur Ata Bruss and Xinyi Ma.
The Transcriptional Landscape of the Mammalian Genome
A graph-based integration of multiple layers of cancer genomics data (Progress Report) Do Kyoon Kim 1.
Cancer Genomics and Class Discovery
Gene expression.
Driver mutations – Epigenetics – Transcriptomics
Global Transcriptional Dysregulation in Breast Cancer
Biomedical Therapies Foundation Standard 1: Academic Foundation
Dept of Biomedical Informatics University of Pittsburgh
Many Sample Size and Power Calculators Exist On-Line
Regulation of Gene Expression by Eukaryotes
Sequencing Data Analysis
Areas of Research Xia Jiang Assistant Professor
Rasoul Godini, Hossein Fallahi
Chapter 18: Regulation of Gene Expression
Presented by Meeyoung Park
Review Warm-Up What is the Central Dogma?
Network Inference Chris Holmes Oxford Centre for Gene Function, &,
Volume 58, Issue 4, Pages (May 2015)
V13 Multi-omics data integration
Integrative Multi-omic Analysis of Human Platelet eQTLs Reveals Alternative Start Site in Mitofusin 2  Lukas M. Simon, Edward S. Chen, Leonard C. Edelstein,
Galaxy course EMC TraIT Nov 2014_Jenster
Integrative omic approaches for the study of host–pathogen interactions Integrative omic approaches for the study of host–pathogen interactions (A) Proteomic.
Brandon Ho, Anastasia Baryshnikova, Grant W. Brown  Cell Systems 
Proteomics Informatics David Fenyő
Volume 26, Issue 12, Pages e5 (March 2019)
Figure 1. Identification of three tumour molecular subtypes in CIT and TCGA cohorts. We used CIT multi-omics data ( Figure 1. Identification of.
Sequencing Data Analysis
Presentation transcript:

Integrative omics analysis Qi Liu Center for Quantitative Sciences Vanderbilt University School of Medicine

Content Introduction Data Sources Methods Tools Things to be aware

Why?

Genomics WGS, WES Transcriptomics RNA-Seq Epigenomics Bisulfite-Seq ChIP-Seq Small indels point mutation Copy number variation Structural variation Differential expression Gene fusion Alternative splicing RNA editing Methylation Histone modification Transcription Factor binding Functional effect of mutation Network and pathway analysis Integrative analysis Further understanding of cancer and clinical applications TechnologiesData AnalysisIntegration and interpretationPatient What? at least two different types of omics data

Objectives 1.Understand relationships between different types of molecular data 2.Understand the phenotype – latent: disease subtype – Observable: patient outcome

Data sources TCGA

Firehose

cBioPortal

ICGC

COSMIC

ENCODE

FANTOM

GTEX

Methods Sequential or overlap analysis Clustering Correlation analysis Linear regression Network based analysis Bayesian …..

Sequential or overlap analysis Confirmation or refinement of findings – Each data are independently analyzed to get a list of interesting entities – Lists of interesting entities are linked together Chin, K. et al. Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Cancer Cell 10, 529–541 (2006). Lando, M. et al. Gene dosage, expression, and ontology analysis identifies driver genes in the carcinogenesis and chemoradioresistance of cervical cancer. PLoS Genet. 5, e (2009). Beroukhim, R. et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905 (2010).

Correlation analysis Reveal the relationships between different molecular layers – The strength of association indicates in trans-regulation.

miRNA

GSE10843 GSE10833 microRNA miRNA-mRNA correlation miRNA-ratio correlation miRNA-protein correlation mRNA decay Translational repression Combined effect Association of sequence features with estimated mRNA decay or translation repression Site type Site location Local AU-context Additional 3’ pairing Significant inverse Correlation (p<0.005) Supported by TargetScan, miRanda or MirTarget2 microRNA-target interactions 7235 functional relationships Binding evidence 580 interactions 60miRNAs 423 genes Sequence features on site efficacy microRNA-target interactions mRNA i protein/mRNA ratio protein the relative contribution of translation repression 79 miRNAs 5144 genes Integrative method

Features on site efficacy for these two regulation types mRNA decay : 8mer is efficient Tanslational repression : 8mer site do not show significant efficacy mRNA decay : 3’UTR>ORF>5’UTR translational repression : marginal significance in ORF

Features on site efficacy for these two regulation types AU-rich context appears to favor both mRNA decay and translational repression 3’ pairing enhance mRNA decay, but disfavor efficacy for translational repression

miRNA-target Interactions 60 miRNAs, 423 genes 580 interactions, in which 332 (57.2%) was discovered by the integration of proteomics data miRNA-mRNAmiRNA-ratio miRNA-protein miRNA-mRNA TargetScan miRanda MirTarget2 miRNA-ratio miRNA-protein Function Sequence

miR-138 prefers translational repression SW620 and SW480 (derived from the same patient) SW620SW480 sourcelymph nodeprimary metastasishighpoor miR-138 (log 2 )

Estimate the strength of association between different data Predict the outcome by modeling the combined effect of multiple types of data Linear regression

Ridge—L2 penalized Lasso—L1 penalized Elastic net—L1+L2 penalized

Clustering Unsupervised clustering of omics data to find inherent structures – Using common latent variables among all data types

Network based analysis --using inferred networks or known network interactions to guide analysis

Illustrative example of SNF steps The advantage of the integrative procedure is that weak similarities (low-weight edges) disappear, helping to reduce the noise, and strong similarities (high-weight edges) present in one or more networks are added to the others. Additionally, low-weight edges supported by all networks are retained depending on how tightly connected their neighborhoods are across networks.

Patient similarities for each data types compared to SNF fused similarity

Comparison of SNF with icluster and concatenation

Methods

Extension to more than 2 data types

Tools Sequential or overlap analysis Clustering – R package icluster, iclusterPlus Correlation based Linear regression – – R package glmnet Network based – R package SNFtool Bayesian …..

Visualization: Circular map for omics data Chen et al. Cell 2012, 148(6):

Circos plot Circos Rcircos OmicCircos

IGV

NetGestalt

Things to be aware The importance The challenge in integrative analyses – Dimensionality Integration attempts are best carried out using known biological knowledge

References Kristensen VN. et al. Principles and methods of integrative genomic analyses in cancer. Nat Rev Cancer. 2014, 14(5): Wang B, et al. Similarity network fusion for aggregating data types on a genomic scale. Nat Methods. 2014,11(3): Yuan Y, et al. Assessing the clinical utility of cancer genomic and proteomic data across tumor types. Nat Biotechnol Jul;32(7): Shen R, et al. Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis. Bioinformatics Nov 15;25(22): Liu Q, et al. Integrative omics analysis reveals the importance and scope of translational repression in microRNA-mediated regulation. Mol Cell Proteomics. 2013,12(7): Setty M, et al. Inferring transcriptional and microRNA-mediated regulatory programs in glioblastoma. Mol Syst Biol. 2012;8:605 Lappalainen T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 2013, 501, 506–511 Jacobsen A, et al. Analysis of microRNA-target interactions across diverse cancer types. Nat Struct Mol Biol. 2013, 20(11):