Advances and challenges in computational modeling and statistical learning of biological systems Qi Liu Department of Biomedical Informatics Vanderbilt.

Slides:



Advertisements
Similar presentations
Gene Regulation in Eukaryotic Cells. Gene regulation is complex Regulation, and therefore, expression of a gene is complex. Regulation of these genes.
Advertisements

Control of Eukaryotic Gene Expression. 2 Eukaryotic Gene Regulation Prokaryotic regulation is different from eukaryotic regulation. 1.Eukaryotic cells.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
The multi-layered organization of information in living systems
Data integration across omics landscapes Bing Zhang, Ph.D. Department of Biomedical Informatics Vanderbilt University School of Medicine
TCGA(The cancer genome atlas) catalogue genetic mutations responsible for cancer, using genome sequencing and bioinformatics The TCGA is sequencing the.
04/02/2006RECOMB 2006 Detecting MicroRNA Targets by Linking Sequence, MicroRNA and Gene Expression Data Joint work with Quaid Morris (2) and Brendan Frey.
Next-generation sequencing and PBRC. Next Generation Sequencer Applications DeNovo Sequencing Resequencing, Comparative Genomics Global SNP Analysis Gene.
Introduction Integrative Analysis of Genomic Variants in Carcinogenesis Syed Haider, Arek Kasprzyk, Pietro Lio Artificial Intelligence and Computational.
Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
The Central Dogma of Molecular Biology (Things are not really this simple) Genetic information is stored in our DNA (~ 3 billion bp) The DNA of a.
Integrative omics analysis Qi Liu Center for Quantitative Sciences Vanderbilt University School of Medicine
Presented by Karen Xu. Introduction Cancer is commonly referred to as the “disease of the genes” Cancer may be favored by genetic predisposition, but.
William S. Klug Michael R. Cummings Charlotte A
Special Topics in Genomics Lecture 1: Introduction Instructor: Hongkai Ji Department of Biostatistics
Control of Gene Expression Eukaryotes. Eukaryotic Gene Expression Some genes are expressed in all cells all the time. These so-called housekeeping genes.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Radiogenomics in glioblastoma multiforme
Introduction Hepatitis C Virus
Data Analysis Summary. Elephant in the room General Comments General understanding that informatics is integral in medical sequencing and other –omics.
Eukaryotic Gene Regulation
Regulation of Gene Expression Eukaryotes
Copyright © 2009 Pearson Education, Inc. Art and Photos in PowerPoint ® Concepts of Genetics Ninth Edition Klug, Cummings, Spencer, Palladino Chapter 18.
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
Head and Neck Cancer: microRNA analysis
Chapter 11 Regulation of Gene Expression. Regulation of Gene Expression u Important for cellular control and differentiation. u Understanding “expression”
Inferring transcriptional and microRNA-mediated regulatory programs in glioblastma Setty, M., et al.
Lecture 4. Topics in Gene Regulation and Epigenomics (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology.
Epigenetic Analysis BIOS Statistics for Systems Biology Spring 2008.
AP Biology Control of Eukaryotic Genes.
Copyright © 2009 Pearson Education, Inc. Regulation of Gene Expression in Eukaryotes Chapter 17 Lecture Concepts of Genetics Tenth Edition.
Harbin Institute of Technology Computer Science and Bioinformatics Wang Yadong Second US-China Computer Science Leadership Summit.
Eukaryotic Genomes  The Organization and Control of Eukaryotic Genomes.
COMPUTATIONAL ANALYSIS OF MULTILEVEL OMICS DATA FOR THE ELUCIDATION OF MOLECULAR MECHANISMS OF CANCER Presented by Azeez Ayomide Fatai Supervisor: Junaid.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
ACCELERATING CLINICAL AND TRANSLATIONAL RESEARCH Challenges in Bioinformatics R.W. Doerge Department of Statistics Department Agronomy.
Anthony Gitter Cancer Bioinformatics (BMI 826/CS 838) May 5, 2015
No reference available
CBioPortal Web resource for exploring, visualizing, and analyzing multidimentional cancer genomics data.
(1) Genotype-Tissue Expression (GTEx) Largest systematic study of genetic regulation in multiple tissues to date 53 tissues, 500+ donors, 9K samples, 180M.
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
? ? Individual 1Individual 2 1. Questions This is a pedigree for a disease involving a mutation within an imprinted gene. The disease manifests only when.
How to get from a pile of unprocessed data to knowledge: The user’s perspective Guido Jenster, Ph.D. Professor of Experimental Urological Oncology Department.
High-throughput data used in bioinformatics
Cancer Genomics Core Lab
A graph-based integration of multiple layers of cancer genomics data (Progress Report) Do Kyoon Kim 1.
Cancer Genomics and Class Discovery
Galaxy course EMC TraIT Nov 2014_Jenster
Gene expression.
Driver mutations – Epigenetics – Transcriptomics
Global Transcriptional Dysregulation in Breast Cancer
Dept of Biomedical Informatics University of Pittsburgh
Many Sample Size and Power Calculators Exist On-Line
Regulation of Gene Expression by Eukaryotes
Sequencing Data Analysis
Post-GWAS and Mechanistic Analyses
Advanced PGDB Editing: Regulation GO Terms
Chapter 18: Regulation of Gene Expression
Systems biology of kidney diseases
MicroRNAs: regulators of gene expression and cell differentiation
Review Warm-Up What is the Central Dogma?
Network Inference Chris Holmes Oxford Centre for Gene Function, &,
Epigenetics modification
Galaxy course EMC TraIT Nov 2014_Jenster
Volume 26, Issue 12, Pages e5 (March 2019)
Figure 1. Identification of three tumour molecular subtypes in CIT and TCGA cohorts. We used CIT multi-omics data ( Figure 1. Identification of.
Sequencing Data Analysis
Epigenetic mechanisms and the development of asthma
Presentation transcript:

Advances and challenges in computational modeling and statistical learning of biological systems Qi Liu Department of Biomedical Informatics Vanderbilt University School of Medicine

BIG Data Methods Applications Genomics Transcriptomics Proteomics Epigenomics Clinical Data Data mining Machine learning Regression models Exploratory analysis Statistics Disease marker Prediction model Classification Precise treatment Hypothesis test

NGS Technologies

A decade’s perspective on DNA sequencing technology Elaine R. Mardis, Nature(2011) 470,

Genomics WGS, WES Transcriptomics RNA-Seq Epigenomics Bisulfite-Seq ChIP-Seq Small indels point mutation Copy number variation Structural variation Differential expression Gene fusion Alternative splicing RNA editing Methylation Histone modification Transcription Factor binding Functional effect of mutation Network and pathway analysis Integrative analysis Further understanding of cancer and clinical applications TechnologiesData AnalysisIntegration and interpretationPatient Shyr D, Liu Q. Biol Proced Online. (2013)15,4

Objectives 1.Understand relationships between different types of molecular data 2.Understand the phenotype – latent: disease subtype – Observable: patient outcome

GTEX

TCGA

Inferring regulation networks DNA RNA Protein transcription Post-transcription TF Transcriptional regulation network Post-transcription regulation network miRNA

Reveal the relationships between different molecular layers – The strength of association indicates in trans-regulation.

miRNA

GSE10843 GSE10833 microRNA miRNA-mRNA correlation miRNA-ratio correlation miRNA-protein correlation mRNA decay Translational repression Combined effect Association of sequence features with estimated mRNA decay or translation repression Site type Site location Local AU-context Additional 3’ pairing Significant inverse Correlation (p<0.005) Supported by TargetScan, miRanda or MirTarget2 microRNA-target interactions 7235 functional relationships Binding evidence 580 interactions 60miRNAs 423 genes Sequence features on site efficacy microRNA-target interactions mRNA i protein/mRNA ratio protein the relative contribution of translation repression 79 miRNAs 5144 genes Integrative method

Features on site efficacy for these two regulation types mRNA decay : 8mer is efficient Tanslational repression : 8mer site do not show significant efficacy mRNA decay : 3’UTR>ORF>5’UTR translational repression : marginal significance in ORF

Features on site efficacy for these two regulation types AU-rich context appears to favor both mRNA decay and translational repression 3’ pairing enhance mRNA decay, but disfavor efficacy for translational repression

miR-138 prefers translational repression SW620 and SW480 (derived from the same patient) SW620SW480 sourcelymph nodeprimary metastasishighpoor miR-138 (log 2 )

GPROTEIN_COUPLED_RECEPTOR_SIGNALING (FDR=0.005) UP DOWN GOLGI_VESICLE_TRANSPORT( FDR=0.07 ) KEGG_AMINOACYL_TRNA_BIOSYNTHESIS (FDR=0.03) CYTOKINE_METABOLIC_PROCESS (FDR=0.09) FEEDING_BEHAVIOR (FDR=0.005) KEGG_PROTEASOME (FDR=0.03) DOWN (FDR= ) KEGG_PRIMARY_IMMUNODEFICIENCY (FDR=0.002) KEGG_CELL_ADHESION_MOLECULES_CAMS (FDR=0.003) T_CELL_ACTIVATION (FDR=0.002) KEGG_ALLOGRAFT_REJECTION (FDR=0.005) UP B A C D

mRNA CNV Methylation Stage-dependent alterations TF-target CNV effect Methylation effect Stage-dependent TF activity changes Limma Correlation Regression Model 123 Stage I 55 Stage IV

RegulatorTarget regulationEffect sizeFDR GATA6Up e-13 NFIL3Down e-08 SREBF2Up e-08 SREBF1Down e-07 TBPUp e-07 HLFUp e-07 TCF12Up e-06 GATA1Down e-05 FOSBUp e-05 RARA/RARB/ RARG/RXRB Up e-05 RESTUp e-05 FOXF2Down e-04 FOXC1Up e-04 HMGA1Up e-04 E2F7Up e-04 NKX2-1Up e-04 Stage-dependent TF activities changes B A C D

Challenges Complex structure, but limited sample size Cooperative regulation Incorporate prior knowledge Nonlinear effect Long range chromatin interaction Data heterogeneity Complexity and model sparsity

Individual omics analysis

Integrative omics analysis

Illustrative example of SNF steps The advantage of the integrative procedure is that weak similarities (low-weight edges) disappear, helping to reduce the noise, and strong similarities (high-weight edges) present in one or more networks are added to the others. Additionally, low-weight edges supported by all networks are retained depending on how tightly connected their neighborhoods are across networks.

Methods Extension to more than 2 data types Inspired by the theoretical multiview learning framework developed for the computer vision and image processing applications.

Patient similarities for each data types compared to SNF fused similarity

Comparison of SNF with icluster and concatenation

Challenges Systems-level probabilistic modeling of multiple data types Correlated data Missing values Dependence among genes

Thank you very much for your attention!