Download presentation
Presentation is loading. Please wait.
Published byShawn Lang Modified over 8 years ago
1
Advances and challenges in computational modeling and statistical learning of biological systems Qi Liu Department of Biomedical Informatics Vanderbilt University School of Medicine qi.liu@vanderbilt.edu
2
BIG Data Methods Applications Genomics Transcriptomics Proteomics Epigenomics Clinical Data Data mining Machine learning Regression models Exploratory analysis Statistics Disease marker Prediction model Classification Precise treatment Hypothesis test http://jdr.sagepub.com/content/90/5/561
3
NGS Technologies http://www.slideshare.net/mkim8/a-comparison-of-ngs-platforms
4
A decade’s perspective on DNA sequencing technology Elaine R. Mardis, Nature(2011) 470, 198-203
7
Genomics WGS, WES Transcriptomics RNA-Seq Epigenomics Bisulfite-Seq ChIP-Seq Small indels point mutation Copy number variation Structural variation Differential expression Gene fusion Alternative splicing RNA editing Methylation Histone modification Transcription Factor binding Functional effect of mutation Network and pathway analysis Integrative analysis Further understanding of cancer and clinical applications TechnologiesData AnalysisIntegration and interpretationPatient Shyr D, Liu Q. Biol Proced Online. (2013)15,4
8
Objectives 1.Understand relationships between different types of molecular data 2.Understand the phenotype – latent: disease subtype – Observable: patient outcome
9
GTEX http://www.gtexportal.org/home/
10
TCGA https://tcga-data.nci.nih.gov/tcga/ http://www.nature.com/ng/journal/v45/n10/full/ng.2764.html
11
Inferring regulation networks DNA RNA Protein transcription Post-transcription TF Transcriptional regulation network Post-transcription regulation network miRNA
12
Reveal the relationships between different molecular layers – The strength of association indicates in trans-regulation.
13
miRNA
14
GSE10843 GSE10833 microRNA miRNA-mRNA correlation miRNA-ratio correlation miRNA-protein correlation mRNA decay Translational repression Combined effect Association of sequence features with estimated mRNA decay or translation repression Site type Site location Local AU-context Additional 3’ pairing Significant inverse Correlation (p<0.005) Supported by TargetScan, miRanda or MirTarget2 microRNA-target interactions 7235 functional relationships Binding evidence 580 interactions 60miRNAs 423 genes Sequence features on site efficacy microRNA-target interactions mRNA i protein/mRNA ratio protein the relative contribution of translation repression 79 miRNAs 5144 genes Integrative method
15
Features on site efficacy for these two regulation types mRNA decay : 8mer is efficient Tanslational repression : 8mer site do not show significant efficacy mRNA decay : 3’UTR>ORF>5’UTR translational repression : marginal significance in ORF
16
Features on site efficacy for these two regulation types AU-rich context appears to favor both mRNA decay and translational repression 3’ pairing enhance mRNA decay, but disfavor efficacy for translational repression
17
miR-138 prefers translational repression SW620 and SW480 (derived from the same patient) SW620SW480 sourcelymph nodeprimary metastasishighpoor miR-138 (log 2 ) 3.066.39
24
GPROTEIN_COUPLED_RECEPTOR_SIGNALING (FDR=0.005) UP DOWN GOLGI_VESICLE_TRANSPORT( FDR=0.07 ) KEGG_AMINOACYL_TRNA_BIOSYNTHESIS (FDR=0.03) CYTOKINE_METABOLIC_PROCESS (FDR=0.09) FEEDING_BEHAVIOR (FDR=0.005) KEGG_PROTEASOME (FDR=0.03) DOWN (FDR=0.00001) KEGG_PRIMARY_IMMUNODEFICIENCY (FDR=0.002) KEGG_CELL_ADHESION_MOLECULES_CAMS (FDR=0.003) T_CELL_ACTIVATION (FDR=0.002) KEGG_ALLOGRAFT_REJECTION (FDR=0.005) UP B A C D
25
mRNA CNV Methylation Stage-dependent alterations TF-target CNV effect Methylation effect Stage-dependent TF activity changes Limma Correlation Regression Model 123 Stage I 55 Stage IV
26
RegulatorTarget regulationEffect sizeFDR GATA6Up0.141.2e-13 NFIL3Down-0.121.0e-08 SREBF2Up0.127.3e-08 SREBF1Down-0.081.0e-07 TBPUp0.051.4e-07 HLFUp0.117.5e-07 TCF12Up0.103.1e-06 GATA1Down-0.071.6e-05 FOSBUp0.101.7e-05 RARA/RARB/ RARG/RXRB Up0.216.5e-05 RESTUp0.149.2e-05 FOXF2Down-0.051.3e-04 FOXC1Up0.091.7e-04 HMGA1Up0.091.9e-04 E2F7Up0.123.6e-04 NKX2-1Up0.068.2e-04 Stage-dependent TF activities changes B A C D
28
Challenges Complex structure, but limited sample size Cooperative regulation Incorporate prior knowledge Nonlinear effect Long range chromatin interaction Data heterogeneity Complexity and model sparsity
29
Individual omics analysis
30
Integrative omics analysis
31
Illustrative example of SNF steps The advantage of the integrative procedure is that weak similarities (low-weight edges) disappear, helping to reduce the noise, and strong similarities (high-weight edges) present in one or more networks are added to the others. Additionally, low-weight edges supported by all networks are retained depending on how tightly connected their neighborhoods are across networks.
32
Methods Extension to more than 2 data types Inspired by the theoretical multiview learning framework developed for the computer vision and image processing applications.
33
Patient similarities for each data types compared to SNF fused similarity
34
Comparison of SNF with icluster and concatenation
36
Challenges Systems-level probabilistic modeling of multiple data types Correlated data Missing values Dependence among genes
37
Thank you very much for your attention!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.