Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advances and challenges in computational modeling and statistical learning of biological systems Qi Liu Department of Biomedical Informatics Vanderbilt.

Similar presentations


Presentation on theme: "Advances and challenges in computational modeling and statistical learning of biological systems Qi Liu Department of Biomedical Informatics Vanderbilt."— Presentation transcript:

1 Advances and challenges in computational modeling and statistical learning of biological systems Qi Liu Department of Biomedical Informatics Vanderbilt University School of Medicine qi.liu@vanderbilt.edu

2 BIG Data Methods Applications Genomics Transcriptomics Proteomics Epigenomics Clinical Data Data mining Machine learning Regression models Exploratory analysis Statistics Disease marker Prediction model Classification Precise treatment Hypothesis test http://jdr.sagepub.com/content/90/5/561

3 NGS Technologies http://www.slideshare.net/mkim8/a-comparison-of-ngs-platforms

4 A decade’s perspective on DNA sequencing technology Elaine R. Mardis, Nature(2011) 470, 198-203

5

6

7 Genomics WGS, WES Transcriptomics RNA-Seq Epigenomics Bisulfite-Seq ChIP-Seq Small indels point mutation Copy number variation Structural variation Differential expression Gene fusion Alternative splicing RNA editing Methylation Histone modification Transcription Factor binding Functional effect of mutation Network and pathway analysis Integrative analysis Further understanding of cancer and clinical applications TechnologiesData AnalysisIntegration and interpretationPatient Shyr D, Liu Q. Biol Proced Online. (2013)15,4

8 Objectives 1.Understand relationships between different types of molecular data 2.Understand the phenotype – latent: disease subtype – Observable: patient outcome

9 GTEX http://www.gtexportal.org/home/

10 TCGA https://tcga-data.nci.nih.gov/tcga/ http://www.nature.com/ng/journal/v45/n10/full/ng.2764.html

11 Inferring regulation networks DNA RNA Protein transcription Post-transcription TF Transcriptional regulation network Post-transcription regulation network miRNA

12 Reveal the relationships between different molecular layers – The strength of association indicates in trans-regulation.

13 miRNA

14 GSE10843 GSE10833 microRNA miRNA-mRNA correlation miRNA-ratio correlation miRNA-protein correlation mRNA decay Translational repression Combined effect Association of sequence features with estimated mRNA decay or translation repression Site type Site location Local AU-context Additional 3’ pairing Significant inverse Correlation (p<0.005) Supported by TargetScan, miRanda or MirTarget2 microRNA-target interactions 7235 functional relationships Binding evidence 580 interactions 60miRNAs 423 genes Sequence features on site efficacy microRNA-target interactions mRNA i protein/mRNA ratio protein the relative contribution of translation repression 79 miRNAs 5144 genes Integrative method

15 Features on site efficacy for these two regulation types mRNA decay : 8mer is efficient Tanslational repression : 8mer site do not show significant efficacy mRNA decay : 3’UTR>ORF>5’UTR translational repression : marginal significance in ORF

16 Features on site efficacy for these two regulation types AU-rich context appears to favor both mRNA decay and translational repression 3’ pairing enhance mRNA decay, but disfavor efficacy for translational repression

17 miR-138 prefers translational repression SW620 and SW480 (derived from the same patient) SW620SW480 sourcelymph nodeprimary metastasishighpoor miR-138 (log 2 ) 3.066.39

18

19

20

21

22

23

24 GPROTEIN_COUPLED_RECEPTOR_SIGNALING (FDR=0.005) UP DOWN GOLGI_VESICLE_TRANSPORT( FDR=0.07 ) KEGG_AMINOACYL_TRNA_BIOSYNTHESIS (FDR=0.03) CYTOKINE_METABOLIC_PROCESS (FDR=0.09) FEEDING_BEHAVIOR (FDR=0.005) KEGG_PROTEASOME (FDR=0.03) DOWN (FDR=0.00001) KEGG_PRIMARY_IMMUNODEFICIENCY (FDR=0.002) KEGG_CELL_ADHESION_MOLECULES_CAMS (FDR=0.003) T_CELL_ACTIVATION (FDR=0.002) KEGG_ALLOGRAFT_REJECTION (FDR=0.005) UP B A C D

25 mRNA CNV Methylation Stage-dependent alterations TF-target CNV effect Methylation effect Stage-dependent TF activity changes Limma Correlation Regression Model 123 Stage I 55 Stage IV

26 RegulatorTarget regulationEffect sizeFDR GATA6Up0.141.2e-13 NFIL3Down-0.121.0e-08 SREBF2Up0.127.3e-08 SREBF1Down-0.081.0e-07 TBPUp0.051.4e-07 HLFUp0.117.5e-07 TCF12Up0.103.1e-06 GATA1Down-0.071.6e-05 FOSBUp0.101.7e-05 RARA/RARB/ RARG/RXRB Up0.216.5e-05 RESTUp0.149.2e-05 FOXF2Down-0.051.3e-04 FOXC1Up0.091.7e-04 HMGA1Up0.091.9e-04 E2F7Up0.123.6e-04 NKX2-1Up0.068.2e-04 Stage-dependent TF activities changes B A C D

27

28 Challenges Complex structure, but limited sample size Cooperative regulation Incorporate prior knowledge Nonlinear effect Long range chromatin interaction Data heterogeneity Complexity and model sparsity

29 Individual omics analysis

30 Integrative omics analysis

31 Illustrative example of SNF steps The advantage of the integrative procedure is that weak similarities (low-weight edges) disappear, helping to reduce the noise, and strong similarities (high-weight edges) present in one or more networks are added to the others. Additionally, low-weight edges supported by all networks are retained depending on how tightly connected their neighborhoods are across networks.

32 Methods Extension to more than 2 data types Inspired by the theoretical multiview learning framework developed for the computer vision and image processing applications.

33 Patient similarities for each data types compared to SNF fused similarity

34 Comparison of SNF with icluster and concatenation

35

36 Challenges Systems-level probabilistic modeling of multiple data types Correlated data Missing values Dependence among genes

37 Thank you very much for your attention!


Download ppt "Advances and challenges in computational modeling and statistical learning of biological systems Qi Liu Department of Biomedical Informatics Vanderbilt."

Similar presentations


Ads by Google