Presentation is loading. Please wait.

Presentation is loading. Please wait.

 2 Outline  Review of major computational approaches to facilitate biological interpretation of  high-throughput microarray  and RNA-Seq experiments.

Similar presentations


Presentation on theme: " 2 Outline  Review of major computational approaches to facilitate biological interpretation of  high-throughput microarray  and RNA-Seq experiments."— Presentation transcript:

1

2  2 Outline  Review of major computational approaches to facilitate biological interpretation of  high-throughput microarray  and RNA-Seq experiments.

3 3 Input: Microarray / RNA seq DEG: Differentially Expressed Genesco-expression / clustering Gene Set-Wise Differential Expression Analysis Differential Co-Expression Analysis Interest gene, genes list, gene pair or gene list pair FAA: Functional Annotation Analysis: Gene Ontology (GO) or Pathway analysis Gene list with annotations Visualization, sematic assembling and knowledge learning: Concept lattice analysis : BioLattice

4   FAA: Functional Annotation Analysis  GO: Gene Ontology  Pathway  DEG: Differentially Expressed Genes  GSEA: Gene Set Enrichment Analysis  Biological Interpretation and Biological Semantics  Concept lattice analysis 4 Glossary

5  Pathway and Ontology-Based Analysis  GO and biological pathway-based analysis:  one of the most powerful methods for inferring the biological meanings of expression changes  list of genes obtained by:  differential expression analysis  co-expression analysis (or clustering)

6  6 Pathway and Ontology-Based Analysis

7 7

8   Attributes can be applied for FAA:  transcription factor binding  clinical phenotypes like disease associations  MeSH (Medical Subject Heading) terms  microRNA binding sites  protein family memberships  chromosomal bands, etc  GO terms  biological pathways 8 Pathway and Ontology-Based Analysis

9   Features may have their own ontological structures  GO has a structure as a DAG (Directed Acyclic Graph) 9 Pathway and Ontology-Based Analysis

10   DEGs: 10 Pathway and Ontology-Based Analysis

11 11 Input: Microarray / RNA seq DEG: Differentially Expressed Genesco-expression / clustering Gene Set-Wise Differential Expression Analysis Differential Co-Expression Analysis Interest gene, genes list, gene pair or gene list pair FAA: Functional Annotation Analysis: Gene Ontology (GO) or Pathway analysis Gene list with annotations Visualization, sematic assembling and knowledge learning: Concept lattice analysis : BioLattice

12   DEGs:  3 techniques which help obtain DEGs:  t -test  Wilcoxon’s rank sum test  ANOVA  Need to note that multiple-hypothesis-testing problem should be properly managed 12 Pathway and Ontology-Based Analysis

13   Co-expression analysis 13 Pathway and Ontology-Based Analysis

14   Co-expression analysis  puts similar expression profiles together and different ones apart  Returning genes that are assumed to be co-regulated  Clustering algorithms:  hierarchical-tree clustering  partitional clustering 14 Pathway and Ontology-Based Analysis

15   Pathways are powerful resources for the understanding of shared biological processes  E.g.: KEGG, MetaCyc and BioCarta (signaling pathways) 15 Pathway and Ontology-Based Analysis

16   MetaCyc :  an experimentally determined non-redundant metabolic pathway database  It is the largest collection  containing over 1400 metabolic pathways 16 Pathway and Ontology-Based Analysis

17   Ontology / GO :  providing a shared understanding of a certain domain of information  controlled vocabularies  DAG structures with 3 vocabularies of GO:  Molecular Function (MF)  Cellular Compartment (CC)  Biological Process (BP) 17 Pathway and Ontology-Based Analysis

18   Common Gos:  MIPS: integrated source, protein properties, variety of complete genomes  MeSH: clinical including disease names  OMIM (Online Mendelian Inheritance in Man)  UMLS (Unified Medical Language System) 18 Pathway and Ontology-Based Analysis

19   GO enrichment test:  For example  if 20% of the genes in a gene list are annotated with a GO term ‘apoptosis’  only 1% of the genes in the whole human genome fall into this functional category 19 Pathway and Ontology-Based Analysis

20   Common statistical tests:  Chi-square  binomial  hypergeometric tests 20 Pathway and Ontology-Based Analysis

21   hypergeometric test: 21 Pathway and Ontology-Based Analysis

22   Avoid pitfalls when using hypergeometric test  Choice of background, that makes substantial impact on the result.  All genes having at least one GO annotation  all genes ever known in genome databases  all genes on the microarray  GO has a hierarchical tree (or graphical) structure while hypergeometric test assumes independence of categories 22 Pathway and Ontology-Based Analysis

23   Common Tools  DAVID  ArrayX- Path  Pathway Miner  EASE  GOFish  GOTree etc. 23 Pathway and Ontology-Based Analysis

24 24

25  25 Gene Set-Wise Differential Expression Analysis

26 26 Input: Microarray / RNA seq DEG: Differentially Expressed Genesco-expression / clustering Gene Set-Wise Differential Expression Analysis Differential Co-Expression Analysis Interest gene, genes list, gene pair or gene list pair FAA: Functional Annotation Analysis: Gene Ontology (GO) or Pathway analysis Gene list with annotations Visualization, sematic assembling and knowledge learning: Concept lattice analysis : BioLattice

27   Evaluates coordinated differential expression of gene groups  Gene Set Enrichment Analysis (GSEA)  The first developed in this category  evaluates for each a pre-defined gene set the significant association with phenotypic classes 27 Gene Set-Wise Differential Expression Analysis

28   Difference between FAA and GSEA:  FAA: find over-represented GO terms from a interesting gene list  GSEA: obtain the pre-defined gene list first and test the changes under different conditions. 28 Gene Set-Wise Differential Expression Analysis

29 29

30   Advantages of gene set-wise differential expression analysis:  successfully identified modest but coordinated changes in gene expression that might have been missed by conventional ‘individual gene-wise’ differential expression analysis.  (many tiny expression changes can collectively create a big change)  straightforward biological interpretation because the gene sets are defined by biological knowledge 30 Gene Set-Wise Differential Expression Analysis

31   Enrichment Score (ES) is calculated by evaluating the fractions of genes in S (‘‘hits’’) weighted by their correlation and the fractions of genes not in S (‘‘misses’’) present up to a given position i in the ranked gene list, L, where N genes are ordered according to the correlation, 31 Gene Set-Wise Differential Expression Analysis

32   Typical gene sets:  regulatory-motif  function-related  disease-related sets  Database:  MSigDB:  6769 gene sets  classified into five different collections  Has some interesting extensions 32 Gene Set-Wise Differential Expression Analysis

33  33 Differential Co-Expression Analysis

34 34 Input: Microarray / RNA seq DEG: Differentially Expressed Genesco-expression / clustering Gene Set-Wise Differential Expression Analysis Differential Co-Expression Analysis Interest gene, genes list, gene pair or gene list pair FAA: Functional Annotation Analysis: Gene Ontology (GO) or Pathway analysis Gene list with annotations Visualization, sematic assembling and knowledge learning: Concept lattice analysis : BioLattice

35   Co-expression analysis:  determines the degree of co-expression of a cluster of genes under a certain condition  Differential co-expression analysis:  determines the degree of co-expression difference of a gene pair or a gene cluster across different conditions 35 Differential Co-Expression Analysis

36   3 major types:  (a) differential co-expression of gene cluster(s)  (b) gene pair-wise differential co- expression  (c) differential co-expression of paired gene sets 36 Differential Co-Expression Analysis

37 37

38   Type (a), identify differentially co-expressed gene cluster(s) between two conditions  Let conditions and genes be denoted by J and I, respectively. The mean squared residual of model is a measurement of co- expression of genes: 38 Differential Co-Expression Analysis

39  39 Differential Co-Expression Analysis Type (a) cont.

40   Type (b) 40 Differential Co-Expression Analysis

41   Type (b), identify differentially co-expressed gene pairs  Techniques:  F -statistic  A meta-analytic approach 41 Differential Co-Expression Analysis

42   Note that identification of differentially co-expressed gene clusters or gene pairs usually do not use a pre-defined gene sets or pairs.  Thus the interpretation may also be improved by ontology and pathway-based annotation analysis. 42 Differential Co-Expression Analysis

43   Type (c), dCoxS (differential co-expression of gene sets) algorithm identifies gene set pairs differentially co-expressed across different conditions  Biological pathways can be used as pre-defined gene sets and the differential co-expression of the biological pathway pairs between conditions is analyzed. 43 Differential Co-Expression Analysis

44   Type (c) cont.  To measure the expression similarity between paired gene-sets under the same condition, dCoxS defines the interaction score (IS) as the correlation coefficient between the sample-wise entropies. Even when the numbers of the genes in different pathways are different, IS can always be obtained because it uses only sample- wise distances regardless of whether the two pathways have the same number of genes or not. 44 Differential Co-Expression Analysis

45   Type (c) cont. 45 Differential Co-Expression Analysis

46  46 Biological Interpretation and Biological Semantics

47 47 Input: Microarray / RNA seq DEG: Differentially Expressed Genesco-expression / clustering Gene Set-Wise Differential Expression Analysis Differential Co-Expression Analysis Interest gene, genes list, gene pair or gene list pair FAA: Functional Annotation Analysis: Gene Ontology (GO) or Pathway analysis Gene list with annotations Visualization, sematic assembling and knowledge learning: Concept lattice analysis : BioLattice

48   Biomedical semantics provides rich descriptions for biomedical domain knowledge.  Motivation for Biological Semantics:  GO has limitations:  The result of GO is typically a long unordered list of annotations  Most of the analysis tools evaluate only one cluster at a time  time-consuming to read the massive annotation lists  hard to manually assemble  Many annotations are redundant 48 Biological Interpretation and Biological Semantics

49   Introducing BioLattice:  a mathematical framework  based on concept lattice analysis  organize traditional clusters and associated annotations into a lattice of concepts  A graphical summary  considers gene expression clusters as objects and annotations as attributes  Thus, complex relations among clusters and annotations are clarified, ordered and visualized. 49 Biological Interpretation and Biological Semantics

50   Another advantage of BioLattice is that heterogeneous biological knowledge resources can be added 50 Biological Interpretation and Biological Semantics

51 51

52   Tool to construct BioLattice:  The Ganter algorithm http:// www.snubi.org/software/biolattice/ 52 Biological Interpretation and Biological Semantics

53 53

54   Review of major computational approaches to facilitate biological interpretation of high-throughput microarray and RNA-Seq experiments. 54 Conclusion

55 55 Input: Microarray / RNA seq DEG: Differentially Expressed Genesco-expression / clustering Gene Set-Wise Differential Expression Analysis Differential Co-Expression Analysis Interest gene, genes list, gene pair or gene list pair FAA: Functional Annotation Analysis: Gene Ontology (GO) or Pathway analysis Gene list with annotations Visualization, sematic assembling and knowledge learning: Concept lattice analysis : BioLattice

56 56


Download ppt " 2 Outline  Review of major computational approaches to facilitate biological interpretation of  high-throughput microarray  and RNA-Seq experiments."

Similar presentations


Ads by Google