Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gene expression analysis

Similar presentations


Presentation on theme: "Gene expression analysis"— Presentation transcript:

1 Gene expression analysis
Tutorial 7 Gene expression analysis

2 Gene expression analysis
Expression data GEO UCSC ArrayExpress General clustering methods Unsupervised Clustering Hierarchical clustering K-means clustering Tools for clustering EPCLUST Mev Functional analysis Go annotation

3 Gene expression data sources
Microarrays RNA-seq experiments

4 Expression Data Matrix
Gene 1 -1.2 -2.1 -3 -1.5 1.8 2.9 Gene 2 2.7 0.2 -1.1 1.6 -2.2 -1.7 Gene 3 -2.5 1.5 -0.1 -1 0.1 Gene 4 2.6 2.5 -2.3 Gene 5 2.2 Gene 6 -2.9 -1.9 -2.4 Each column represents all the gene expression levels from a single experiment. Each row represents the expression of a gene across all experiments.

5 Expression Data Matrix
Gene 1 -1.2 -2.1 -3 -1.5 1.8 2.9 Gene 2 2.7 0.2 -1.1 1.6 -2.2 -1.7 Gene 3 -2.5 1.5 -0.1 -1 0.1 Gene 4 2.6 2.5 -2.3 Gene 5 2.2 Gene 6 -2.9 -1.9 -2.4 Each element is a log ratio: log2 (T/R). T - the gene expression level in the testing sample R - the gene expression level in the reference sample

6 Expression Data Matrix
Black indicates a log ratio of zero, i.e. T=~R Green indicates a negative log ratio, i.e. T<R Grey indicates missing data Red indicates a positive log ratio, i.e. T>R

7 Different representations
Microarray Data: Different representations T>R Log ratio Log ratio T<R Exp Exp

8 How to search for expression profiles
GEO (Gene Expression Omnibus) Human genome browser ArrayExpress

9

10 Searching for expression profiles in the GEO
Datasets - suitable for analysis with GEO tools Expression profiles by gene Probe sets *further curated= statistically comparable datasets Microarray experiments Groups of related microarray experiments

11 Clustering Download dataset Statistic analysis

12 Clustering analysis

13 Clustering Download dataset Statistic analysis

14 The expression distribution for different lines in the cluster

15 Searching for expression profiles in the Human Genome browser.

16 Keratine 10 is highly expressed in skin

17 ArrayExpress

18

19

20

21

22 How to analyze gene expression data

23 Unsupervised Clustering - Hierarchical Clustering

24 Hierarchical Clustering
genes with similar expression patterns are grouped together and are connected by a series of branches (dendrogram). 2 1 3 4 5 6 1 6 3 5 2 4 Leaves (shapes in our case) represent genes and the length of the paths between leaves represents the distances between genes.

25 How to determine the similarity between two genes? (for clustering)
Patrik D'haeseleer, How does gene expression clustering work?, Nature Biotechnology 23, (2005) ,

26 Hierarchical clustering finds an entire hierarchy of clusters.
If we want a certain number of clusters we need to cut the tree at a level indicates that number (in this case - four).

27 Hierarchical clustering result
Five clusters

28 Unsupervised Clustering – K-means clustering
An algorithm to classify the data into K number of groups. K=4

29 How does it work? 1 2 3 4 The centroid of each of the k clusters becomes the new means. k initial "means" (in this casek=3) are randomly selected from the data set (shown in color). k clusters are created by associating every observation with the nearest mean Steps 2 and 3 are repeated until convergence has been reached. The algorithm divides iteratively the genes into K groups and calculates the center of each group. The results are the optimal groups (center distances) for K clusters.

30 How should we determine K?
Trial and error Take K as square root of gene number

31 Tools for clustering - EPclust

32

33

34

35

36

37

38 In the input matrix each column should represents a gene and each row should represent an experiment (or individual). Hierarchical clustering Edit the input matrix: Transpose,Normalize,Randomize K-means clustering

39 In the input matrix each column should represents a gene and each row should represent an experiment (or individual). Hierarchical clustering

40 Data Clusters

41 In the input matrix each column should represents a gene and each row should represent an experiment (or individual). K-means clustering

42 Samples found in cluster
Graphical representation of the cluster Graphical representation of the cluster

43 10 clusters, as requested

44 Tools for clustering - MeV
Multi experiment viewer

45 Gene expression function analysis
1007_s_at 1053_at 117_at 121_at 1255_g_at 1294_at 1316_at 1320_at 1405_i_at 1431_at 1438_at 1487_at 1494_f_at 1598_g_at What can we learn from clusters?

46 Gene Ontology (GO) The Gene Ontology project provides an ontology of defined terms representing gene product properties. The ontology covers three domains:

47 Gene Ontology (GO) Cellular Component (CC) - the parts of a cell or its extracellular environment. Molecular Function (MF) - the elemental activities of a gene product at the molecular level, such as binding or catalysis. Biological Process (BP) - operations or sets of molecular events with a defined beginning and end, pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms.

48 The GO tree

49 GO sources ISS Inferred from Sequence/Structural Similarity
IDA Inferred from Direct Assay IPI Inferred from Physical Interaction TAS Traceable Author Statement NAS Non-traceable Author Statement IMP Inferred from Mutant Phenotype IGI Inferred from Genetic Interaction IEP Inferred from Expression Pattern IC Inferred by Curator ND No Data available IEA Inferred from electronic annotation

50 Search by AmiGO

51 Results for alpha-synuclein

52 DAVID http://david.abcc.ncifcrf.gov/
DAVID  Functional Annotation Bioinformatics Microarray Analysis Identify enriched biological themes, particularly GO terms Discover enriched functional-related gene/protein groups Cluster redundant annotation terms Explore gene names in batch 

53 annotation classification ID conversion

54 Functional annotation
Upload Annotation options

55

56

57 Gene expression analysis
Expression data GEO UCSC ArrayExpress General clustering methods Unsupervised Clustering Hierarchical clustering K-means clustering Tools for clustering EPCLUST Mev Functional analysis Go annotation


Download ppt "Gene expression analysis"

Similar presentations


Ads by Google