Expression profiling & functional genomics Exercises.

Slides:



Advertisements
Similar presentations
Basic Gene Expression Data Analysis--Clustering
Advertisements

Exercise 1: Importing Illumina data  Using the Import tool File / Import folder. Select the folder IlluminaTeratospermiaHuman6v1_BS1 In the Import files.
Uncertainty in fall time surrogate Prediction variance vs. data sensitivity – Non-uniform noise – Example Uncertainty in fall time data Bootstrapping.
The Maize Inflorescence Project Website Tutorial Nov 7, 2014.
The Rice Functional Genomics Program of China cDNA microarray database (RIFGP-CDMD) consists of complete datasets, including the probe sequences, microarray.
4.1 (Part 1) Flow diagram for gene expression profiling.
Fission Yeast Computing Workshop -1- Exercise 5: Looking for overreprsented GO terms in a gene set using Onto-Express GO annotations can be used to obtain.
Genomic Profiles of Brain Tissue in Humans and Chimpanzees II Naomi Altman Oct 06.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Sandrine Dudoit1 Microarray Experimental Design and Analysis Sandrine Dudoit jointly with Yee Hwa Yang Division of Biostatistics, UC Berkeley
Microarrays Dr Peter Smooker,
SocalBSI 2008: Clustering Microarray Datasets Sagar Damle, Ph.D. Candidate, Caltech  Distance Metrics: Measuring similarity using the Euclidean and Correlation.
Microarray GEO – Microarray sets database
Public data - available for projects 6 data sets: –Human Tissues –Leukemia –Spike-in –FARO compendium – Yeast Cell Cycle –Yeast Rosetta Find one yourself.
Dimension reduction : PCA and Clustering Agnieszka S. Juncker Slides: Christopher Workman and Agnieszka S. Juncker Center for Biological Sequence Analysis.
Microarray Data Preprocessing and Clustering Analysis
Introduction to Computational Biology Topics. Molecular Data Definition of data  DNA/RNA  Protein  Expression Basics of programming in Matlab  Vectors.
Figure 1: (A) A microarray may contain thousands of ‘spots’. Each spot contains many copies of the same DNA sequence that uniquely represents a gene from.
GEPAS -Gene Expression Pattern Analysis Suite Hongli Li Computer Science Department UMASS Lowell
Gene Expression 1. Methods –Unsupervised Clustering Hierarchical clustering K-means clustering Expression data –GEO –UCSC EPCLUST 2.
As with averages, researchers need to transform data into a form conducive to interpretation, comparisons, and statistical analysis measures of dispersion.
Patrick Kemmeren Using EP:NG.
Tutorial 8 Clustering 1. General Methods –Unsupervised Clustering Hierarchical clustering K-means clustering Expression data –GEO –UCSC –ArrayExpress.
Cluster Analysis Hierarchical and k-means. Expression data Expression data are typically analyzed in matrix form with each row representing a gene and.
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
The following slides have been adapted from to be presented at the Follow-up course on Microarray Data Analysis.
Copyright 2000, Media Cybernetics, L.P. Array-Pro ® Analyzer Software.
CDNA Microarrays MB206.
Panu Somervuo, March 19, cDNA microarrays.
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
Statistical Analysis. Statistics u Description –Describes the data –Mean –Median –Mode u Inferential –Allows prediction from the sample to the population.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics Lab v1 | Saurabh Sinha1 Powerpoint by Casey Hanson.
Eran Yanowski, Eran Hornstein’s: Monitor drug impact on the transcriptome of mouse beta cells (primary and cell-line) using Transeq/RNA-Seq Report.
Gene expression analysis
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman modified by Hanne Jarmer.
Tutorial 7 Gene expression analysis 1. Expression data –GEO –UCSC –ArrayExpress General clustering methods –Unsupervised Clustering Hierarchical clustering.
Hierarchical Bayesian Model Specification Model is specified by the Directed Acyclic Network (DAG) and the conditional probability distributions of all.
ANOVA: Analysis of Variance.
Introduction to Statistical Analysis of Gene Expression Data Feng Hong Beespace meeting April 20, 2005.
Gene Expression Analysis. 2 DNA Microarray First introduced in 1987 A microarray is a tool for analyzing gene expression in genomic scale. The microarray.
Statistics for Differential Expression Naomi Altman Oct. 06.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics | Saurabh Sinha | PowerPoint by Casey Hanson.
CSIRO Insert presentation title, do not remove CSIRO from start of footer Experimental Design Why design? removal of technical variance Optimizing your.
Cluster validation Integration ICES Bioinformatics.
Bioinformatics Expression profiling and functional genomics Part I: Preprocessing Ad 29/10/2006.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Preprocessing Bioconductor. Overview Data set Salmo, color flip design Name Cy5 Cy3 FileName array1 self1 self txt array2.
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Chapter 4 Exploring Chemical Analysis, Harris
NCode TM miRNA Analysis Platform Identifies Differentially Expressed Novel miRNAs in Adenocarcinoma Using Clinical Human Samples Provided By BioServe.
Cluster Analysis, an Overview Laurie Heyer. Why Cluster? Data reduction – Analyze representative data points, not the whole dataset Hypothesis generation.
Microarray Technology and Data Analysis Roy Williams PhD Sanford | Burnham Medical Research Institute.
DNA Microarray. Microarray Printing 96-well-plate (PCR Products) 384-well print-plate Microarray.
CellExpress Tutorial A Comprehensive Microarray-Based Cancer Cell Line and Clinical Sample Gene Expression Analysis Online System :8080 NTU.
Tutorial 6 : RNA - Sequencing Analysis and GO enrichment
Gene expression arrays in cancer research: methods and applications
Gene expression analysis
Statistical Process Control
Dimension reduction : PCA and Clustering
Volume 12, Issue 6, Pages (December 2003)
Gene Expression Analysis
Using reliable online resources
Gene expression profiles of T cells.
One-way hierarchical cluster analysis of SAM-identified genes using the TMEV software to see the data substructure. One-way hierarchical cluster analysis.
Presentation transcript:

Expression profiling & functional genomics Exercises

Differential expression

Use the normalized data to find statistically differentially expressed genes: CyberT software oefnbaldi.xls The file contain the 4 normalised ratios (see SNOMAD) T test on the ratios Condition 1 Dye1 Replica L Condition 1 dye1 Replica R Condition 2 dye2 Replica L Condition 2 dye2 Replica R Condition 2 dye1 Replica L Condition 2 dye1 Replica R Condition 1 dye2 Replica L Condition 1 dye2 Replica R Array 1 Array 2 Per gene, per condition 4 measurements available Paired samples CyberT

Results CyberT Mn: mean ratio # obs: number of ratios available to calculate the statistics SD: standard deviation on the ratio estimates T, p calculated t and p value that indicate the significance of the measurement

Results CyberT

SAM

MARAN ANOVA based Filtering Linearisation Bootstrapping Log transformation

Two typical cDNA designs Reference design (Spellman data set) Reference: unsynchronized cells Condition: synchronized cells during cell cycle at distinct time intervals (18) Condition 1 Dye1 Replica L Condition 2 Dye1 Replica L Condition 3 Dye1 Replica L Condition 4 Dye1 Replica L. … Condition 19 Dye2 Replica L Condition 19 Dye2 Replica L Condition 19 Dye2 Replica L Condition 19 Dye2 Replica L Array 1 Experimental design Exercises

Data were precalculated Login: username userGGS Password: Njoedel Uploaded data: Spellman: test cell cycle (reference design) Mouse: latin sqaure design (log transformed) MARAN

Spellman non log transformed

MARAN

Complex cDNA design Latin Square (mouse data set) Reference: normal mouse Condition: pygmee mouse Two experiments T=1, T=2 reflects two sample time points 2 batches: not all genes of the genome on one array A 1, T 1 B1 Test = R Ref = G A 2, T 1 B1 Test = G Ref = R A 5, T 2 B1 Test = R Ref = G A 6, T 2 B1 Test = G Ref = R A 3, T 1 B2 Test = R Ref = G A 4, T 1 B2 Test = R Ref = G A 7, T 2 B2 Test = R Ref = G A 8, T 2 B2 Test = G Ref = R Exercises

Clustering of expression profiling experiments

Complex cDNA design Latin Square (mouse data set) Reference: normal mouse Condition: pygmee mouse Two experiments T=1, T=2 reflects two sample time points 2 batches: not all genes of the genome on one array A 1, T 1 B1 Test = R Ref = G A 2, T 1 B1 Test = G Ref = R A 5, T 2 B1 Test = R Ref = G A 6, T 2 B1 Test = G Ref = R A 3, T 1 B2 Test = R Ref = G A 4, T 1 B2 Test = R Ref = G A 7, T 2 B2 Test = R Ref = G A 8, T 2 B2 Test = G Ref = R Experimental design 8 Arrays 2 Batches 2 Dyes 2 Conditions Exercises

Dataset Yeast cell cycle data set –Data set is preprocessed (slide by slide) –Expression level of each gene is expressed as the log of the ratio –15 experiments, 7000 genes –Filtering based on variance => retain 3000 genes –Rescaling (mean variance) –Cluster the experiment using Kmeans (EPCLUST) Hierarchical clustering (EPCLUST) AQBC (INCLUsive)

Exercises Clustering INCLUsive

Exercises Clustering INCLUsive

Exercises Clustering INCLUsive Average profile

Exercises EPCLUST

Exercises EPCLUST Remember the ID of the file

Check if your data were uploaded Go back and refresh the page to return to the original page Exercises EPCLUST Continue here

Exercises EPCLUST

Exercises EPCLUST Make a selection of the most interesting genes, because a filtering was already performed select all data

Exercises EPCLUST Try hierarchical clustering and K- means clustering

K-means 30 clusters, Euclidea n distance Exercises EPCLUST: result Kmeans

Exercises EPCLUST Try hierarchical clustering and K- means clustering

The comparison between the content of these two clusters can be seen in the file vergelijkingcluster.xls

Exercises EPCLUST: hierarchical clustering Analyze the tree Try to detect the number of clusters in the dataset Click on a node and view the profile of a subcluster

Exercises EPCLUST: automatic linking to other tools

Exercises EPCLUST: automatic linking to other tools

Exercises EPCLUST: automatic linking to other tools

FATIGO: calculating statistical overrepresentation using GO