Introduction to Microarrays Dr. Özlem İLK & İbrahim ERKAN 2011, Ankara.

Slides:



Advertisements
Similar presentations
From the homework: Distribution of DNA fragments generated by Micrococcal nuclease digestion mean(nucs) = bp median(nucs) = 110 bp sd(nucs+ = 17.3.
Advertisements

Genomic Profiles of Brain Tissue in Humans and Chimpanzees II Naomi Altman Oct 06.
C82MST Statistical Methods 2 - Lecture 4 1 Overview of Lecture Last Week Per comparison and familywise error Post hoc comparisons Testing the assumptions.
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
Detecting Differentially Expressed Genes Pengyu Hong 09/13/2005.
Part I – MULTIVARIATE ANALYSIS
Using Statistics in Research Psych 231: Research Methods in Psychology.
Analysis of gene expression data (Nominal explanatory variables) Shyamal D. Peddada Biostatistics Branch National Inst. Environmental Health Sciences (NIH)
Gene Expression Data Analyses (3)
Differentially expressed genes
ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models.
Statistical Analysis of Microarray Data
1 Data Analysis for Gene Chip Data Part I: One-gene-at-a-time methods Min-Te Chao 2002/10/28.
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
Lecture 9: One Way ANOVA Between Subjects
Biol 500: basic statistics
One-way Between Groups Analysis of Variance
Significance Tests P-values and Q-values. Outline Statistical significance in multiple testing Statistical significance in multiple testing Empirical.
Student’s t statistic Use Test for equality of two means
Statistical Comparison of Two Learning Algorithms Presented by: Payam Refaeilzadeh.
Using Statistics in Research Psych 231: Research Methods in Psychology.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 5 – Testing for equivalence or non-inferiority. Power.
Different Expression Multiple Hypothesis Testing STAT115 Spring 2012.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Analysis of Variance (ANOVA) Quantitative Methods in HPELS 440:210.
Wfleabase.org/docs/tileMEseq0905.pdf Notes and statistics on base level expression May 2009Don Gilbert Biology Dept., Indiana University
Multiple testing in high- throughput biology Petter Mostad.
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Essential Statistics in Biology: Getting the Numbers Right
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
CSCE555 Bioinformatics Lecture 16 Identifying Differentially Expressed Genes from microarray data Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun.
Differential Gene Expression Dennis Kostka, Christine Steinhoff Slides adapted from Rainer Spang.
Sociology 5811: Lecture 14: ANOVA 2
Analysis of variance Petter Mostad Comparing more than two groups Up to now we have studied situations with –One observation per object One.
Department of Statistics, University of California, Berkeley, and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research.
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
Microarray data analysis David A. McClellan, Ph.D. Introduction to Bioinformatics Brigham Young University Dept. Integrative Biology.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
Introduction to Statistical Analysis of Gene Expression Data Feng Hong Beespace meeting April 20, 2005.
1 9 Tests of Hypotheses for a Single Sample. © John Wiley & Sons, Inc. Applied Statistics and Probability for Engineers, by Montgomery and Runger. 9-1.
MRNA Expression Experiment Measurement Unit Array Probe Gene Sequence n n n Clinical Sample Anatomy Ontology n 1 Patient 1 n Disease n n ProjectPlatform.
Statistics for Differential Expression Naomi Altman Oct. 06.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: One-way ANOVA Marshall University Genomics Core.
Differential Expressions: Multiple Treatments ANOVA Kruskal Wallis Factorial Set-up.
Statistical Analysis of Microarray Data By H. Bjørn Nielsen.
Extracting binary signals from microarray time-course data Debashis Sahoo 1, David L. Dill 2, Rob Tibshirani 3 and Sylvia K. Plevritis 4 1 Department of.
CSIRO Insert presentation title, do not remove CSIRO from start of footer Experimental Design Why design? removal of technical variance Optimizing your.
For a specific gene x ij = i th measurement under condition j, i=1,…,6; j=1,2 Is a Specific Gene Differentially Expressed Differential expression.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
1 Estimation of Gene-Specific Variance 2/17/2011 Copyright © 2011 Dan Nettleton.
The Broad Institute of MIT and Harvard Differential Analysis.
Empirical Bayes Analysis of Variance Component Models for Microarray Data S. Feng, 1 R.Wolfinger, 2 T.Chu, 2 G.Gibson, 3 L.McGraw 4 1. Department of Statistics,
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
Other uses of DNA microarrays
ANalysis Of VAriance (ANOVA) Used for continuous outcomes with a nominal exposure with three or more categories (groups) Result of test is F statistic.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Inferential Statistics Psych 231: Research Methods in Psychology.
Chapter 11: Test for Comparing Group Means: Part I.
Microarray Data Analysis Xuming He Department of Statistics University of Illinois at Urbana-Champaign.
Micro array Data Analysis. Differential Gene Expression Analysis The Experiment Micro-array experiment measures gene expression in Rats (>5000 genes).
Estimating the False Discovery Rate in Genome-wide Studies BMI/CS 576 Colin Dewey Fall 2008.
Slides to accompany Weathington, Cunningham & Pittenger (2010), Chapter 11: Between-Subjects Designs 1.
Presentation transcript:

Introduction to Microarrays Dr. Özlem İLK & İbrahim ERKAN 2011, Ankara

Gene  The fundamental unit in a living organism. It holds information necessary for building cells and passing genetic traits to offsprings.  Every single cell in the human body have the same set of genes.  However, different genes are active (and therefore "expressed") in different kinds of cells and tissues. 2

Gene  The amount of activity (expression) in a gene is an indicator whether it is being used to form an organic structure or not. 3

What is a microarray?  DNA microarrays are small solid surfaces on which thousands of gene sequences are contained. 4

Microarray  The spots of gene sequences are placed in an order.  Therefore, the researcher can keep track of the gene sequences and uses the location of each spot in the array to identify a particular gene sequence. 5

Microarray  Microarrays help easily and quickly analyze thousands of genes at once.  Analyzing genes refers to the determination of activity and also the amount of activity in a specific gene.  This can help finding active genes in the existence of a certain disease or a treatment providing guidance for medical researchers.  Interdisciplinary work – statisticians, biologists, doctors, engineers, … 6

Methods  Preprocessing – MAS5, dChip,RMA, gcRMA...  fold change  Clustering  ANOVA, t-test  multiple testing correction Probe ID _g_at _at _at

Fold change: (GE1 / GE2) + Easy + Can use in data sets without any replicates − Not a statistical method, doesn’t take the variance into account, its sensitivity and reliability are in doubt Probe IDCaseControlFold change 100_g_at (7.4587/7.321) 1000_at _at

Clustering + Easy − Can’t measure the statistical significance and would find clusters in the data even if there aren’t reasonable clusters in the dataset − Can be affected from the data transformation or from the measurement unit (Xu et al., Human Molecular Genetics, 2002) 9

ANOVA H0: µ1i=µ2i=µ3i, H1: at least one is different, i: prob set Probe IDDrug1, Rep1 Drug1, Rep2 Drug2, Rep1 Drug2, Rep2 Drug3, Rep1 Drug3, Rep2 p-value 100_g_at _at _at

Interpretation of the results Rejection region Rejection region z If p-value < alpha, then reject H0 Alpha is usually 0.05, 0.01 or 0.1 E.g., expected false positive is 10,000 gene * 0.05 =

ANOVA + Can test the difference between groups + Takes the variance into account − Can’t be applied to data w/o replicates − Assumes data come from Normal distribution − The rejection of H0 does not provide the information on which groups are different, we need pairwise comparisons (t-tests) for this. 12

t-test H0: µ1i=µ2i, H1: µ1i≠µ2i + Can measure the differences between two groups + Takes the variance into account − Can’t be applied to data w/o replicates − Assumes data come from Normal distribution Warning: paired / unpaired t-test Probe IDp (1,2)p (2,3)p (1,3) 100_g_at _at _at

Multiple testing correction  Assume that we are holding two independent tests (!) for two genes. Let the probability of each being correct be Due to independence, the probability of both test being correct is 0.95*0.95 = − Bonferroni  New alpha = alpha / test number  Too conservative + Benjamini-Hochberg FDR  Example, suppose you found 100 expressed genes out of 10,000 at alpha=  Expected false positive number 10,000 gene * = 1  False Discovery Rate (FDR) = 1/100 =

Recommended references  Pavlidis, P.. Using ANOVA for gene selection from microarray studies of the nervous system. Methods, (2003); 31, 282–289.  Books in the series of QP624.5 in the library  A few courses in Computer Engineering Department (Tolga Can) 15