Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Microarrays Dr. Özlem İLK & İbrahim ERKAN 2011, Ankara.

Similar presentations


Presentation on theme: "Introduction to Microarrays Dr. Özlem İLK & İbrahim ERKAN 2011, Ankara."— Presentation transcript:

1 Introduction to Microarrays Dr. Özlem İLK & İbrahim ERKAN 2011, Ankara

2 Gene  The fundamental unit in a living organism. It holds information necessary for building cells and passing genetic traits to offsprings.  Every single cell in the human body have the same set of genes.  However, different genes are active (and therefore "expressed") in different kinds of cells and tissues. 2

3 Gene  The amount of activity (expression) in a gene is an indicator whether it is being used to form an organic structure or not. 3

4 What is a microarray?  DNA microarrays are small solid surfaces on which thousands of gene sequences are contained. 4

5 Microarray  The spots of gene sequences are placed in an order.  Therefore, the researcher can keep track of the gene sequences and uses the location of each spot in the array to identify a particular gene sequence. 5

6 Microarray  Microarrays help easily and quickly analyze thousands of genes at once.  Analyzing genes refers to the determination of activity and also the amount of activity in a specific gene.  This can help finding active genes in the existence of a certain disease or a treatment providing guidance for medical researchers.  Interdisciplinary work – statisticians, biologists, doctors, engineers, … 6

7 Methods  Preprocessing – MAS5, dChip,RMA, gcRMA...  fold change  Clustering  ANOVA, t-test  multiple testing correction Probe ID123456 100_g_at7.45877.3217.20377.32047.3337.486 1000_at6.27086.2766.03706.0466.0526.262 1708_at11.6211.40911.3984.354.3584.534 7

8 Fold change: (GE1 / GE2) + Easy + Can use in data sets without any replicates − Not a statistical method, doesn’t take the variance into account, its sensitivity and reliability are in doubt Probe IDCaseControlFold change 100_g_at7.45877.32041.012 (7.4587/7.321) 1000_at6.27086.0460.999 1708_at11.624.352.67 8

9 Clustering + Easy − Can’t measure the statistical significance and would find clusters in the data even if there aren’t reasonable clusters in the dataset − Can be affected from the data transformation or from the measurement unit (Xu et al., Human Molecular Genetics, 2002) 9

10 ANOVA H0: µ1i=µ2i=µ3i, H1: at least one is different, i: prob set Probe IDDrug1, Rep1 Drug1, Rep2 Drug2, Rep1 Drug2, Rep2 Drug3, Rep1 Drug3, Rep2 p-value 100_g_at7.45877.3217.20377.32047.3337.4860.46 1000_at6.27086.2766.03706.0466.0526.2620.0043 1708_at11.6211.40911.39811.354.3584.5340.00288 10

11 Interpretation of the results Rejection region Rejection region -1.96 1.96 z If p-value < alpha, then reject H0 Alpha is usually 0.05, 0.01 or 0.1 E.g., expected false positive is 10,000 gene * 0.05 = 500 11

12 ANOVA + Can test the difference between groups + Takes the variance into account − Can’t be applied to data w/o replicates − Assumes data come from Normal distribution − The rejection of H0 does not provide the information on which groups are different, we need pairwise comparisons (t-tests) for this. 12

13 t-test H0: µ1i=µ2i, H1: µ1i≠µ2i + Can measure the differences between two groups + Takes the variance into account − Can’t be applied to data w/o replicates − Assumes data come from Normal distribution Warning: paired / unpaired t-test Probe IDp (1,2)p (2,3)p (1,3) 100_g_at0.2920.2630.863 1000_at0.00050.3860.383 1708_at0.3250.000170.0004 13

14 Multiple testing correction  Assume that we are holding two independent tests (!) for two genes. Let the probability of each being correct be 0.95. Due to independence, the probability of both test being correct is 0.95*0.95 = 0.9025. − Bonferroni  New alpha = alpha / test number  Too conservative + Benjamini-Hochberg FDR  Example, suppose you found 100 expressed genes out of 10,000 at alpha= 0.0001  Expected false positive number 10,000 gene * 0.0001 = 1  False Discovery Rate (FDR) = 1/100 = 0.01 14

15 Recommended references  Pavlidis, P.. Using ANOVA for gene selection from microarray studies of the nervous system. Methods, (2003); 31, 282–289.  Books in the series of QP624.5 in the library  A few courses in Computer Engineering Department (Tolga Can) 15


Download ppt "Introduction to Microarrays Dr. Özlem İLK & İbrahim ERKAN 2011, Ankara."

Similar presentations


Ads by Google