Post-Genomics Experimental Design CSC8309 - Gene Expression and Proteomics Simon Cockell & Cedric Simillion.

Slides:



Advertisements
Similar presentations
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Advertisements

Statistics in Science  Role of Statistics in Research.
Statistics.  Statistically significant– When the P-value falls below the alpha level, we say that the tests is “statistically significant” at the alpha.
1 Hypothesis Testing Chapter 8 of Howell How do we know when we can generalize our research findings? External validity must be good must have statistical.
From the homework: Distribution of DNA fragments generated by Micrococcal nuclease digestion mean(nucs) = bp median(nucs) = 110 bp sd(nucs+ = 17.3.
Probability & Statistical Inference Lecture 7 MSc in Computing (Data Analytics)
Using Statistics in Research Psych 231: Research Methods in Psychology.
Chapter 8 Hypothesis Testing I. Significant Differences  Hypothesis testing is designed to detect significant differences: differences that did not occur.
Using Statistics in Research Psych 231: Research Methods in Psychology.
MARE 250 Dr. Jason Turner Hypothesis Testing II To ASSUME is to make an… Four assumptions for t-test hypothesis testing: 1. Random Samples 2. Independent.
SADC Course in Statistics Comparing Means from Independent Samples (Session 12)
Differentially expressed genes
Statistical Analysis of Microarray Data
Analysis of Variance Chapter 3Design & Analysis of Experiments 7E 2009 Montgomery 1.
Inference about a Mean Part II
Ch. 9 Fundamental of Hypothesis Testing
Using Statistics in Research Psych 231: Research Methods in Psychology.
Sample Size Determination
Inferential Statistics
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 7 – T-tests Marshall University Genomics Core Facility.
Basic Statistics in Clinical Research Slides created from article by Augustine Onyeaghala (MSc, PhD, PGDQA, PGDCR, MSQA,
Hypothesis Testing.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Tuesday, September 10, 2013 Introduction to hypothesis testing.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 – Multiple comparisons, non-normality, outliers Marshall.
Statistical Analysis Statistical Analysis
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Comparing Two Proportions
Health and Disease in Populations 2001 Sources of variation (2) Jane Hutton (Paul Burton)
Chapter 9 Power. Decisions A null hypothesis significance test tells us the probability of obtaining our results when the null hypothesis is true p(Results|H.
SUMMARY Hypothesis testing. Self-engagement assesment.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Verna Vu & Timothy Abreo
Hypotheses tests for means
No criminal on the run The concept of test of significance FETP India.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
Chapter 20 Testing Hypothesis about proportions
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Statistics for Differential Expression Naomi Altman Oct. 06.
Module 15: Hypothesis Testing This modules discusses the concepts of hypothesis testing, including α-level, p-values, and statistical power. Reviewed.
Design of Micro-arrays Lecture Topic 6. Experimental design Proper experimental design is needed to ensure that questions of interest can be answered.
Chapter 8 Parameter Estimates and Hypothesis Testing.
KNR 445 Statistics t-tests Slide 1 Introduction to Hypothesis Testing The z-test.
The Analysis of Variance. One-Way ANOVA  We use ANOVA when we want to look at statistical relationships (difference in means for example) between more.
C82MST Statistical Methods 2 - Lecture 1 1 Overview of Course Lecturers Dr Peter Bibby Prof Eamonn Ferguson Course Part I - Anova and related methods (Semester.
CSIRO Insert presentation title, do not remove CSIRO from start of footer Experimental Design Why design? removal of technical variance Optimizing your.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
Type I and Type II Errors. For type I and type II errors, we must know the null and alternate hypotheses. H 0 : µ = 40 The mean of the population is 40.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
Model adequacy checking in the ANOVA Checking assumptions is important –Normality –Constant variance –Independence –Have we fit the right model? Later.
6.2 Large Sample Significance Tests for a Mean “The reason students have trouble understanding hypothesis testing may be that they are trying to think.”
Statistics (cont.) Psych 231: Research Methods in Psychology.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Review Statistical inference and test of significance.
Inferential Statistics Psych 231: Research Methods in Psychology.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Chapter 21 More About Tests.
More about Tests and Intervals
Hypothesis Testing Two Proportions
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Design Issues Lecture Topic 6.
Presentation transcript:

Post-Genomics Experimental Design CSC Gene Expression and Proteomics Simon Cockell & Cedric Simillion

Outline Introduction –Post-Genomic Technologies –The Importance of Design Experimental Design –When Design Goes Bad –More Commonly Made Mistakes –Things Done Right –Types of Experiment

Post-Genomic Technologies Set of technologies that have become prevalent since the advent of genome sequencing Also referred to as ’functional genomics’ technologies –Transcriptomics –Proteomics –Metabolomics 'High-throughput’ techniques, generate lots of data, fast

Importance of Design Functional Genomics experiments are expensive The quantity of data can mask interesting biological variation (noise) ‏ Bad design can increase noise Or at least fail to minimise it

When Design Goes Wrong A trivial example Bill and Ben want to identify proteins upregulated in response to water starvation in a drought resistant plant So, Bill went away and grew some plants, and so did Ben

When Design Goes Wrong continued Bill chose 3 plants, and Ben chose 4 Bill grew his at home in normal conditions, and Ben grew his in the lab with minimal water Then, after a few days of growth, they each took samples from their plants and ran 2D-PAGE

When Design Goes Wrong analysis They used average gels of the 2 groups of plants to find differentially expressed proteins They did t-tests for every spot on the gels, and found 400 of 2500 proteins (95% level) with significantly altered expression in drought conditions What now? They only wanted 10-20

When Design Goes Wrong What did they do wrong? Confounding –Experiment can’t distinguish between a number of factors: Drought Experimenter effects Difference between home and lab Selection –Bill or Ben could be biased in how they selected plants, even unconsciously –Randomised selection is preferred Unbalanced –Better to have equal numbers in each group for many statistical analyses

When Design Goes Wrong How to improve Grow plants together under same conditions Select an equal number randomly for both Bill and Ben Both half their plants and grow normal and drought plants to the same protocol Better still, either Bill or Ben should do the whole experiment

When Design Goes Wrong Post mortem Even with a rigorously designed experiment, Bill and Ben may still have obtained confusing results –It is common to identify many differentially expressed genes/proteins –This can be a true reflection of the biology –False discovery rate is necessarily high in post-genomic experiments, because of the number of hypotheses being tested Good experimental design could have reduced the complexity of their output – providing a base for a robust statistical analysis of the data

Choice of Technology Microarray or proteomics? Affy or two-colour arrays? –Reference sample? 2D gels or LC-MS? Single stain or DIGE? –Reference sample? No easy (or correct) answers –Depends very much on the individual experiment

Further Pitfalls Fahrenheit and the Cow Based on urban myth Still an important message –No individual is typical –Biological, as well as technical, replicates required

Further Pitfalls The pester problem –Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad can I have a puppy, Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy? Dad, can I have a puppy ? Ask a question often enough, eventually you’ll get the answer you’re after

Further Pitfalls The universe doesn’t exist -- on average –Pooling samples makes little sense: no information about distribution / need STDDEV for significance test “My machine/technique is so accurate, I don’t need replicates” –Accuracy has little effect on biological variance

Doing Things Right Some ideas for good design Blocking Replicates –Calculating power

Doing Things Right Blocking Flask Gel IEF PAGE

Doing Things Right Replication

Doing Things Right Calculating power Probability density (null hypothesis) Probability density (alternative hypothesis)    = probability of false positive (Type I Error)  = Power 1-  = probability of false negative (Type II Error)

Doing Things Right Calculating power Probability density (null hypothesis) Probability density (alternative hypothesis)    = probability of false positive (Type I Error)  = Power 1-  = probability of false negative (Type II Error)

Doing Things Right Calculating power Probability density (null hypothesis) Probability density (alternative hypothesis)    = probability of false positive (Type I Error)  = Power 1-  = probability of false negative (Type II Error)

Doing Things Right Calculating power Probability density (null hypothesis) Probability density (alternative hypothesis)    = probability of false positive (Type I Error)  = Power 1-  = probability of false negative (Type II Error)

Types of Experiment Time course –Cell cycle –Following drug challenge –Following external stimulus –Following release of mutant Mutant vs Wild-Type Normal vs Diseased Developmental Changes Different Tissues Within cell differences

Types of Experiment Novel microarray techniques –Genotyping –SNP detection –Copy Number Assessment Novel proteomics techniques –High-throughput interaction detection –Phosopho-proteomics Also… –Protein binding arrays –Ligand binding arrays

A couple of quotes You know, the most amazing thing happened to me tonight. I was coming here, on the way to the lecture, and I came in through the parking lot. And you won’t believe what happened. I saw a car with the license plate ARW 357. Can you imagine? Of all the millions of license plates in the state, what was the chance that I would see that particular one tonight? Amazing! –Richard P. Feynman To consult a statistician after an experiment is finished is often merely to ask him to conduct a post-mortem examination. He can perhaps say what the experiment died of. –R.A.Fisher, 1938.

Summary Post-genomics technologies are powerful, but expensive Good design gives maximum return for minimum effort

Any questions? After the fact questions: