Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introductory Statistics for Laboratorians dealing with High Throughput Data sets Centers for Disease Control.

Similar presentations


Presentation on theme: "Introductory Statistics for Laboratorians dealing with High Throughput Data sets Centers for Disease Control."— Presentation transcript:

1 Introductory Statistics for Laboratorians dealing with High Throughput Data sets Centers for Disease Control

2 Statistics and Order Random vs. Accidental Snowflakes Quincunx’ – http://www.stattucino.com/berrie/dsl/Galton.html http://www.stattucino.com/berrie/dsl/Galton.html – http://www.jcu.edu/math/ISEP/Quincunx/Quincunx.html http://www.jcu.edu/math/ISEP/Quincunx/Quincunx.html Statistical Determinism

3 Hypothesis Testing Statistical methods are used to test scientific hypotheses. You already understand this logic, you do this every day. Statistical methods simply provide a way to put numbers to your logic. That is, compute the chances that you are wrong or right.

4 Harry and Sue This is the story of Harry Heartthrob and his Girlfriend Sue Sweetheart. Harry would like to know that everything is fine between him and Sue. Begin by stating two hypotheses: – H 0 : Things are fine in Harry’s love life. – H a : Harry has romantic problems. We assume that H 0 : is true – that things are fine. This leads us to expect to see certain things when we observe nature/reality.

5 Hypotheses H 0 : is the “Null Hypotheis.” It states that nothing is going on. H a : is the “Alternative Hypothesis.” It states that something is going on. The null and alternative hypotheses must be mutually exclusive and exhaustive. That is: they can’t both be true and they can’t both be false.

6 Harry and Sue Believing that H 0 : is true (as we assume) leads to certain expectations about what will be observed in reality. Specifically Harry will expect there to be “no signs of men” in Sue’s apartment. If Harry observes things in her apartment that differ from his expectations he will begin to doubt the truth of H 0 :

7 H 0 : Assume All is OK Expect no Signs of Men in Her Apartment Harry’s Love Life H a : Harry has Romantic Problems

8 Proof Can you ever prove that H 0 : is true? NO!!! What is the strongest evidence that H 0 : is true? – No indication of men in the apartment! Could this happen if H 0 : is false? YES!! The strongest evidence for H 0 : is still weak We “assume” H 0 : is true if the evidence it is false is weak.

9 Proof Can Harry prove that H 0 : is false? YES!!! There can be things that are so improbable if H 0 : is true that when you observe those things, (XXX) for example, you know H 0 : is false. There can be strong evidence that H 0 : is false. When we see that strong evidence we reject H 0 : Then we say “we conclude H a : is true” or “we have proved H a : “

10 H 0 : Assume All is OK Expect no Signs of Men in Her Apartment Statistically Speaking H a : Harry has Romantic Problems 98 % 20 % 2%2%.05 % Reject H 0 : if the probability of the observed event is small enough if H 0 : is assumed to be true

11 H 0 : Assume All is OK Expect no Signs of Men in Her Apartment Statistical Error H a : Harry has Romantic Problems 98 % 20 % 2%2%.05 % Type I Error: Reject H 0 : when it is true Type II Error: Continue to believe H 0 : when it is false Two Types of Error Type I: too jealous Type II: too trusting

12 Statistical Error Truth about Sue Sweetheart H 0 True She is actually not cheating H 0 False She is cheating her A off. Harry’s Decision Based on Observed Data Reject H 0 : Concludes she is Cheating Type I Error (Alpha) Too jealous Correct Decision Fail To Reject H 0 : Concludes everything is OK Correct Decision Type II Error (Beta) Too trusting

13 Quality Control Have the samples been watered down? There is a severe shortage of flu vaccine in the USA this season. However, Canada has a large surplus and they are willing to sell it to us. We are a little paranoid and wonder if the reason they have extra is because they watered it down. We make a surprise visit to their warehouse and request a sample of the vaccine for evaluation purposes before we commit to purchase the lot. They allow us to select 70 vials at random from the whole lot for testing.

14 Quality Control Example All flu vaccine is made to standard specifications. It is all supposed to be 16 m/dl with a standard deviation of 0.4. (That’s what it’s supposed to be if it is not watered down). We measure the 70 vials from Canada and get a mean of 15.8 m/dl. Is the Canadian surplus watered down?

15 Step 1: State H 0 and H a H 0 : This sample of 70 vials (with a mean of 15.8) comes from a population with a mean of 16 and a standard deviation of 0.4. – (Everything is fine.) H a : This sample of 70 vials could not have been drawn from a population with a mean of 16 and a standard deviation of 0.4. – (There is a problem.)

16 Step 2: Select a Region of Rejection If the probability of the null hypothesis being true is less than 5 chances in 100 (.05) we will reject it. Alpha =.05

17 Step 3: Make Observations Conduct the experiment – make surprise trip to the Canadian warehouse, select vials at random, test each vial. Compute the mean for the 70 vials. – mean = 15.8.

18 Step 4: Test the Null Hypothesis What does the Central Limit Theorem tells us about the distribution of means of samples of size N = 70 from a population with a mean of 16 and standard deviation of 0.4. Central Limit Theorem says: – Mean of the means of all possible samples should be 16 – Standard Error (Standard Deviation) of the means is 0.4/sqrt(70) =.048

19 Step 4: Test the Null Hypothesis Use http://davidmlane.com/hyperstat/z_table.html to compute the probability that a mean would be 15.8 or greater if the Sampling Distribution of the Mean has a mean of 16 and a standard deviation of.048. http://davidmlane.com/hyperstat/z_table.html 15.8 is 4.17 standard errors (standard deviations) below the mean of the population (16). Z = -4.17

20 Step 4: Test the Null Hypothesis The probability of the mean of a sample of size 70 being 15.8 or less is.000015 (15 chances in 1,000,000). This is in the region of rejection Reject H 0 -- There is a problem. This stuff has been watered down.

21 Region of Rejection for a Sample of size N = 70 from a Population with mean 16 and standard deviation of 0.4 The region of rejection is anything below 15.921. 15.921 cuts of.05 of the distribution. There are only 5 chances in 100 of a mean being less than 15.921 Our mean was 15.8

22 Types of Error Truth about Population from which sample came H 0 TrueH 0 False Decision Based on Sample Reject H 0 Type I Error (Alpha) Correct Decision Fail To Reject H 0 Correct Decision Type II Error (Beta)

23 Error in Diagnostic Testing Truth about person H 0 True Really Don’t Have Disease H 0 False Really Do Have Disease What Diagnostic Test Tells Us Positive (Says they have disease) Type I Error (Alpha) False Positive True Positive Negative (Says they don’t have disease) True Negative Type II Error (Beta) False Negative


Download ppt "Introductory Statistics for Laboratorians dealing with High Throughput Data sets Centers for Disease Control."

Similar presentations


Ads by Google