Chapter 1 Introduction to the Statistical Process

Slides:



Advertisements
Similar presentations
CHAPTER 15: Tests of Significance: The Basics Lecture PowerPoint Slides The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner.
Advertisements

Unit 1 Overview  Significance – How strong is the evidence of an effect? (Chapter 1)  Estimation – How large is the effect? (Chapter 2)  Generalization.
Estimating a Population Proportion
Hypothesis Testing: Intervals and Tests
Testing Hypotheses About Proportions Chapter 20. Hypotheses Hypotheses are working models that we adopt temporarily. Our starting hypothesis is called.
Chapter 10: Hypothesis Testing
Stat 217 – Day 6 Tests of Significance. Quiz 1 Notes Solutions posted in PolyLearn  Grading notation (c) the question I intended (d) make sure put it.
Stat 217 – Day 15 Statistical Inference (Topics 17 and 18)
Estimation: How Large is the Effect? Chapter 2. Chapter Overview So far, we can only say things like ◦ “We have strong evidence that the long-run probability.
Preliminaries Introduction to Statistical Investigations
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 11 Introduction to Hypothesis Testing.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Hypotheses STAT 101 Dr. Kari Lock Morgan SECTION 4.1 Statistical test Null and alternative.
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Testing Hypotheses About Proportions
Chapter 9 Comparing More than Two Means. Review of Simulation-Based Tests  One proportion:  We created a null distribution by flipping a coin, rolling.
Chapter 4 Introduction to Hypothesis Testing Introduction to Hypothesis Testing.
Chapter 8 Introduction to Hypothesis Testing
Significance Tests: THE BASICS Could it happen by chance alone?
AP Statistics Section 11.1 A Basics of Significance Tests
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
Agresti/Franklin Statistics, 1 of 122 Chapter 8 Statistical inference: Significance Tests About Hypotheses Learn …. To use an inferential method called.
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
Chapter 20 Testing hypotheses about proportions
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 20 Testing Hypotheses About Proportions.
10.1: Confidence Intervals Falls under the topic of “Inference.” Inference means we are attempting to answer the question, “How good is our answer?” Mathematically:
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
MATH 2400 Ch. 15 Notes.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
CHAPTER 9 Testing a Claim
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Rejecting Chance – Testing Hypotheses in Research Thought Questions 1. Want to test a claim about the proportion of a population who have a certain trait.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Chapter 12: Hypothesis Testing. Remember that our ultimate goal is to take information obtained in a sample and use it to come to some conclusion about.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
© Copyright McGraw-Hill 2004
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
STA Lecture 221 !! DRAFT !! STA 291 Lecture 22 Chapter 11 Testing Hypothesis – Concepts of Hypothesis Testing.
Chapter 9 Day 2 Tests About a Population Proportion.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
Significance Tests Section Cookie Monster’s Starter Me like Cookies! Do you? You choose a card from my deck. If card is red, I give you coupon.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.1 Significance Tests:
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Slide 20-1 Copyright © 2004 Pearson Education, Inc.
Section 9.1 First Day The idea of a significance test What is a p-value?
Tests of Significance We use test to determine whether a “prediction” is “true” or “false”. More precisely, a test of significance gets at the question.
Significance Tests: The Basics Textbook Section 9.1.
Copyright © 2009 Pearson Education, Inc. 9.2 Hypothesis Tests for Population Means LEARNING GOAL Understand and interpret one- and two-tailed hypothesis.
Statistics 20 Testing Hypothesis and Proportions.
+ Testing a Claim Significance Tests: The Basics.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.1 Significance Tests:
What Is a Test of Significance?
Unit 5: Hypothesis Testing
Testing Hypotheses about Proportions
Warm Up Check your understanding p. 541
Testing Hypotheses About Proportions
Simulation-Based Approach for Comparing Two Means
CHAPTER 9 Testing a Claim
Stat 217 – Day 7 Tests of Significance.
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Significance Tests: The Basics
Significance Tests: The Basics
Testing Hypotheses About Proportions
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
STA 291 Spring 2008 Lecture 17 Dustin Lueker.
Presentation transcript:

Chapter 1 Introduction to the Statistical Process

Statistics vs. Anecdotal Evidence Section 1.1: Introduction to Statistics Statistics vs. Anecdotal Evidence Smoking causes cancer. Seat belts save lives.

Autism and Vaccines Nelson says it wasn't long after her son Parker's shots at 15 months that she noticed something was wrong. "He had run a slight fever after the vaccinations, but i didn't think anything of it," said Nelson. "You know kids run fevers all the time, but about a week after that he just completely stopped talking." After months of worrying, wondering, and going back and forth with doctors, an official diagnosis was made: autism. Nelson believes it started with the vaccines. "Gradually, I started piecing it together. He got sick after his vaccinations and about a week later everything changed. He was a completely different little boy then," said Nelson. http://www.wsaz.com/charleston/headlines/19376044.html

What is Statistics? Statistics the discipline that guides us to produce or collect data which is then analyzed in order to draw inferences or make predictions. Numerical summaries such as means, percentages, and standard deviations are called statistics.

Descriptive Statistics Descriptive Statistics refers to methods for summarizing data. These summaries consist of graphs (histograms, scatterplots, pie charts, etc.) and numbers (means, standard deviations, regression equations, percentages, etc.).

Inferential Statistics Inferential statistics refers to methods of making decisions or predictions about a population or a process, based on data obtained from a sample. We will use tests of significance and confidence intervals to achieve this.

This semester, we will be looking at and conducting a number of studies

Statistical Process Logic of Inference Scope of - Significance - Estimation - Generalize - Cause/Effect 7. Communicate findings 1. Ask a research question Research Conjecture 2. Design a study 3. Collect data 4. Explore the data 5. Draw inferences 6. Formulate conclusions

Physicians’ Health Study I 1. Research Question: Will taking aspirin help reduce heart attacks? 2. Design Study: Started in 1982 with 22,071 male physicians. Half took a 325mg aspirin every other day (the other half took a placebo)

Physicians’ Health Study I 3. Collect Data: Intended to go until 1995, the aspirin study was stopped in 1988 after 189 heart attacks occurred in the placebo group and 104 in the aspirin group. Hoped to be a wonder drug, it was found there was no benefit or harm from beta carotene. This result allowed investigators to turn to other, more promising agents.

Physicians’ Health Study I 4. Explore Data: 1.7% in the placebo group had heart attacks while only 0.9% in the aspirin group had heart attacks. (45% reduction in heart attacks for the aspirin group) 5. Draw Inferences: The likelihood of the difference between the proportions of heart attacks in each group being as large as it was just by chance is very, very small.

Physicians’ Health Study I 6. Formulate Conclusions: They concluded that taking aspirin does reduce the likelihood of heart attacks in middle-age and older males. 7. Report Findings:

Terminology The individual entities on which data are recorded are called observational units. The recorded characteristics of the observational units are the variables of interest. What are the observational units and variables in the Physician’s Health Study?

Logic of Statistical Inference Section 1.2 Introduction to the Logic of Statistical Inference

Dolphin Communication Can dolphins communicate abstract ideas? In an experiment done in the 1960s, Doris was instructed which of two buttons to push. She then had to communicate this to Buzz (who could not see Doris). If he picked the correct button, both dolphins would get a reward. What are the observational units and variables in this study?

Dolphin Communication In one set of trials, Buzz chose the correct button 15 out of 16 times. Based on these results, do you think Buzz knew which button to push or is he just guessing? How might we justify an answer? How might we model this situation?

Modeling Buzz and Doris Flip Coins Applet

Can Chimps Solve Problems? http://youtu.be/ySMh1mBi3cI

Exploration 1.2: Can Chimps Solve Problems? Sarah, a 30 year-old chimp, is shown videos of a person struggling with some problem. (can’t reach a banana, cage door locked, record player not working, etc.) She is then shown two pictures. One of the solution and one not. She then picks one of the pictures. Does Sarah understand the solution to these problems or is she just randomly picking a picture?

Exploration 1.2 (pg 15) Read the first paragraph. State the research question. (This is a broad statement.) State the research conjecture. (This is more specific to our test.) Sarah correctly picked 7 of the 8 pictures. Is this unlikely if she is just guessing? Continue working on the exploration.

Section 1.3 Statistical Significance: Other Random Choice Models

Can dogs sniff out cancer? Marine sniffing samples

Can Dogs Sniff Out Cancer? 1. Research Question: Can dogs detect a patient with cancer by smelling their breath? 2. Design a study: Five breath bags were shown to Marine, one from a cancer patient and four from non-cancer patients. 3. Collect data: Marine completed 33 attempts at this procedure. 4. Explore the data: Marine identified the correct bag 30 out of 33 times.

Can Dogs Sniff Out Cancer? How is the chance model we will use for this situation different than our previous ones? Can we use coins again?

Can Dogs Sniff Out Cancer? 5. Draw Inferences Three S Strategy Statistic: Compute the statistic from the observed data. Simulate: Identify a model that represents a chance explanation. Use the model to simulate data that “could have happened” when the chance model is true. Calculate the value of the statistic from the could-have-been data. Repeat the simulation process to generate a distribution of the could-have-been values for the statistic. Strength of evidence: Consider whether the value of the observed statistic is unlikely to occur when the chance model is true.

Can Dogs Sniff Out Cancer? We have the statistic. Marine made the correct identification 30 out of 33 times. How could we set up a simulation? Tactile (how could this be done?) Applet Strength of evidence. Is 30 out of 33 very unlikely under the chance model?

Can Dogs Sniff Out Cancer? 6: Formulate conclusions: Can we conclude that marine can identify cancerous breath? Can we conclude that all dogs can do this? Some dogs? 7: Communicate findings: Marine, the dog that can sniff out bowel cancer By Jeremy Laurance, Health Editor A labrador retriever called Marine has been trained to sniff out cancer with stunning accuracy, researchers report today.

Terminology: Hypotheses The null hypothesis is the chance explanation. Typically the alternative hypothesis is what the researchers think is true. Null hypothesis: Marine is randomly choosing which bag to sit next to. Alternative hypothesis: Marine is not randomly choosing which bag to sit next to.

Terminology: Null Distribution We will refer to the distribution of chance outcomes as the null distribution. For Marine, we should have gotten a null distribution similar to the following.

Terminology: P-value The p-value as the proportion of outcomes in the null distribution that are at least as extreme as the value of the statistic actually observed in the study. What was our p-value for Marine? Were they all the same? Were they all close to the same?

Guidelines for evaluating strength of evidence from p-values p-value >0.10, not much evidence against null hypothesis 0.05 < p-value < 0.10, moderate evidence against the null hypothesis 0.01 < p-value < 0.05, strong evidence against the null hypothesis 0.001 < p-value < 0.01, very strong evidence against the null hypothesis p-value < 0.001, extremely strong evidence against the null hypothesis

Terminology: Statistically Significant If the observed results provide strong evidence that the data did not arise by random chance alone then the research result is called statistically significant. Are Marine’s results statistically significant?

Let’s play some rock-paper-scissors Rock smashes scissors Paper covers rock Scissors cut paper Play the novice version at least 30 times and keep track of all your choices.

Activity 1.4 Now work on activity 1.4.

Criminal Justice System vs. Significance Tests Innocent until proven guilty. We assume a defendant is innocent and the prosecution has to collect evidence to try to prove the defendant is guilty. Likewise, we assume our chance model (or null hypothesis) is true and we collect data and calculate a sample proportion. We then show how unlikely our proportion is if the chance model is true.

Criminal Justice System vs. Significance Tests If the prosecution shows lots of evidence that go against this assumption of innocence (DNA, witnesses, motive, contradictory story, etc.) then the jury concludes that the defendant the innocence assumptions is wrong. If after we collect data and find that the likelihood (p-value) of such a proportion is so small that it would rarely occur by chance if the null hypothesis is true, then we conclude our assumption of the chance model being true is wrong.

Review For Sarah the chimp, you could have gotten a null distribution similar to the one shown here. What does a single dot represent? What does the whole distribution represent? What is the p-value for this simulation? What does this p-value mean?

More Review The null hypothesis is the chance explanation. Typically the alternative hypothesis is what the researchers think is true. Three S Strategy Statistic, Simulate, Strength of evidence The p-value as the proportion of outcomes in the null distribution that are at least as extreme as the value of the statistic actually observed in the study.

Still More Review A small p-value gives evidence against the null and for the alternative. If the observed results provide strong evidence that the data did not arise by random chance alone then the research result is called statistically significant.

Section 1.4 Other Chance Models

Ron Artest, choker at the line? In the 2009-10 basketball Season Ron Artest made 68.8% of his free throws, similar to his career average. In his first 15 attempts in the playoffs, he only made 7 free throws. (46.7%) Is this evidence that he is “choking” and performing significantly worse than during the regular season?

Ron Artest Example What are the observational units? Artest’s 15 free throw attempts. What is the variable? Whether or not he makes the free throw. What is the statistic of interest? 7/15

Notation Our sample proportion (statistic) can be described using the symbol 𝑝 (p-hat). A parameter is a numerical summary of a variable that is either an unobservable long-run outcome or a value for an entire population. It can be described using the symbol 𝜋 (pi). In our example, 𝜋=0.688 and 𝑝 =0.467.

Hypotheses Null hypothesis: Ron Artest’s performance at the free throw line during the 2010 NBA finals is the same as his regular season performance; his probability of making a basket in the playoffs is 0.688. Alternative hypothesis: Ron Artest’s performance at the free throw line during the 2010 NBA finals is worse than his regular season performance; his probability of making a basket in the playoffs is less than 0.688.

Simulated Chance Model Coins, cards, dice, spinners, etc. don’t really work well here to develop a chance model of a 68.8% success rate. But we can still use the magic of an applet. (While this will be a different applet than the first two we used, it is essentially the same.)

Ron Artest Continued So we have moderate evidence against the null. Let’s see what would happen if we had more data. Suppose he continued to shoot 46.7% from the free throw line so that he made 7 out of 15 of his next attempts as well for a total of 14 out of 30. Let’s return to the applet to see how our p-value would change.

Ron Artest Continued As the sample size increases, there is less variability in our null distribution. It is still centered around 0.688, but its width becomes more and more narrow. As a result, 0.467 gets further and further out in the tail and thus the p-value gets smaller. This should make intuitive sense in that with a larger sample size, we have more evidence.

Ron Artest Continued Besides a larger sample size, how else could we get more evidence against the null? Artest could make fewer shots. Is that what really happened? No. Artest made 4 of his next 5 shots for a total of 11 out of 20 (55%) for the playoffs. Let’s return to the applet and see how this changes our p-value.

Exploration 1.4 Shaky Putting? Phil Mickelson is one of the best golfers in the world. He’s won the Masters Tournament three times. However, 2011 was not his best year. He seemed to struggle with his putting and switched to a “belly putter” late in the year.

Exploration 1.4 Was Mickelson a poor putter in 2011? In this exploration, you will compare Mickelson’s 2011 record of putting from 10 feet away from the hole with that of all other professional golfers that year. Was he significantly worse than his peers?

Section 1.5 Modeling More Complex Situations

Infant preference for helper or hinderer?

Helper Toy

Baby chooses a toy

Helper or Hinderer? Sixteen babies were shown the two demonstrations. One helper toy and one hinderer toy. Which toy used and the order was random. When presented with the two toys (randomly which was to the left and which to the right) 14 of the babies chose the helper toy. How is this experiment different than any we have looked at so far?

Helper or Hinderer? The key difference is that each attempt was made by a different baby. Our chance model implies that each baby has the same chance of choosing the helper toy (50%). It could be that some babies randomly choose and some do not. We will talk about this in our conclusion. Let’s run the test.

Helper or Hinderer? Null  Hypothesis: Each baby is randomly choosing one of two toys. (The babies choose the helper toy 50% of the time in the long run.) Alternative Hypothesis: The babies are not randomly choosing, but show a preference for the helper toy. (The babies choose the helper toy more than 50% of the time in the long run.) We can use any applet to test this. Remember that our sample proportion is 14 out of 16.

Helper or Hinderer? So what can we conclude? Do all the babies prefer the helper toy? Do some of the babies prefer the helper toy? Because we had a low p-value, we can conclude that not all the babies are randomly choosing and that at least some of them prefer the helper toy. Can we make conclusions beyond these 16 babies?

Which Tire? Two students miss a chemistry exam because of excessive partying, but blame their absence on a flat tire. The professor allowed them to take a make-up exam, and he sent them to separate rooms to take it. The first question, worth 5 points, was quite easy. The second question, worth 95 points, asked: Which tire was flat?

Which Tire? How would you answer this question? Passenger’s Driver’s side front Driver’s side front Driver’s side rear Passenger’s side rear

Exploration 1.5: Tire Story Falls Flat We will use the data from class to determine if students have a preference for picking one of the four tires. This is similar to the helper-hinderer example because our observational units are different people. Let’s work exploration 1.5 (page 50).