An Overview of Statistical Inference – Learning from Data


Similar presentations
Chapter 2 The Process of Experimentation

Animal, Plant & Soil Science
Introduction to Statistics
Estimating a Population Proportion
Statistics Versus Parameters
AP Statistics – Chapter 9 Test Review
Chapter 10: Hypothesis Testing
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview of Lecture Independent and Dependent Variables Between and Within Designs.
Inferences About Process Quality
Chapter 11 Asking and Answering Questions About The Difference Between Two Population Proportions Created by Kathy Fritz.
BCOR 1020 Business Statistics
Chapter 9 Hypothesis Testing 9.1 The Language of Hypothesis Testing.
CHAPTER 2 Statistical Inference 2.1 Estimation  Confidence Interval Estimation for Mean and Proportion  Determining Sample Size 2.2 Hypothesis Testing:
Fundamentals of Hypothesis Testing: One-Sample Tests
10.3 Estimating a Population Proportion
More About Significance Tests
+ Section 10.1 Comparing Two Proportions After this section, you should be able to… DETERMINE whether the conditions for performing inference are met.
Introduction to Statistical Inference Probability & Statistics April 2014.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 7: Sampling Distributions Section 7.2 Sample Proportions.
Normal Distr Practice Major League baseball attendance in 2011 averaged 30,000 with a standard deviation of 6,000. i. What percentage of teams had between.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
AP Statistics Chapter 13 Notes. Two-sample problems The goal is to compare the responses of two treatments given to randomly assigned groups, or to compare.
Chapter 9 Hypothesis Testing 9.1 The Language of Hypothesis Testing.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Introduction to Statistics Chapter 1. § 1.1 An Overview of Statistics.
Chapter 13: Inferences about Comparing Two Populations Lecture 8b Date: 15 th November 2015 Instructor: Naveen Abedin.
RESEARCH METHODS IN INDUSTRIAL PSYCHOLOGY & ORGANIZATION Pertemuan Matakuliah: D Sosiologi dan Psikologi Industri Tahun: Sep-2009.
Chapter 22 Comparing Two Proportions.  Comparisons between two percentages are much more common than questions about isolated percentages.  We often.
Producing Data: Experiments BPS - 5th Ed. Chapter 9 1.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Chapter 9 Audit Sampling – Part a.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Hypothesis Tests for 1-Proportion Presentation 9.
Chapter 9 Estimation using a single sample. What is statistics? -is the science which deals with 1.Collection of data 2.Presentation of data 3.Analysis.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
Chapter 9 Estimating a Population Proportion Created by Kathy Fritz.
Sampling Distributions
An Overview of Statistical Inference – Learning from Data
Lecture Slides Elementary Statistics Twelfth Edition
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 10 Comparing Two Populations or Groups
An Overview of Statistical Inference – Learning from Data
Chapter 11 Goodness-of-Fit and Contingency Tables
CHAPTER 11 Inference for Distributions of Categorical Data
Confidence Intervals: The Basics
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 10 Comparing Two Populations or Groups
Presentation transcript:

An Overview of Statistical Inference – Learning from Data Chapter 7 An Overview of Statistical Inference – Learning from Data Created by Kathy Fritz

Statistical Inference What You Can Learn from Data

With the increasing popularity of With the increasing popularity of online dating services, the truthfulness of information in the personal profiles by users is a topic of interest. A study was designed to investigate misrepresentation of personal characteristics. The researchers hoped to answer three questions: What proportion of online daters believe they have misrepresented themselves in an online profile? What proportion of online daters believe that others frequently misrepresent themselves? Are people who place a greater importance on developing a long-term, face-to-face relationship more honest in their online profiles? The first two of these questions are estimation problems because they involve using sample data to learn something about a population characteristic. The third question is a hypothesis testing problem because it involves determining if sample data support a claim about the population of online daters.

Learning from Sample Data When you obtain information from a sample selected from some population, it is usually because you want to learn something about characteristics of the population. OR you want to use sample data to decide whether there is support for some claim or statement about the population. A hypothesis testing problem involves using sample data to test a claim about a population. An estimation problem involves using sample data to estimate the value of a population characteristic. Methods for estimation and hypothesis testing are called statistical inference methods because they involve generalizing (making an inference) from a sample to the population from which the sample was selected.

Learning from Data When There Are Two or More Populations Sometimes sample data are obtained from two or more populations of interest, and the goal is to learn about differences between the populations. Consider the following example: College student spend a lot of time online, but do members of Facebook spend more time online than non-members? Data was collected from two samples of college students; one consisting of Facebook members and the other consisting of non-members. One of the variables studied was the amount of time spent on the Internet in a typical day. Based on the resulting data, it was concluded that there was no support for the claim that the mean time spent online for Facebook members was greater than the mean time for non-members. This study involves generalizing from samples, and it is a hypothesis testing problem because it involves testing a claim about the difference between the two groups.

Learning from Experimental Data Statistical inference methods are also used to learn from experiment data. When data are obtained from an experiment, it is usually because you want to learn about the effect of the different experimental conditions (treatments) on the measured response. OR you want to determine if experiment data provide support for a claim about how the effects of two or more treatments differ. This is a hypothesis testing problem because it involves testing a claim (hypothesis) about treatment effects. This is an estimation problem because it involves using sample data to estimate a characteristic of the treatments, such as the mean response for a treatment.

Do U Smoke After Txt? Researchers in New Zealand investigated whether mobile phone text messaging could be used to help people stop smoking? An experiment was designed to compare two treatments. Subjects for the experiment were 1705 smokers who were older than 15 years and owned a mobile phone and who wanted to quit smoking. People in the first group received personalized text messages providing support and advice on stopping smoking. The second group was a control group, and people in this group did not receive any of these text messages. After 6 weeks, each person participating in the study was contacted and asked if he or she had smoked during the previous week. Researchers estimated that the proportion of those who successfully quit smoking was greater by 0.15 for those who received text messages. Data from the experiment were used to estimate the difference in the proportion who had quit for those who received the text messages and those who did not.

Statistical Inference Involves Risk The risks associated with statistical inference arise because you are attempting to draw conclusions on the basis of data that provide partial rather than complete information. In estimation problems . . . RISK – these estimates may be inaccurate Understand that the method used to produce the estimates and accompanying measures of accuracy might mislead

Statistical Inference Involves Risk The risks associated with statistical inference arise because you are attempting to draw conclusions on the basis of data that provide partial rather than complete information. In hypothesis testing situations . . . RISK – an inaccurate conclusion Understand how likely it is that the method used to decide whether or not a claim is supported might lead to an incorrect decision

Variability in Data When there is variability in the population, you need to consider whether this partial picture (the sample) is representative of the population. Suppose we wanted to estimate the mean length of fish in a large lake. We could catch a sample of 20 fish from the lake. One sample may have a symmetric distribution like this. Another sample may have a skewed distribution like this . . . This sample-to-sample variability should be considered when you assess the risk associated with drawing conclusions about the population from sample data. . . . or like this.

vs. Variability in Data An experiment might be designed to determine if noise level has an effect on the time required to perform a task requiring concentration. There are 20 individuals available to serve as subjects in this experiment with two treatment conditions (quiet environment and noisy environment). The response variable is the time required to complete the task. If noise level has NO effect on completion time, the time observed for each of the 20 subjects would be the same whether they are in the quiet group or the noisy group. Any observed differences in the completion times for the two treatments would NOT be due to noise level, but to person-to-person variability and the random assignment of subjects to treatments. You must understand how differences might result from variability in the response and the random assignment to treatment groups in order to distinguish them from differences created by a treatment effect.

Selecting an Appropriate Method Four Key Questions

Four Key Questions In the following chapters, you will encounter different types of inference problems. The answer to the following questions will lead you to a suggested method to use. Question Type (Q): Is the question you are trying to answer an estimation problem or a hypothesis testing problem? Study Type (S): Does the situation involve generalizing from a sample to learn about the population (an observational study or survey) OR does it involve generalizing from an experiment to learn about treatment effects? You will choose different methods depending on the answer to this question. The answer to this question affects the choice of the method as well as the type of conclusion that can be drawn.

Four Key Questions Continued . . . Type of Date (T): What type of data will be used to answer the question? Is the data set univariate (one variable) or bivariate (two variables)? Identify whether these examples involve univariate or bivariate data. Explain your choice. Are the data categorical or numerical? Univariate versus Bivariate The study of deception in online dating profiles investigated whether people who place a greater importance on developing a long-term face-to-face relationship are more honest in their online profiles. A study was performed to learn how the proportion with a TV in the bedroom differed for children in two age groups.

Four Key Questions Continued . . . Type of Date (T): What type of data will be used to answer the question? Is the data set univariate (one variable) or bivariate (two variables)? Are the data categorical or numerical? Categorical versus Numerical If you have a single variable and the data are categorical, the question of interest is probably about a population proportion. If the data are numerical, the question of interest is probably about a population mean.

Four Key Questions Continued . . . Number of Samples or Treatments (N): How many samples are there? OR IF the data are from an experiment, how many treatments are being compared? For situations that involve sample data, different methods are used depending on whether there are one, two, or more than two samples. Also, you may choose a different method to analyze data from an experiment with only two treatments than you would for an experiment with more than two treatments.

QSTN Q S T N Think of this as the word QUESTION without the vowels. Question Type Estimation or hypothesis testing? S Study Type Sample data or experiment data? T Type of Data Univariate or bivariate? Categorical or numerical? N Number of Samples or Treatments How many samples or treatments?

Answering Four Key Questions to Identify An Appropriate Method Question Type S Study Type T Type of Data N Number Method to Consider Chapter Estimation Sample Univariate Categorical 1 One Sample z Confidence Interval for a Proportion 9 Hypothesis Test One Sample z Test for a Proportion 10 2 Two Sample z Confidence Interval for a Difference in Proportions 11 Two Sample z Test for a Difference in Proportions Univariate Numerical One Sample t Confidence Interval for a Mean 12 One Sample t Test for a Mean Two Sample t Confidence Interval for a Difference in Means 13 Two Sample t Test for a Difference in Means More than 2 ANOVA F Test 17 online Multiple Comparisons You will be able to refer to this table in the following chapters to identify an appropriate method to use.

A Five-Step Process for Statistical Inference Estimation Problems Hypothesis Testing Problems

A Five-Step Process for Estimation Problems (EMC3) What is this step? E M C Estimate: Explain what population characteristic you plan to estimate Method: Select a potential method using QSTN Check: Check to make sure that the method is appropriate. It is important to verify that any conditions are met before proceeding. Calculate: Sample data are used to perform any necessary calculations. Communicate Results: This is a critical step in the process. You will answer the questions of interest, explain what you have learned from the data, and acknowledge potential risk.

A Five-Step Process for Hypothesis Testing Problems (HMC3) What is this step? H M C Hypotheses: Define the hypotheses that will be tested Method: Select a potential method using QSTN Check: Check to make sure that the method is appropriate. It is important to verify that any conditions are met before proceeding. Calculate: Sample data are used to perform any necessary calculations. Communicate Results: This is a critical step in the process. You will answer the questions of interest, explain what you have learned from the data, and acknowledge potential risk.