From Wikipedia: “Parametric statistics is a branch of statistics that assumes (that) data come from a type of probability distribution and makes inferences.

Slides:



Advertisements
Similar presentations
Review bootstrap and permutation
Advertisements

Sta220 - Statistics Mr. Smith Room 310 Class #14.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Sampling Distributions (§ )
Significance Testing Chapter 13 Victor Katch Kinesiology.
GG313 Lecture 8 9/15/05 Parametric Tests. Cruise Meeting 1:30 PM tomorrow, POST 703 Surf’s Up “Peak Oil and the Future of Civilization” 12:30 PM tomorrow.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview Parameters and Statistics Probabilities The Binomial Probability Test.
Chapter 11: Inference for Distributions
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
5-3 Inference on the Means of Two Populations, Variances Unknown
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Nonparametrics and goodness of fit Petter Mostad
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
Linear Regression Inference
Means Tests Hypothesis Testing Assumptions Testing (Normality)
1 CSI5388 Error Estimation: Re-Sampling Approaches.
Confidence Intervals for the Regression Slope 12.1b Target Goal: I can perform a significance test about the slope β of a population (true) regression.
Chapter 9 Hypothesis Testing II: two samples Test of significance for sample means (large samples) The difference between “statistical significance” and.
Bootstrapping (And other statistical trickery). Reminder Of What We Do In Statistics Null Hypothesis Statistical Test Logic – Assume that the “no effect”
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Chapter 10: Comparing Two Populations or Groups
Inference and Inferential Statistics Methods of Educational Research EDU 660.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
Statistical Inference Statistical Inference involves estimating a population parameter (mean) from a sample that is taken from the population. Inference.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
Bootstrapping and Randomization Techniques Q560: Experimental Methods in Cognitive Science Lecture 15.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Statistical Inference
Chapter 10: Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
One-Sample Inference for Proportions
CHAPTER 6 Random Variables
STA 291 Spring 2010 Lecture 18 Dustin Lueker.
Chapter 10: Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Elementary Statistics
Some Nonparametric Methods
Statistical Inference about Regression
Comparing two Rates Farrokh Alemi Ph.D.
Inference for Relationships
CHAPTER 10 Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Sampling Distributions (§ )
STA 291 Summer 2008 Lecture 18 Dustin Lueker.
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Comparing Two Proportions
Chapter 10: Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Presentation transcript:

From Wikipedia: “Parametric statistics is a branch of statistics that assumes (that) data come from a type of probability distribution and makes inferences about the parameters of the distribution. Most well-known elementary statistical methods (e.g. the ones from our class) are parametric.” But there are alternative methods that don’t require any assumptions about the shape of the population’s probability distribution. Resampling methods are an example. Resampling Methods

There are three kinds of resampling methods: Permutation methods – used most commonly with correlations where the probability of the observed data is estimated by comparing the observed parings to a large number of random parings of the data. Monte Carlo methods – estimate the population probability distribution through simulation. Bootstrap methods – the population distribution of an observed statistic is estimated by repeatedly resampling the data with replacement and calculating the statistic.

Example of a permutation method: Suppose you measured the IQ’s of 25 pairs of twins and found a correlation of r=0.36. The scatter plot of your data is shown below. Is the observed correlation significantly greater than zero? (use  =.01) Correlation r = 0.36 IQ Twin 1 IQ Twin 2 The (parametric) test used in our class would have found an r crit value of We would reject H 0 and conclude that a correlation 0.36 is (barely) significantly greater than zero.

The distribution under the null hypothesis can be estimated by repeatedly shuffling (or ‘permuting’) the relationship between the X and Y values and calculating the correlation: X Y’ r = X Y’ r = X Y r = X Y’ r =.20 …

r= 0.05r=-0.01r=-0.00r=-0.22r= 0.25r= 0.05r=-0.32 r=-0.47r=-0.34r=-0.18 r=-0.01r= 0.31r=-0.20r=-0.25r= 0.15r=-0.37r=-0.11 r=-0.24r=-0.38r=-0.36r=-0.26r=-0.30r=-0.09r=-0.24 r= 0.07r= 0.05r= 0.13r=-0.05r=-0.16r= 0.02r=-0.17 r= 0.11r=-0.12r=-0.19r=-0.01 This generates a distribution of correlations that should be centered around zero. We can then use this distribution to calculate the probability of making our observed sample correlation.

After reps, Pr(r> 0.36)= Only 3.78% of the correlations generated by permutation exceeds the observed correlation of 0.36, so we’d reject the null hypothesis using  =.05

Example of a Monte Carlo simulation: Liar’s dice This is a game where n players roll 40 6-sided dice and keep the outcome hidden under their own separate cups. The goal is to guess how many dice equal the mode. After a player makes a guess, the next player must decide if the guess is too high, or otherwise guess a higher number. If it is decided that the guess is too high, the cups are lifted and the number of dice equal to the mode is computed. If the he/she wins and the player that made the guess must drink (lemonade). Suppose there are eight players, each with 5 dice. The player to your right just guessed that the modal value is 14. What is the probability that the mode of the 40 dice is that high or higher? Here’s an example of 40 throws. The mode is 5, and 10 of these throws equals the mode. 313 mode#

Example of 20 simulations. Each row is a throw of 40 dice. The last column is the number of throws that equal the mode rep #mode#

A computer simulation of one million rolls generated this histogram. Shown in red are the examples when the number of dice equal to the mode is 14 or higher. Only 2.31% of the simulations found a count of 14 or higher. This small number means that the player should ask all players to lift their cups and calculate the value.

Third method of resampling: bootstrapping to conduct a hypothesis test on medians. Suppose you measured the amount of time it takes for a subject to perform a simple mental rotation. Previous research shows that it should take a median of 2 seconds to conduct this task. Your subject conducts 500 trials and generates the distribution of response times below, which has a median of 2.15 seconds. Is this number significantly greater than 2? (use  =.05)

The trick to bootstrapping is to generate an estimate of the sampling distribution of your observed statistic by repeatedly sampling the data with replacement and recalculating the statistic median = median = median = median = median = median = median = median = median = median = median = median = 2.22 For our example, we can count the proportion of times that the median falls below 2.

Since more than 5% of our bootstrapped medians fall below 2, we (just barely) cannot conclude that our observed median is significantly greater than 2.