+ Using StatCrunch to Teach Statistics Using Resampling Techniques Webster West Texas A&M University.

Slides:



Advertisements
Similar presentations
Implementation and Order of Topics at Hope College.
Advertisements

An Active Approach to Statistical Inference using Randomization Methods Todd Swanson & Jill VanderStoep Hope College Holland, Michigan.
Review bootstrap and permutation
Objectives 10.1 Simple linear regression
Plausible values and Plausibility Range 1. Prevalence of FSWs in some west African Countries 2 0.1% 4.3%
Statistics : Statistical Inference Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University 1.
Sampling Distributions (§ )
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7.3 Estimating a Population mean µ (σ known) Objective Find the confidence.
An Inference Procedure
1 A heart fills with loving kindness is a likeable person indeed.
1 The Basics of Regression Regression is a statistical technique that can ultimately be used for forecasting.
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
QM Spring 2002 Business Statistics Sampling Concepts.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
Bootstrapping LING 572 Fei Xia 1/31/06.
1 Inference About a Population Variance Sometimes we are interested in making inference about the variability of processes. Examples: –Investors use variance.
Inferential Statistics
Introducing Inference with Simulation Methods; Implementation at Duke University Kari Lock Morgan Department of Statistical Science, Duke University
Introducing Inference with Bootstrap and Randomization Procedures Dennis Lock Statistics Education Meeting October 30,
Simulation and Resampling Methods in Introductory Statistics Michael Sullivan Joliet Junior College
The Bootstrap Michael Sullivan Joliet Junior College
Workshop on Teaching Introductory Statistics Session 2b: Planning the Use of Activities Roger Woodard, North Carolina State University Ginger Holmes Rowell,
Using Simulation Methods to Introduce Inference Kari Lock Morgan Duke University In collaboration with Robin Lock, Patti Frazer Lock, Eric Lock, Dennis.
AP Statistics Section 10.1 B CI for Population Mean When is Known.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses.
Using Lock5 Statistics: Unlocking the Power of Data
How to Handle Intervals in a Simulation-Based Curriculum? Robin Lock Burry Professor of Statistics St. Lawrence University 2015 Joint Statistics Meetings.
Statistics: Unlocking the Power of Data Lock 5 Afternoon Session Using Lock5 Statistics: Unlocking the Power of Data Patti Frazer Lock University of Kentucky.
QBM117 Business Statistics Estimating the population mean , when the population variance  2, is known.
Confidence Intervals for Means. point estimate – using a single value (or point) to approximate a population parameter. –the sample mean is the best point.
PARAMETRIC STATISTICAL INFERENCE
Introducing Inference with Simulation Methods; Implementation at Duke University Kari Lock Morgan Department of Statistical Science, Duke University
9 Mar 2007 EMBnet Course – Introduction to Statistics for Biologists Nonparametric tests, Bootstrapping
From Theory to Practice: Inference about a Population Mean, Two Sample T Tests, Inference about a Population Proportion Chapters etc.
Active Learning Lecture Slides For use with Classroom Response Systems Statistical Inference: Confidence Intervals.
BUS304 – Chapter 6 Sample mean1 Chapter 6 Sample mean  In statistics, we are often interested in finding the population mean (µ):  Average Household.
Sampling And Resampling Risk Analysis for Water Resources Planning and Management Institute for Water Resources May 2007.
Introducing Inference with Bootstrapping and Randomization Kari Lock Morgan Department of Statistical Science, Duke University with.
Implementing a Randomization-Based Curriculum for Introductory Statistics Robin H. Lock, Burry Professor of Statistics St. Lawrence University Breakout.
§ 5.3 Normal Distributions: Finding Values. Probability and Normal Distributions If a random variable, x, is normally distributed, you can find the probability.
Confidence Intervals (Dr. Monticino). Assignment Sheet  Read Chapter 21  Assignment # 14 (Due Monday May 2 nd )  Chapter 21 Exercise Set A: 1,2,3,7.
Give your data the boot: What is bootstrapping? and Why does it matter? Patti Frazer Lock and Robin H. Lock St. Lawrence University MAA Seaway Section.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
Inference: Probabilities and Distributions Feb , 2012.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
Inference ConceptsSlide #1 1-sample Z-test H o :  =  o (where  o = specific value) Statistic: Test Statistic: Assume: –  is known – n is “large” (so.
1 Probability and Statistics Confidence Intervals.
10.1 – Estimating with Confidence. Recall: The Law of Large Numbers says the sample mean from a large SRS will be close to the unknown population mean.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Review Statistical inference and test of significance.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
 Confidence Intervals  Around a proportion  Significance Tests  Not Every Difference Counts  Difference in Proportions  Difference in Means.
Reflections on Using Simulation Based Methods to Teach Statistical Methods Amanda Ellis and Melissa Pittard University of Kentucky, Department of Statistics.
Inference about a Population Mean
Inference: Conclusion with Confidence
Simulation Based Inference for Learning
ESTIMATION.
Teaching Introductory Statistics
Inference: Conclusion with Confidence
Chapter Six Normal Curves and Sampling Probability Distributions
Ch. 8 Estimating with Confidence
Inferences and Conclusions from Data
Stat 217 – Day 28 Review Stat 217.
Sampling Distribution
Sampling Distribution
Using Simulation Methods to Introduce Inference
Using Simulation Methods to Introduce Inference
Sampling Distributions (§ )
Presentation transcript:

+ Using StatCrunch to Teach Statistics Using Resampling Techniques Webster West Texas A&M University

+ Background George Cobb started the interest in using resampling methods for introductory statistics with his plenary talk at the First USCOTS in Several groups our now working on integrating these approaches into the curriculum. I have added numerous resampling procedures to StatCrunch which is widely used in teaching introductory statistics. Roger Woodard and I have our INCIST NSF grant to develop teaching materials which incorporate these methods. We have conducted numerous workshops around the country where we have presented these materials to statistics teachers.

+ A Randomization Activity Students were randomly assigned to a version of an exam (Yellow or Green) when they entered the classroom. Afterwards, both sets of students complained about the exam saying their version was harder. Students investigate the possibility that the observed difference between means of 6.3 might occur due to random chance. They shuffle cards with scores written on them into yellow and green groups and then calculate the difference between the two means. They place a post it note on a whiteboard in the proper location and evaluate the resulting randomization distribution.

+ The randomization applet

+ A Sampling Distribution Activity A very inconvenient printed roster of 12,000 students at a fictitious university is provided to the students. They build a sampling distribution by each collecting a random sample of size 30 and reporting the mean number of Facebook friends for their sample via a StatCrunch survey. From the sampling distribution, students see the normal curve is a good descriptor of the sampling variability of the sample mean. This leads to the CLT and the idea of statistic ± 2×standard error as a 95% confidence interval for an unknown population mean.

+ A Sampling Distribution Activity Mary’s Sample Mean

+ A Sampling Distribution Activity The instructor then uses their access to the data in electronic form within StatCrunch to compute 1000 sample means with each sample mean based on a sample of 30 students. This is like doing the activity with 1000 students instead of 10.

+ A Bootstrapping Activity Building off the sampling distribution activity, students are then tasked with estimating the standard error of the sample mean using a single sample. Using the sample as a proxy for the population, each student collects 30 resamples taken with replacement from the common sample data. Each student reports the mean of their bootstrap sample. The student results are then augmented with applet results to compute a 95% confidence interval for the population mean.

+ The bootstrapping applet

+ What we have learned about the randomization approach This approach appeals nicely to a basic intuition that most people have about the problem. The tactile simulation adds a great deal of value in terms of students understanding of what is taking place in the applet. It can be introduced with little or no background required in terms of other statistical concepts such as normal theory or even the jargon of hypothesis testing. It can be easily used at a variety of time points in the standard introductory course. Students and instructors seem to really take to it!

+ What we have learned about the bootstrapping approach The bootstrap can be used to reinforce this idea of a sampling distribution. The bootstrap approach requires a great deal of backstory before it can be effectively introduced. It is probably best to rely on technology alone for the bootstrap after the student does the sampling activity in a more tactile way. Students seem to like it but instructors not so much! The bootstrap may also lead to possible misconceptions on the part of students: “Am I getting a confidence interval for the sample mean?”

+ For discussion With these approaches, two people can get different results even when using the same data set. Taking a simple random sample from a large list of values is difficult to do in a tactile way. Must we always rely on help from the computer? How can we make one sample problems more interesting to students? Students are interested in samples but not interested in population parameters! Why do we care about the mean number of Facebook friends? Are we too focused on inference in the introductory course? How often will the average student ever be confronted with a random sample?