Lecture 18. Final Project Design your own survey! – Find an interesting question and population – Design your sampling plan – Collect Data – Analyze using.

Slides:



Advertisements
Similar presentations
EDUC 200C Two sample t-tests November 9, Review : What are the following? Sampling Distribution Standard Error of the Mean Central Limit Theorem.
Advertisements

Genetic Statistics Lectures (5) Multiple testing correction and population structure correction.
Topics Today: Case I: t-test single mean: Does a particular sample belong to a hypothesized population? Thursday: Case II: t-test independent means: Are.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Sampling: Final and Initial Sample Size Determination
Chapter 4 Probability and Probability Distributions
Copyright © 2010 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
ESTIMATION AND HYPOTHESIS TESTING
Stat 301 – Day 15 Comparing Groups. Statistical Inference Making statements about the “world” based on observing a sample of data, with an indication.
Statistical Background
Sampling error Error that occurs in data due to the errors inherent in sampling from a population –Population: the group of interest (e.g., all students.
Hypothesis Testing:.
Chapter 13: Inference in Regression
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Tue, Oct 23, 2007.
QBM117 - Business Statistics Estimating the population mean , when the population variance  2, is unknown.
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Copyright © 2010 Pearson Education, Inc. Slide
Populations, Samples, and Probability. Populations and Samples Population – Any complete set of observations (or potential observations) may be characterized.
Breaking Statistical Rules: How bad is it really? Presented by Sio F. Kong Joint work with: Janet Locke, Samson Amede Advisor: Dr. C. K. Chauhan.
Chapter 4 Statistics. 4.1 – What is Statistics? Definition Data are observed values of random variables. The field of statistics is a collection.
Chapter Twelve Census: Population canvass - not really a “sample” Asking the entire population Budget Available: A valid factor – how much can we.
Bayesian inference review Objective –estimate unknown parameter  based on observations y. Result is given by probability distribution. Bayesian inference.
Decisions from Data: The Role of Simulation Gail Burrill Gail Burrill Michigan State University
From the Data at Hand to the World at Large
Copyright © 2009 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
1 Chapter 18 Sampling Distribution Models. 2 Suppose we had a barrel of jelly beans … this barrel has 75% red jelly beans and 25% blue jelly beans.
Hypothesis Testing A procedure for determining which of two (or more) mutually exclusive statements is more likely true We classify hypothesis tests in.
Hypothesis Testing. The 2 nd type of formal statistical inference Our goal is to assess the evidence provided by data from a sample about some claim concerning.
Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Chapter 8 : Estimation.
Not in FPP Bayesian Statistics. The Frequentist paradigm Defines probability as a long-run frequency independent, identical trials Looks at parameters.
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
PROBABILITY IN OUR DAILY LIVES
Repeated Measures Analysis of Variance Analysis of Variance (ANOVA) is used to compare more than 2 treatment means. Repeated measures is analogous to.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Fri, Nov 12, 2004.
AP Statistics, Section 7.2, Part 1 2  The Michigan Daily Game you pick a 3 digit number and win $500 if your number matches the number drawn. AP Statistics,
Section 7.2 P1 Means and Variances of Random Variables AP Statistics.
Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 7
Section 6.4 Inferences for Variances. Chi-square probability densities.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Multinomial Distribution World Premier League Soccer Game Outcomes.
1 Guess the Covered Word Goal 1 EOC Review 2 Scientific Method A process that guides the search for answers to a question.
An Overview of Statistics Lesson 1.1. What is statistics? Statistics is the science of collecting, organizing, analyzing, and interpreting data in order.
Plan for Today: Chapter 1: Where Do Data Come From? Chapter 2: Samples, Good and Bad Chapter 3: What Do Samples Tell US? Chapter 4: Sample Surveys in the.
Introduction Imagine the process for testing a new design for a propulsion system on the International Space Station. The project engineers wouldn’t perform.
Confidence Intervals. Point Estimate u A specific numerical value estimate of a parameter. u The best point estimate for the population mean is the sample.
Fundamentals of Data Analysis Lecture 3 Basics of statistics.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 7 l Hypothesis Tests 7.1 Developing Null and Alternative Hypotheses 7.2 Type I & Type.
Ch3.5 Hypergeometric Distribution
Testing Hypotheses about a Population Proportion
Probability and Statistics
Inference: Conclusion with Confidence
Overview of Statistics
Lecture 17.
Lecture 18.
Lecture 20.
Discrete Event Simulation - 4
Statistical NLP: Lecture 4
Probability and Statistics
Sampling Distribution Models
Additional notes on random variables
Additional notes on random variables
Hypothesis Tests for a Standard Deviation
LECTURE 07: BAYESIAN ESTIMATION
Section Means and Variances of Random Variables
Section Means and Variances of Random Variables
Testing Hypotheses about a Population Proportion
Presentation transcript:

Lecture 18

Final Project Design your own survey! – Find an interesting question and population – Design your sampling plan – Collect Data – Analyze using R Write 5 page paper on your results Due November 25 (just before Thanksgiving)

Final presentation During the last class (December 3) all students will be required to give a short presentation – Select one of the three projects – Make a powerpoint presentation (no more than 3- 5 slides) – Present your results to the class

Sampling exercise We have discussed need to do random sampling Fun exercise (thanks to Dr. Kelli)

Statistics Main ideas of statistics – Given multiple plausible models select one (or several) that is (are) the most consistent with the observed data – Quantify a measure of belief in our solution The main idea is that if something looks like a very unlikely coincidence we would prefer another more likely explanation

Example 1 There is one model we favor and want to check if a particular feature of the data is consistent with it (hypotheses testing). The UK National Lottery is 6/49 Genoese lottery. – In the first 1240 drawings since 2000 there has been a lucky number 38 (drawn 181 times) and unlucky number 20 (drawn 122 times). [All things being equal we would expect each number to be drawn 151.8] – Similarly number 17 took a staggering break of 72 drawings in a row! Is this consistent with the assumption that the lottery is random and all numbers are equally likely?

Idea Generate similar data from the known distribution and compare with the results observed. Statistics: number of times “luckiest number” drawn, number of times “unluckiest number” drawn, size of the biggest gap

R simulation Code Loterry1240.R What is our conclusion?

Example 2 Premier League 2006/2007 – 20 teams – playing home and away (total 380 mathes) – 3 points for victory, 1 point each for a draw – At the end Manchester United ended up with 89 points, Chelsea with 83, Watford with 28 Could we view this as random uncertainty-premier-league?src=aop

R simulation Data – Statistic – Max (89), min (28), variance (238.7) Issue – it is known that there is a big difference between home and away. – Simple model: (p-home,p-draw,p-away) If all things were equal we can estimate this to be (48%,26%,26%) Conclusion?

Other issues In sports – successive trials are probably not independent Can we test this? What would we need? – Data – Statistics (numerical measurement that caries information about the feature we are interested in) – Simulation scheme/model

Other statistical problems Having several models and deciding how likely each model is given data. Bayesian statistics – Need prior believe in each model – Update the believe based on data