Lecture 17.

Slides:



Advertisements
Similar presentations
Lecture 18. Final Project Design your own survey! – Find an interesting question and population – Design your sampling plan – Collect Data – Analyze using.
Advertisements

Lecture Inference for a population mean when the stdev is unknown; one more example 12.3 Testing a population variance 12.4 Testing a population.
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
The Calibration Process
Hypothesis Testing:.
Chapter 13: Inference in Regression
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Populations, Samples, and Probability. Populations and Samples Population – Any complete set of observations (or potential observations) may be characterized.
Bayesian inference review Objective –estimate unknown parameter  based on observations y. Result is given by probability distribution. Bayesian inference.
AP Statistics Chapter 13 Notes. Two-sample problems The goal is to compare the responses of two treatments given to randomly assigned groups, or to compare.
Not in FPP Bayesian Statistics. The Frequentist paradigm Defines probability as a long-run frequency independent, identical trials Looks at parameters.
Lecture: Forensic Evidence and Probability Characteristics of evidence Class characteristics Individual characteristics  features that place the item.
PROBABILITY IN OUR DAILY LIVES
Copyright © 2009 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Fri, Nov 12, 2004.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Wed, Nov 1, 2006.
Section 7.2 P1 Means and Variances of Random Variables AP Statistics.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
1 Math 4030 – 10b Inferences Concerning Proportions.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Multinomial Distribution World Premier League Soccer Game Outcomes.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
1 Where we are going : a graphic: Hypothesis Testing. 1 2 Paired 2 or more Means Variances Proportions Categories Slopes Ho: / CI Samples Ho: / CI / CI.
Copyright © Cengage Learning. All rights reserved. 3 Discrete Random Variables and Probability Distributions.
Lecture Slides Elementary Statistics Twelfth Edition
Confidence Intervals for Proportions
ESTIMATION.
CHAPTER 10 Comparing Two Populations or Groups
Multiple Imputation using SOLAS for Missing Data Analysis
Testing Hypotheses about a Population Proportion
Probability and Statistics
AP Statistics Comparing Two Proportions
Chapters 20, 21 Hypothesis Testing-- Determining if a Result is Different from Expected.
Confidence Intervals for Proportions
Confidence Intervals for Proportions
The Calibration Process
Lecture Slides Elementary Statistics Twelfth Edition
Lecture 18.
CHAPTER 10 Comparing Two Populations or Groups
Lecture 20.
The Nature of Probability and Statistics
More about Posterior Distributions
MATH 2311 Section 4.4.
Chapter 18 – Sampling Distribution Models
Statistical NLP: Lecture 4
Lecture: Forensic Evidence and Probability Characteristics of evidence
Comparing two Rates Farrokh Alemi Ph.D.
Probability and Statistics
CHAPTER 10 Comparing Two Populations or Groups
Sampling Distribution Models
Sampling Distribution of a Sample Mean
Additional notes on random variables
Chapter 8: Estimating with Confidence
Additional notes on random variables
Testing Hypotheses about a Population Proportion
Section Means and Variances of Random Variables
Confidence Intervals for Proportions
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Testing Hypotheses about a Population Proportion
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Testing Hypotheses about a Population Proportion
MATH 2311 Section 4.4.
Presentation transcript:

Lecture 17

Final Project Design your own survey! Find an interesting question and population Design your sampling plan Collect Data Analyze using R Write 5 page paper on your results Due April 21

Final presentation During the last class (April 26) all students will be required to give a short presentation Select one of the three projects Make a powerpoint presentation (no more than 3-5 slides) Present your results to the class

Final Project Design your own survey! Find an interesting question and population Design your sampling plan Collect Data Analyze using R Write 5 page paper on your results Due December 1

Final presentation All are required to give a short presentation Last two classes (December 1 and 6) Select one of the three projects Make a powerpoint presentation (no more than 3-5 slides) Present your results to the class

Statistics Main ideas of statistics Given multiple plausible models select one (or several) that is (are) the most consistent with the observed data Quantify a measure of belief in our solution The main idea is that if something looks like a very unlikely coincidence we would prefer another more likely explanation

Example 1 There is one model we favor and want to check if a particular feature of the data is consistent with it (hypotheses testing). The UK National Lottery is 6/49 Genoese lottery. In the first 1240 drawings since 2000 there has been a lucky number 38 (drawn 181 times) and unlucky number 20 (drawn 122 times). [All things being equal we would expect each number to be drawn 151.8] Similarly number 17 took a staggering break of 72 drawings in a row! Is this consistent with the assumption that the lottery is random and all numbers are equally likely?

Idea Generate similar data from the known distribution and compare with the results observed. Statistics: number of times “luckiest number” drawn, number of times “unluckiest number” drawn, size of the biggest gap

R simulation Code Loterry1240.R What is our conclusion?

Example 2 Premier League 2006/2007 Could we view this as random 20 teams – playing home and away (total 380 matches) 3 points for victory, 1 point each for a draw At the end Manchester United ended up with 89 points, Chelsea with 83, Watford with 28 Could we view this as random http://plus.maths.org/content/understanding-uncertainty-premier-league?src=aop

R simulation Data Statistic Issue Conclusion? http://en.wikipedia.org/wiki/2006–07_Premier_League Statistic Max (89), min (28), variance (238.7) Issue it is known that there is a big difference between home and away. Simple model: (p-home,p-draw,p-away) If all things were equal we can estimate this to be (48%,26%,26%) Conclusion?

Other issues In sports – successive trials are probably not independent Can we test this? What would we need? Data Statistics (numerical measurement that caries information about the feature we are interested in) Simulation scheme/model

Other statistical problems Having several models and deciding how likely each model is given data. Bayesian statistics Need prior believe in each model Update the believe based on data

Deciding between several models One option is to use Bayesian approach

Bayesian statistics A method for updating and combining information expressed in terms of probabilities (data and prior believes) Recent notable uses Nate Silver http://fivethirtyeight.com Search for Air France Flight 447 http://en.wikipedia.org/wiki/Air_France_Flight_447

Example 1: Testing for disease or drug. Two models: Prior information Subject has the disease (uses drugs) – Test is positive with probability q1 (sensitivity) Subject does not have the disease (is clean) – Test is negative with probability q2 (specificity) Prior information Certain proportion p of the population has the disease

About 4% of the population uses It is claimed that a DrugWipe5 has sensitivity of 91% and specificity of 95% About 4% of the population uses A randomly selected worker tested positive. What is the chance he is a user? http://www.drugwarfacts.org/cms/Drug_Testing

Excel Calculation

A spotter has a 30% success rate in spotting workers who use A spotter has a 30% success rate in spotting workers who use. A person selected by a spotter was tested positive. Is there a difference? Comments?

The rule of succession Derived by Laplace (1814) We have an experiment that can end up in success or failure We performed n experiments and all of them were successes. What is the chance that the next one will be success?

Observed data n out of n successes Prior – all models equally likely Consider N models Chance of success p=0/(N-1),1/(N-1),…,(N-1)/(N-1) Observed data n out of n successes Prior – all models equally likely Posterior probability

Example 2b What if we observed s out of n successes? Other priors?