Communicating Quantitative Information Inflation Election district Polling, predictions, confidence intervals, margin of error Homework: Identify topic.

Slides:



Advertisements
Similar presentations
Personal Response System (PRS). Revision session Dr David Field Do not turn your handset on yet!
Advertisements

Sampling Distributions
Chapter 19 Confidence Intervals for Proportions.
1 Difference Between the Means of Two Populations.
An Inference Procedure
1 Confidence Interval for the Population Proportion.
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
7-2 Estimating a Population Proportion
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Hypothesis Tests Regarding a Parameter 10.
1 The Sample Mean rule Recall we learned a variable could have a normal distribution? This was useful because then we could say approximately.
1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
Inferential Statistics
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Confidence Intervals and Hypothesis Testing
Comparing Systems Using Sample Data Andy Wang CIS Computer Systems Performance Analysis.
Communicating Quantitative Information Normal distribution Values, changes, changes of changes Female/Male income. Questions on Midterm Homework: prepare.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
The Standardized Normal Distribution Z is N( 0, 1 2 ) The standardized normal X is N( ,  2 ) 1.For comparison of several different normal distributions.
Creating User Interfaces Review midterm Sampling Homework: User observation reports due next week.
Sampling Distribution ● Tells what values a sample statistic (such as sample proportion) takes and how often it takes those values in repeated sampling.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter 10: Comparing Two Populations or Groups
BPS - 5th Ed. Chapter 11 1 Sampling Distributions.
Inference We want to know how often students in a medium-size college go to the mall in a given year. We interview an SRS of n = 10. If we interviewed.
Confidence Intervals for Proportions Chapter 8, Section 3 Statistical Methods II QM 3620.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
1 Sampling Distributions. Central Limit Theorem*
Section 10.1 Confidence Intervals
Chapter 8 Delving Into The Use of Inference 8.1 Estimating with Confidence 8.2 Use and Abuse of Tests.
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
+ “Statisticians use a confidence interval to describe the amount of uncertainty associated with a sample estimate of a population parameter.”confidence.
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
Section 10.1 Estimating with Confidence AP Statistics February 11 th, 2011.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Chapter 9 Day 2 Tests About a Population Proportion.
Outline of Today’s Discussion 1.The Distribution of Means (DOM) 2.Hypothesis Testing With The DOM 3.Estimation & Confidence Intervals 4.Confidence Intervals.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Please hand in homework on Law of Large Numbers Dan Gilbert “Stumbling on Happiness”
Estimation by Intervals Confidence Interval. Suppose we wanted to estimate the proportion of blue candies in a VERY large bowl. We could take a sample.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Creating User Interfaces Qualitative vs Quantitative research. Sampling. Panels. Homework: Post proposal & work on user observation study. Next week:Review.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
Copyright © 2010 Pearson Education, Inc. Slide
Statistics 19 Confidence Intervals for Proportions.
 Confidence Intervals  Around a proportion  Significance Tests  Not Every Difference Counts  Difference in Proportions  Difference in Means.
Copyright © 2009 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Tests About a Population Proportion
9.3 Hypothesis Tests for Population Proportions
CHAPTER 9 Testing a Claim
Chapter 9: Testing a Claim
Hypothesis Testing: Preliminaries
CHAPTER 9 Testing a Claim
Chapter 8: Inference for Proportions
Chapter 10: Estimating with Confidence
Sampling Distributions
Chapter 9: Testing a Claim
Chapter 9 Testing a Claim
Chapter 9: Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Chapter 9: Testing a Claim
Lecture Slides Elementary Statistics Twelfth Edition
Chapter 9: Testing a Claim
Unit 5: Hypothesis Testing
Presentation transcript:

Communicating Quantitative Information Inflation Election district Polling, predictions, confidence intervals, margin of error Homework: Identify topic for Project 1. Postings. Prepare for Midterm

Inflation is when goods and services cost more over time –money is worth less Government agencies do the analysis on a 'shopping cart' of goods and services and calculates (and publishes) a number If annual inflation is 2% =.02, it means that something that cost $100 last year would cost $102 this year (on average) old_cost * (1 + inflation_rate) is the new_cost

Hint Need to change the percentage into a fraction –2% becomes.02 Need to add 1 Multiply old by 1.02 Hint: if inflation is positive (if goods and services are increasing in price), then new must be more than old—need to multiply by something that increases…..

Exercises If inflation is 4%, what would new prices be for something –$50 –$10 If inflation is 12%, what would new prices be for something –$50 –$10

History Mostly, there is inflation, though deflation is possible (and generally not good for economy) Central banks ('the fed') try to regulate inflation by changes in the interest rates Calculation is complex –Consider computers –digital cameras

What is meant by Grade Inflation? ?

Dental expenses Yes, expenses have gone up, but have they gone up faster than inflation, that is, faster than everything Look at the graph –Gray line versus blue line –NOTE: both are increases

Pie chart versus Bar graph Pie is to show parts of a whole –For example, different categories of spending Bar graphs can show categories, also. –Better than pie charts if categories are not everything Bar graphs good for showing different time periods –Horizontal (x-axis) typically holds the time Clustered bar good for comparisons Stacked bar good for parts of a whole

On graphs Graphs and diagrams are for showing context …. Telling a story (the relevant story) Complexity is okay –Want to encourage AND reward study Remember: definitions, denominator, distribution, difference (context), dimension Dimension: may be axis in graph gapminder uses color, size of 'dot', and timing Napoleon matching to/from Moscow: color, thickness of line, geography, temperature

On re-districting One technique is to concentrate [known] voters of one type to remove from other districts Are voters so predictable? Do the qualities of the individual representatives count?

New topic(s) Measurement Polling and sampling

Measurements Measuring something can require defining a system / process –Competitive figure skating ‘operational’ definition –‘likely voter’ someone who voted in x% of last general elections and/or y% of primaries And knows the voting place Fixed place and time For surveys: answered a specific question in the context of other questions, …

Source The Cartoon guide to Statistics by Larry Gonick and Woollcott Smith HarperResource

Caution Procedures (formulas) presented without proof, though, hopefully, motivated Go over process different ways Next class: models of population, subpopulations in sample

Task Want to know the percentage (proportion) of some large group –adults in USA –television viewers –web users For a particular thing –think the president is doing a good job –watched specific program viewed specific commercial –visited specific website

Strategy: Sampling Ask a small group –phone –solicitation at a mall –other? Monitor actions of a small group, group defined for this purpose Monitor actions of a panel chosen ahead of time

Quality of sample Recall discussion on students who 'took the bait' to take special survey More on quality of sample later More on adjusting data from panel for statement about total population later

Two approaches Estimating with confidence interval c in general population based on proportion p hat in sample Hypothesis testing: H0 (null hypothesis) p = p0 versus Ha p > p0

Estimation process Construct a sample of size n and determine p hat –Ask who they are voting for (for now, let this be binomial choice) Use this as estimate for actual proportion p. … but the estimate has a margin of error. This means : The actual value is within a range centered at p hat …UNLESS the sample was really strange. The confidence value specifies what the chances are of the sample being that strange.

Statement I'm 95% sure that the actual proportion is in the following range…. p hat – m <= p <= p hat + m Notice: if you want to claim more confidence, you need to make the margin bigger.

Image from Cartoon book You are standing behind a target. An arrow is shot at the target, at a specific point in the target. The arrow comes through to your side. You draw a circle (more complex than +/- error) and say Chances are: the target point is in this circle unless shooter was 'way off'. Shooter would only be way off X percent of the time. (Typically X is 5% or 1%.)

Mathematical basis Samples are themselves normally distributed… –if sample and p satisfy certain conditions. Most samples produce p hat values that are close to the p value of the whole population. Only a small number of samples produce values that are way off. –Think of outliers of normal distribution

Actual (mathematical) process Can use these techniques when n*p>=5 and n*(1-p)>=5 The p hat values are distributed close to normal distribution with standard deviation sd(p) = Can estimate this using p hat in place of p in formula! Choose the level of confidence you want (again, typically 5% or 1%). For 5% (95% confident), look up (or learn by heart the value 1.96: this is the amount of standard deviations such that 95% of values fall in this area. So.95 is P(-1.96 <= (p-p hat )/sd(p) <=1.96) Sample size must be this big

Notes p is less than 1 so (1-p) is positive. Margin of error decreases as p varies from.5 in either direction. (Check using excel). –if sample produces a very high (close to 1) or very low value (close to 0), p * (1-p) gets smaller –(.9)*(.1) =.09; (.8)*(.2) =.16, (.6)*(.4) =.24; (.5)*.5)=.25

Notes Need to quadruple the n to halve the margin of error.

Formula Use a value called the z transform –95% confidence, the value is 1.96

Level of confidence 1-leg or 2-legStandard deviations (z- score) 80%.10 or %.05 or %.025 or %.005 or

Mechanics Process is Gather data (get p hat and n) choose confidence level Using table, calculate margin of error. Book example: 55% (.55 of sample of 1000) said they backed the politician) sd(p hat ) = square_root ((.55)*(.45)/1000) =.0157 Multiply by z-score (e.g., 1.96 for a 95% confidence) to get margin of error So p is within the range:.550 – (1.96)*(.0157) and (1.96)*(.0157).519 to.581 or 51.9% to 58.1%

Example, continued 51.9% to 58.1% may round to 52% to 58% or may say 55% plus or minus 3 percent. What is typically left out is that there is a 1/20 chance that the actual value is NOT in this range.

95% confident means 95/100 probability that this is true 5/100 chance that this is not true 5/100 is the same as 1/20 so, There is only a 1/20 chance that this is not true. Only 1/20 truly random samples would give an answer that deviated more from the real –ASSUMING NO INTRINSIC QUALITY PROBLEMS –ASSUMING IT IS RANDOMLY CHOSEN

99% confidence means [Give fraction positive] [Give fraction negative]

Why Confidence intervals given mainly for 95% and 99%?? History, tradition, doing others required more computing….

Let's ask a question How many of you watched the last Super Bowl? World Cup? –Sample is whole class How many registered to vote? –Sample size is number in class 18 and older ????

Excel: columns A & B students watchers psample=B2/B1 sd=SQRT(B3*(1-B3)/B1) Ztransform for 95%=1.96 margin=B5*B4 lower=MAX(0,B3-B6) upper=MIN(B3+B6,1)

Variation of book problem Say sample was 300 (not 1000). sd(phat) = square_root ((.55)*(.45)/300) =.0287 Bigger number. The circle around the arrow is larger. The margin is larger because it was based on a smaller sample. Multiplying by 1.96 get.056, subtracting and adding from the.55 get.494 to.606 You/we are 95% sure that true value is in this range. Oops: may be better, but may be worse. The fact that the lower end is below.5 is significant for an election! Divisor smaller

Exercise Determine / choose / read size of sample n proportion in sample (p hat ) claimed confidence level (and consult table). Hint: go back to Mechanics slide and Table slide and plug in the numbers!

Exercise size of sample is n proportion in sample is p hat confidence level produces factor called the z-score –Can be anything but common values are [80%], 90%, 95%, 99%) –Use table. For example, 95% value is 1.96; 99% is 2.58 Calculate margin of error m – m = zscore * sqrt((p hat )*(1-p hat )/n) Actual value is >= p hat – m and <= p hat + m

Hypothesis testing Pre-election polling Repeat example Source (again) The Cartoon Guide to Statistics by Gonick and Smith –See also for Jury selection, product inspection, etc.

Hypothesis testing Null hypothesis p = p 0 Alternate hypothesis p > p 0 Do a test and decide if there is evidence to reject the Null hypothesis. (Need more evidence to reject than to keep). –Similar analysis (not giving proof!)

Hypothesis testing, continued Test statistic is Z = (.55-50)/sqrt(.5*.5)/sqrt(1000) = 3.16 Use Excel =1-normsdist(3.16) P(z>=3.16) =.0008 Reject Null hypothesis. Chances are.0008 that it is true (that p = p 0 )

Project I Paper or presentation on news story involving mathematics and/or quantitative reasoning –Involving the audience is good –Everybody be ready with paper or ready to present. Some presentations may go to next class. Use multiple sources Explain the mathematics!!!

Ways to get topic Topic, assignment in other course that involves quantitative information –Double dipping Alternative: compare how two different newspapers/writers/media treat the same topic. There must be real differences. –Variant (special case): election polling. Talk about similarities and differences, perhaps definition of 'toss-up', how they describe sources,? Paulos TV series: ng/ ng/

Homework Topic for project 1 due by October 20 –You can re-use any topic you or anyone else posted –You can re-use spreadsheet or diagram topics –You can use topics I suggested –You can use topics from another class –YOU MUST post your proposal even if it is a topic I suggested. Midterm is October 18 Presentation and project 1 paper due Nov. 4 (Guide to midterm is on-line. Reviewing will assume you have studied the guide.)