Estimation and Uncertainty 12-706/73-359 Lecture 3 - Sept 8, 2004.

Slides:



Advertisements
Similar presentations
Mean, Proportion, CLT Bootstrap
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Estimation in Sampling
Inference: Confidence Intervals
Chapter 19 Confidence Intervals for Proportions.
Confidence Intervals for
Lecture 3 Miscellaneous details about hypothesis testing Type II error
Estimation and Uncertainty Dr. Deanna Matthews / Lecture 7 - Sept 18, 2002.
Estimation and Uncertainty / Lecture 2.
Estimation and Uncertainty / Original lecture by H. Scott Matthews, CMU Sept 24, 2003.
Estimation and Uncertainty / / Lecture 2 - August 31, 2005.
1 Course Intro Scott Matthews / Lecture 1.
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
Estimation and Uncertainty H. Scott Matthews / Lecture 10 - Oct. 1, 2003.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of scientific research When you know the system: Estimation.
Estimation and Uncertainty / Lecture 8 - Sept 24, 2003.
1 Course Intro Scott Matthews / / Lecture 1 - 8/29/2005.
Estimation and Uncertainty H. Scott Matthews / Lecture 10 - Oct. 4, 2004.
Estimation 8.
1 The Sample Mean rule Recall we learned a variable could have a normal distribution? This was useful because then we could say approximately.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
The one sample t-test November 14, From Z to t… In a Z test, you compare your sample to a known population, with a known mean and standard deviation.
Inferential Statistics
Confidence Intervals W&W, Chapter 8. Confidence Intervals Although on average, M (the sample mean) is on target (or unbiased), the specific sample mean.
Scot Exec Course Nov/Dec 04 Ambitious title? Confidence intervals, design effects and significance tests for surveys. How to calculate sample numbers when.
Standard Error of the Mean
Intermediate Statistical Analysis Professor K. Leppel.
Introduction to Statistical Inferences
1 CHAPTER 7 Homework:5,7,9,11,17,22,23,25,29,33,37,41,45,51, 59,65,77,79 : The U.S. Bureau of Census publishes annual price figures for new mobile homes.
Chapter 1: Introduction to Statistics
From Sample to Population Often we want to understand the attitudes, beliefs, opinions or behaviour of some population, but only have data on a sample.
June 18, 2008Stat Lecture 11 - Confidence Intervals 1 Introduction to Inference Sampling Distributions, Confidence Intervals and Hypothesis Testing.
Section 3C Dealing with Uncertainty Pages
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 9 Section 1 – Slide 1 of 39 Chapter 9 Section 1 The Logic in Constructing Confidence Intervals.
Estimation Statistics with Confidence. Estimation Before we collect our sample, we know:  -3z -2z -1z 0z 1z 2z 3z Repeated sampling sample means would.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Basic Business Statistics 11 th Edition.
Welcome to the Unit 8 Seminar Dr. Ami Gates
Confidence Interval Estimation
Chapter 7 Statistical Inference: Confidence Intervals
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Estimates and Sample Sizes Lecture – 7.4
Sampling Distribution ● Tells what values a sample statistic (such as sample proportion) takes and how often it takes those values in repeated sampling.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
Stat 13, Tue 5/8/ Collect HW Central limit theorem. 3. CLT for 0-1 events. 4. Examples. 5.  versus  /√n. 6. Assumptions. Read ch. 5 and 6.
From Theory to Practice: Inference about a Population Mean, Two Sample T Tests, Inference about a Population Proportion Chapters etc.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Copyright © 2012 Pearson Education. All rights reserved © 2010 Pearson Education Copyright © 2012 Pearson Education. All rights reserved. Chapter.
1 Sampling Distributions. Central Limit Theorem*
Confidence intervals: The basics BPS chapter 14 © 2006 W.H. Freeman and Company.
1 Chapter 6 Estimates and Sample Sizes 6-1 Estimating a Population Mean: Large Samples / σ Known 6-2 Estimating a Population Mean: Small Samples / σ Unknown.
Section 10.1 Confidence Intervals
11/18/2015 IENG 486 Statistical Quality & Process Control 1 IENG Lecture 07 Comparison of Location (Means)
Chapter 3: Statistical Significance Testing Warner (2007). Applied statistics: From bivariate through multivariate. Sage Publications, Inc.
Confidence intervals. Estimation and uncertainty Theoretical distributions require input parameters. For example, the weight of male students in NUS follows.
1 Course Intro Scott Matthews / Lecture 1.
CONFIDENCE INTERVALS: THE BASICS Unit 8 Lesson 1.
Copyright © 2009 Pearson Education, Inc. 8.1 Sampling Distributions LEARNING GOAL Understand the fundamental ideas of sampling distributions and how the.
Chapter 13 Sampling distributions
Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations.
Class 5 Estimating  Confidence Intervals. Estimation of  Imagine that we do not know what  is, so we would like to estimate it. In order to get a point.
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
BUS304 – Chapter 7 Estimating Population Mean 1 Review – Last Week  Sampling error The radio station claims that on average a household in San Diego spends.
Uncertainty2 Types of Uncertainties Random Uncertainties: result from the randomness of measuring instruments. They can be dealt with by making repeated.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Section 6-1 – Confidence Intervals for the Mean (Large Samples) Estimating Population Parameters.
As a data user, it is imperative that you understand how the data has been generated and processed…
Intro to Probability and Statistics 1-1: How Can You Investigate Using Data? 1-2: We Learn about Populations Using Samples 1-3: What Role Do Computers.
Virtual University of Pakistan
Confidence intervals for the difference between two means: Independent samples Section 10.1.
Estimates and Sample Sizes Lecture – 7.4
Presentation transcript:

Estimation and Uncertainty / Lecture 3 - Sept 8, 2004

Estimation in the Course  We will encounter estimation problems in sections on demand, cost and risks.  We will encounter estimation problems in several case studies.  Projects will likely have estimation problems.  Need to make quick, “back-of-the-envelope” estimates in many cases.  Don’t be afraid to do so!

Problem of Unknown Numbers  If we need a piece of data, we can:  Look it up in a reference source  Collect number through survey/investigation  Guess it ourselves  Get experts to help you guess it  Often only ‘ballpark’, ‘back of the envelope’ or ‘order of magnitude needed  Situations when actual number is unavailable or where rough estimates are good enough  E.g. 100s, 1000s, … (10 2, 10 3, etc.)  Source: Mosteller handout

Notes about Reference Sources  Some obvious: Statistical Abstract of US  Always check sources and secondary sources of data  Usually found in footnotes – also tells you about assumptions/conditions for using  Sometimes the summarized data is wrong!  Look in multiple sources  Different answers implies something about the data and method – and uncertainty

Estimation gets no respect  The 2 extremes - and the respect thing  Aristotle:  “It is the mark of an instructed mind to rest satisfied with the degree of precision which the nature of the subject permits and not to seek an exactness where only an approximation of the truth is possible.”  Archbishop Ussher of Ireland, 1658 AD:  “God created the world in 4028 BC on the 9th of September at nine o’clock in the morning.”  We consider it somewhere in between

In the absence of “Real Data”  Are there similar or related values that we know or can guess? (proxies)  Mosteller: registered voters and population  Are there ‘rules of thumb’ in the area?  E.g. ‘Rule of 72’ for compound interest  r*t = 72: investment at 6% doubles in 12 yrs  MEANS construction manual  Set up a ‘model’ to estimate the unknown  Linear, product, etc functional forms  Divide and conquer

Methods zSimilarity – do we have data that can be made applicable to our problem? zStratification – segment the population into subgroups, estimate each group zTriangulation – create models with different approaches and compare results zConvolution – use probability or weightings (see Selvidge’s table, Mosteller p. 181) yNote – example of a ‘secondary source’!!

Notes on Estimation  Move from abstract to concrete, identifying assumptions  Draw from experience and basic data sources  Use statistical techniques/surveys if needed  Be creative, BUT  Be logical and able to justify  Find answer, then learn from it.  Apply a reasonableness test

Attributes of Good Assumptions  Need to document assumptions in course  Write them out and cite your sources  Have some basis in known facts or experience  Write why you make the specific assumptions  Are unbiased towards the answer  Example: what is inflation rate next year?  Is past inflation a good predictor?  Can I find current inflation?  Should I assume change from current conditions?  We typically use history to guide us

How many TV sets in the US?  Can this be calculated?  Estimation approach #1: Survey/similarity  How many TV sets owned by class?  Scale up by number of people in the US  Should we consider the class a representative sample? Why not?

TV Sets in US – another way  Estimation approach # 2 (segmenting):  Work from # households and # TV’s per household - may survey for one input  Assume x households in US  Assume z segments of ownership (i.e. what % owns 0, owns 1, etc)  Then estimated number of television sets in US = x*(4z 5 +3z 4 +2z 3 +1z 2 +0z 1 )

TV Sets in US – sample  Estimation approach # 2 (segmenting):  work from # households and # tvs per household - may survey for one input  Assume 50,000,000 households in US  Assume 19% have 4, 30% have 3, 35% 2, 15% 1, 1% 0 television sets  Then 50,000,000*(4*.19+3*.3+2* ) = M television sets

TV Sets in US – still another way  Estimation approach #3 – published data  Source: Statistical Abstract of US  Gives many basic statistics such as population, areas, etc.  Done by accountants/economists - hard to find ‘mass of construction materials’ or ‘tons of lead production’.  How close are we?

How well did we do?  Most recent data = 2001  But ‘recently’ increasing < 2% per year  TV/HH tvs, StatAb – 248M TVs,  % error: (248M – 125.5M)/125.5M ~ 100%  What assumptions are crucial in determining our answer? Were we right?  What other data on this table validate our models?  See ‘SAMPLE ESTIMATION’ linked on web page to see how you are expected to answer these types of questions.

Changing Assumptions zStatistical Abstract gave additional info: yAverage TVs/HH = 2.4 (ours was 2.5) yNumber of households: 100 million (ours 50) zThus to redo our analysis, we should do a better job at estimating households

Significant Figures zWe estimated 125,500,000 tvs in US zHow accurate is this - nearest 50,000, the nearest 500,000, the nearest 5,000,000 or the nearest 50,000,000? zShould only report estimates to your confidence - perhaps 1 or 2 “significant figures” could be reported here. zFigures are only carried along to document calculations or avoid rounding errors.

Some handy/often used data zPopulation of US btw million zNumber of households ~ 100 million zAverage personal income ~$30,000

Exercise #2: Estimate Annual Vehicle Miles Travelled (VMT) in the US zEstimate “How many miles per year are passenger automobiles driven in the US?” zTypes of models ySimilar to TVs: Guess number of cars, segment population into miles driven per year yFind fuel consumption data, guess at fuel economy ratio for passenger vehicles yOther ideas? Let’s try it on the board.

Estimate VMT in the US zTable 1093 of 2003 Stat. Abstract suggests 2001 VMT was 2.28 trillion miles (yes - twice as much as 1972 implied in the Mosteller handout)! y235 billion ‘passenger car trips’ per year yAbout 200 million cars yAvg VMT 21,000 mi., about 10,000 miles per car zNote the Dept of Transportation separately specifies “passenger car VMT” as 1.62 trillion miles - does better job of separating trucks yAbout 16k VMT per household yhttp:// ion_statistics/2003/index.html (Table 1-32) ion_statistics/2003/index.html

More clever: Cobblers in the US zCobblers repair shoes zOn average, assume 20 min/task zThus 20 jobs / day ~ 5000/yr yHow many jobs are needed overall for US? zI get shoes fixed once every 5 years yAbout 280M people in US zThus 280M/4 = 56 M shoes fixed/year y56M/5000 ~ 11,000 => 10^4 cobblers in US zActual: Census dept says 5,120 in US

An Energy Example zEnergy measured in SI units = Watts (as opposed to BTUs, etc) zIn practice, we usually talk about kilowatts or kilowatt-hours of energy zRule: 1 Watt of energy used for one hour is One watt-hour (compound unit) = 1Wh y1000 Watts used for one hour = 1kWh z‘How much energy used by lighting in US residences?’

‘How much energy used by lighting in US residences?’ zAssume 50 light fixtures per house zAssume each in use avg 2 hours per day zAssume average fixture is 50W zThus each fixture uses 100Wh/day zEach house uses 5000Wh/day (5kWh/day) z100 million households would use 500 million kWh/day y182,500 million kWh/yr

‘How much energy used by lighting in US residences?’ zOur guess: 182,500 million kWh/yr yDOE: “lighting is 5-10% of household elec” yhttp:// z2000 US residential Demand ~ 1.2 million million kWh (source below) y10% is 120,000 million kWh y5% is 60,000 million kWh y2000 demand source: epmt44p1.html

A Random Example zSelect a random panel of data from the Statistical Abstract of the U.S. yCan you formulate an ‘estimation question’? yCan you estimate the answer? yHow close were you to the ‘actual answer’? zLet’s try this ourselves

Uncertainty zInvestment planning and benefit/cost analysis is fraught with uncertainties yforecasts of future are highly uncertain yapplications often made to preliminary designs ydata is often unavailable zStatistics has confidence intervals – economists need them, too.