Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-1 Lesson 1: Analysis of Economic Data is difficult but intuitive.

Slides:



Advertisements
Similar presentations
Chapter 6 Sampling and Sampling Distributions
Advertisements

QBM117 Business Statistics Introduction to Statistics.
Chapter 7: Data for Decisions Lesson Plan
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson1-1 Lesson 1: Analysis of Economic Data is difficult but intuitive.
© 2004 Prentice-Hall, Inc.Chap 1-1 Basic Business Statistics (9 th Edition) Chapter 1 Introduction and Data Collection.
Ka-fu Wong © 2003 Chap 1-1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
A Poem The information you have is not the information you want
Chapter 7 Sampling and Sampling Distributions
Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.
Ka-fu Wong © 2003 Chap 1-1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Ka-fu Wong © 2003 Lab 1-1 Dr. Ka-fu WONG ECON1003 Analysis of Economic Data.
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.
Review of normal distribution. Exercise Solution.
A P STATISTICS LESSON 9 – 1 ( DAY 1 ) SAMPLING DISTRIBUTIONS.
Chapter 7 Sampling Distributions
Chapter 7: Sampling Distributions
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 7: Sampling Distributions Section 7.1 What is a Sampling Distribution?
ECON 6012 Cost Benefit Analysis Memorial University of Newfoundland
Chapter 1: Introduction to Statistics
1 Excursions in Modern Mathematics Sixth Edition Peter Tannenbaum.
Statistics: Basic Concepts. Overview Survey objective: – Collect data from a smaller part of a larger group to learn something about the larger group.
 The situation in a statistical problem is that there is a population of interest, and a quantity or aspect of that population that is of interest. This.
Copyright © 2009 Cengage Learning Chapter 10 Introduction to Estimation ( 추 정 )
Random Sampling, Point Estimation and Maximum Likelihood.
Chapter 4 Statistics. 4.1 – What is Statistics? Definition Data are observed values of random variables. The field of statistics is a collection.
§ 1.1 An Overview of Statistics. Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics.
PARAMETRIC STATISTICAL INFERENCE
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
An Overview of Statistics
Sampling is the other method of getting data, along with experimentation. It involves looking at a sample from a population with the hope of making inferences.
Chapter 7: Data for Decisions Lesson Plan Sampling Bad Sampling Methods Simple Random Samples Cautions About Sample Surveys Experiments Thinking About.
Ka-fu Wong © 2003 Chap 8- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Distributions of the Sample Mean
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Ka-fu Wong © 2003 Chap 6- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. PPSS The situation in a statistical problem is that there is a population of interest, and a quantity or.
C1, L1, S1 Chapter 1 What is Statistics ?. C1, L1, S2 Chapter 1 - What is Statistics? A couple of definitions: Statistics is the science of data. Statistics.
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson0-1 Supplement 2: Comparing the two estimators of population variance by simulations.
Population Distributions vs. Sampling Distributions There are actually three distinct distributions involved when we sample repeatedly andmeasure a variable.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
Introduction Sample surveys involve chance error. Here we will study how to find the likely size of the chance error in a percentage, for simple random.
METHODOLOGY OF ECONOMETRICS Broadly speaking, traditional econometric methodology proceeds along the following lines:Broadly speaking, traditional econometric.
Virtual University of Pakistan
Sampling Distributions
I. Introduction to statistics
ECO 173 Chapter 10: Introduction to Estimation Lecture 5a
Chapter 7: Sampling Distributions
ECO 173 Chapter 10: Introduction to Estimation Lecture 5a
Chapter 9: Sampling Distributions
Chapter 7: Sampling Distributions
Test Drop Rules: If not:
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 9: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
The Practice of Statistics – For AP* STARNES, YATES, MOORE
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Presentation transcript:

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-1 Lesson 1: Analysis of Economic Data is difficult but intuitive

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-2 Outline Capture-Recapture experiment Estimator Simulations What is Statistics? Sampling How to estimate unemployment rate?

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-3 Capture/Re-capture Goal: 1.Illustrate that how to estimate the population size when the cost of counting all individuals is prohibitive. 2.Illustrate how easy and intuitive statistics could be. Statistics need not be completely deep, murky, and mysterious. Our common sense can help us to negotiate our way through the course.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-4 Counting the stones We are interested in knowing the number of black stones in the box. We only need to do to obtain a reasonable estimate of stones in the box – allowing for errors of counting or estimation.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-5 Two examples Example #1: The box contains only a small number of stones. Example #2: The box contains a lot of stones that will take days to count.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-6 History and examples of capture / recapture method Capture-recapture methods were originally developed in the wildlife biology to monitor the census of bird, fish, and insect populations (counting all individuals is prohibitive). Recently, these methods have been utilized considerably in the areas of disease and event monitoring.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-7 The fish example Estimating the number of fish in a lake or pond. C fish is caught, tagged, and returned to the lake. Later on, R fish are caught and checked for tags. Say T of them have tags. The numbers C, R, and T are used to estimate the fish population.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-8 Stones in a box The objective is to estimate the number of fish (represented by black stones) in a box. Capture one handful of fish (black stones). Count them and call it C. Mark the fish by replacing the black stones with red stones. Put them back into the box. Capture another handful of fish (stones). Count the total number of fish or stones (R) and the number of marked fish or white stones (T). Based on this information, How to obtain a reasonable estimate of the number of fish or stones in the box?

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-9 Stones in a box We know that C/N ≈ T/R Hence, a simple estimate is CR/T C= the number of fish or stones captured in the first round. R= the total number of fish or stones captured in the second round. T= the number of marked fish or white stones captured in the second round.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-10 Simulations to see the properties of this proposed estimator How good is the proposed estimator? To see the properties of this proposed estimator, I have use MATLAB to simulation our Capture-recapture experiment with different numbers of capture (C) and different numbers of recapture (R), relative to the total number of fish in the pond. Throughout, N=500 and 1000 simulations

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-11 Definition: Estimator Estimator is a formula or a rule that takes a set of data and returns an estimate of the population quantity (also known as population parameter) we are interested in. θ(x 1,x 2,...,x n ) Example: An estimator for the population mean If we are interested in the population mean, a very intuitive estimator of the population mean based on a sample (x 1,x 2,...,x n ) is θ(x 1,x 2,...,x n )= (x 1 +x x n )/n

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-12 Simulating the properties of a sample mean estimator If we were to study the properties of the following two estimators for the population mean: θ(x 1,x 2,...,x n )= (x 1 +x x n )/n versus θ(x 1,x 2,...,x n )= (x 1 +x x n +1)/n With some basic computing skills, we may perform Monte Carlo simulations to compare their properties.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-13 Simulating the properties of a sample mean estimator 1.We will need to define a population. Suppose the population consists of 10 balls numbered from 1 to 10 in a bag. We know that the population mean is ( )/10 = We will need to define the sampling process. Suppose we draw a sample of size 5 with replacement. For the sample, compute the two sample mean estimates of the population mean. 3.We will need to decide on the number of repetitions. Suppose we will repeat the process for 10,000 times. 4.After repeating the sampling process 10,000 times, we will have 10,000 sample means for each of the estimator, each of them are estimate of the population mean based on the respective samples.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-14 Simulating the properties of a sample mean estimator The above simulation is performed using MATLAB. The means of the 10,000 sample means of the two estimators are and

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-15 Which estimator is more desirable?

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-16 Simulation design – via MATLAB Individual simulation experiment: Create 500 “black” fish, labelled 1 to 500. Capture a random sample of C fish, mark them by converting their label to zero (i.e., red fish). Capture another random sample of R fish. Count the number of marked fish in the sample. Call it T. Compute the estimate as CR/T. If T=0, we are in trouble. Such experiments with T=0 are dropped. Repeat this experiment 1000 times. Hence, we have 1000 estimates. Compute the mean and standard deviation of these 1000 estimates.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-17 Properties of our estimator Increasing C and R NCRSMeanStd N = Total number of fish in the pond. C = number of captured fish. R = number of re-captured fish. S = number of simulation with at least one marked fish in recapture.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-18 Properties of our estimator Constant C and increasing R NCRSMeanStd N = Total number of fish in the pond. C = number of captured fish. R = number of re-captured fish. S = number of simulation with at least one marked fish in recapture.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-19 Properties of our estimator Increasing C and constant R NCRSMeanStd N = Total number of fish in the pond. C = number of captured fish. R = number of re-captured fish. S = number of simulation with at least one marked fish in recapture.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-20 Conclusion from the simulations The proposed estimator generally overestimate the number of fish in pond, i.e., estimate is larger than the true number of fish in pond. That is, there is a bias. Holding R constant, increasing the number of capture (C) helps: Bias is reduced, i.e., Mean is closer to the true population The estimator is more precise, i.e., standard deviation of the estimator is smaller. Holding C constant, increasing the number of recapture (R) does not help: Bias is more or less unchanged. The precision of the estimator is more or less unchanged.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-21 Additional issues Our proposed estimator is good enough but it can be better. Alternative estimators have been developed to reduce or eliminate the bias of estimating N. For instance, Seber (1982, p.60) suggests an estimator of N (C+1)(R+1)/(T+1) – 1 (Note that our proposed formula is CR/T.) Seber, G. (1982): The Estimation of Animal Abundance and Related Parameters, second edition, Charles.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-22 Simulations to see the properties of this modified estimator How good is the modified estimator? To see the properties of this modified estimator, we repeat the above simulation exercise with this new formula. (C+1)(R+1)/(T+1) – 1

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-23 Properties of modified estimator Increasing C and R NCRSMeanStd N = Total number of fish in the pond. C = number of captured fish. R = number of re-captured fish. S = number of simulation with non-zero marked fish in recapture.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-24 Properties of modified estimator Constant C and increasing R NCRSMeanStd N = Total number of fish in the pond. C = number of captured fish. R = number of re-captured fish. S = number of simulation with non-zero marked fish in recapture.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-25 Properties of modified estimator Increasing C and constant R NCRSMeanStd N = Total number of fish in the pond. C = number of captured fish. R = number of re-captured fish. S = number of simulation with non-zero marked fish in recapture.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-26 Conclusion from the simulations The modified estimator performs better than the original estimator. There is no apparent bias. The estimator is more precise. Holding R constant, increasing the number of capture (C) helps: The estimator is more precise, i.e., standard deviation of the estimator is smaller. Holding C constant, increasing the number of recapture (R) does not help: The precision of the estimator is more or less unchanged.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-27 What is Meant by Statistics? Statistics is the science of 1.collecting, 2.organizing, 3.presenting, 4.analyzing, and 5.interpreting numerical data to assist in making more effective decisions.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-28 Who Uses Statistics? Statistical techniques are used extensively by Economists, marketing, accounting, quality control, consumers, professional sports people, hospital administrators, educators, politicians, physicians, etc...

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-29 Who Uses Statistics? As economists, We must verifying our models with data. We need to provide forecast of the economy (GDP growth). We need quantitative estimates of How individual decisions are influenced by policy variables (such as unemployment benefits, education subsidy) in order to forecast the impact of public policies. How macro policies (government expenditure) will affect output.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-30 Who Uses Statistics? In the business community, managers must make decisions based on what will happen to such things as demand, costs, and profits. These decisions are an effort to shape the future of the organization. If the managers make no effort to look at the past and extrapolate into the future, the likelihood of achieving success is slim.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-31 Why do we need to understand Statistics? We are constantly deluged with statistics in the media (newspapers, magazines, journals, text books, etc.). We need to have a means to condense large quantities of information into a few facts or figures. We need to predict what will likely occur given what has occurred in the past. We need to generalize what we have learned in specific situations to the more general case.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-32 We are users of statistics We do not want to become professors of statistics. We do not want to develop advanced statistics theory. We are users of statistics To be effective users, we need to have a good grip of basic statistics theory. We need to practice using the tools. This course will give you the basic, enough for you to move on to your next Econometrics class.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-33 Populations and Samples A population is a collection of all possible individuals, objects, or measurements of interest. A sample is a portion, or part, of the population of interest.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-34 Populations and Samples Population Sample

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-35 Sampling a Population of Existing Units Random Sampling A procedure for selecting a subset of the population units in such a way that every unit in the population has an equal chance of selection Sampling with replacement When a unit is selected as part of the sample, its value is recorded and placed back into the population for possible reselection Sampling without replacement Units are not placed back into the population after selection

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-36 Approximate Random Samples Frame A list of all population units. Required for random sampling, but not for approximate random sampling methods like systematic and voluntary response sampling. Systematic Sample Every k-th element of the population is selected for the sample Voluntary Response Sample Sample units are self-selected (as in radio/TV surveys)

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-37 How to estimate the unemployment rate First, survey a large number of individuals (say, 1000) Are you 15 and over? If not, you are definitely not in the labor force. If you are 15 and over, Have you work for pay or profit during the seven days before enumeration or have a formal job attachment? If yes, you are counted as employed. If not employed, Have you been available for work during the seven days before enumeration? And Have you sought work during the 30 days before enumeration? If yes to both questions, you are counted as unemployed.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-38 How to estimate the unemployment rate The unemployment rate is computed as #unemployed/ (#unemployed + #employed) Note that the estimate of the unemployment rate is based on a random subset (which we call a sample) of the individuals of an economy -- not all individuals in an economy.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-39 Simulation: An estimation of the unemployment rate A process of estimating unemployment rate may be simulated at home or in a classroom with a bag of black and white stones (as in a game of GO). Suppose black stones stand for unemployed and white stones stand for employed individuals. A random selection of 20 individuals is like randomly grabbing 20 stones from the bag. We ask each selected individuals whether they are white (employed) or black (unemployed). The unemployment rate may be computed using the formula

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson1-40 What to take away today Statistics could be easy and intuitive. Statistics need not be completely deep, murky, and mysterious. Our common sense can help us to negotiate our way through the course.

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson END - Lesson 1: Analysis of Economic Data is difficult but intuitive