Presentation is loading. Please wait.

Presentation is loading. Please wait.

Econ 140 Lecture 31 Univariate Populations Lecture 3.

Similar presentations


Presentation on theme: "Econ 140 Lecture 31 Univariate Populations Lecture 3."— Presentation transcript:

1 Econ 140 Lecture 31 Univariate Populations Lecture 3

2 Econ 140 Lecture 32 Today’s Plan Univariate statistics - distribution of a single variable Making inferences about population parameters from sample statistics - (For future reference: how can we relate the ‘a’ and ‘b’ parameters from last lecture to sample data) Dealing with two types of probability –‘A priori’ classical probability – Empirical classical

3 Econ 140 Lecture 33 A Priori Classical Probability Characterized by a finite number of known outcomes The expected value of Y can be defined as The expected value will always be the mean value µ Y is the population mean is the sample mean The outcome of an experiment is a randomized trial

4 Econ 140 Lecture 34 Flipping Coins Example: flipping 2 fair coins –Possible outcomes are: HH, TT, HT, TH –we know there are only 4 possible outcomes –we get discreet outcomes because there are a finite number of possible outcomes –We can represent known outcomes in a matrix

5 Econ 140 Lecture 35 Flipping Coins (2) The probability of some event A is –where m is the number of events keeping with event A and n is the total number of possible events. –If A is the number of heads when flipping 2 coins we can represent the probability distribution function like this:

6 Econ 140 Lecture 36 Flipping Coins (3) If we graph the PDF we get The expected value is = 0(0.25) + 1(0.5) + 2(0.25)

7 Econ 140 Lecture 37 Empirical Classical Probability Characterized by an infinite number of possible outcomes With empirical classical probability, we use sample data to make inferences about underlying population parameters –Most of the time, we don’t know what the population values are, so we need to use a sample Example: GPAs in the Econ 140 population –We can take a sample of every 5th person in the room –Assuming that our sample is random (that Econ 140 does not sit in some systematic fashion), we’ll have a representative sample of the population

8 Econ 140 Lecture 38 Empirical Classical Probability Statisticians/economists collect sample data for many other purposes CPS is another example: sampling occurs at the household level CPS uses weights to correct data for oversampling –Over-sampling would be if we picked 1 in 3 in front of the room and only 1 in 5 in the back of the room. In that case we would over-sample the front –There’s a spreadsheet example on the course website (the weighted mean is our best guess of the population mean, whereas the unweighted mean is the sample mean)

9 Econ 140 Lecture 39 Empirical Classical Probability On the course website you’ll find an Excel spreadsheet that we will use to calculate the following: –Expected value –PDF and CDF –Weights to translate sample data into population estimates –Examine the difference between the sample (unweighted) mean and the estimated population (weighted) mean: Weighted mean = sum(EARNWKE*EARNWT)/sum(EARNWT) This approximates the population mean estimate

10 Econ 140 Lecture 310 Empirical Classical Probability(3) So how do we construct a PDF for our spreadsheet example? –Pick sensible earnings bands (ie 10 bands of $100) –We can pick as many bands as we want - the greater the number of bands, the more accurate the shape of the PDF to the ‘true population’. More bands = more calculation!

11 Econ 140 Lecture 311 Empirical Classical Probability(2) Constructing PDFs: –Count the number of observations in each band to get an absolute frequency –Use weights to translate sample frequencies into estimates of the population frequencies –Calculate relative frequencies for each band by dividing the absolute frequency for the band by the total frequency

12 Econ 140 Lecture 312 Empirical Classical Probability(4) –A weighted way to approximate the PDF: –When we have k bands, always check: if the probabilities don’t sum to 1, we’ve made a mistake!

13 Econ 140 Lecture 313 Empirical Classical Probability(5) Going back to our expected value… The expected value of Y will be: –The p k are frequencies and they can be weighted or not –The Y k are the earnings bands midpoints (50, 150, 250, and so on in the spreadsheet) From our spreadsheet example our weighted mean was $316.63 and the unweighted mean was $317.04 –Since the sample is so large, there is little difference between the sample (unweighted) mean and the population (weighted) mean

14 Econ 140 Lecture 314 Empirical Classical Probability(6) We can also calculate the weighted and unweighted expected values: E(Weighted value): $326.85 E(Unweighted value:$327.31 Why are the expected values different from the means? –We lose some information (bands for the wage data) in calculating the expected values! So why would we want to weight the observations? –With a small sample of what we think is a large population, we might not have sampled randomly. We use weights to make the sample more closely resemble the population.

15 Econ 140 Lecture 315 Empirical Classical Probability(7) The mean is the first moment of distribution of earnings We may also want to consider how variable earnings are –we can do this by finding the variance, or standard error Calculate the variance –In our example, the unweighted variance is: –The weighted variance is 29730.34 –The difference between the two is 623.44

16 Econ 140 Lecture 316 Empirical Classical Probability(8) The weighted PDF is pink It’s tough to see, but the weighting scheme makes the population distribution tighter

17 Econ 140 Lecture 317 Empirical Classical Probability(9) We can use our PDF to answer: –What is the probability that someone earns between $300 and $400? But we can’t use this PDF to answer: –What is the probability that someone earns between $253 and $316? Why? –The second question can’t be answered using our PDF because $253 and $316 fall somewhere within the earnings bands, not at the endpoints

18 Econ 140 Lecture 318 Standard Normal Curve We need to calculate something other than our PDF, using the sample mean, the sample variance, and an assumption about the shape of the distribution function Examine the assumption later The standard normal curve (also known as the Z table) will approximate the probability distribution of almost any continuous variable as the number of observations approaches infinity

19 Econ 140 Lecture 319 Standard Normal Curve (2) The standard deviation (measures the distance from the mean) is the square root of the variance: 68% area under curve 95% 99.7%

20 Econ 140 Lecture 320 Standard Normal Curve (3) Properties of the standard normal curve –The curve is centered around –The curve reaches its highest value at and tails off symmetrically at both ends –The distribution is fully described by the expected value and the variance You can convert any distribution for which you have estimates of and to a standard normal distribution

21 Econ 140 Lecture 321 Standard Normal Curve (4) A distribution only needs to be approximately normal for us to convert it to the standardized normal. The mass of the distribution must fall in the center, but the shape of the tails can be different or

22 Econ 140 Lecture 322 Standard Normal Curve (5) If we want to know the probability that someone earns at most $C, we are asking: We can rearrange terms to get: Properties for the standard normal variate Z: –It is normally distributed with a mean of zero and a variance of 1, written in shorthand as Z~N(0,1)

23 Econ 140 Lecture 323 Standard Normal Curve (5) If we have some variable Y we can assume that Y will be normally distributed, written in shorthand as Y~N(µ,  2 ) We can use Z to convert Y to a normal distribution Look at the Z standardized normal distribution handout –You can calculate the area under the Z curve from the mean of zero to the value of interest –For example: read down the left hand column to 1.6 and along the top row to.4 you’ll find that the area under the curve between Z=0 and Z=1.64 is 0.4495

24 Econ 140 Lecture 324 Standard Normal Curve (6) Going back to our earlier question: What is the probability that someone earns between $300 and $400 [P(300  Y  400)]? P(300  Y  400) Z1Z1 Z2Z2

25 Econ 140 Lecture 325 What we’ve done ‘A priori’ empirical classical probability –There are a finite number of possible outcomes –Flipping coins example Empirical classical probability –There are an infinite number of possible outcomes –Difference between sample and population means –Difference between sample and population expected values –Difference in calculating PDF’s of a Univariate population. Use of standard normal distribution.


Download ppt "Econ 140 Lecture 31 Univariate Populations Lecture 3."

Similar presentations


Ads by Google