Introduction to Econometrics The Statistical Analysis of Economic (and related) Data
2 What do economists study?
3 How do we answer these questions?
4 Review of Probability and Statistics (SW Chapters 2, 3)
5 The California Test Score Data Set
6 Initial look at the data: (You should already know how to interpret this table) This table doesn’t tell us anything about the relationship between test scores and the STR.
7 Do districts with smaller classes have higher test scores? Scatterplot of test score v. student-teacher ratio What does this figure show?
8 How do we answer this question with data?
9 Compare districts with “small” (STR < 20) and “large” (STR ≥ 20) class sizes 1.Estimation of = difference between group means 2.Test the hypothesis that = 0 3.Construct a confidence interval for Class SizeAverage score ( ) Standard deviation (s Y ) n Small Large
10 1. Estimation
11 2. Hypothesis testing
12 Compute the difference-of-means t-statistic:
13 3. Confidence interval
14 Review of Statistical Theory
15 (a) Population, random variable, and distribution
16 Population distribution of Y
(b) Characteristics (a.k.a. moments) of a population distribution 17
Flip coin to see how many heads result from 2 flips E(Y) = 0*(0.25) + 1*(0.50) + 2*(0.25) = = 1 var(Y) = (0.25)*(0 - 1)² + (0.50)*(1 – 1)² + (0.25)*(2 – 1)² = =.50 stdev(Y) = √.50 =
19
20
21 2 random variables: joint distributions and covariance
Joint Probability Example: The relationship between commute time and rain Pr(X=x, Y=y) is the joint probability, where X = 0 if raining = 1 otherwise Y = 1 if commute time is short (<20 minutes) = 0 if commute time is long (>= 20 minutes) Positive or negative relationship? 22
Conditional Probability Conditional probability is used to determine the probability of one event given the occurrence of another related event. Conditional probabilities are written as P(X | Y). They are read as “the probability of X given Y” and are calculated as: 23
Joint Independence Two random variables, X and Y, are independently distributed if for all X and Y Pr(X = x,Y = y) = Pr(X = x)*Pr(Y = y) or Pr(Y = y | X = x) = Pr(Y = y) 1. Do these hold in the rain and commute example? 2. Pr (X = 1, Y=1) = ? 3. E (X | Y=1) = ? 4. Pr (X=0 | Y=0) = ? 24
25 The correlation coefficient is defined in terms of the covariance:
26 The correlation coefficient measures linear association
27 (c) Conditional distributions and conditional means
28 Conditional mean, ctd.
29 (d) Distribution of a sample of data drawn randomly from a population: Y 1,…, Y n
30 Distribution of Y 1,…, Y n under simple random sampling
31
32 (a) The sampling distribution of
33 The sampling distribution of, ctd.
34 The sampling distribution of when Y is Bernoulli (p =.78):
35 Things we want to know about the sampling distribution:
36 The mean and variance of the sampling distribution of
37
38 Mean and variance of sampling distribution of, ctd.
39 The sampling distribution of when n is large
40 The Law of Large Numbers:
41 The Central Limit Theorem (CLT):
42 Sampling distribution of when Y is Bernoulli, p = 0.78:
43 Same example: sampling distribution of :
44 Summary: The Sampling Distribution of
45 (b) Why Use To Estimate Y ?
46
Test statistic = t-statistic: Significance level: Specified probability of Type I error Significance level = α Critical Value: Value of test statistic for which the test just rejects the null at a given significance level Language of Hypothesis Testing 47
Language of Hypothesis Testing, ctd. p-value Probability of drawing a statistic (e.g. Y) at least as adverse to the null hypothesis as the value computed with your data, assuming the null hypothesis is true The smallest significance level at which you can reject the null hypothesis |Test statistic| > |critical value| → reject null hypothesis |Test statistic| < |critical value| → fail to reject null hypothesis 48
49 Calculating the p-value with Y known:
50 Estimator of the variance of Y :
51 What is the link between the p-value and the significance level?
Common Critical Values One-Tail TestTwo-Tail Test 1-ααCritical Value 1-αα/2Critical Value
53