Econ 488 Lecture 2 Cameron Kaplan. Hypothesis Testing Suppose you want to test whether the average person receives a B or higher (3.0) in econometrics.

Slides:



Advertisements
Similar presentations
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Advertisements

Doing an Econometric Project Or Q4 on the Exam. Learning Objectives 1.Outline how you go about doing your own econometric project 2.How to answer Q4 on.
Random Assignment Experiments
C 3.7 Use the data in MEAP93.RAW to answer this question
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Econ 488 Lecture 5 – Hypothesis Testing Cameron Kaplan.
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
CHAPTER 1: THE NATURE OF REGRESSION ANALYSIS
Building and Testing a Theory Steps Decide on what it is you want to explain or predict. 2. Identify the variables that you believe are important.
Lecture 4 Econ 488. Ordinary Least Squares (OLS) Objective of OLS  Minimize the sum of squared residuals: where Remember that OLS is not the only possible.
Econ Prof. Buckles1 Welcome to Econometrics What is Econometrics?
Economics 20 - Prof. Anderson
1. Estimation ESTIMATION.
Linear Regression.
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 7: Demand Estimation and Forecasting.
Econ 140 Lecture 131 Multiple Regression Models Lecture 13.
1 More Regression Information. 2 3 On the previous slide I have an Excel regression output. The example is the pizza sales we saw before. The first thing.
Multiple Regression Models
Topic 2: Statistical Concepts and Market Returns
The Basics of Regression continued
Chapter 2 – Tools of Positive Analysis
Prof. Dr. Rainer Stachuletz 1 Welcome to the Workshop What is Econometrics?
Chapter 9 Hypothesis Testing.
1 In the previous sequence, we were performing what are described as two-sided t tests. These are appropriate when we have no information about the alternative.
Introduction to Regression Analysis, Chapter 13,
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Ordinary Least Squares
Multiple Linear Regression Analysis
Inference for regression - Simple linear regression
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Hypothesis testing – mean differences between populations
Hypothesis Testing in Linear Regression Analysis
  What is Econometrics? Econometrics literally means “economic measurement” It is the quantitative measurement and analysis of actual economic and business.
Multiple Regression. In the previous section, we examined simple regression, which has just one independent variable on the right side of the equation.
1 Research Method Lecture 6 (Ch7) Multiple regression with qualitative variables ©
Bivariate Regression Analysis The most useful means of discerning causality and significance of variables.
CHAPTER 14 MULTIPLE REGRESSION
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Testing of Hypothesis Fundamentals of Hypothesis.
Lecture 16 Dustin Lueker.  Charlie claims that the average commute of his coworkers is 15 miles. Stu believes it is greater than that so he decides to.
Managerial Economics Demand Estimation & Forecasting.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
STA Lecture 251 STA 291 Lecture 25 Testing the hypothesis about Population Mean Inference about a Population Mean, or compare two population means.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
Lecture 18 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Lecture 17 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Eight.
2010, ECON Hypothesis Testing 1: Single Coefficient Review of hypothesis testing Testing single coefficient Interval estimation Objectives.
11 Chapter 5 The Research Process – Hypothesis Development – (Stage 4 in Research Process) © 2009 John Wiley & Sons Ltd.
What is a Hypothesis? A hypothesis is a claim (assumption) about the population parameter Examples of parameters are population mean or proportion The.
EED 401: ECONOMETRICS COURSE OUTLINE
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Chapter 4 Basic Estimation Techniques
The Nature of Econometrics and Economic Data
The Nature of Econometrics and Economic Data
The Nature of Econometrics and Economic Data
The Nature of Econometrics and Economic Data
Defining the null and alternative hypotheses
Interval Estimation and Hypothesis Testing
Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.
Econometrics Analysis
Seminar in Economics Econ. 470
Regression & Correlation (1)
Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.
Introduction to Regression
Presentation transcript:

Econ 488 Lecture 2 Cameron Kaplan

Hypothesis Testing Suppose you want to test whether the average person receives a B or higher (3.0) in econometrics. The Null Hypothesis (H 0 ): Usually trying to reject this:  H 0 : µ =3.0

Hypothesis Testing Alternative Hypothesis (H A or H 1 ): The null hypothesis is not true  H A : µ ≠3.0 (two-sided)  Or H A : µ >3.0 (one-sided) Usually we pick the two sided test unless we can rule out the possibility that µ >3.0

Hypothesis Testing Suppose we conduct a sample of 20 former econometrics students we found:  Sample Mean = 3.30  Standard Deviation = 0.25 How likely is it that a sample of 20 would give a sample average of 3.30 if the population average was really 3.0?

Hypothesis Testing When we estimate x-bar using an estimated standard error we need to use the t- distribution

Hypothesis Testing Test Statistic: Significance Level - Most common is 5% or 1%.

5 % significance level If  really was 3.0, what values of t would give us a test that would reject the null when it’s correct only 5% of the time? If  really was 3.0, what values of t would give us a test that would reject the null when it’s correct only 5% of the time?

Hypothesis Testing We have a sample size of 20 Thus we have N-1 = 20-1 = 19 degrees of freedom. Look in t-table t* = So if our value of t is greater than OR less than , we should reject the null hypothesis

Hypothesis testing So, we should reject the null

P-value Suppose we want to know: if the average student really got a 3.0, how likely would it be for us to observe a value at least as far from 3.0 as we did in our sample? In other words, if  = 3.0, how likely is it that when we draw a sample of 20 that we would get a sample mean of 3.3 or greater (or 2.7 or less)?

P-value We want to know the probability that t>5.366 Can’t look up in most tables, but most stats software gives it to you. In this case, p= In other words if the null were true, we would only get a value that extreme % of the time (1 out of 29,000 times) This is strong evidence that we should reject the null.

P-value If p-value is smaller than the significance level, reject null. P-value is nice, because if you are given p-value, you don’t have to look anything else up in a table. Smaller p-values mean null hypothesis is less likely to be true.

Bias A biased sample is a sample that differs significantly from the population.

Common Types of Bias Selection Bias Sample systematically excludes or underrepresents certain groups. e.g. calculating the average height of US men using data from medicare records We are systematically excluding the young, who may be different for many reasons.

Common Types of Bias Self-Selection Bias/Non-Response Bias Bias that occurs when people choose to give certain information. e.g. ads to participate in medical studies e.g. calculating average CSUCI GPA by asking students to volunteer to let us look at their transcripts.

Common Types of Bias Survivor Bias Suppose we are looking at the historical average performance of companies on the NYSE, and wanted to know how that was related to CEO pay. One problem that we might have is that we might only look at companies that are still around. We are excluding companies that went out of business.

Review of Regression Regression - Attempt to explain movement in one variable as a function of a set of other variables Example: Are higher campaign expenditures related to more votes in an election?

Review of Regression Dependent Variable - Variable that is observed to change in response to the independent variable e.g. share of votes in the election Independent Variable(s) (AKA explanatory variable) - variables that are used to explain variation in dependent variable. e.g. campaign expenditures.

Review of Regression Example: Demand Quantity is dependent variable Price, Income, Price of compliments, Price of Substitutes are all independent variables.

Simple Regression Y =  0 +  1 X Y: Dependent Variable X: Independent Variable  0 : Intercept (or Constant)  1 : Slope Coefficient

Simple Regression Y X 00 11

 1 is the response of Y to a one unit increase in X  1 =  Y/  X When we look at real data, the points aren’t all on the line

Simple Regression Y X

How do we deal with this? By adding a stochastic error term to the equation. Y =  0 +  1 X +  Deterministic Component Stochastic Component

Simple Regression Y X  0 +  1 X 

Why do we need  ? 1.Omitted Variables 2.Measurement Error 3.The underlying relationship may have a different functional form 4.Human behavior is random

Notation There are really N equations because there are N observations. Y i =  0 +  1 X i +  i (i=1,2,…,N) E.g. Y 1 =  0 +  1 X 1 +  1 Y 2 =  0 +  1 X 2 +  2 … Y N =  0 +  1 X N +  N

Multiple Regression We can have more than one independent variable Y i =  0 +  1 X 1i +  2 X 2i +  3 X 3i +  I What does  1 mean? It is the impact of a one unit increase in X 1 on the dependent variable (Y), holding X 2 and X 3 constant.

Steps in Empirical Economic Analysis 1.Specify an economic model. 2.Specify an econometric model. 3.Gather data. 4.Analyze data according to econometric model. 5.Draw conclusions about your economic model.

Step 1: Specify an Economic Model Example: An Economic Model of Crime Gary Becker Crimes have clear economic rewards (think of a thief), but most criminal behavior has economic costs. The opportunity cost of crime prevents the criminal from participating in other activities such as legal employment, In addition, there are costs associated with the possibility of being caught, and then, if convicted, there are costs associated with being incarcerated.

Economic Model of Crime y=f(x 1, x 2, x 3, x 4, x 5, x 6, x 7 ) y=hours spent in criminal activity x 1 =“wage” for an hour spent in criminal activity x 2 =hourly wage in legal employment x 3 =income from sources other than crime/employment x 4 =probability of getting caught x 5 =probability of being convicted if caught x 6 =expected sentence if convicted x 7 =age

Economic Model of Education What is the effect of education on wages? wage=f(educ,exper,tenure) educ=years of education exper=years of workforce experience tenure=years at current job

Step 2: Specify an econometric model In the crime example, we can’t reasonably observe all of the variables e.g. the “wage” someone gets as a criminal, or even the probability of being arrested We need to specify an econometric model based on observable factors.

Econometric Model of Crime crime i =  0 +  1 wage i +  2 othinc i +  3 freqarr i +  4 freqconv i +  5 avgsen i +  6 age i +  I crime = some measure of frequency of criminal activity wage = wage earned in legal employment othinc = income earned from other sources freqarr = freq. of arrests for prior infractions

Econometric Model of Crime crime i =  0 +  1 wage i +  2 othinc i +  3 freqarr i +  4 freqconv i +  5 avgsen i +  6 age i +  I freqconv = frequency of convictions avgsen = average length of sentence age= age in years  = stochastic error term

Econometric Model of Crime The stochastic error term contains all of the unobserved factors, e.g. wage for criminal activity, prob of arrest, etc. We could add variables for family background, parental education, etc, but we will never get rid of 

Wage and Education wage i =  0 +  1 educ i +  2 exper +  3 tenure i +  I What are the signs of the betas? Run Regression in Gretl! (wage1.gdt)

Step 3: Gathering Data Types of Data: Cross-Sectional Data Time Series Data Pooled Cross Sections Panel/Longitudinal Data

Cross-Sectional Data A sample of individuals, households, firms, cities, states, or other units, taken at a given point in time Random Sampling Mostly used in applied microeconomics Examples  General Social Survey  US Census  Most other surveys

Cross-Sectional Data Obswageeducexperfemalemarried ………………

Time Series Data Observations on a variable or several variables over time E.g. stock prices, money supply, CPI, GDP, annual homicide rates, etc. Because past events can influence future events, and lags in behavior are common in economics, time is an important dimension of time-series

Time Series Data More difficult to analyze than cross- sectional data Observations across time are not independent May also have to control for seasonality

Time Series Data Obsyearavgminavgcovunempgnp ………………

Pooled Cross-Sections Both time series and cross-sectional features Suppose we collect data on households in 1985 and 1990 We can combine both of these into one data set by creating a pooled cross-section Good if there is a policy change between years Need to control for time in analysis

Pooled Cross-Sections Obsyearhpriceproptax , ,30036 ………… , , ,40020 ………… ,20016

Panel/Longitudinal Data A panel data set consists of a time series for each cross-sectional member E.g. select a random sample of 500 people, and follow each for 10 years.

Panel Data obspersonidyearwagedinout ……………

Causality & Ceteris Paribus What we really want to know is: does the independent variable have a causal effect on the dependent variable But: Correlation does not imply causation Suppose we want to know if higher education leads to higher worker productivity

Causality and Ceteris Paribus If we find a relationship between education and wages, we don’t know much Why? What if highly educated people have higher IQs, and it’s really high IQ that leads to higher wages? If you give a random person more education, will they get higher wages?

Causality and Ceteris Paribus What we want to know is… Does higher education lead to higher wages ceteris paribus… holding all else constant We have to control for IQ, experience, gender, job training, etc. But we can’t control for everything!