Causality and Identification in Structural Econometrics Causality and Probability in the Sciences University of Kent, Canterbury September 10 2008 Damien.

Slides:



Advertisements
Similar presentations
Financial Econometrics
Advertisements

SEM PURPOSE Model phenomena from observed or theoretical stances
Lecture #11: Introduction to the New Empirical Industrial Organization (NEIO) - What is the old empirical IO? The old empirical IO refers to studies that.
INTRODUCTION TO MODELING
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
The General Linear Model Or, What the Hell’s Going on During Estimation?
Omitted Variable Bias Methods of Economic Investigation Lecture 7 1.
Lecture 12 (Ch16) Simultaneous Equations Models (SEMs)
Causality, Mechanisms and Modularity: Structural Models in Econometrics Damien Fennell Centre for the Philosophy of Natural and Social Science London School.
Building and Testing a Theory Steps Decide on what it is you want to explain or predict. 2. Identify the variables that you believe are important.
Appendix to Chapter 1 Mathematics Used in Microeconomics © 2004 Thomson Learning/South-Western.
Linear Regression.
Chapter 10 Simple Regression.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 4: Modeling Decision Processes Decision Support Systems in the.
Simultaneous Equations Models
1Prof. Dr. Rainer Stachuletz Simultaneous Equations y 1 =  1 y 2 +  1 z 1 + u 1 y 2 =  2 y 1 +  2 z 2 + u 2.
1 Empirical Similarity and Objective Probabilities Joint works of subsets of A. Billot, G. Gayer, I. Gilboa, O. Lieberman, A. Postlewaite, D. Samet, D.
Chapter 9 Simultaneous Equations Models. What is in this Chapter? In Chapter 4 we mentioned that one of the assumptions in the basic regression model.
Finance 510: Microeconomic Analysis
Macro Policy Debates neoclassical monetarists, Keynesians, and supply-side economics.
Identification of the short-run structure The identified cointegration relations are kept fixed at their previously estimated values An identified structure.
Writing tips Based on Michael Kremer’s “Checklist”,
Chapter 2 – Tools of Positive Analysis
Chapter 21. Stabilization policy with rational expectations
Demand Estimation & Forecasting
1 In the previous sequence, we were performing what are described as two-sided t tests. These are appropriate when we have no information about the alternative.
PEPA is based at the IFS and CEMMAP © Institute for Fiscal Studies Identifying social effects from policy experiments Arun Advani (UCL & IFS) and Bansi.
Regression Analysis British Biometrician Sir Francis Galton was the one who used the term Regression in the later part of 19 century.
3. Multiple Regression Analysis: Estimation -Although bivariate linear regressions are sometimes useful, they are often unrealistic -SLR.4, that all factors.
Copyright (c) 2000 by Harcourt, Inc. All rights reserved. Functions of One Variable Variables: The basic elements of algebra, usually called X, Y, and.
ECON 6012 Cost Benefit Analysis Memorial University of Newfoundland
Welcome! Econ A494 Math Econ & Advanced Micro Theory Spring 2013 Prof. Jim Murphy.
SIMULTANEOUS EQUATION MODELS
  What is Econometrics? Econometrics literally means “economic measurement” It is the quantitative measurement and analysis of actual economic and business.
Simultaneous Equations Models (聯立方程式模型)
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Resource Identity and Semantic Extensions: Making Sense of Ambiguity David Booth, Ph.D. Cleveland Clinic (contractor) Semantic Technology Conference 25-June-2010.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Econometrics ECO 54 History of Economic Thought Udayan Roy.
CJT 765: Structural Equation Modeling Class 10: Non-recursive Models.
STUDYING BEHAVIOR © 2009 The McGraw-Hill Companies, Inc.
Methodological Problems in Cognitive Psychology David Danks Institute for Human & Machine Cognition January 10, 2003.
On the futility of attempts to formalize clustering within conventional formal frameworks Lev Goldfarb ETS group Faculty of Computer Science UNB Fredericton,
Comments on “State-Owned Banks: Do They Promote or Depress Financial Development and Economic Growth?” Dani Rodrik February 25, 2005.
WHAT IS THE NATURE OF SCIENCE?. SCIENTIFIC WORLD VIEW 1.The Universe Is Understandable. 2.The Universe Is a Vast Single System In Which the Basic Rules.
Centre of Full Employment and Equity Slide 2 Short-run models and Error Correction Mechanisms Professor Bill Mitchell Director, Centre of Full Employment.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
Issues in Estimation Data Generating Process:
1 Some Basic Stuff on Empirical Work Master en Economía Industrial Matilde P. Machado.
11 Chapter 5 The Research Process – Hypothesis Development – (Stage 4 in Research Process) © 2009 John Wiley & Sons Ltd.
Simultaneous Equations Models A simultaneous equations model is one in which there are endogenous variables which are determined jointly. e.g. the demand-supply.
EED 401: ECONOMETRICS COURSE OUTLINE
Lecture by: Jacinto Fabiosa Fall 2005 Methodology in Economics.
10-1 MGMG 522 : Session #10 Simultaneous Equations (Ch. 14 & the Appendix 14.6)
Mediation: The Causal Inference Approach David A. Kenny.
#1 Make sense of problems and persevere in solving them How would you describe the problem in your own words? How would you describe what you are trying.
CHAPTER 1 HUMAN INQUIRY AND SCIENCE. Chapter Outline  Looking for Reality  The Foundation of Social Science  Some Dialectics of Social Research  Quick.
Identification in Econometrics: A Way to Get Causal Information from Observations? Damien Fennell, LSE UCL, May 27, 2005.
Lecturer: Ing. Martina Hanová, PhD. Business Modeling.
Endogeneity in Econometrics: Simultaneous Equations Models Ming LU.
Time Series Econometrics
Simultaneous equation system
MLR5 MLR3* MLR4’ MLR3 + Q is finite MLR5 MLR{2 4’} MLR2 MLR3 MLR4 MLR6
Simultaneous equation models Prepared by Nir Kamal Dahal(Statistics)
Chapter 7: The Normality Assumption and Inference with OLS
Seminar in Economics Econ. 470
Tariff Rate Quotas with endogenous mode of competition:
Instrumental Variables Estimation and Two Stage Least Squares
Chapter 9 Dummy Variables Undergraduated Econometrics Page 1
Simultaneous Equations Models
Presentation transcript:

Causality and Identification in Structural Econometrics Causality and Probability in the Sciences University of Kent, Canterbury September Damien Fennell CPNSS, London School of Economics

Introducing Econometrics What is econometrics? Very crudely, it is a sub-discipline of economics that attempts to use observational economic data to predict, to measure and to identify economic structures. Why is it interesting? (1) Methodology of structural econometrics is philosophically rich; it offers a particular approach to structural equation modelling (2) Econometrics analysis influences policy that influences everyone.

Simultaneous Equation Models Economics makes widespread use of simultaneous equation models to model equilibrium relations e.g. in supply and demand models. As a result, conventional econometrics incorporated simultaneous equation models into structural models in econometrics. So, in contrast to many other forms of structural equation modelling, econometrics centrally relies on non-recursive models. This leads to different methodological issues, I focus on just one of these in this talk: identification.

Avoiding causes In the early days of econometrics (30s & 40s) founders wrote in depth on how structural equation models should be interpreted e.g. Frisch, Haavelmo, Koopmans and Simon. However, as Kevin Hoover notes (2001), under the influence of the then dominant logical positivism, causal talk was frowned upon and econometrics developed structural modelling methods avoiding causal terminology. Hoover proposes two reasons for this: 1.Wold lost the debate with Haavelmo to restrict models that represented causal chains. 2.‘Simon [1952]showed … that a linear system was identified if and only if it was causally ordered’  I focus on the second reason here: does Simon’s (1952) analysis do this i.e. licence dropping causal talk?

Identification and causal order Herbert Simon’s paper does claim an important equivalence between identification and causal order. But Does it licence an equivalence between identifiability rather than causal order ? And I also ask: What does identifiability require of causal order?

Background: Simon’s causal order in brief Herbert Simon defines a causal order for a linear set of equations. STEP 1: Formal definition As the order in which we solve for the variables in a set of equations, when we solve the equations using the fewest equations. STEP 2: Causal semantics Intervention-based interpretation of structural equations.

Simon’s causal order – example Consider the following simple abstract supply and demand model. x 1 = a + u 1 … (det1) x 2 = b + u 2 … (det2) q = c + dp + dx 1 + u 3 … (supply) q = e + fp + gx 2 + u 4 … (demand) x 1 – a determinant of demand, x 2 – determinant of supply, p - equilib price, q – equilib quantity, u’s errors, a,b,c...e - parameters Solving for the variables block-recursively gives the causal order: Variable OrderingEquation Ordering {x 1 }{x 2 } {det1}{det2} {p, q} {supply, demand}

Simon’s semantics in brief To give substance to this formal ordering, Simon asks us to imagine there is an experimenter (can also be nature) that can control/change certain factors (philosophical position  manipulability view, cf Woodward) We are to interpret the structural equations as follows. Formal Term Interpretation equationmechanism parameter directly controllable factor variableindirectly controllable factor error termOmitted factors treated ‘as if’ a directly controllable factor

More semantics Interpretation of: Minimal set of equations denotes: a set of mechanisms that together (just) jointly determines the values of a set of indirectly controllable factors (denoted by the minimal set of variables) given directly controlled factors (parameters) and previously (causally precedent) determined indirectly controllable factors (other previously solved variables in the equations). The causal order: x causally precedes y iff x must be determined for y to be and if there is a chain of mechanisms from x to y (represented by a series of equations used when solving for y using x).

The Identification problem In econometrics one assumes a general structural model relating variables (some observable) and attempts to infer the values of structural parameters by regressing on the data of observable variables. However, if the structural model is too general then the data may not be sufficient to infer a unique structure  The identification problem.

Identification problem in simult. equation models Assume form of the equations correctly represents the ‘true’ causal model, and all variables are observable, but parameters are unknown. How to infer them from data? Not always possible Simple Supply-Demand Model (cf. Supply – Demand situation faced by early econometricians, Morgan 1990) Price, p Quantity, q (q 1, p 1 ) (q 2, p 2 ) Demand1Demand2 Supply1 Supply2 Mistaken Regression Line Figure 1 – The Identification Problem – Causes change in both mechanisms

Simon: Identification and Causal Order In his paper, Simon proves the following: ‘ In order to permit the determination of coefficients of an identifiable complete subset of equations we need to relax at most the equations that are [causally] precedent to this subset’ (p. 33) Interpreted using his semantics this means: An experimenter can infer the structural parameters in an identifiable complete subset of equations by ‘relaxing’ the causally precedent mechanisms to create observations constrained only by the mechanisms in the set to be identified.

Example1: Identifiable supply and demand model Price, p Quantity, q (q 1, p 1 ) (q 2, p 2 ) Demand1Demand2 Supply1/Regression Line Figure 2 – No Identification Problem – Income changes for Demand mechanism Model:Causal Order: {i}  {p,q} {income}  {supply, demand} The experimenter varies income to ‘relax’ the demand equation. The changes in income causes changes in the demand mechanism. The resulting observations are only constrained by the supply mechanism, allowing it to be identified.

The ‘experiment’ to identify supply {i}  {p,q} {income}  {supply, demand} Price, p Quantity, q Supply/Regression Line Demand

Example 2: Unidentifiable Supply and Demand Price, p Quantity, q Demand Supply Figure 1 – The Identification Problem – change in both mechanisms Problem: Here varying income causes shifts in both mechanisms: There is no cause allowing one to relax one mechanism but not the other. Mistaken regression line

An extension In earlier work I prove: A mechanism is identifiable if and only if for any two factors in that mechanism, x and y, x and y can be varied while holding all other factors in the mechanism fixed. In turn this holds if and only if x or y has a cause, z, that is a directly controllable factor such that. 1.z does not cause any other factor in the mechanism. OR 2.z does cause some of the other factors in the mechanism, but it is possible to change it to vary either x and/or y while not changing any other factor in that mechanism (this may require changing some other directly controllable factors to cancel out the impact of z on other factors in the mechanism).

What this means Shows that identifiability puts limits on how causally ‘connected’ the system is i.e. Examples above: In the first income example, supply equation identifiable because income only impacts demand. Allows identification of supply equation. In second case income factor is too ‘connected’, it impacts both mechanisms. As a result it is not possible to vary quantity and price as the only two factors in either the demand or supply mechanism, so these cannot be identified.

Return to my earlier question: Have attempted to clarify what identifiability requires of a set of structural equations (interpreted using Simon’s semantics) Does this rationalise (following Hoover) econometricians focusing on identification in place of causal order? From analysis of identification above, nothing suggests this. Simon’s identification result does not show equivalence of identification and causal order. Unsurprising perhaps! Easy to see non-equivalence of concepts. NOT NECESSARY: causal order intelligible in non-identifiable systems NOT SUFFICIENT: Can have spurious but identifiable equations.

Another perspective - Identification in its conventional form The Rank condition Given a system of m linear equations in m endogenous variables and n exogenous variables in which all variables are observable. A necessary and sufficient condition for the coefficients of an equation to be identified is that the submatrix formed from the columns of the coefficients of the variables (endogenous and exogenous) omitted from that equation has rank m-1. Satisfaction of rank condition: exclusions of variables from equations – but on what basis? Need causal content here to justify exclusions. Restriction to mathematics obscures this. Identification isn’t enough!

So what work does identification do? Identification in structural equations permits one to infer certain structural parameters How? Because when we assert identifiability, we assert strong constraints on the causal structure generating the observations. These strong constraints allow us to ‘reconstruct’ the impact of the different variables on each other even when we cannot perform experiments on them. It is condition on the causal structure that permits inference. BUT this does not license replacing causal order with it!