CREST-ENSAE Mini-course Microeconometrics of Modeling Labor Markets Using Linked Employer-Employee Data John M. Abowd portions of today’s lecture are.

Slides:



Advertisements
Similar presentations
5.4 Basis And Dimension.
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
General Linear Model With correlated error terms  =  2 V ≠  2 I.
SEM PURPOSE Model phenomena from observed or theoretical stances
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
GENERAL LINEAR MODELS: Estimation algorithms
Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 – CHAPTER 4 GRAPHS 1.
1 Chapter 4 Experiments with Blocking Factors The Randomized Complete Block Design Nuisance factor: a design factor that probably has an effect.
Chapter 4 Randomized Blocks, Latin Squares, and Related Designs
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
INFO 7470/ILRLE 7400 Geographic Information Systems John M. Abowd and Lars Vilhuber March 29, 2011.
Ch11 Curve Fitting Dr. Deshi Ye
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
The General Linear Model. The Simple Linear Model Linear Regression.
Visual Recognition Tutorial
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
Chapter 8 Estimation: Additional Topics
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
© John M. Abowd 2007, all rights reserved Statistical Tools for Data Integration John M. Abowd April 2007.
Independent Component Analysis (ICA) and Factor Analysis (FA)
© John M. Abowd 2005, all rights reserved Statistical Tools for Data Integration John M. Abowd April 2005.
Prediction and model selection
© John M. Abowd 2005, all rights reserved Modeling Integrated Data John M. Abowd April 2005.
Lecture 11 Multivariate Regression A Case Study. Other topics: Multicollinearity  Assuming that all the regression assumptions hold how good are our.
Linear and generalised linear models
6 6.3 © 2012 Pearson Education, Inc. Orthogonality and Least Squares ORTHOGONAL PROJECTIONS.
Linear and generalised linear models
Basics of regression analysis
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Lecture II-2: Probability Review
Separate multivariate observations
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
Principles of the Global Positioning System Lecture 11 Prof. Thomas Herring Room A;
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
INFO 7470/ILRLE 7400 Statistical Tools: Basic Integrated Data Models John M. Abowd and Lars Vilhuber April 12, 2011.
Concepts of Database Management, Fifth Edition
CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Modern Navigation Thomas Herring
Yaomin Jin Design of Experiments Morris Method.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Section 2.3 Properties of Solution Sets
Multivariate Statistics Confirmatory Factor Analysis I W. M. van der Veld University of Amsterdam.
© John M. Abowd 2005, all rights reserved Multiple Imputation, II John M. Abowd March 2005.
CREST-ENSAE Mini-course Microeconometrics of Modeling Labor Markets Using Linked Employer-Employee Data John M. Abowd portions of today’s lecture are the.
Lecture 2: Statistical learning primer for biologists
Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.
Topic 20: Single Factor Analysis of Variance. Outline Analysis of Variance –One set of treatments (i.e., single factor) Cell means model Factor effects.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
Advanced Statistics Factor Analysis, I. Introduction Factor analysis is a statistical technique about the relation between: (a)observed variables (X i.
INFO 7470/ECON 7400 Statistical Tools: Complex Data Models John M. Abowd and Lars Vilhuber April 22, 2013.
INFO 4470/ILRLE 4470 Visualization Tools and Data Quality John M. Abowd and Lars Vilhuber March 16, 2011.
ANOVA and Multiple Comparison Tests
CREST-ENSAE Mini-course Microeconometrics of Modeling Labor Markets Using Linked Employer-Employee Data John M. Abowd portions of today’s lecture are joint.
Computacion Inteligente Least-Square Methods for System Identification.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Estimation Econometría. ADE.. Estimation We assume we have a sample of size T of: – The dependent variable (y) – The explanatory variables (x 1,x 2, x.
Lecture XXVII. Orthonormal Bases and Projections Suppose that a set of vectors {x 1,…,x r } for a basis for some space S in R m space such that r  m.
INFO 7470 Statistical Tools: Hierarchical Models and Network Analysis John M. Abowd and Lars Vilhuber May 2, 2016.
Chapter 7. Classification and Prediction
LECTURE 11: Advanced Discriminant Analysis
CJT 765: Structural Equation Modeling
OVERVIEW OF LINEAR MODELS
OVERVIEW OF LINEAR MODELS
Presentation transcript:

CREST-ENSAE Mini-course Microeconometrics of Modeling Labor Markets Using Linked Employer-Employee Data John M. Abowd portions of today’s lecture are the work of Kevin McKinney (U.S. Census Bureau) and Ian Schmutte (University of Georgia) June 3, 2013

Topics May 30: Basics of analyzing complex linked data June 3: Basics of graph theory with applications to labor markets June 6: Matching and sorting models June 10: Endogenous mobility models Online course materials 3 June 2013© John M. Abowd and others, 20132

Lecture 2 Basic graph theory for linked employer employee data models Estimation by fixed and mixed effect methods More basic graph theory Sampling employer-employee graphs Modularity and community models for employer-employee graphs 3 June 2013© John M. Abowd and others, 20133

BASIC GRAPH THEORY 3 June 2013© John M. Abowd and others, 20134

Graph Basics 3 June 2013© John M. Abowd and others, 20135

Graphs Fully connected graph network 3 June 2013© John M. Abowd and others, 20136

The Bipartite Labor Market Graph 3 June 2013© John M. Abowd and others, 20137

Realized Employment Network 3 June 2013© John M. Abowd and others, 20138

Adjacency Matrices 3 June 2013© John M. Abowd and others, 20139

STATISTICAL MODELING: A FIRST EXAMPLE BASED ON AKM 3 June 2013© John M. Abowd and others,

The dependent variable is compensation The function J(i,t) indicates the employer of i at date t. The first component is the person effect. The second component is the firm effect. The third component is the measured characteristics effect. The fourth component is the statistical residual, orthogonal to all other effects in the model. Basic Statistical Model 3 June 2013© John M. Abowd and others,

Matrix Notation: Basic Model All vectors/matrices have row dimensionality equal to the total number of observations. Data are sorted by person-ID and ordered chronologically for each person. D is the design matrix for the person effect: columns equal to the number of unique person IDs. F is the design matrix for the firm effect: columns equal to the number of unique firm IDs times the number of effects per firm. 3 June 2013© John M. Abowd and others,

True Industry Effect Model The function K(j) indicates the industry of firm j The first component is the person effect The second component is the firm effect net of the true industry effect The third component is the true industry effect, an aggregation of firm effects since industry is a property of the employer The fourth component is the effect of personal characteristics The fifth component is the statistical residual 3 June 2013© John M. Abowd and others,

Matrix Notation: True Industry Effect Model The matrix A is the classification matrix taking firms into industries The matrix FA is the design matrix for the true industry effect The true industry effect  can be expressed as 3 June 2013© John M. Abowd and others,

Raw Industry Effect Model The first component is the raw industry effect The second component is the measured personal characteristics effect The third component is the statistical residual The raw industry effect is an aggregation of the appropriately weighted average person and average firm effects within the industry, since both have been excluded from the model The true industry effect is only an aggregation of the appropriately weighted average firm effect within the industry, as shown above 3 June 2013© John M. Abowd and others,

Industry Effects Adjusted for Person Effects Model The first component is the industry effect adjusted for person effects The second component is individual effect (with firm effects omitted) The third component is the measured personal characteristics effect The fourth component is the statistical residual The industry effects adjusted for person effects are also biased 3 June 2013© John M. Abowd and others,

Relation: True and Raw Industry Effects The vector  ** of industry effects can be expressed as the true industry effect  plus a bias that depends upon both the person and firm effects The matrix M is the residual matrix (column null space) after projection onto the column space of the matrix in the subscript. For example, 3 June 2013© John M. Abowd and others,

Relation: Industry, Person and Firm Effects The vector  ** of raw industry effects can be expressed as a matrix weighted average of the person effects  and the firm effects  The matrix weights are related to the personal characteristics X, and the design matrices for the person and firm effects (see Abowd, Kramarz and Margolis, 1999) 3 June 2013© John M. Abowd and others,

IDENTIFICATION 3 June 2013© John M. Abowd and others,

Estimation by Fixed-effect Methods The normal equations for least squares estimation of fixed person, firm and characteristic effects are very high dimension Estimation of the full model by either fixed- effect or mixed-effect methods requires special algorithms to deal with the high dimensionality of the problem 3 June 2013© John M. Abowd and others,

Least Squares Normal Equations The full least squares solution to the basic estimation problem solves these normal equations for all identified effects 3 June 2013© John M. Abowd and others,

Identification of Effects Use of the decomposition formula for the industry (or firm-size) effect requires a solution for the identified person, firm and characteristic effects The usual technique of eliminating singular row/column combinations from the normal equations won’t work if the least squares problem is solved directly. 3 June 2013© John M. Abowd and others,

Identification by Finding connected Sub-graphs (Depth-first) Firm 1 is in group g = 1 Repeat until no more persons or firms are added: – Add all persons employed by a firm in group 1 to group 1 – Add all firms that have employed a person in group 1 to group 1 For g= 2,..., repeat until no firms remain: – The first firm not assigned to a group is in group g. – Repeat until no more firms or persons are added to group g: Add all persons employed by a firm in group g to group g. Add all firms that have employed a person in group g to group g. Each group g is a connected sub-graph Identification of  : drop one firm from each group g Identification of  : impose one linear restriction 3 June 2013© John M. Abowd and others,

Connected Sub-graphs of the Labor Market 3 June 2013© John M. Abowd and others,

Normal Equations after Group Blocking The normal equations have a sub-matrix with block diagonal components This matrix is of full rank and the solution for (  ) is unique 3 June 2013© John M. Abowd and others,

Necessity of Identification Conditions For necessity, we want to show that exactly N+J-G person and firm effects are identified (estimable), including the grand mean  y Because X and y are expressed in deviations from the mean, all N effects are included in the equation but one is redundant because both sides of the equation have a zero mean by construction So the grand mean plus the person effects constitute N effects There are at most N + J-1 person and firm effects including the grand mean. The grouping conditions imply that at most G group means are identified (or, the grand mean plus G-1 group deviations) Within each group g, at most N g and J g -1 person and firm effects are identified. Thus the maximum number of identifiable person and firm effects is: 3 June 2013© John M. Abowd and others,

Sufficiency of Identification Conditions For sufficiency, we use an induction proof Consider an economy with J firms and N workers Denote by E[y it ] the projection of worker i's wage at date t on the column space generated by the person and firm identifiers. For simplicity, suppress the effects of observable variables X The firms are connected into G groups, then all effects  j, in group g are separately identified up to a constraint of the form: 3 June 2013© John M. Abowd and others,

Sufficiency of Identification Conditions II Suppose G=1 and J=2 Then, by the grouping condition, at least one person, say 1, is employed by both firms and we have So, exactly N+2-1 effects are identified 3 June 2013© John M. Abowd and others,

Sufficiency of Identification Conditions III Next, suppose there is a connected group g with J g firms and exactly J g -1 firm effects identified Consider the addition of one more connected firm to such a group Because the new firm is connected to the existing J g firms in the group there exists at least one individual, say worker 1 who works for a firm in the identified group, say firm J g, at date 1 and for the supplementary firm at date 2. Then, we have two relations So, exactly J g effects are identified with the new information 3 June 2013© John M. Abowd and others,

ESTIMATION BY FIXED-EFFECTS METHODS 3 June 2013© John M. Abowd and others,

Estimation by Direct Solution of Least Squares Once the grouping algorithm has identified all estimable effects, we solve for the least squares estimates by direct minimization of the sum of squared residuals This method, widely used in animal breeding and genetics research, produces a unique solution for all estimable effects Software: SAS (GLM); Stata (xtreg); R (lme); cgcg 3 June 2013© John M. Abowd and others,

Least Squares Conjugate Gradient Algorithm The matrix  is chosen to precondition the normal equations The data matrices and parameter vectors are redefined as shown 3 June 2013© John M. Abowd and others,

LSCG (II) The goal is to find  to solve the least squares problem shown The gradient vector g figures prominently in the equations The initial conditions for the algorithm are shown. – e is the vector of residuals – d is the direction of the search 3 June 2013© John M. Abowd and others,

LSCG (III) The loop shown has the following features: – The search direction d is the current gradient plus a fraction of the old direction – The parameter vector  is updated by moving a positive amount in the current direction – The gradient, g, and residuals, e, are updated. – The original parameters are recovered from the preconditioning matrix 3 June 2013© John M. Abowd and others,

LSCG (IV) Verify that the residuals are uncorrelated with the three components of the model. – Yes: the LS estimates are calculated as shown. – No: certain constants in the loop are updated and the next parameter vector is calculated. 3 June 2013© John M. Abowd and others,

ESTIMATION BY MIXED-EFFECTS METHODS 3 June 2013© John M. Abowd and others,

Mixed Effects Assumptions The assumptions above specify the complete error structure with the firm and person effects random. For maximum likelihood or restricted maximum likelihood estimation assume joint normality. Software: ASREML, cgmixedASREMLcgmixed 3 June 2013© John M. Abowd and others,

Estimation by Mixed Effects Methods Solve the mixed effects equations Techniques: Bayesian EM, Restricted ML 3 June 2013© John M. Abowd and others,

Bayesian ECM The algorithm is illustrated for the special case of uncorrelated residuals and uncorrelated random effects. The initial conditions are taken directly from the LS solution to the fixed effects problem. 3 June 2013© John M. Abowd and others,

Bayesian ECM (II) At each loop of the algorithm the E step is used to compute the parameters The conditional M step is used to update the variances. 3 June 2013© John M. Abowd and others,

Relation Between Fixed and Mixed Effects Models Under the conditions shown above, the ME and estimators of all parameters approaches the FE estimator 3 June 2013© John M. Abowd and others,

Correlated Random Effects vs. Orthogonal Design Orthogonal design means that characteristics, person design, firm design are orthogonal Uncorrelated random effects means that  is diagonal 3 June 2013© John M. Abowd and others,

Software SAS: proc mixed ASREML aML SPSS: Linear Mixed Models STATA: xtreg, gllamm, xtmixed R: the lme() function S+: linear mixed models Gauss Matlab (pcg) Genstat: REML Grouping (connected sub-graphs) 3 June 2013© John M. Abowd and others,

EXAMPLE 3 June 2013© John M. Abowd and others,

Inter-Industry Wage Differences From Abowd, Kramarz, Lengermann, Roux, and Schmutte (2012) 3 June 2013© John M. Abowd and others,

3 June 2013© John M. Abowd and others,

3 June 2013© John M. Abowd and others,

3 June 2013© John M. Abowd and others,

3 June 2013© John M. Abowd and others,

3 June 2013© John M. Abowd and others,

APPLICATION TO SAMPLING GRAPHS (WORK OF MCKINNEY) 3 June 2013© John M. Abowd and others,

The Three Labor Market Graphs Person-Firm Graph – The core set of connections or relationships Firm-to-Firm Graph – Projection of the PFG onto the firm nodes Person-to-Person Graph – Projection of the PFG onto the person nodes (will not discuss today) Today’s talk is non-technical, although a formal mathematical document is available on request. 52

Person-Firm Graph G=(V,E) V=Nodes (Persons and Firms) E=Edges (Jobs) – Graph is bipartite; nodes are made up of two disjoint sets (persons and firms). – Persons cannot employ other persons and firms cannot employ other firms. An edge always contains one node from the set of persons and one node from the set of firms. – Edges (jobs) are generated when person i is employed at firm j and positive earnings are reported to a participating state UI system. 53

A Person-Firm Edge in Detail Person Node (PIK) Labels DOB Sex Race Person Node (PIK) Labels DOB Sex Race 6 6 Edge (Job) Labels Quarterly Earnings History Edge (Job) Labels Quarterly Earnings History Firm Node (SEIN) Labels Industry Location Size Firm Node (SEIN) Labels Industry Location Size

LEHD from a Graph Theoretic POV Person-Firm Graph – Graph is not complete. Every person does not have a job at every firm. – Adjacency matrix is sparse; realized person-firm edges are only ~3/1000 th of one percent of the total possible. Edge Labels – Earnings history for each job is stored in the EHF – SAS statement to recover person-firm graph from the EHF: proc sort data=ehf (keep=pik sein) out=pfg nodupkey; by pik sein; Node Labels Stored in Separate Files – Person – ICF – Firm – ECF 56

LEHD Products that use the Person- Firm Graph QWI – Sum edges (jobs), where edge and node labels meet certain criteria. For example, sum all edges (jobs) for males age with positive earnings in 1992:4, working in the construction industry at a firm located in Miami-Dade county Florida. OTM – Attaches location information to the person and firm nodes. Plots each active edge in 2 dimensional space, where the node locations determine the placement of the edge. Annual Job Flows – Uses a directed weighted firm graph. We will talk more about Job Flows in subsequent slides. 57

The Firm Graph Projection of the Person Firm Graph onto the Firm Nodes ‒Workers with multiple employers generate (j*(j- 1))/2 edges. ‒Workers with a single employer generate a loop. If loops are included, the result is a multi-graph. Shows the Connections Between Firms Workers employed at multiple firms bring skills and knowledge from previous jobs to their current job. Different Versions Undirected versus directed Unweighted versus weighted Loops versus no loops 58

Going from Person-Firm to Firm Edges Two Loops Workers 2 and 3 are employed at only one firm Two Loops Workers 2 and 3 are employed at only one firm Three Firm-to-Firm Edges Workers 1,3, and 5 each generate one edge Three Firm-to-Firm Edges Workers 1,3, and 5 each generate one edge 60

Undirected Unweighted Firm-to-Firm Graph with no Loops Edge Weight of 2 Edge Weight of 1 61

Undirected Weighted Firm-to-Firm Graph A A B B E E C C D D F F Largest Connected Component G G 1 62

Weighted Directed Firm-to-Firm Graph A A B B E E C C D D F F Largest Strongly Connected Component G G Weakly Connected

Data Select workers on the EHF with positive earnings some time during 2000:1 to 20010:4. – All States except MA, plus DC – Some states are not available until after 2000:1: AL (2001:1), AR(2002:3), DC(2002:2), MS (2003:3), and NH (2003:1). Workers with more than 40 jobs are removed. – PIK with a large number of jobs is unlikely to represent the work history of a single individual. – A worker with a large number of jobs generates a large number of edges. Max edges per PIK=780. About 220 million persons, 15 million firms, and 6.3 billion person firm year quarter observations. 64

Graph Characteristics Total number of person-to-firm edges is about 1 billion Total number of firm-to-firm edges is about 4 billion (including loops) – Direct relationship between the distribution of the number of jobs per person and the number of person- firm edges – Direct relationship between the distribution of the number of jobs per person and the number of firm-to- firm edges Edges per person=(j*(j-1))/2 except if j=1 then Edges=1 65

Distribution of Jobs and Firm Edges by Number of Jobs per Person 66

Cumulative Distribution of Jobs and Firm Edges by Number of Jobs per Person 67

Creating the State Level Firm-to-Firm Graph Collapse the firm edges by removing duplicates. For example, multiple workers may have been employed at the same two firms. Average multiplicity is approximately 2, but a little over 80% of the edges come from only one PIK (multiplicity of 1). Attach state geography labels to each firm-to-firm node and sum edges within each state pair. Result is a new graph with state-to-state edges 68

69 50 Largest Edges (Top 10 in Red)

70

71 Largest Edge for Each State (Reciprocal Edges in Red)

APPLICATION TO MODULARITY MODELING (WORK OF SCHMUTTE) 3 June 2013© John M. Abowd and others,

Summary The connectedness of the bipartite graph between employers and employees yields the identification conditions for longitudinally linked employer-employee data models at the job level These can be estimated using either fixed or mixed effects methods. 3 June 2013© John M. Abowd and others,