Incomplete data: Indirect estimation of migration flows Modelling approaches.

Slides:



Advertisements
Similar presentations
Chapter 1 Why Study Statistics?
Advertisements

Point Estimation Notes of STAT 6205 by Dr. Fan.
Probabilistic models Haixu Tang School of Informatics.
[Part 1] 1/15 Discrete Choice Modeling Econometric Methodology Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
Chapter 1 The Where, Why, and How of Data Collection
Maximum likelihood estimates What are they and why do we care? Relationship to AIC and other model selection criteria.
Models of migration Observations and judgments In: Raymer and Willekens, 2008, International migration in Europe, Wiley.
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Introduction to Statistics
Lecture 5: Learning models using EM
Chap 1-1 Chapter 1 Why Study Statistics? EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008.
Log-linear modeling and missing data A short course Frans Willekens Boulder, July
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Life course and cohort measures Hist Cross-sectional data “Snapshot” of a population at a particular moment Examples: Census; Tax list Limitation:
Log-linear analysis Summary. Focus on data analysis Focus on underlying process Focus on model specification Focus on likelihood approach Focus on ‘complete-data.
Understanding sample survey data
Structural Equation Modeling Intro to SEM Psy 524 Ainsworth.
United Nations Statistics Division Backcasting. Overview  Any change in classifications creates a break in time series, since they are suddenly based.
1 B. The log-rate model Statistical analysis of occurrence-exposure rates.
C. Logit model, logistic regression, and log-linear model A comparison.
Log-linear modeling and missing data A short course Frans Willekens Boulder, July-August 1999.
Econometric Methodology. The Sample and Measurement Population Measurement Theory Characteristics Behavior Patterns Choices.
Changes to Internal Migration methodology for English Subnational Population Projections Robert Fry & Lucy Abrahams.
Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture.
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
Discrete Choice Modeling William Greene Stern School of Business New York University.
The Triangle of Statistical Inference: Likelihoood
Logit model, logistic regression, and log-linear model A comparison.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
1. Researchers use the terms variable, subject, sample, and population when describing their research. 2. Psychologists do research to measure and describe.
Generalized Linear Models All the regression models treated so far have common structure. This structure can be split up into two parts: The random part:
Uses of Statistics: 1)Descriptive : To describe or summarize a collection of data points The data set in hand = the population of interest 2)Inferential.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
Sub-regional Workshop on Census Data Evaluation, Phnom Penh, Cambodia, November 2011 Evaluation of Census Data using Consecutive Censuses United.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
Integrating archival tag data into stock assessment models.
In-depth Analysis of Census Data on Migration Country Course on Analysis and Dissemination of Population and Housing Census Data with Gender Concern
Pro gradu –thesis Tuija Hevonkorpi.  Basic of survival analysis  Weibull model  Frailty models  Accelerated failure time model  Case study.
Two Main Uses of Statistics: 1)Descriptive : To describe or summarize a collection of data points The data set in hand = the population of interest 2)Inferential.
Data Collection & Sampling Dr. Guerette. Gathering Data Three ways a researcher collects data: Three ways a researcher collects data: By asking questions.
Overview of Optimization in Ag Economics Lecture 2.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
1 Statistics Statistics can be found in all aspects of life:
Estimation Method of Moments (MM) Methods of Moment estimation is a general method where equations for estimating parameters are found by equating population.
1 Follow the three R’s: Respect for self, Respect for others and Responsibility for all your actions.
Statistics for Engineer. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems and design.
Exploring Microsimulation Methodologies for the Estimation of Household Attributes Dimitris Ballas, Graham Clarke, and Ian Turton School of Geography University.
INTERNATIONAL MIGRATION DATA as input for population projections Anne HERM and Michel POULAIN Estonian Interuniversity Population Research Centre, Estonia.
POLS 7000X STATISTICS IN POLITICAL SCIENCE CLASS 5 BROOKLYN COLLEGE-CUNY SHANG E. HA Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for.
A Brief Maximum Entropy Tutorial Presenter: Davidson Date: 2009/02/04 Original Author: Adam Berger, 1996/07/05
Topics Semester I Descriptive statistics Time series Semester II Sampling Statistical Inference: Estimation, Hypothesis testing Relationships, casual models.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
The Pure Birth Process Derivation of the Poisson Probability Distribution Assumptions events occur completely at random the probability of an event occurring.
Methods for Data-Integration
QMT 3033 ECONOMETRICS QMT 3033 ECONOMETRIC.
Review of Probability Theory
William Greene Stern School of Business New York University
Chapter 1 Why Study Statistics?
Assessing Disclosure Risk in Microdata
Monitoring international migration flows in Europe Frans Willekens
Classification of unlabeled data:
Chapter 1 Why Study Statistics?
Introduction to logistic regression a.k.a. Varbrul
Maximum Likelihood Find the parameters of a model that best fit the data… Forms the foundation of Bayesian inference Slide 1.
Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.
EC 331 The Theory of and applications of Maximum Likelihood Method
The log-rate model Statistical analysis of occurrence-exposure rates
Joyful mood is a meritorious deed that cheers up people around you
Structural Equation Modeling
Presentation transcript:

Incomplete data: Indirect estimation of migration flows Modelling approaches

Aim: Synthetic data base by effective combination of data from different sources

Requirements Data representation: a mathematical model of the ‘complete’ or desired migration data Data types: the different ways of measuring migration –Data on events [relocations] (‘movement data’) Migrations –Data on changes in status [place of residence] Migrants

Requirements Typology of missing or incomplete data Related to data types: what is missing? Typology of available data Related to data types: what is available? –Primary data –Auxiliary data (e.g. historical migration matrix) Measure of reliability of available data. Method to infer missing data from available statistical data and ‘soft’ information on migration

Existing approaches Net migration: residual method Gross migration flows: spatial interaction models –Gravity model –Entropy maximisation –Information-theoretic approaches –Iterative proportional fitting (bi- and multiproportional adjustment [RAS]) Age profile: model migration schedules

The approach Migration is a manifestation of behavioural processes and random processes (choice and chance) Describe the processes and get plausible/accurate parameter estimates based on the (incomplete) data and additional information Apply the model to predict migration flows

Data types Micro-data –Migration data (event data) Occurrence of migration in observation period Time at migration –Migrant data (status data; transition data) Current status Status at two or more points in time (panel) –Equal interval –Unequal interval (e.g. place of birth and place of current residence) Grouped data

Data types Micro-data Grouped data (aggregate data; tabulations) –Migrations (events) –Migrants (transitions) Observation in continuous time (e.g. population register) Observation in discrete time

Types of incompleteness Non-response Net migration vs gross flows Migrants vs migrations (events) Single migration recorded instead of sequence of migrations (e.g. last migration) Partially missing data –e.g. Origin by age or covariates –Some information missing for some persons

Solutions to incomplete data Collect missing information Use ancillary data and/or information on comparable population Live with it and minimise distortions caused by missing data Infer missing data from all the information you can get (combine sources)

Probability models of migration Migration is a realisation of a Poisson process

Log-rate model: rate = events/exposure Gravity model

RAS, Biproportional adjustment, etc.

Likelihood equations may be written as : Marginal totals are sufficient statistics

A different way of writing the spatial interaction model: Link Poisson - Multinomial

The gravity model is a log-linear model The entropy model is a log-linear model The RAS model is as log-linear (log-rate) model

Parameter estimation Maximise (log) likelihood function: probability that the model predicts the data Expectation: predict E[N rs ] =  rs given the model and initial parameter estimates. Maximisation: maximise the ‘complete- data’ log-likelihood.

Z ki : Individual k is member of group i

When  k and  2 are known, then

Conclusion A unified approach to the prediction of migration from different types of data and different data sources Approach based on probability theory and theory of statistical inference (not ad hoc) The EM algorithm is studied extensively. Much experience gathered. ‘Soft’data (e.g. expert opinions) can be added