Jean-Luc LIPATZ INSEE - France 2007/10 Using gridded census data to analyze socio-spatial structure of french cities Short history of grids in the INSEE.

Slides:



Advertisements
Similar presentations
Paul Smith Office for National Statistics
Advertisements

Methods of analysing change over time and space Ian Gregory (University of Portsmouth) & Paul Ell (Queens University, Belfast)
Will 2011 be the last Census of its kind in England and Wales? Roma Chappell, Programme Director Beyond 2011 Office for National Statistics, July 2011.
Sources and effects of bias in investigating links between adverse health outcomes and environmental hazards Frank Dunstan University of Wales College.
Automated Regression Modeling Descriptive vs. Predictive Regression Models Four common automated modeling procedures Forward Modeling Backward Modeling.
Urban Statistics serving the evolving European Urban Agenda Presented by Jagdev Virdee Prepaired by Teodora Brandmuller, Eurostat unit E4 IAOS 2014, Da.
Integrating Land Use in a Hedonic Price Model Using GIS URISA 2001 Yan Kestens Marius Thériault François Des Rosiers Centre de Recherche en Aménagement.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
SPATIAL DATA ANALYSIS Tony E. Smith University of Pennsylvania Point Pattern Analysis Spatial Regression Analysis Continuous Pattern Analysis.
Synthetic estimators in Ireland Anthony Staines DCU.
GIS and Spatial Statistics: Methods and Applications in Public Health
Correlation and Autocorrelation
Spatial Analysis Longley et al., Ch 14,15. Transformations Buffering (Point, Line, Area) Point-in-polygon Polygon Overlay Spatial Interpolation –Theissen.
Why Geography is important.
Data collection on homelessness in statistical offices in France 1. National Institute of Statistics and Economic Studies (INSEE) 2. Statistical Office.
University of Wisconsin-Milwaukee Geographic Information Science Geography 625 Intermediate Geographic Information Science Instructor: Changshan Wu Department.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Radial Basis Function Networks
Trade and business statistics: use of administrative data Lunch Seminar Enrico Giovannini Italian National Statistical Institute (ISTAT) New York, February,
1 1 Establishing a register-based statistical system Example: Population and housing censuses in Norway Statistical Training Course Use of Administrative.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Constructing Individual Level Population Data for Social Simulation Models Andy Turner Presentation as part.
Esri International User Conference | San Diego, CA Technical Workshops | Spatial Statistics: Best Practices Lauren Rosenshein, MS Lauren M. Scott, PhD.
Joint UNECE/Eurostat Meeting on Population and Housing Censuses (28-30 October 2009) Accuracy evaluation of Nuts level 2 hypercubes with the adoption of.
Adaptive Kernel Density in Demographic Analysis Richard Lycan Institute on Aging Portland State University.
Food Store Location Analysis Albuquerque New Mexico, 2010 Prepared for: Geography 586L - Spring Semester, 2014 Larry Spear M.A., GISP Sr. Research Scientist.
Dutch Virtual Census Presentation at the International Seminar on Population and Housing Censuses; Beyond the 2010 Round November, 2012 Egon Gerards,
Workshop - Genève 22 november French rolling census Jean-Michel DURR Insee.
Uneven Intraurban Growth in Chinese Cities: A Study of Nanjing Yehua Dennis Wei Department of Geography and Institute of Public and International Affairs.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Emerging methodologies for the census in the UNECE region Paolo Valente United Nations Economic Commission for Europe Statistical Division International.
Transition from traditional census to sample survey? (Experience from Population and Housing Census 2011) Group of Experts on Population and Housing Censuses,
Spatial Statistics in Ecology: Area Data Lecture Four.
Health Datasets in Spatial Analyses: The General Overview Lukáš MAREK Department of Geoinformatics, Faculty.
Edoardo PIZZOLI, Chiara PICCINI NTTS New Techniques and Technologies for Statistics SPATIAL DATA REPRESENTATION: AN IMPROVEMENT OF STATISTICAL DISSEMINATION.
Research Methodology Lecture No :14 (Sampling Design)
Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters.
Spatial and non spatial approaches to agricultural convergence in Europe Luciano Gutierrez*, Maria Sassi** *University of Sassari **University of Pavia.
Scot Exec Course Nov/Dec 04 Survey design overview Gillian Raab Professor of Applied Statistics Napier University.
1 Measuring Uncertainty in Population Estimates at Local Authority Level Ruth Fulton, Bex Newell, Dorothee Schneider.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
The Dutch Virtual Census based on registers and already existing surveys Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics.
Spatial Interpolation III
Sampling Techniques 19 th and 20 th. Learning Outcomes Students should be able to design the source, the type and the technique of collecting data.
New sources – administrative registers Genovefa RUŽIĆ.
Taking ‘Geography’ Seriously: Disaggregating the Study of Civil Wars. John O’Loughlin and Frank Witmer Institute of Behavioral Science University of Colorado.
Quality Assurance Programme of the Canadian Census of Population Expert Group Meeting on Population and Housing Censuses Geneva July 7-9, 2010.
Modelling international migration to produce local level estimates Ruth Fulton Office for National Statistics.
Targeting of Public Spending Menno Pradhan Senior Poverty Economist The World Bank office, Jakarta.
An ecological analysis of crime and antisocial behaviour in English Output Areas, 2011/12 Regression modelling of spatially hierarchical count data.
Eurostat Accuracy of Results of Statistical Matching Training Course «Statistical Matching» Rome, 6-8 November 2013 Marcello D’Orazio Dept. National Accounts.
The Statistical Urban Zoning. The Experience of the Municipality of Firenze La zonizzazione statistica in ambito urbano. L’esperienza del Comune di Firenze.
François CLANCHÉ Insee, National statistical office, France 30/09/2013 The French rolling census, ten years after its launch.
QUALITY ASSESSMENT OF THE REGISTER-BASED SLOVENIAN CENSUS 2011 Rudi Seljak, Apolonija Flander Oblak Statistical Office of the Republic of Slovenia.
Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc
Beyond 2011 Administrative data sources and low-level aggregate models for producing population counts.
The use of population register data for defining the homeless in Finland Jari Nieminen Statistics Finland.
Data Alignment and Management in ArcMap
EUROPEAN FORUM FOR GEOSTATISTICS CONFERENCE (EFGS) October Lisbon, Portugal 1 Lessons learned from disaggregating population data by using.
Demand Management and Forecasting Chapter 11 Portions Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Small area estimation combining information from several sources Jae-Kwang Kim, Iowa State University Seo-Young Kim, Statistical Research Institute July.
French new census : method and utilization IPUMS Workshop 9 june Paris.
Establishing a register-based statistical system Example: Population and housing censuses in Norway Training workshop on censuses using administrative.
Luciano Gutierrez*, Maria Sassi**
Machine learning, pattern recognition and statistical data modelling
State of the Art in France
Tabulations and Statistics
Larger Urban Zones in the Urban Audit
The role of metadata in census data dissemination
Geocoding of Population and Housing Census 2000: Lessons Learned
Presentation transcript:

Jean-Luc LIPATZ INSEE - France 2007/10 Using gridded census data to analyze socio-spatial structure of french cities Short history of grids in the INSEE

Page 2 EFGS meeting JL.LIPATZ 2009/10 1) The used of gridded data in the INSEE 2) The production of gridded data in the complex environment of the french Census

Page 3 EFGS meeting JL.LIPATZ 2009/10 Starting point Sub-city districts for public action › A question from the DIV (« Délégation Interministérielle à la Ville »), ministry responsible for urban social policies (2005) –Context : 2005 urban riots. Are public actions ineffective or geographical areas for them badly choosen? ‐ Redesign new one ‐ Check relevance of existing ones –Question : How to check the relevance of deprived districts design by local authorities? › Cannot use existing zones –Existing districts : outdated, partial –Existing output area for statistical products: too large, too much internal heterogeneity –No data source was completely usable at point level. –=> Use more detailed data but how transforming a set of points to a boundary of zone?

Page 4 EFGS meeting JL.LIPATZ 2009/10 The tool – an example, what data says… Poitiers - Health insurance register Blue : existing deprived districts Red : areas of high probabiblity of low income Grey shade : population density 200 m² grid cells clusterSurfaceEffectif totalPart Sous-populationxy Z Z Z Z Z

Page 5 EFGS meeting JL.LIPATZ 2009/10 … and an effective result Blue: new deprived districts as defined by local government

Page 6 EFGS meeting JL.LIPATZ 2009/10 The tool – how it works Grids everywhere! › Probability density estimates using kernel method –-> gridded data instead of individual data ‐ Part of data cannot be fully (up to the adress) localized ‐ Quicker processes without quality loss ‐ Weaker confidentiality issues allowing use in regional delegations of INSEE –Estimate 1: Whole Population in the data source –Estimate 2: « Deprived » population relative to this data source › Ratio of probabilities to compute relative risk –-> grid cells as a support of estimated functions › Cartography of high estimated risk › Zones are a selection of contiguous grid cells using an automatic rule –Signal but not design

Page 7 EFGS meeting JL.LIPATZ 2009/10 From data to final map Sub population1) Simplify the maps 4)Superpose the maps Whole population 2) Combine the maps 3) Extract the outlines Rough data Probability estimates Relative risk estimate

Page 8 EFGS meeting JL.LIPATZ 2009/10 And census? › The tool is now used (within INSEE) to describe other phenomenas, with every available source › Using the census –Small LAU2s (out of reach for the tool : no detail for small geographical levels, but mainly not urban) ‐ Exhaustive ‐ Data collection over 5 years (each LAU in one year) –Large LAU2s (city cores) ‐ Sample 40 % ‐ Addresses register maintained for smapling purpose, used as a reference when localizing administrative registers ‐ Data collection over 5 five years

Page 9 EFGS meeting JL.LIPATZ 2009/10 Idea › To compare census data and administrative at location where they are both available to estimate together : –The administrative bias –The time shift

Page 10 EFGS meeting JL.LIPATZ 2009/10 Filling the gaps of census collection Collected census data Data from an administrative register An address from the census sampling register Can we deduce this from that?

Page 11 EFGS meeting JL.LIPATZ 2009/10 GWR › A regression, but not a global one –Standard regression gives correlated residuals : spatial distribution will be biased –Regression models with autocorrelated residuals seem not to be applicable easily (different variograms for different city parts) => Local regressions (Geographical Weighted Regression cf. Fotheringham)

Page 12 EFGS meeting JL.LIPATZ 2009/10 GWR Space as an explicative factor Estimates Local subsets for regressions Decreasing weights with the distance

Page 13 EFGS meeting JL.LIPATZ 2009/10 Data Grids coming back! › Two kind of data –Census data + explicative data (administrative and dwellings from the address register) –Explicative data only –Administrative data not connected to the address register (20 %) is ignored but corresponding addresses are used with zeroed administrative data › …added up to avoid singularity problem in matrixes during estimation –-> grids › Multiplication of cells by intersecting with: –Housing type –Administrative districts

Page 14 EFGS meeting JL.LIPATZ 2009/10 Internals › Weights –Actual weighting function doesn’t really matter –Classical –Added penalty (doubled distance) when cells have different building types (houses vs. Appartments) › Radius –Derived from a fixed number of neighbours –Actual number of neighbours minimizes the Aikake Information Criterion (AICC)

Page 15 EFGS meeting JL.LIPATZ 2009/10 Prediction Small Area estimation is : Unable to compute locally Not spatial error term : ignored Spatial trend

Page 16 EFGS meeting JL.LIPATZ 2009/10 Accuracy questions › Key issue is spatial autocorrelation –Local regressions behaviour (adjusted R², residuals) –Classical LISAs (local Moran…) › But no local accurary measure › Just a trend anyway › => Validation at global level, where census gives its own figure (mainly Horwitz/Thomson estimate) –Must include omitted correction term –Theorically the GWR gives best results, but there is no estimation of accuracy in both cases (now developping simulations to produce abacuses)

Page 17 EFGS meeting JL.LIPATZ 2009/10 Example Young people in Toulouse › From census : (5 years) › From estimations –Model (1year of data collection) –With fiscal source : –With health insurance source : 95440

Page 18 EFGS meeting JL.LIPATZ 2009/10 Strasbourg – High unemployment areas Estimations with 2004, 2005, 2006 census surveys Final census figures (5 years estimation) Estimated populations in deprived districts

Page 19 EFGS meeting JL.LIPATZ 2009/10 Thank you Any question?