TerraPop Goals Lower barriers to conducting interdisciplinary human-environment interactions research by making data with different formats from different.

Slides:



Advertisements
Similar presentations
Counting the Dutch, The Future of the Virtual Census in the Netherlands Presentation at the seminar Counting the 7 Billion 24 February 2012 * Geert Bruinooge.
Advertisements

Aggregate data Also called summary data, tabular data Counts of things for places (e.g. counties) or entities Examples: –census volumes –HSUS –ICPSR files.
Raw Census Microdata from IPUMS IPUMS Data Structure Household record (shaded) followed by a person record for each member of the household Relationship.
Habitat Analysis in ArcGIS Use of Spatial Analysis to characterize used resources Thomas Bonnot
RATIONALE The storage in a smart phone would cost (in 2011 dollars) $7,571 in 2001 $212,040 in 1991 $3,796,800 in 1981 $56,168,800 in 1971 $1,233,179,000.
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Using Geographic Information Systems (GIS) as.
Grid-based Analysis in GIS
4 April 2007METIS Work Session1 Metadata Standards and Their Support of Data Management Needs Daniel W. Gillman Bureau of Labor Statistics Paul Johanis.
Statistical Coherence: Census Hub Hypercubes and IPUMS Microdata UNECE Expert Group on Population and Housing Censuses Geneva, September 2014 Lara.
BY:- RAVI MALKAT HARSH JAIN JATIN ARORA CIVIL -2 ND YEAR.
Geographic Information System GIS This project is implemented through the CENTRAL EUROPE Programme co-financed by the ERDF GIS Geographic Inf o rmation.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Emerging methodologies for the census in the UNECE region Paolo Valente United Nations Economic Commission for Europe Statistical Division International.
TerraPop Vision An organizational and technical framework to preserve, integrate, disseminate, and analyze global-scale spatiotemporal data describing.
GIS in Weather and Society Olga Wilhelmi Institute for the Study of Society and Environment National Center for Atmospheric Research.
The Minnesota Data Harmonization Projects Bill & Melinda Gates Foundation Seattle, Washington May 21, 2014 Elizabeth Boyle, Miriam King, Matthew Sobek.
CountrySTAT REGIONAL BASIC ADMINISTRATOR TRAINING for ECO MEMBER STATES Ankara, Turkey, October 2013 CountrySTAT STATISTICS COMPONENT (Concepts,
Population census micro data for research: the case of Slovenia Danilo Dolenc Statistical Office of the Republic of Slovenia Ljubljana, First Regional.
Data Projects at the Minnesota Population Center Resources for Comparative Population and Health Research Seattle, Washington May 22, 2014 Elizabeth Boyle,
New and easier ways of working with aggregate data and geographies from UK censuses Justin Hayes UK Data Service Census Support.
06/21/2012 Aggregations of spatial information for data dissemination.
The availability of Dutch census microdata Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands Division Social.
Integrated Public Use Microdata Series IPUMSwww.ipums.org Matt Sobek Minnesota Population Center
1 Overview Importing data from generic raster files Creating surfaces from point samples Mapping contours Calculating summary attributes for polygon features.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
TerraPop Mission Enabling research, learning, and policy analysis by providing integrated spatiotemporal data describing people and their environment.
U.S. Census Data & TIGER/Line Files
Integration of Geography and Statistics in Europe – State of Progress Based on the GEOSTAT 2 survey Jerker Moström, Statistics Sweden Amelia Wardzińska-Sharif,
U.S. Department of the Interior U.S. Geological Survey Automatic Generation of Parameter Inputs and Visualization of Model Outputs for AGNPS using GIS.
Census Office Fernando Casimiro Geneva, July 2010 Portugal – Census results tailored to user needs «
Integrated Public Use Microdata Series IPUMSwww.ipums.org.
Data access and development: The IPUMS perspective United Nations Commission on Population and Development The data revolution in action: National and.
Lessons Learned from the production of Gridded Population of the World Version 4 (GPW4) Columbia University, CIESIN, USA EFGS October 2014.
Geo-referenced data and DLI aggregate data sources
ZAMBIA CENSUS MAPPING PRESETATION
Data Processing Hollerith 1921
Census developments in the Netherlands
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
DataNet Collaboration
Collaboration and Outreach
TerraPop Goals Lower barriers to conducting interdisciplinary human-environment interactions research by making data with different formats from different.
SciDataCon September, 2016 Greg Yetman Kytt MacManus
of the Statistical Office of the Republic of Slovenia
Establishing an Automated Confidentiality Service in Stats NZ
Census Bureau – Fernando Casimiro, Coordinator
CyberGIS: Reston, VA, September 22, 2018
Marja Tammilehto-Luode, Statistics Finland
Meeting of the Working Party
Presentation 2b 2018 Census Products & Services Engagement.
Tabulations and Statistics
Analysis of demographical and economical statistical data on the basis of a trans-national hierarchical grid Igor Kuzma Statistical Office of the Republic.
Terra Populus Data Domains
6.1 Quality improvement Regional Course on
and the Future of Historical Family Demography
CENSUS MICRODATA : THAILAND
Preparing the data for its use in a GIS software
Human Population Characteristics
Expert Expert Group Meeting on Statistical Methodology for Delineating Cities and Rural Areas Iven M. Sikanyiti 28th-30th January 2019 United Nations:
Danilo Dolenc Statistical Office of the Republic of Slovenia
MAKING INCLUSIVE GROWTH HAPPEN IN REGIONS AND CITIES: Present and future developments for the metropolitan database SCORUS conference 16th - 17th June.
Mapping Data Production Processes to the GSBPM
Local Administrative Units
LAMAS Working Group June 2018
Population Data
Geo-enabling the SDG indicators – experiences from the UN Global Geospatial Management and the GEOSTAT 3 project Agenda item 12 Ekkehard PETRI – Eurostat,
Stephanie Hirner ESTP ”Administrative data and censuses
Merging statistics and geospatial information Grants 2012
Presentation transcript:

TerraPop Goals Lower barriers to conducting interdisciplinary human-environment interactions research by making data with different formats from different scientific domains easily interoperable Provide an organizational and technical framework to preserve, integrate, disseminate, and analyze global-scale spatiotemporal data describing population and the environment.

Source Data Domains & Formats Population Microdata Area-Level Data

Terra Populus Data Domains Microdata Land cover Individuals and households Population Environment Land use Disparate scientific domains – interrelated processes Multiple data formats Areal Data Climate

Trent Alexander, CIC Conference Presentation on IPUMS, 10/11/07 Age Birthplace Sex Mother’s birthplace Relationship Race Occupation PopulationMicrodata Structure Geographic and housing characteristics Rows Household records Person records within households Columns Variables Key points – Preserve richness of microdata Location based on administrative units

Microdata Availability Thanks to attending partners: Czech Republic* Mexico* Brazil University of Barcelona Slovenia* Netherlands* Spain* Italy* Poland* *anticipating 2010 or 2011 data Croatia – legislation approved? Norway & Sweden NAPP only Bulgaria no data provided Finland, Slovakia, Kosovo Denmark?

Area-level Data Sources Census tables, especially where microdata is unavailable Other types of surveys, data Agricultural surveys Economic surveys, data Election data Legal information

Environmental Data (Rasters) TerraPop Prototype Land cover data from satellite images (Global Land Cover 2000) Agricultural land use data from satellites and government records (Global Landscapes Initiative) Climate data from weather stations (WorldClim)

Location-Based Integration Microdata  Area-level  Raster

Location-Based Integration Microdata Integration across domains, formats hinges on geography Users get any type of data in format useful to them Requires boundary files, boundaries harmonized over time Rasters Area-level data

Location-Based Integration Microdata Individuals and households with their environmental and social context Integration across domains, formats hinges on geography Users get any type of data in format useful to them Requires boundary files, boundaries harmonized over time Rasters Area-level data

Location-Based Integration Microdata Summarized environmental and population County ID G17003100001 G17003100002 G17003100003 G17003100004 G17003100005 G17003100006 G17003100007 County ID Mean Ann. Temp. Max. Ann. Precip. Rent, Rural Rent, Urban Own, Rural Own, Urban G17003100001 21.2 768 3129 1063 637 365 G17003100002 23.4 589 2949 1075 1469 717 G17003100003 24.3 867 3418 1589 1108 617 G17003100004 21.5 943 1882 425 202 142 G17003100005 24.1 2416 572 426 197 G17003100006 24.4 697 2560 934 950 563 G17003100007 25.6 701 2126 653 321 215 County ID Mean Ann. Temp. Max. Ann. Precip. G17003100001 21.2 768 G17003100002 23.4 589 G17003100003 24.3 867 G17003100004 21.5 943 G17003100005 24.1 G17003100006 24.4 697 G17003100007 25.6 701 characteristics for administrative districts Integration across domains, formats hinges on geography Users get any type of data in format useful to them Requires boundary files, boundaries harmonized over time Rasters Area-level data

Location-Based Integration Microdata Rasters of population and environment data Integration across domains, formats hinges on geography Users get any type of data in format useful to them Requires boundary files, boundaries harmonized over time Rasters Area-level data

Boundaries are Key Linkages across data formats rely on administrative unit boundaries Particular needs Lower level boundaries Historical boundaries

Administrative Unit Boundary Processing Obtaining Linking to microdata Temporal harmonization regionalization

Obtaining Boundary Data Potential sources of digital data National Statistical Offices Global Administrative Areas data (e.g. SALB, GAUL) Digitizing from images or paper maps Challenges Lower level and historical data Date mismatches with census data Code matching to microdata

Digitizing Boundaries Leveraging available digital data Script input Existing digital data Rough digitized boundaries Script output Relevant boundaries from digital data Relationship between digital and digitized units Advantages Preserve accuracy and detail Flag areas needing more work

Code Matching Codes link boundaries to microdata records, connect people to places Boundary data may or may not include codes Approach Name matching, when possible Map observations – digitizing script captures codes Research on boundary changes Boundary shape attributes IPUMS microdata

Temporal Harmonization Purpose Create consistent units for time-series analysis Top-down strategy Start with first administrative level units Harmonize 2nd level units within 1st level “containers” Script to create “least common denominator” units Applicable when maps from multiple years are available Creates aggregate units encompassing areas with boundary changes Constructs source-harmonized crosswalk

“Erase” interior boundaries applicable to only one census Apply harmonized codes Also aids in code matching Crosswalk

Regionalization Confidentiality concerns require minimum 20,000 population in each unit disseminated REDCAP tool Constructs regions by combining units Regions meet minimum population threshold Contiguity constrained Combines units that are similar in terms of a selected variable Currently in testing phase REDCAP Algorithms and parameters Optimization variables (e.g., pop. density, education, occupation) Testing on Malawi TAs, Brazil 2000 municipios Guo’s NSF grant number is 0748813, in case they ask

Regionalization - Lilongwe, Malawi Units < 20K combined with neighbors to meet threshold Specific aggregation depends on Optimization variable Algorithm Need to check this map on the projector….may be tough to see

Beyond Administrative Boundaries Arbitrary boundaries rasterization

Arbitrary Boundaries Watersheds, buffers around features, etc. Near-term Summarize rasters to user-supplied boundaries Identify administrative units intersecting user-supplied boundaries Future Reallocation based on uniform distribution assumption Reallocation based on other assumptions

Rasterization Prototype - All cells in unit get the same value Use lowest level units available Rates only, not counts Future – Distribute based on ancillary data Requires research on available methods May provide as service – users select: Ancillary data Weights Spatial distribution parameters