Statistical Peril in the Transportation Planning Polygon Kevin Hathaway, Colin Smith, & John Gliebe May 2013.

Slides:



Advertisements
Similar presentations
A Synthetic Environment to Evaluate Alternative Trip Distribution Models Xin Ye Wen Cheng Xudong Jia Civil Engineering Department California State Polytechnic.
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
Turun kauppakorkeakoulu  Turku School of Economics REGIONAL DIFFERENCES IN HOUSING PRICE DYNAMICS: PANEL DATA EVIDENCE European Real Estate Society 19th.
Geographical Information Systems and Science Longley P A, Goodchild M F, Maguire D J, Rhind D W (2001) John Wiley and Sons Ltd 6. Uncertainty © John Wiley.
Analysis. Start with describing the features you see in the data.
Mapping People Cartograms of Ireland Martin Charlton
Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
GIS and Spatial Statistics: Methods and Applications in Public Health
19 th Advanced Summer School in Regional Science Overview of advanced techniques in ArcGIS data manipulation.
Chapter 12 Simple Regression
Agenda Overview Why TransCAD Challenges/tips Initiatives Applications.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Geog 458: Map Sources and Errors Uncertainty January 23, 2006.
Subcenters in the Los Angeles region Genevieve Giuliano & Kenneth Small Presented by Kemeng Li.
Luci2 Urban Simulation Model John R. Ottensmann Center for Urban Policy and the Environment Indiana University-Purdue University Indianapolis.
Map to Geographic Information Systems (GIS) Maps as layers of geographic information Desire to ‘automate’ map Evolution of GIS –Create automated mapping.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
GIS 2, Final Project: Creating a Dasymetric Map for Two Counties in Minnesota By: Hamidreza Zoraghein Melissa Cushing Caitlin Lee Fall 2013.
University of Wisconsin-Milwaukee Geographic Information Science Geography 625 Intermediate Geographic Information Science Instructor: Changshan Wu Department.
QUIZ CHAPTER Seven Psy302 Quantitative Methods. 1. A distribution of all sample means or sample variances that could be obtained in samples of a given.
IS415 Geospatial Analytics for Business Intelligence
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 10 Sampling Distributions.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
MAT 254 – Probability and Statistics Sections 1,2 & Spring.
Objectives (BPS chapter 11) Sampling distributions  Parameter versus statistic  The law of large numbers  What is a sampling distribution?  The sampling.
Sampling distributions BPS chapter 11 © 2006 W. H. Freeman and Company.
Copyright © 2010 Pearson Education, Inc. Slide
CorPlan: Place Based Scenario Planning Tool
Hydrologic Modeling: Verification, Validation, Calibration, and Sensitivity Analysis Fritz R. Fiedler, P.E., Ph.D.
© 2008 Brooks/Cole, a division of Thomson Learning, Inc. 1 Chapter 4 Numerical Methods for Describing Data.
Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.
Basic Geographic Concepts GEOG 370 Instructor: Christine Erlien.
Our objectives: We will consider four thematic map types choropleth proportional symbol dot density cartograms understand decisions involved in classifying.
Edoardo PIZZOLI, Chiara PICCINI NTTS New Techniques and Technologies for Statistics SPATIAL DATA REPRESENTATION: AN IMPROVEMENT OF STATISTICAL DISSEMINATION.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
15. Descriptive Summary, Design, and Inference. Outline Data mining Descriptive summaries Optimization Hypothesis testing.
Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Spatial Association Defining the relationship between two variables.
From Theory to Practice: Inference about a Population Mean, Two Sample T Tests, Inference about a Population Proportion Chapters etc.
1 Statewide Land-Use Allocation Model for Florida Stephen Lawe, John Lobb & Kevin Hathaway Resource Systems Group.
Geographic Information Science
Extending Spatial Hot Spot Detection Techniques to Temporal Dimensions Sungsoon Hwang Department of Geography State University of New York at Buffalo DMGIS.
Sampling distributions BPS chapter 11 © 2006 W. H. Freeman and Company.
Lecture 16 Dustin Lueker.  Charlie claims that the average commute of his coworkers is 15 miles. Stu believes it is greater than that so he decides to.
SP 240 Cartography Alex Chaucer
Evaluating Transportation Impacts of Forecast Demographic Scenarios Using Population Synthesis and Data Simulation Joshua Auld Kouros Mohammadian Taha.
Population and Sample The entire group of individuals that we want information about is called population. A sample is a part of the population that we.
The Capacity of Hope: Developing a Regional Build-Out Model with GIS Martin Kim, Tom Harner, Kathryn Youra Polk The Capacity of Hope: Developing a Regional.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Mapping Point Phenomena: The Common Dot Map
1 Chapter 4 Numerical Methods for Describing Data.
Spatial Smoothing and Multiple Comparisons Correction for Dummies Alexa Morcom, Matthew Brett Acknowledgements.
Environmental GIS Nicholas A. Procopio, Ph.D, GISP
GIS September 27, Announcements Next lecture is on October 18th (read chapters 9 and 10) Next lecture is on October 18th (read chapters 9 and 10)
Transportation Modeling – Opening the Black Box. Agenda 6:00 - 6:05Welcome by Brant Liebmann 6:05 - 6:10 Introductory Context by Mayor Will Toor and Tracy.
Metro Data Resource Center Portland Metro: MetroScope 2002 – 2008 A Tool for Regional Decision Making Presentation to the New York Metropolitan Transportation.
Map Design.
Urbanization and Development: Is LAC Different from the Rest of the World? Mark Roberts (GSURR, World Bank), Brian Blankespoor (DEC-RG, World Bank),
AMPO Annual Conference October 22, 2014
Lecture 6 Implementing Spatial Analysis
Geography “Geo”= Earth, “Graphy”= to write
Tabulations and Statistics
Spatial Point Pattern Analysis
Why are Spatial Data Special?
Mapping Quantities: Choropleth Maps Gary Christopherson
Presentation transcript:

Statistical Peril in the Transportation Planning Polygon Kevin Hathaway, Colin Smith, & John Gliebe May 2013

Aggregated Data – A Planning Reality & A Planning Problem  Aggregation units are required since traffic analysis zones are the convenient grouping scheme for regional and statewide transportation planning.  Zone-level variables are both consumed on their own and used as inputs to travel demand and land use allocation models, with the assumption that the groupings are real and fixed.  The fundamentals of spatial analysis and statistical sampling error are commonly ignored, which can have undesirable consequences.

Modifiable Areal Unit Problem: The Zone Effect  The sizes and shapes of planning zones are modifiable and arbitrary (they rarely represent real geographical properties or segment the population in a meaningful way).  Changing the polygon boundaries can drastically change the zonal statistics (e.g. Gerrymandering)

 The scale of the zones will also change the results.  As the polygons get bigger and underlying population grows, variability is washed away.  As the polygons get small and underlying population shrinks, we are more likely to observe extreme (and perhaps unreliable) values.  When we mix scales in a planning region, both statistical properties will be present. Modifiable Areal Unit Problem: The Scale Effect

Normalizing a Layer’s Attribute - ArcMAP

Show a map with New York State Housing Units Block-level: Units per Person

Hous. Units per Person Population in Block Two Ways to View the Distribution

Standard Errors for Averages and Proportions

Start with One Polygon  Simulated polygon with population of orange and grey squares.  Color locations are randomly assigned  20.2% of the zone is orange.  Cut the polygon up and measure the orange within each smaller polygon.

Look at size before location Always plot your statistic against its own denominator. Funnel or cone shapes indicate you may have a scale effect playing a role.

More on Scale – Conventional Guidance on TAZ size According to AASHTO: “…, it is strongly suggested that TAZs should be delineated with a resident or worker population of 1,200 or greater.”

Land Use Model Inputs Employment Density (jobs/acre) Non-residential Developed Acres

Rates of Seatbelt Use Across a State Road Segment Daily Volume

What should you do? 1. Resist the temptation to explain all the spatial and temporal variability. 2. For TAZ delineation, optimization routines and explicit testing of varying zone structures have been proposed (Ding, 1998 & Viegas, et al., 2007). 3. Run simulations on your own planning units to explore the severity of the zone and scale effects. The impacts depend on the measures and the specific region under study. More Tactical Adjustments during Data Exploration 1. Binomial Data with small n: methods that follow the Law of Succession (Laplace, Wilson, or Jeffreys) are helpful to improve small sample statistics. 2. For zone-level means, you can center the distribution by using the regional mean as the expected value.

Mapping Polygon Values Mark Newman University of Michigan 2008 Presidential Election Results

Mark Newman University of Michigan

Mark Newman University of Michigan

Mark Newman University of Michigan

Cartograms in ArcGIS & R

Recap and Final Thoughts 1.Rather than ignoring sampling variation, we should recognize its presence. 2.Rather than only asking if the observational differences are a function of location or polygon-specific attributes, consider some or most of the differences could be merely be a function of the base size and your zonal delineation. 3.Real variation due to the underlying spatial phenomenon are often blurred by our unit of analysis. Both aggregation and disaggregation create problems; our job is understand the trade-offs. 4.The least densely populated zones are sometimes the largest. The use of thematic mapping has an unfortunate consequence of overemphasizing large units and minimizing small ones. Consider alternatives that are more honest in their visual representation.

Sources & Further Reading 1.Statistics for Spatial Data. Noel A. Cressie Spatial Modeling of Regional Variables. Noel Cressie and Ngai H. Chan The Most Dangerous Equation. Howard Wainer Diffusion-based method for producing density-equalizing maps. Michael T. Gastner and M. E. J. Newman Effects of the modifiable areal unit problem on the delineation of traffic analysis zones. Viegas, et al The GIS-Based Human Interactive TAZ Design Algorithm: Examining the Impacts of Data Aggregation on Transportation Planning Analysis. Ding, C When 100% Really Isn’t 100%: Improving the Accuracy of Small-Sample Estimates of Completion Rates. James Lewis and Jeff Sauro