Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Use of Census Data and Spatial Statistical Tools in GIS to Identify Economically Distressed Areas Presented by: Barbara Gibson & Ty Simmons SCAUG User.

Similar presentations


Presentation on theme: "The Use of Census Data and Spatial Statistical Tools in GIS to Identify Economically Distressed Areas Presented by: Barbara Gibson & Ty Simmons SCAUG User."— Presentation transcript:

1 The Use of Census Data and Spatial Statistical Tools in GIS to Identify Economically Distressed Areas Presented by: Barbara Gibson & Ty Simmons SCAUG User Group Meeting, Broken Arrow March 2nd, 2010

2 Introduction What we were asked to do
Creating the Economic Conditions Index Using Spatial Statistical Tools Mapping the Results

3 What we were asked to do Approached by transportation planning staff to create an economic conditions index (similar to a Florida study) as a part of a TIGER grant application The index categorizes block groups based on their level of distress as measured by 3 factors Unemployment Families in poverty Substandard housing The index is based solely on 2000 census data by block group TIGER (Transportation Investment Generating Economic Recovery) which we were awarded!! The only one in the state! The project was a multi-model bridge on I-244 over the Arkansas River. Bridge would be the first of it’s kind in Tulsa built to accommodate highway, high-speed intercity and commuter rail, and pedestrian and bicycle traffic. The high speed passenger rail component gives the project national significance…………. The index score for each census block group was based on comparing its indicator scores to the average scores for the county in which the block group was located in our case Tulsa County

4 Unemployment – Tulsa County
P43. SEX BY EMPLOYMENT STATUS FOR THE POPULATION 16 YEARS AND OVER [15] Universe: Population 16 years and over Total: P ,088 Male: P ,309 In labor force: P ,810 In Armed Forces P Civilian: P ,516 Employed P ,173 Unemployed P ,343 Not in labor force P ,999 Female: P ,779 In labor force: P ,228 In Armed Forces P Civilian: P ,181 Employed P ,683 Unemployed P ,498 Not in labor force P ,551 13,841 / 432,088 = 3.20% Percent Unemployment Start with unemployment P43 – table reference number for the data

5 Unemployment – Tract 7 BG 2
Census Tract 7 Block Group 2 P43. SEX BY EMPLOYMENT STATUS FOR THE POPULATION 16 YEARS AND OVER [15] Universe: Population 16 years and over Total: P Male: P In labor force: P In Armed Forces P Civilian: P Employed P Unemployed P Not in labor force P Female: P In labor force: P In Armed Forces P Civilian: P Employed P Unemployed P Not in labor force P Unemployment 42 / 691 = 6.08% Percent Unemployment

6 Unemployment Index Value
Unemployment Index Value for Census Tract 7 Block Group 2 6.08 1.9 3.20

7 Families in Poverty – Tulsa County
P90. POVERTY STATUS IN 1999 OF FAMILIES BY FAMILY TYPE BY PRESENCE OF RELATED CHILDREN UNDER 18 YEARS BY AGE OF RELATED CHILDREN [41] Universe: Families Total: P ,189 Income in 1999 below poverty level: P ,962 12,962 / 148,189 = 8.75% Percent Families in Poverty P90 – table reference number for census

8 Families in Poverty – Tract 7 BG 2
Census Tract 7 Block Group 2 P90. POVERTY STATUS IN 1999 OF FAMILIES BY FAMILY TYPE BY PRESENCE OF RELATED CHILDREN UNDER 18 YEARS BY AGE OF RELATED CHILDREN [41] Universe: Families Total: P Income in 1999 below poverty level: P 37 / 207 = 18.14% Percent Families in Poverty

9 Family Poverty Index Value
Family Poverty Index Value for Census Tract 7 Block Group 2 18.14 2.07 8.75

10 Substandard Housing Index
Housing Index is based on 3 variables Housing Units lacking complete plumbing facilities Home value for all owner occupied housing units Year Structure Built for all housing units To create the substandard housing index by block group we found an index for each variable and then took the average of the 3 overall to a have a composite index score.

11 Plumbing Facilities – Tulsa County
H47. PLUMBING FACILITIES [3] Universe: Housing units Total: H ,953 Complete plumbing facilities H ,404 Lacking complete plumbing facilities H ,549 1,549 / 243,953 = 0.63% Percent Housing Units Lacking Complete Plumbing

12 Plumbing Facilities – Tract 7 BG 2
Census Tract 7 Block Group 2 H47. PLUMBING FACILITIES [3] Universe: Housing units Total: H Complete plumbing facilities H Lacking complete plumbing facilities H 22 / 367 = 5.99% Percent Housing Units Lacking Complete Plumbing

13 Lacking Complete Plumbing Index Value
Lacking Complete Plumbing Index Value for Census Tract 7 Block Group 2 5.99 9.51 0.63

14 Home Value – Tulsa County
H84. VALUE FOR ALL OWNER-OCCUPIED HOUSING UNITS [25] Universe: Owner-occupied housing units Total: H ,131 Less than $10,000 H ,264 $10,000 to $14,999 H ,279 $15,000 to $19,999 H ,345 $20,000 to $24,999 H ,065 $25,000 to $29,999 H ,751 $30,000 to $34,999 H ,659 $35,000 to $39,999 H ,605 $40,000 to $49,999 H ,057 $50,000 to $59,999 H ,084 $60,000 to $69,999 H ,140 $70,000 to $79,999 H ,268 $80,000 to $89,999 H ,183 $90,000 to $99,999 H ,290 $100,000 to $124,999 H ,583 $125,000 to $149,999 H ,885 $150,000 to $174,999 H ,700 $175,000 to $199,999 H ,562 $200,000 to $249,999 H ,378 $250,000 to $299,999 H ,837 $300,000 to $399,999 H ,598 $400,000 to $499,999 H ,142 $500,000 to $749,999 H $750,000 to $999,999 H $1,000,000 or more H Total number of owner- occupied housing units with a value < $90,000 76,700 76,700 / 140,131 = 55% Percent of owner- Occupied housing units With a value < $90,000 For Home value and year structure built have to first look at the median for the county. In our case Tulsa Counties medain home value was $85,000. Since we are using grouped values you have to make a choice and we went with all owner-occupied housing units with a value less than $90,000. Tulsa County Median Value for all Owner-occupied housing units = $85,000

15 Occupied housing units
Home Value – Tract 7 BG 2 Census Tract 7 Block Group 2 H84. VALUE FOR ALL OWNER-OCCUPIED HOUSING UNITS [25] Universe: Owner-occupied housing units Total: H Less than $10,000 H $10,000 to $14,999 H $15,000 to $19,999 H $20,000 to $24,999 H $25,000 to $29,999 H $30,000 to $34,999 H $35,000 to $39,999 H $40,000 to $49,999 H $50,000 to $59,999 H $60,000 to $69,999 H $70,000 to $79,999 H $80,000 to $89,999 H $90,000 to $99,999 H $100,000 to $124,999 H $125,000 to $149,999 H $150,000 to $174,999 H $175,000 to $199,999 H $200,000 to $249,999 H $250,000 to $299,999 H $300,000 to $399,999 H $400,000 to $499,999 H $500,000 to $749,999 H $750,000 to $999,999 H $1,000,000 or more H Total number of owner-occupied housing units with a value < $90,000 172 172 / 192 = 89.58% Percent of owner- Occupied housing units With a value < $90,000

16 Home Value Index Home Value Index for Census Tract 7 Block Group 2 89.58 1.63 55

17 Year Built – Tulsa County
H34. YEAR STRUCTURE BUILT [10] Universe: Housing units Total: H ,953 Built 1999 to March H ,196 Built 1995 to H ,270 Built 1990 to H ,202 Built 1980 to H ,570 Built 1970 to H ,908 Built 1960 to H ,062 Built 1950 to H ,160 Built 1940 to H ,598 Built 1939 or earlier H ,987 Total number of housing units Built before 1970 111,807 Tulsa County Median Year Structure Built for all housing units = 1972 111,807 / 243,953 = 46% Percent of housing Units Built before 1970 Year built we follow the same principle. For Tulsa county the median year built was 1972 so we went with all housing units built before 1970.

18 Percent of housing units
Year Built – Tract 7 BG 2 Census Tract 7 Block Group 2 H34. YEAR STRUCTURE BUILT [10] Universe: Housing units Total: H Built 1999 to March H Built 1995 to H Built 1990 to H Built 1980 to H Built 1970 to H Built 1960 to H Built 1950 to H Built 1940 to H Built 1939 or earlier H Total number of housing units Built before 1970 = 302 302 / 367 = 82.29% Percent of housing units Built before 1970

19 Year Built Index Year Structure Built Index for Census Tract 7 Block Group 2 82.29 1.79 46

20 Substandard Housing Index
Indicator scores for each block group were summed and averaged to provide an overall substandard housing Index score Lacking complete plumbing index score = 9.51 Home Value index score = 1.63 Year Structure Built index score = 1.79 Substandard Housing Index Score for Census Tract 7 Block Group 2 = ( ) / 3 = 4.31 Now that we have the indicator scores for the 3 variables for housing we can develop our composite index. Scores are for our sample block group – census tract 7 block group 2

21 Economic Conditions Index
Indicator scores for each census block are then summed and averaged to provide an overall economic conditions index score Unemployment Index = 1.9 Family Poverty Index = 2.07 Substandard Housing Index = 4.31 Index score for Census tract 7 block group 2 = ( ) / 3 = 2.76 Now that we have our composite index score for substandard housing we will use the same formula for our overall economic conditions score. An index score was created for each of the 410 block groups within Tulsa County

22 Index Scores Mapped The results were mapped based on the natural breaks method using five classes. We knew north Tulsa was an economically distressed area from the beginning and our analysis illustrates that. We wanted to somehow emphasize the clustering of the high index score values around the I-244 bridge relative to the rest of Tulsa County, so we began exploring the spatial statistical tools in ArcGIS.

23 Using Spatial Statistical Tools
Are found in ArcToolbox Toolsets Include: Analyzing Patterns Mapping Clusters Measuring Geographic Distributions Modeling Spatial Relationships Cluster and Outlier Analysis Tool Spatial Statistical Tools come standard with ArcGIS, is not an extension. Modeling Spatial Relationships toolset is not available with ArcView.

24 Using Spatial Statistical Tools
Spatial Statistics: The love child of Geography and Statistics Were developed specifically for use with geographic data Incorporates space, such as proximity, area, and connectivity into the statistical process Allows you to analyze spatial Distributions Patterns Processes Relationships Differs from traditional statistics in that you are not making inferences about the data, rather you typically are dealing with all the available data in your study area Traditional statistics typically works with a random sample with you trying to determine if your sample data is a good representation of the population at large. For example, what are the chances that the results from my exit poll will reflect the final election results. On the other hand, when you compute a statistic for the entire population with spatial statistics, you do not have an estimate, but rather a fact, since you are dealing with all the possible data.

25 Using Spatial Statistical Tools
The Statistics behind it all The randomization null hypothesis – is used by many of the tools in the spatial statistics toolbox for statistical significance testing. It postulates that there is no spatial pattern among the features, or among the values associated with those features, in the study area. Most statistical tests begin by identifying a null hypothesis, which is a statement of no effect or no difference. The cluster and outlier analysis tool, like many of the spatial statistical tools, uses the randomization null hypothesis, which postulates that there is no spatial pattern among the features, or among the values associated with those features, in the study area. If I were to pick up the index score values and throw them into the block groups I would have one possible spatial arrangement. The randomization null hypothesis states that if I do this an infinite number of times most of the time the pattern will be different than the observed pattern (what our actual index score map looks like). The randomization null hypothesis states that your data is one of many, many possible versions of complete spatial randomness. The data values are fixed; only their spatial arrangement could vary.

26 Using Spatial Statistical Tools
The Statistics behind it all Z score – test of statistical significance that helps you decide whether or not to reject the null hypothesis. They tell us how many standard deviations our index scores are from the mean and in what direction P-value – the probability that you have falsely rejected the null hypothesis. The smaller the p-value is, the stronger the evidence is against the null hypothesis In order to determine whether or not to reject the null hypothesis, you have to derive a Z score and a p-value. The Z score and the p-value help us determine whether or not the clustering we see on our map is actually statistically significant.

27 Using Spatial Statistical Tools
The Statistics behind it all Both the z score and p-value are associated with the standard normal distribution, which relates standard deviations with probabilities and allows significance and confidence to be attached to the Z scores and p-values Very high or low (negative) Z scores with very small p-values are found in the tails of the normal distribution Using a 95% confidence level, the Z scores would be and and the p-value would be 0.05, which means you can reject the null hypothesis 95% 2.5% When you perform a feature pattern analysis, such as cluster/outlier analysis, and it yields small p-values and either a very high or very low (negative) Z score, this indicates it is very unlikely that the observed pattern is some version of the theoretical spatial random pattern represented by your null hypothesis, thus you can reject the null hypothesis.

28 Using Spatial Statistical Tools
The Statistics behind it all A Z score between and means the p-value will be larger than 0.05, thus the null hypothesis cannot be rejected 95% 2.5% On the other hand, if your Z scores are between and and your p-value is larger than 0.05, you cannot reject the null hypothesis.

29 Using Spatial Statistical Tools
Cluster and Outlier Analysis Analysis Identifies clusters of features with similar magnitudes, as well as spatial outliers It does this by calculating Local Moran’s I Value Z score P-value COType field Interpretation Positive I value indicates a cluster Negative I value indicates an outlier COType field distinguishes between statistically significant Cluster of high values (HH) Cluster of low values (LL) Outlier with a high value surrounded primarily by low values (HL) Outlier with a low value surrounded primarily by high values (LH) It is important to note that the Cluster and Outlier Analysis tool requires projected data to accurately measure distance. The Local Moran's index evaluates whether the pattern expressed is clustered, dispersed, or random. It can only be interpreted within the context of the computed Z score or p-value. COType field gives you an alpha code for statistically significant features of HH, LL, HL, or LH

30 Using Spatial Statistical Tools
Cluster and Outlier Analysis

31 Mapping the Results

32 Mapping the Results Census Tract 7, Block Group 2
Local Moran’s I Value = Z score = P-value = 0 COType field = HH A positive Local Moran’s I Value means that this block group is part of a cluster The large Z score means that the block group is statistically significant. The very small (non existent) p-value means we can safely reject our null hypothsis And the Cluster type field reveals that this is part of a cluster of high economic distress values.

33 Other Applications Brownfield identification funding
Kendall-Whittier Tulsa Community Foundation Anticipate using for environmental justice maps for 2035 regional transportation plan

34 Questions? Contact Information: Ty Simmons – Barbara Gibson – Phone:


Download ppt "The Use of Census Data and Spatial Statistical Tools in GIS to Identify Economically Distressed Areas Presented by: Barbara Gibson & Ty Simmons SCAUG User."

Similar presentations


Ads by Google