Geog 463 GIS Workshop April 17, 2006
Outlines Data Acquisition –Acquiring spatial data –Metadata –Spatial data quality –Determining fitness-for-use of data Spatial Data Infrastructure (SDI) –Concepts of SDI –What constitute SDI? –How can SDI be characterized?
Part I. Data Acquisition Evaluating the applicability of data is one of essential skills for GIS professionals
Acquiring spatial data Use data download service –USGS National Map Seamless Data Distribution System –USGS EROS Data Center –Microsoft’s Terraserver –TIGER/Line by Census Bureau or ESRI –Subnational GIS clearing house (e.g. WAGDA)… Use data catalogue service (or spatial portal) –Geospatial one-stop –ESRI geographynetwork.com
Tips for spatial & non-spatial data acquisition By geographic scale –Data resolution is often related to the geographic scale of data providing agency being considered –federal data sources have lower resolution with wider geographic coverage (e.g. LU/LC in EROS Data Center) –parcel data can be found in the local level (e.g. City of Seattle) By related agency and organizations –Best data about housing can be found in HUD… –Best data about transportation can be found in BTS… –Best data about education can be found in NCES… –Best data about justice can be found in BJS… By theme –Talk to resource persons in the area; they probably have go through data search processes Also read if you’re not familiar with UW library systemhttp://courses.washington.edu/geog360a/dataatlibs2003.ppt
Metadata Describes content and characteristics of data Helps determine fitness for use –Is the data suitable for the application? Is metadata always available? –No (much shared data is more likely to be published with metadata e.g. USGS public domain data) What if metadata is not available? –Look for data dictionary at least; or contact persons in charge Metadata standard for public data in the U.S. –FGDC metadata content standard (
Reading FGDC metadata Want to know…?Sections in FGDC metadata Map scale or resolutionData Quality - Lineage How current?Identification – Time Period Which area is covered?Identification – Spatial Domain How is data processed?Data Quality – Lineage How accurate?Data Quality - Accuracy Datum, map projectionSpatial Reference Data structure {vector, raster}Spatial Data Organization AttributesEntity and Attribute Never miss reading abstract and purpose! Example:
Creating metadata How do I create metadata? –Use metadata creation/editing tool ArcCatalog from ESRI tkme from How do I check if this metadata conforms to FGDG Content Standard? –Use metadata validation tool Install program mp from Use web service at
Spatial data quality where –Column: components of geographic information –Row: components of data quality Accuracy: lack of discrepancy between measurement and values considered true (e.g. is this location near true value?) Consistency: whether given components conform to logical rules (e.g. any digitizing error?) Completeness: whether what’s required is encoded in data (i.e. anything missing) SpaceTimeAttribute AccuracyPositional accuracy Attribute accuracy ConsistencyLogical consistency Completeness FGDC metadata terms How is spatial data quality related to fitness for use of data?
Determining fitness for use Does map scale or resolution of the data provide the level of details required by the application? –Using low-resolution satellite image for street-level survey is not acceptable –Any generalization algorithms used? Is data current enough to support needs identified from P1? –Using outdated data for replacing a old map is not acceptable Are specific characteristics of data useful for the application? –Topology for routing operation –Multispectral image for land use detection –Non-planar representation for 3D visualization Any processing steps linked to usefulness of data for specific applications? –Some processing steps brought about irreversible effects on data (e.g. unknown algorithm parameters) *Questions shown in this lecture note are not intended to be exhaustive
Determining fitness for use Is the stated level of accuracy sufficient given error tolerance? –Requirements for accuracy vary highly by the applications –Required types of accuracy vary by need-to-know questions or research questions (e.g. measuring parcel size require relative accuracy while surveying require absolute accuracy) Is the state level of completeness of features or attribute adequate to need-to-know question? –Some entities and attributes are required rather than optional Logical consistency of data? –Doesn’t data lack conformance to logical rules? (e.g. is identifier generated properly? Doesn’t data has too many sliver?) –Does metadata indicate that the agency put any effort in quality control? (e.g. lack of information in data quality section)
Part II. Spatial Data Infrastructure Searching for the day we take less pain in data acquisition
Imagine the future when information is extracted from data upon request (maybe future is now) In the future, data is right there, and different data are integrated in a seamless manner so that value-added products can be generated in a timely fashion What are barriers to getting there? Are we getting there? What are steps towards making the best use of spatial data?
Role of geographic information Statistics shows that 80% of government-related activities require locational information Business demand exist to analyze customers’ need on a locational basis There are overriding concerns for understanding the complexity of human and natural environment and its interaction Locational framework can act as a glue that puts together related themes –Sustainability can be understood by examining relationships of all related themes, not by examining one theme separate from other themes Sustainability has been widely acknowledged as a future agenda in varying organizational structure
Spatial data as commodity Thus spatial data is being seen as assets promoting good governance, economic development and improved environmental sustainability as we observe increasing attention to holistic approaches Also seen as push towards information society In addition, many sustainability concerns cannot be addressed without cutting across multiple jurisdictions Access to applicable spatial data is essential to this endeavor Spatial data infrastructure that provides enabling environment for a spatially enabled society
Spatial data & integration Thematic integration Geographic integration
SDI: Reinventing the meaning of GIS The larger the scope of GIS project is, the more base data is shared, thus generating more benefit What about national or global effort to sharing spatial data? (e.g. NSDI, Global Map) Spatial data as an infrastructure –Airplane doesn’t fly, but airline does –Highway only doesn’t take you there, but all related technical and institutional arrangements do –Spatial data only doesn’t meet your need, but multiple aspects related to creation, maintenance, extraction, and dissemination of spatial data do
What constitute SDI? What is needed to facilitate data sharing? Or what are barriers to facilitating data sharing as a counter example? –What if there’s no metadata that describes data? –What if there’s no people who know characteristics and constraints attached to spatial data? –What if there’s no website for data dissemination? –What if there’s no standards that promote interoperability (e.g. FGDC metadata content standard)? –What if there’s no coordination between agency? –What if there’s no willingness to share data? SDI = spatial data + people + technology + standards + policy
How can SDI be characterized? SDIs are shared –as they seek to make available, expensive, geo-referenced spatial data digitally to a variety of users for diverse application needs (for example, biodiversity, utilities, and health) based on an integrated approach. SDIs are open –as no pre-defined boundaries limiting the user groups are made, and typically various government departments, citizens, and private sector are expected to draw upon them. SDIs are inherently enabling –as they are not pre-configured to a particular application and can potentially be used by different entities to design their own applications. Groot and McLaughlin 2000