Preservation of State and Local Government Digital Geospatial Data: The North Carolina Geospatial Data Archiving Project Steven P. Morris, James Tuttle, and Robert Farrell North Carolina State University Libraries IS&T Archiving 2006 May 24, 2006
Geospatial data types: Vector data Parcel Boundary Changes 2001-2004 Time series Parcel Boundary Changes 2001-2004 North Raleigh, NC Note: Percentages based on the actual number of respondents to each question
Geospatial data types: Aerial imagery Note: Percentages based on the actual number of respondents to each question
Geospatial data types: Aerial imagery Note: Percentages based on the actual number of respondents to each question
Geospatial data types: Aerial imagery 85+ NC counties with orthophotos 1-5 flights per county 30-200 gb per flight Note: Percentages based on the actual number of respondents to each question
Geospatial data types: Tabular data (w/vector) Economic, infrastructure, and ethnographic data Note: Percentages based on the actual number of respondents to each question
Today’s geospatial data as tomorrow’s cultural heritage Future uses of data are difficult to anticipate (as with Sanborn Maps). Note: Percentages based on the actual number of respondents to each question
Geospatial Data: Risks Producer focus on current data Future support of data formats in question Shift to web services- and API-based access Inadequate or nonexistent metadata Increasing use of spatial databases for data management Many digital archiving challenges … Note: Percentages based on the actual number of respondents to each question
Challenge: Vector Data Formats No widely-supported, open vector formats for geospatial data Spatial Data Transfer Standard (SDTS) not widely supported Geography Markup Language (GML) – diversity of application schemas and profiles threatens permanent access Spatial Databases The sum is more than the whole of the parts, and the sum is very difficult to preserve Can export individual data layers for curation Some thinking of using the spatial database as the primary archival platform Note: Percentages based on the actual number of respondents to each question
Challenge: Cartographic Representation Counterpart to the map is not just the dataset but also models, symbolization, classification, annotation, etc. Note: Percentages based on the actual number of respondents to each question
Challenge: Geospatial Web Services How to capture records from decision- making processes? Possible: Atlas collections from automated image capture Web 2.0 impact: Emerging tiling and caching schemes (archive target?) Note: Percentages based on the actual number of respondents to each question
NC Geospatial Data Archiving Project Partnership between university library (NCSU) and state agency (NCCGIA), with Library of Congress under the National Digital Information Infrastructure and Preservation Program (NDIIPP) One of 8 initial NDIIPP partnerships Focus on state and local geospatial content in North Carolina (state demonstration) Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventories Objective: engage existing state/federal geospatial data infrastructures in preservation Serve as catalyst for discussion within industry Note: Percentages based on the actual number of respondents to each question
Different Ways to Approach Preservation Technical solutions: How do we archive acquired content over the long term? Cultural/Organizational solutions: How do we make the data more preservable—and more prone to be archived—from point of production? Note: Percentages based on the actual number of respondents to each question
Technical Approaches Receive data as is – variety of distribution methods Migration of some at-risk formats Metadata remediation, standardization, and synchronization Distilling complex objects into repository ingest items (not easy) Using DSpace for demonstration purposes In the development: use METS record as dormant item “brain” within the repository Some unsustainable activities – for learning experience Note: Percentages based on the actual number of respondents to each question
Cultural/Organizational Approaches Feedback to metadata outreach program Feedback to coordinating bodies on adherence to content standards Engage existing spatial data infrastructure in archiving and preservation Engage software vendors and standards community Cross-fertilize with other national archiving efforts Current use and data sharing requirements – not archiving needs – drive improved preservability of content and improvement of metadata Note: Percentages based on the actual number of respondents to each question
Cultivating a commercial market for older data. Project Status Note: Percentages based on the actual number of respondents to each question
Questions? http://www.lib.ncsu.edu/ncgdap Contact: Steve Morris Head, Digital Library Initiatives NCSU Libraries ph: (919) 515-1361 Steven_Morris@ncsu.edu http://www.lib.ncsu.edu/ncgdap Note: Percentages based on the actual number of respondents to each question