Presentation is loading. Please wait.

Presentation is loading. Please wait.

Putting time into the GeoWeb: Data persistence in a web services environment Steve Morris NCSU Libraries July 23, 2008.

Similar presentations


Presentation on theme: "Putting time into the GeoWeb: Data persistence in a web services environment Steve Morris NCSU Libraries July 23, 2008."— Presentation transcript:

1 Putting time into the GeoWeb: Data persistence in a web services environment Steve Morris NCSU Libraries July 23, 2008

2 Overview Background to the digital preservation problem Problems –Temporal data access issues –Capturing data state in a services or API context –Making the business case for older data Preservation approaches Future directions

3 Project background: North Carolina Geospatial Data Archiving Project Partnership between university library (NCSU) and state agency (NCCGIA) Under cooperative agreement with Library of Congress in NDIIPP national preservation program Focus on state and local geospatial content in North Carolina (state demonstration) Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventories Goal: Engage spatial data infrastructure (SDI) in data preservation and archiving Demonstration repository as catalyst for an industry conversation

4 SDI role in data preservation Data inventories support content identification Metadata standards support discoverability and use Content standards support data interoperability over time and help eliminate semantic confusion Data exchange networks: –Minimize need to make contact –Add technical, administrative, descriptive metadata –Establish rights and provenance

5 Project roots: NCSU Libraries data directories Tracking data, map servers, and web services since 2000 Ranked 3 rd in traffic among entry points to entire library website Persistent identifiers –usage tracking –ID links used in other sites Community help in site maintenance

6 County map and data services in NC 100 Counties in North Carolina

7 Carrboro, NC : Population 17,797 (2005 est.) 24 downloadable GIS data layers 4 WMS data layers 6 web mapping applications 9 downloadable PDF map layers

8 Note: Percentages based on the actual number of respondents to each question Downtown Raleigh Near State Capitol 1914 Sanborn Map

9 Note: Percentages based on the actual number of respondents to each question Downtown Raleigh Near State Capitol 1993 DOQQ

10 Note: Percentages based on the actual number of respondents to each question Downtown Raleigh Near State Capitol 1999 Wake County Ortho

11 Note: Percentages based on the actual number of respondents to each question Downtown Raleigh Near State Capitol 2005 Wake County Ortho

12 Note: Percentages based on the actual number of respondents to each question Downtown Raleigh Near State Capitol 2005 Wake County Ortho Imagery = Durable Static Simple structure Mostly open formats Vector data = Volatile Frequent update Complex structure Mostly proprietary formats Downtown Raleigh Near State Capitol 2005 Wake County Ortho Imagery = Durable Static Simple structure Mostly open formats Vector data = Volatile Frequent update Complex structure Mostly commercial formats

13 Data preservation points of failure Data is not saved, or … can’t be found, or … media is obsolete, or … media is corrupt, or … format is obsolete, or … file is corrupt, or … meaning is lost Solutions: MigrationEmulation EncapsulationXML

14 Problem: Data state in a web services or API-driven environment xxxxxxxxxxxxxxxxxx How to capture records from decision- making processes? How to capture data state as well as service state?

15 Problem: Temporal data unavailability Industry focus on “latest and greatest” data “Kill and fill” as a common approach to data management (past versions of vector data lost) Not just data loss, also: Loss of memory about data Of superceded county orthophoto flights in NC only 22% recorded in the state’s GIS inventory Some older inventories only available through Internet Archive

16 Availability of older orthoimagery on county map servers in NC Only 30% of superceded digital ortho flights accessible through county map servers

17 Availability of older orthoimagery on county map servers in NC 23 Counties in NC publish ortho WMS services 0 Counties in NC publish superceded orthos as WMS services

18 Problem: Making business case for archiving Use case: Land use and impervious surface change analysis 1993 2005 1998 2002 1999

19 Building the preservation business case Land use change analysis Site location analysis Real estate trends analysis Disaster response Resolution of legal challenges Impervious surface change mapping

20 Planned 2008 NC business case survey Case description Resources/Scope of effort Benefits and results Fiscal assessment Based on previous experience, pending projects, examples of when a project could have been served better if archival data were available

21 Geospatial data preservation challenges Producer focus on current data Future support of data formats in question Inadequate or nonexistent metadata Spatial databases Complex data objects (multi-file, multi-format) Shift to web services-based access (ephemeral data) Difficult to capture data state at point of decision-making

22 Preservation approaches: Temporal data snapshots Issue: How frequently should county and municipal vector data layers be captured in archives? Parcels, centerlines, jurisdictions, zoning, … Parcel Boundary Changes 2001-2004, North Raleigh, NC

23 NC frequency of data capture surveys How often should continually changing vector datasets be captured? Tap into data custodian understanding of production patterns and uses Tap into local innovation Learn about local business drivers for data archiving –2006 and 2008 surveys of NC cities and counties –2008 survey of archival practice in state agencies in NC –Planned survey of data users in NC http://www.nconemap.com/AboutNCOneMap/tabid/289/Default.aspx#preservation

24 Preservation approaches: Dessicated data Complex data representations can be made more preservable (and less useful) through simplification

25 Preservation approaches: Dessicated data Complex documents may be very hard to preserve over time –GIS project files –Layer definitions –Web services or API interactions Image outputs capture some sense of final product--but lose underlying data intelligence

26 Note: Percentages based on the actual number of respondents to each question Cartographic outputs – analogous to the old paper maps Combined datasets, with data models, classification, symbolization, annotation More data intelligence than in images Dessicated data: PDF and GeoPDF

27 Dessicated data: Geospatial PDF Explosion of geospatial PDF content in past few years Standards issues –GeoPDF: proprietary TerraGo technology –PDF an open ISO standard –Open PDF variants created through ISO standards process (PDF/E, PDF/X, PDF/A, …) PDF content retained in addition to, NOT instead of data

28 Preservation approaches: Historical WMS tile caches? No market for archived tiles without standard way to describe tiles and without commonly used tiling schemes

29 Preservation approaches: Historical WMS tile caches? Tile cache systems developed for more responsive WMS or mapping systems –WMS Tile Caching (WMS-C) incubated by OSGEO –WMTS (Web Map Tiling) OGC white paper No explicit temporal component in WMS-C or WMT To what extent do temporal geospatial systems become video-like?

30 Use Sanborn map slide or replacement Pronounced local agency interest in archiving, digitizing, and geo- referencing older analog products Old maps coming into the GeoWeb …

31 New archiving interest: Location-based content Present-day value in location-based services and mobile applications Street Views Oblique Imagery 3D Images

32 Future value of non-spatial place-based imagery as cultural heritage resource More descriptive of place and function than spatial imagery New archiving interest: Location-based content

33 Moving forward GICC Archival and Long-Term Access Committee Geo Multistate Archival and Preservation Partnership (GeoMAPP) OGC Data Preservation Working Group

34 Community response to data archiving challenge Nov. 2007: NC Geographic Information Coordinating Council (GICC): Ten Recommendations in Support of Geospatial Data Sharing released –Recommendation: “Establish archive and long term data access strategies” –Suggested best practices include: “Establish a policy and procedure for the provision of access to historic data, especially for framework data layers.”

35 GICC Archival and Long-Term Access Committee Initiated in response to agency requests for guidance on temporal data management Federal, state, regional, and local agency representation Key focus –Best practices for data snapshots and retention –State Archives processes: appraisal, selection, retention schedules, etc. –Who, What, Why, When, Where, How

36 Geo Multistate Archive and Preservation Partnership (GeoMAPP) Lead organizations: North Carolina Center for Geographic Information & Analysis (NCCGIA), State Archives of NC, with Library of Congress Partners: –State geospatial organizations of Kentucky and Utah –State Archives of Kentucky and Utah –NCSU Libraries in catalytic/advisory role State-to-state and geo-to-Archives collaboration 2 year project: Nov. 2007-Dec. 2009 Archives as part of Spatial Data Infrastructure

37 OGC Data Preservation Working Group Formed Dec. 2006 Engage archival community Find points of intersection with other OGC activities: –GML for archiving –Content packaging –Large scale data transfers –Time in decision support

38 The Content Packaging Problem Files Multi-file dataset Georeferencing Metadata file Symbols file Additional documentation License Disclaimer More Metadata ISO/FGDC Acquisition metadata Transfer metadata Ingest metadata Archive rights Archive processes Collection metadata Series metadata Metadata Exchange Format (MEF) in GeoNetwork a form of content packaging

39 Questions? Contact: Steve Morris Head, Digital Library Initiatives NCSU Libraries Steven_Morris@ncsu.edu NCGDAP site: http://www.lib.ncsu.edu/ncgdap/

40


Download ppt "Putting time into the GeoWeb: Data persistence in a web services environment Steve Morris NCSU Libraries July 23, 2008."

Similar presentations


Ads by Google