Long-term preservation of digital geospatial data: challenges for ensuring access and encouraging reuse Anne Robertson, EDINA & Steve Morris, NCSU Libraries.

Slides:



Advertisements
Similar presentations
Scoping a Geospatial Repository for Academic Deposit and Extraction Anne Robertson EDINA EDINA National Data Centre University of Edinburgh JISC Geospatial.
Advertisements

GeoMAPP Business Planning: Developing Materials to Get Stakeholder Buy-in Alec Bethune, North Carolinas Center for Geographic Information and Analysis.
Data Management: Metadata, Repositories and Curation Tony Mathys, Anne Robertson Eddie Boyle, Guy McGarva GeoForum, 4 th November, York.
GeoSpatial MultiState Archive and Preservation Partnership State and Local Agency Geospatial Resources Content Transfer, Demonstration, and Learning Project.
NDIIPP Project Update NC Geospatial Data Archiving Project (NCGDAP) North Carolina State University Libraries North Carolina Center for Geographic Information.
Collecting Digital Content Going Forward: Lessons Learned and New Initiatives NC Geospatial Data Archiving Project (NCGDAP) North Carolina State University.
Identification, Selection, and Appraisal within the North Carolina Geospatial Data Archiving Project (NCGDAP) NCSU Libraries Steve Morris Head of Digital.
Archiving State and Local Agency Digital Geospatial Data: An Overview of the Problem Area Steven P. Morris Head of Digital Library Initiatives North Carolina.
2006 ESRI International Users ConferenceAugust 8, 2006 Spatial Data Infrastructure and Data Preservation in North Carolina Jefferson F. Essic, Robert Farrell,
North Carolina Geospatial Data Archiving Project (NCGDAP) Project Overview Partnership –University library (NCSU) and state agency (NCCGIA) –$520,000 funding,
NCSU Libraries Ingest Workflow Issues: Metadata North Carolina Geospatial Data Archiving Project Steve Morris North Carolina State University Libraries.
Content and Practice: Background to the NC Geospatial Data Archiving Project Steve Morris NCSU Libraries.
Interoperability ERRA System.
Twenty Years of Spatial Vision, But What Does 1987 Look Like in Your GIS? – Emerging Issues, Hindsight and Insights from the NC Preservation Partnership.
Collection and Preservation of At-Risk Digital Geospatial Data: NDIIPP Project Update on the NC Geospatial Data Archiving Project (NCGDAP) Steven P. Morris.
State Presentation Multi-State Geospatial Partnership Kick-off Meeting Salt Lake City, Utah January 23, 2008.
Copyright © 2008, Open Geospatial Consortium, Inc., All Rights Reserved. NDIIPP Partnership Update: North Carolina and Multi-state Demonstration Projects.
Scoping a Geospatial Repository for Academic Deposit and Extraction James Reid EDINA National Data Centre University of Edinburgh October 2006 Geographic.
North Carolina Geospatial Data Archiving Project (NCGDAP) JISC/NDIIPP Joint Digital Preservation Workshop – May 2006 Presented by: Rob Farrell, Steve Morris,
Putting time into the GeoWeb: Data persistence in a web services environment Steve Morris NCSU Libraries July 23, 2008.
ESRI User Conference, August 8, 2006 Long-term archiving of geospatial data: the NGDA project Julie Sweetkind-Singer John Banning Stanford University.
Preservation of Digital Geospatial Data: Challenges and Opportunities Steve Morris Head of Digital Library Initaitives North Carolina State University.
The North Carolina Geospatial Data Archiving Project Steven P. Morris North Carolina State University Libraries Maintaining Long-Term Access to Geospatial.
Why Archiving and Preserving GIS Data Is Important Maps tell a compelling story of change over time. They document movement, progress, and change to the.
Are Geodatabases a Suitable Long-Term Archival Format? Jeff Essic, Matt Sumner North Carolina State University Libraries 2009 ESRI International Users.
Collection Building Processes within the North Carolina Geospatial Data Archiving Project (NCGDAP) NCSU Libraries Steve Morris Head of Digital Library.
OGC ® © 2006 Open Geospatial Consortium, Inc.1 Introduction to Archives and Geospatial Issues ( Continued ) Steve Morris Head, Digital Library Initiatives.
Metadata Handling in the North Carolina Geospatial Data Project (NCGDAP) NCSU Libraries Steve Morris Head of Digital Library Initiatives Rob Farrell Geospatial.
National Digital Information Infrastructure and Preservation Program (NDIIPP) CNI Project Briefing December 5, 2005.
Next Generation Archives: The NC Geospatial Data Archiving Project Jeff Essic Geospatial Data Services Librarian North Carolina State University Libraries.
November 2004 NDIIPP: Future Directions and Relevance to Other Countries Beth Dulabahn Office of Strategic Initiatives Library of Congress November 7,
Cooperative Project with Library of Congress on Preservation of Digital Geospatial Data Steve Morris Head of Digital Library Initiatives NCSU Libraries.
Preserving State and Local Government Digital Geospatial Data Steve Morris Head of Digital Library Initiatives North Carolina State University Libraries.
Collection and Preservation of At- Risk Digital Geospatial Data: North Carolina Geospatial Data Archiving Project (NDIIPP Partnership) Steve Morris Head.
Long-Term Preservation of At- Risk Digital Geospatial Data: A Cooperative Agreement with Library of Congress Steve Morris NCSU Libraries Zsolt Nagy NC.
GeoMAPP: Using Metadata to Help Preserve Geospatial Content Matt Peters, Utah’s Automated Geographic Reference Center Glen McAninch, Kentucky Department.
Preserved Digital Content: Value to Public Policy Decision Making Now and in the Future NC Geospatial Data Archiving Project (NCGDAP) North Carolina State.
Preservation of Coastal Community Geospatial Content: What's Your Long Term Care Plan For Aging Data? Jeff Essic North Carolina State University Libraries.
North Carolina Geospatial Data Archiving Project : Cooperative Project with Library of Congress on Preservation of Digital Geospatial Data Partners: NCSU.
Collection and Preservation of At- Risk Digital Geospatial Data: the North Carolina NDIIPP Project Partners: NCSU Libraries Project Lead: Steve Morris.
NCPMA Fall MeetingOctober 11, 2006 GIS Data Preservation: Partnership with Library of Congress Steve Morris North Carolina State University Libraries.
NCSU Libraries 9 October 2006 EPA Meeting Preservation Partnership with Library of Congress: NDIIPP and the North Carolina Geospatial Data Archiving Project.
PREMIS Implementation Fair, San Francisco, CA October 7, Stanford Digital Repository PREMIS & Geospatial Resources Nancy J. Hoebelheinrich Knowledge.
UK LOCKSS Alliance: Investigation into Private LOCKSS Networks Adam Rusbridge EDINA, University of Edinburgh.
Archiving Geospatial Data: Background to the Problem Area State Government Users Committee October 16, 2008 Steve Morris, NCSU Libraries.
ESRI International Users ConferenceJune 20, 2007 Data Snapshot Archiving: A Frequency of Capture Survey Steve Morris Jeff Essic North Carolina State University.
Preserving Geospatial Data: Challenges and Opportunities Steve Morris NCSU Libraries Indo-US Workshop on Trends in Digital Preservation March 24, 2009.
Preserving Digital Geospatial Data: The NC Geospatial Data Archiving Project (NCGDAP) Steven P. Morris North Carolina State University Libraries CRADLE.
Geospatial Data Preservation Challenges at the Sub-National Level: The North Carolina Experience Steve Morris Head of Digital Library Initiatives North.
Vision for academic geographic data access Dr David Medyckyj-Scott GRADE Project Director EDINA.
NCSU Libraries 13 June 2006 JCDL 2006 NDIIPP Preservation Network: Progress, Problems, and Promise Jim Tuttle, Geospatial Data Librarian.
NDIIPP Project: North Carolina Geospatial Data Archiving Project Partners: NCSU Libraries Project Lead: Steve Morris NC Center for Geographic Information.
North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at- risk digital geospatial data Partners: NCSU Libraries Project.
GISC Seminar: Towards Uncharted GroundSeptember 29, 2006 North Carolina Partnership with Library of Congress on Long-term Preservation of Digital Geospatial.
NDIIPP Project: Collection and Preservation of At-Risk Digital Geospatial Data Partners: NCSU Libraries Project Lead: Steve Morris NC Center for Geographic.
The Disappearing Data Problem Steve Morris Head of Digital Library Initiatives North Carolina State University Libraries.
Models for Shared Responsibility: Collaboration and Engagement with the NCGDAP and GeoMAPP Partnerships Steve Morris North Carolina State Libraries Zsolt.
Mountain Region GIS Advisory Council Meeting September 15, 2006 Long-Term Preservation of Digital Geospatial Data: A Cooperative Project with Library of.
Library of Congress Partnerships for Managing Geospatial Data North Carolina Geographic Information Coordinating Council Raleigh, NC November 7, 2007 William.
Preservation Strategies in the North Carolina Geospatial Data Archiving Project (NCGDAP) NCSU Libraries Steve Morris Head of Digital Library Initiatives.
COMPASS09 Annual Conference of Compass Informatics.
North Carolina Geospatial Data Archiving Project/NDIIPP: Collection and preservation of at-risk digital geospatial data Partners: NCSU Libraries NC Center.
IPR and the EThOS Project 28 th October 2008 Dr. Susan Copeland Senior Information Adviser (Research)
Preservation of State and Local Government Digital Geospatial Data: The North Carolina Geospatial Data Archiving Project Steven P. Morris, James Tuttle,
Preserving Digital Geospatial Data: The NC Geospatial Data Archiving Project (NCGDAP) Steven P. Morris North Carolina State University Libraries CRADLE.
Long-Term Preservation of At-Risk Digital Geospatial Data: The North Carolina Geospatial Data Archiving Project Steve Morris NCSU Libraries.
Update on Geospatial Data Preservation Efforts
Collecting Digital Content Going Forward: Lessons Learned and New Initiatives NC Geospatial Data Archiving Project (NCGDAP) North Carolina State University.
Preserved Digital Content: Collections, Value, and Stewardship NC Geospatial Data Archiving Project (NCGDAP) North Carolina State University Libraries.
CNI Project Briefing December 5, 2005
Presentation transcript:

Long-term preservation of digital geospatial data: challenges for ensuring access and encouraging reuse Anne Robertson, EDINA & Steve Morris, NCSU Libraries EDINA National Data Centre University of Edinburgh North Carolina State University Libraries NCGDAP Architecture Working Group OGC TC/PC Meeting Bonn, 9th November 2005

Objectives Why we’re here……………… Introduce preservation and access use cases to OGC Find points of intersection with OGC initiatives Flesh out research agenda for preservation of geospatial digital data “Permanent access and reuse” not just preservation

North Carolina Preservation Partners North Carolina State University Libraries –University-wide GIS services since 1992 –New focus on publishing WMS services for use by external clients or service aggregators –Archiving local agency geospatial data since 2000 NC Center for Geographic Information & Analysis –State government GIS agency –Maintains state’s Corporate Geographic Database –Coordinates many SDI initiatives, including NC OneMap NC OneMap –Seamless access to local, state, and federal data; component part of National Map –WMS services available individually from sources or through aggregator viewer –Focus on standards, best practices, data sharing agreements, inventories, and metadata outreach

NC Geospatial Data Archiving Project Cooperative project with Library of Congress under the National Digital Information Infrastructure and Preservation Program (NDIIPP) –One of 8 NDIIPP partnership projects, others focusing on web pages, numeric data, video, business records, etc. –Focus on developing a network of partners, identifying preservation issues in various domain areas NCGDAP: 3 year project focused on preservation of state and local agency digital geospatial data –Identify and acquire data –Develop digital repository; ingest and manage content Objective: engage existing spatial data infrastructures in process of data preservation

NCGDAP Project Phases Content Identification and Selection –Work from existing inventory processes –Select from among “early”, “middle”, and “late” stage information products Content Acquisition –Acquire state and local agency content –Investigate methods of automating archive development Partnership Building –Work within NC OneMap framework (infrastructure) –Several other emerging geo-preservation projects Content Retention and Transfer –Metadata and ingest workflow –Emphasis on repository-agnostic approach, avoid “imprinting” one environment –Initially using DSpace open source software, re-ingest into a different environment later

Common Themes – Cartographic Representation The counterpart to the map is not just the dataset but also models, symbology, interpretation. These key elements give real meaning – how are these captured for reuse?

Common Themes – GML for archiving? Interest in alternative to proprietary vector file formats “Permanent access” requirements: –profiles and application schemas widely understood and supported, avoid requiring “digital archaeology” –Role of GML Simple Features Specification? Assessing formats for preservation: sustainability factors, quality & functionality factors Planned environmental scan of existing GML profiles and application schemas –Collaboration with National Archives and Records Administration and FGDC Historical Data Working Group –Vendor support? Official status? Stability over time? How to handle proprietary formats? –UC Santa Barbara/Stanford NDIIPP project working on format registry –Spatial databases pose special challenges

Common Themes – Content replication Need efficient means to replicate content to archive –North Carolina: 100 counties and 140 municipalities Content replication also needed for: –Disaster preparedness –State and federal data improvement projects –Aggregation by regional geospatial web service providers WFS, e.g.: efficiency in complete content transfer? Rsync-like function, plus: rights management, inventory processes, metadata management, informed by data update cycles Archiving delta files vs. complete replication – need to avoid requiring “digital archaeology” in the future Other models: LOCKSS (Lots of Copies Keeps Stuff Safe)

Common Themes – Time versioning How to manage datasets that change over time? –Versions will live in different repositories, must handle relationships outside of the individual repository Industry focus on most current data … but increased demand for temporal data –e.g., land use change detection, business trends analysis –Much older data lost -- “Digital dark age” Draft NCGDAP approach: manage information for “serial objects” separately, link to serial entity via persistent identifier (Handle) –Support “get current data/metadata/DRM” operations –Avoid managing volatile information (e.g., service connections) in individual static metadata records –Other technologies: OpenURL for service connections?

EDINA A National Data Centre for Tertiary Education since 1995 –based at the University of Edinburgh Data Library Our mission... to enhance the productivity of research, learning and teaching in UK higher and further education GeoServices team - provide SDI components to UK academic sector Substantial experience in handling and delivering key geospatial data and geo-referenced information OGC members since 1999 Strategic move toward interoperability & shared services role – use of OGC interface specifications in our projects and services

GRADE project introduction According to OECD Follow up Group on Issues of Access to Publicly Funded Research Data 1 … “More widespread and efficient access to and sharing of research data will have substantial benefits for most areas of scientific research.” Evidence of re-use of data within UK data centres is low: –“Level of re-use of data held in the AHDS and ESRC archives has been disappointingly low” (Alison Allden, 2003) –“NERC spends about £5 million per annum on data management, but unclear what benefit it derives from this. More research is needed to establish benefits and value of data re-use” (Mark Thorley, 2003) –Qualidata survey of qualitative data re-use (2000). 44% respondents used colleague's data rather than acquiring archived data via a dissemination service (33%) 1 Interim Report, 20 October 2002

GRADE project introduction Within UK academia there is a focus on the potential use of digital repositories to assist with a variety of facets of digital asset management including encouraging reuse of research data GRADE will investigate and report on the technical and cultural issues around the reuse of geospatial data within the context of discipline-based repositories Particular focus on sharing and reuse of derived geospatial data EDINA leading GRADE with consortium partners: –AHRC Research Centre for Studies in Intellectual Property and Technology Law, School of Law, Edinburgh University –National Oceanography Centre, Southampton University –Variety of other associate partners including NCGDAP, British Atmospheric Data Centre, Ordnance Survey

Common Themes – Digital Rights UK environment, a complex one –dominant provider of base vector geospatial data provider –array of space borne survey data available, much free for non- commercial use –Stakeholder interest from research funders (research councils) and research hosts (institutions) When we consider the reuse of derived geospatial data concerns over data ownership, IPR and copyright often suppress any initial enthusiasm We can offer the geoDRM discussion real scenarios of –IPR issues for derived geospatial data and –Geospatial data reuse/sharing use cases

Derived Data Example OS Landline Digitise coastline positions Input Processing Output ESRI Shapefile and tables of retreat Ground surveyHistoric OS Maps 2001 Orthophotos Scan Geo- reference Accuracy assessment Planimetric correction GPS survey Calculation of cliff retreat Source: Use case provision of derived geospatial data as part of the GRADE project in scoping digital repositories (draft report)

Common Themes – Content Packaging Consider a geospatial data asset deposited into a repository, it’s more than one file: –GML and associated schema! –proprietary vector format plus cartographic representation detail –geodatabase –raster with header file –Data set metadata and IPR info What is best method to package data? In eLibrary world the Metadata Encoding and Transmission Standard (METS) and IMS content package (IMS CP) and MPEG-21 DIDL for repository objects “Interoperable repositories need to encode, exchange and describe complex objects in agreed ways” What direction is the GI industry taking with content packaging?

Common Themes – Persistent Identifiers Once a geospatial data asset is deposited within a repository, there is a need to be able to persistently identify this asset Particular repository softwares use particular schemes e.g. Fedora uses ‘info’ URI scheme Requirement to ensure identifier is actionable We are thinking about OpenURL Resolvers and perhaps Digital Object Identifier (DOI) for handle schemes What direction is GI industry taking with persistent identifiers?

Common Themes – ‘data plus services’ model National Library of New Zealand

Conclusions Aim is to flesh out research agenda Presented 7 common themes from our work Shift to web services consumption poses threat to secondary archive development … but can geospatial web services be put to use in preservation processes? Encourage GI community to connect with these issues or outcome may be that archive community will fail to take account of OGC work Where to from here?

Contact details Anne Robertson GRADE Project Manager Edina National Data Centre GRADE web site: Steve Morris Head of Digital Library Initiatives North Carolina State University Libraries NCGDAP web site: Questions?