Download presentation
Presentation is loading. Please wait.
Published byHilary O’Brien’ Modified over 9 years ago
1
Catherine Masi, National Geospatial Digital Archive May 16, 2005 NGDA Format Registry Why do we need a FR? We are designing with long-term storage in mind (> 100 years) Cannot depend on format spec to be available via url or even a format registry that might not still be up to date or in existence Thus semantic definition of format must be archived with the object itself This semantic definition must be comprehensive so that format can be accessed even if current access mechanisms no longer exist!
2
Catherine Masi, National Geospatial Digital Archive May 16, 2005 NGDA Format Registry Two major tasks Analyze and define spatial data formats (Meredith Williams) Develop local format registry with programmatic interface to existing authoritative/collaborative FR (Catherine Masi)
3
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Analyze and define spatial data formats Is there a comprehensive list of geospatial formats? Are they defined? How? List of Spatial Data Formats - MWSpatial Data Formats Digital Map Formats Vector File Formats Raster File Formats Other categories - TIN, ASCII, 3D, Tabular Databases Unacceptable Formats
4
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Analyze and define spatial data formats What formats do we have in ADL? How do we define them? ADL format documentation ADL website: http://www.alexandria.ucsb.edu/adl/Collection%20Developmen t/BucketDescrip.htm http://www.alexandria.ucsb.edu/adl/Collection%20Developmen t/BucketDescrip.htm MIME types: http://www.iana.org/assignments/media-types/http://www.iana.org/assignments/media-types/ ADL literature/presentations: Format type: hierarchical vocabulary: ADL Object Format Thesaurus loosely based on MIME multiple values: union compare: DC.Format ADL Webclient list: http://webclient.alexandria.ucsb.edu/mw/index.jsp http://webclient.alexandria.ucsb.edu/mw/index.jsp
5
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Analyze and define spatial data formats What are our preferred formats for NGDA, if any? MW tested three geospatial formats using Sustainability Test derived from LCDF Sustainability Test GJ - "we can ingest anything if we have the definition representation information" Decided to limit allowed formats to a few the first year – CASIL test suite (geotiff, shapefile) What if there is free proprietary software, such as from ESRI, that allows one to look the files. Should we request and archive that as well? - No (UCSB)
6
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Analyze and define spatial data formats How will we define our formats? Using Meredith's list of Spatial Data FormatsSpatial Data Formats Begin defining using LoC Digital Formats as an example How do we know that we have sufficient semantic information to define each geospatial format? What information is required to make the format usable? Ask the users. What information is required to programmatically access the format if current access mechanisms become obsolete? Prioritize and start with most important/ubiquitous formats for our archive Cooordinate with format definitions in Jhove
7
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Develop local format registry with programmatic interface to existing authoritative/collaborative FR What format registries are out there? Library of Congress Digital Formats (LCDF) Global Digital Format Registry (GDFR) - Harvard Global Digital Format Registry Description Global Digital Format Registry Description Ockerbloom's Format Registry Demonstrator (FRED) PRONOM - File format registry - UK archives Practical, in use, not geo-spatial
8
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Develop local format registry with programmatic interface to existing authoritative/collaborative FR Coordinate our efforts with the LCDF, GDFR, FRED, TOM NH initiated contact (Stephen Abrams, John Ockerbloom, Steve Morris, etc.) at DLF Questions for DFL meeting to get discussion started. Questions for DFL meeting Questions that we formulated showed that we have to solve a lot of these problems on our own, especially with regard to the technical aspects of building a FR and interaction mechanisms between LC, GDFR and our local FR
9
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Develop local format registry with programmatic interface to existing authoritative/collaborative FR Do the existing format registries contain geospatial formats? No, in the future we will contribute geospatial formats to an existing registry effort such as LCDF or GDFR
10
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Develop local format registry with programmatic interface to existing authoritative/collaborative FR Do the existing format registries support access and contribution mechanisms? No.
11
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Develop local format registry with programmatic interface to existing authoritative/collaborative FR How are Library of Congress Digital Formats stored internally? Database? XML? Directory structure? In MS Word files
12
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Develop local format registry with programmatic interface to existing authoritative/collaborative FR Is there a data dictionary or other mechanism for defining fields in LCDF? FDD
13
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Develop local format registry with programmatic interface to existing authoritative/collaborative FR CM contacted Steve Morris (NCSU - NDIIPP), Stephen Abrams (Harvard - GDFR) and John Mark Ockerbloom (Penn - FRED), to open up a discussion on the technical aspects of developing a geospatial format registry. S. Abrams responded that GDFR is still only an idea rather than a reality and that a technical discussion of how our GIS formats should be managed in a GDFR- conformant way is a bit premature
14
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Develop local format registry with programmatic interface to existing authoritative/collaborative FR What are the requirements for the NGDA Format Registry? independent contains sufficient semantic information to programmatically access format (UCSB) contains geospatial reference information definitions exist in simple documented format in simple directory structure access/search mechanism not necessary for access interfaces with collaborative authoritative FR for updates and contributions
15
Catherine Masi, National Geospatial Digital Archive May 16, 2005 First steps: CM began prototyping the physical structure of format registry using 2 CASIL formats, geotiff and shapefile. Created directory based registry. Incorporated info from MW's documents Spatial Data Formats and Sustainability TestSpatial Data FormatsSustainability Test Created record layout loosely based on Library of Congress Digital Formats but including spatial reference information. Included format spec as local website (in the case of geotiff) and as local pdf file (in the case of shapefile). All links on record referred to local copies of format information. All documentation about the format is located locally in that format's directory Entries are not complete. This is just a first pass at what the html-rendered format entries will look like. Focus here is on physical structure rather than content.
16
Catherine Masi, National Geospatial Digital Archive May 16, 2005 First steps: Refining content using input from DV, MW and from actual data users as to what is needed to adequately define a format. Determine sufficient semantic info to define geospatial formats Review CASIL formats. Began to flesh out sufficient semantic info. Started with geotiff, shapefile. Review record layout and add, change and delete fields.
17
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Next steps Make sure format spec is complete and all information is located locally where possible. Determine where we draw the line between format registry information/policy/higher level descriptive metadata. Format registry will stick to format spec and a few other important fields only. Develop xml stylesheet of record layout. Decided that html, xml and pdf are acceptable archivable formats for format registry information. Flatten the directory structure (hierarchy) because tfw, for example, is not a subtype of geotiff but can be attached to a tiff or another format. Work more on trying to find a sensible organization for the files in our FR Link to other parts of Archive (Descriptive Metadata) from within FR
18
Catherine Masi, National Geospatial Digital Archive May 16, 2005 Later Develop method of search, retrieval, update Begin to develop programmatic interface to LoC Digital Formats or other authoritative/collaborative format registry
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.