The National Archives Washington DC July 10, 2008

Slides:



Advertisements
Similar presentations
IST Humboldt University Berlin, Germany – Computer and Media Service – Electronic Publishing Group Birgit Matthaei, 4th Sept. 2003, Bath,
Advertisements

The Biosafety Clearing-House of the Cartagena Protocol on Biosafety Tutorial – BCH Resources.
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
Collaborating to Compile Information about Formats The vision, the current state, and the challenges for format registries Caroline R. Arms Library of.
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
Preserving and Sharing Digital Data Greg Colati, Director, Archives and Special Collections May 11, 2012.
Health Ingenuity Exchange (HingX) Best Practices for User Groups and Resource Registration.
Unified Digital Format Registry (UDFR) Stakeholder Meeting Library of Congress Washington, DC April 13, 14, 2011.
CNRIS CNRIS 2.0 Challenges for a new generation of Research Information Systems.
Mark Evans, Tessella Digital Preservation Boot Camp – PASIG meeting, Washington DC, 22 nd May 2013 PREMIS Practical Strategies For Preservation Metadata.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
H a r v a r d U n i v e r s i t y L i b r a r y Global Digital Format Registry An Update July 2006.
1 Using Scalable and Secure Web Technologies to Design Global Format Registry Muluwork Geremew, Sangchul Song and Joseph JaJa Institute for Advanced Computer.
UCLA Digital Library UC Digital Library Forum August 5, 2002 UCLA Digital Library Presenter: Curtis Fornadley Senior Programmer/Analyst.
1 Open Library Environment Designing technology for the way libraries really work December 8, 2008 ~ CNI, Washington DC Lynne O’Brien Director, Academic.
Digital Preservation Dale Flecker Stephen Abrams February 15, 2007 HUL University Library Council.
Digital Library Syllabus Uploader Will Cameron CSC 8530 October 19, 2006 Project Presentation 2.
Catherine Masi, National Geospatial Digital Archive May 16, 2005 NGDA Format Registry  Why do we need a FR? We are designing with long-term storage in.
TDL Forum WEDNESDAY, APRIL 16, Agenda - Updates & Announcements ◦TCDL 2014 (Kristi) ◦Vireo Users Group Meeting (Kristi) ◦Staffing (Ryan) ◦SHARE.
ADC Meeting ICEO Standards Working Group Steven F. Browdy, Co-Chair ADC Workshop Washington, D.C. September, 2007.
H ARVARD U NIVERSITY L IBRARY The Global Digital Format Registry (GDFR) Project Stephen Abrams Harvard University Andreas Stanescu OCLC CNI Fall Task Force.
ECHO DEPository Project: Highlight on tools & emerging issues The ECHO DEPository Project is a 3-year digital preservation research and development project.
Update on UDFR (Unified Digital Format Registry) NDIIPP Meeting June 25, 2009 Andrea Goethals.
Preservation and Archiving Special Interest Group Spring Meeting San Francisco, May 2008 Preservation Characterization Stephen Abrams California.
Access Across Time: How the NAA Preserves Digital Records Andrew Wilson Assistant Director, Preservation.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
ESRI User Conference, August 8, 2006 Long-term archiving of geospatial data: the NGDA project Julie Sweetkind-Singer John Banning Stanford University.
1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.
File format registries - a global infrastructure for local persistence Andreas Aschenbrenner, ERPANET.
JH VE 2 The Fifth International Conference on Preservation of Digital Objects British Library, September 2008 What? So What? The Next-Generation.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
The DPubS Development Project: Building an Open Source Electronic Publishing System David Ruddy Cornell University Library.
Extending Access To Information Resource Discovery Service William E. Moen, Ph.D. Kathleen R. Murray, Ph.D. School of Library and Information Sciences.
TIDEN Node Management Texas Integrated Data Exchange Node Partnered with.
Global Digital Format Registry Progress Andrea Goethals, Harvard University Library NDIIPP Digital Preservation Partners’ Meeting Arlington, VA July 9,
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
PREMIS Data Dictionary and the Future of Preservation Metadata Brian Lavoie Research Scientist OCLC Research Society of American Archivists.
1 SAIC XMSF Update XMSF Workshop & MOVES Open House 4-5 August 2003 Katherine L. Morse, Ph.D., David L. Drake, Ryan.
International Planetary Data Alliance Registry Project Update September 16, 2011.
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
+ The Learning Registry: A How To Primer for Digital Content Publishers and Aggregators December 20, 2011.
Systems Analysis and Design in a Changing World, Fifth Edition
Preserving Digital Collections
GISELA & CHAIN Workshop Digital Cultural Heritage Network
The Global Digital Format Registry (GDFR) Project
Ian Bird GDB Meeting CERN 9 September 2003
An Introduction to Tessella and The Safety Deposit Box Platform
Betsy Wilson Environmental Update October 29, 2007
Global Digital Format Registry (GDFR)
Metadata Editor Introduction
Flexible Extensible Digital Object Repository Architecture
Flexible Extensible Digital Object Repository Architecture
Wsdl.
James SHANNON RA44 – BUD 15 Sept 2010
Implementing an Institutional Repository: Part II
DPubS: An Open Source Electronic Publishing System
NSDL Data Repository (NDR)
HingX Project Overview
e-Invoicing – e-Ordering 20/11/2008
GISELA & CHAIN Workshop Digital Cultural Heritage Network
MSDI training courses feedback MSDIWG10 March 2019 Busan
Grid Systems: What do we need from web service standards?
The Fedora Project April 28-29, 2003 CNI, Washington DC
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
SDMX IT Tools SDMX Registry
Presentation transcript:

The National Archives Washington DC July 10, 2008 GDFR Pilot Discussion The National Archives Washington DC July 10, 2008

Agenda Introductions – (All) Purpose of meeting – (Dale) Roles – (Dale, Richard) Background/history – (Stephen) GDFR Governance Workshop – (Richard, Robert) Architecture – (Stephen) Current state – (Andrea) Relationship to PRONOM – (Andrea) Issues and observations – (Dale) Use cases – (Andrea) Discussion of pilot – (All) Review next steps from GDFR Governance Workshop Report – (Richard, Robert) Outreach to other interested parties – (All) Next steps – (All)

Introductions All

Purpose of the meeting Dale Flecker

Harvard – Dale Flecker NARA –Richard Steinbacher Roles Harvard – Dale Flecker NARA –Richard Steinbacher

Background/History Stephen Abrams

Background/History Format is the key piece of representation information that permits preservation activities to be focused on interpretable/renderable content, not just opaque bit strings ffd8ffe000104a46494600010201 008300830000ffed0fb050686f74 6f73686f7020332e30003842494d 03e90a5072696e7420496e666f00 0000007800000000004800480000 000002f40240ffeeffee03060252 0347052803fc0002000000480048 0000000002d80228000100000064 000000010003030300000001270f 0001000100000000000000000000 000060080019019000000000 ... SOI APP0 JFIF 1.2 APP13 IPTC APP2 ICC DQT SOF0 183x512 DRI DHT SOS ECS0 ...

Background/History Traditional methods of managing format information, e.g. the IANA MIME registry, are insufficiently descriptive and granular for effective preservation planning and intervention The application/word format is essentially defined as anything produced by the Word application TIFF 6.0, TIFF/IT, TIFF/EP, GeoTIFF,…  image/tiff

Background/History Two DLF-sponsored invitational workshops Univ. Pennsylvania, January 2003 Washington, March 2003 Two independent demonstration projects FRED, John Ockerbloom, Univ. Pennsylvania FOCUS, Joseph JaJa, Univ. Maryland

Background/History Evolving consensus on scope A forum for documenting normative definitions of format syntax and semantics A common facility to pool and share scarce technical expertise on a global basis A channel for the distribution of that expertise to the international community of preservation practitioners A foundation for additional value-added services requiring detailed knowledge of digital formats

Background/History Peer-to-peer network of independent, but cooperating registries

Background/History Harvard University Library (HUL) funded for 2 years by the Andrew W. Mellon Foundation Technical deliverables only; no funded governance/policy activity Staffing and technical work subcontracted to OCLC (July 2006)

NARA Governance Workshop Richard Steinbacher Robert Chadduck

Architecture Stephen Abrams

Architecture A generic distributed registry framework, specialized for the GDFR application Based on well-known products and protocols Human and machine interfaces Full information content expressible in XML form; can be re-instantiated from that expression Platform independence Globally fault tolerant Open source

Architecture Data model is an extension of PRONOM 4

Architecture Based on the OCLC IWSA/RFA framework

Architecture Java, Apache/Tomcat, Berkeley DB XML GNU LGPL license Including technology newly-developed for the project and pre-existing OCLC technology

Current state Andrea Goethals

Current state: schedule July 31, 2008 Contract with OCLC ends GDFR source node at Harvard goes public in beta mode August 2008 up to August 2010 Harvard maintains GDFR software, website and source node

Current state: GDFR Home website It moved! Old GDFR Home: http://www.formatregistry.org New GDFR Home: http://www.gdfr.info All existing GDFR docs migrated from the old GDFR Home website Over the next month Updated documentation! Demo source node?

Current state: architecture Currently: One GDFR source node Where all data additions and edits are performed Many GDFR mirror nodes Replicated data Future? Multiple GDFR source nodes? Multiple interoperable format registry source nodes? “Discoverable” from GDFR Home website Each node has 2 Interfaces For humans: user interface For machines: web service interface

Current state: GDFR source node Housed by Harvard for now http://www.formatregistry.org/registry Populated with test data- ~2000 formats from Magic database Need authorized account to add/edit data

Current state: GDFR mirror nodes Test mirror nodes at OCLC and Harvard Anyone can run a mirror node Synchronize data with the source node Can brand your mirror node

Current state: Mirror node set-up Dependencies Apache 2 (mod_rewrite, mod_jk, mod_perl2) Tomcat 5.5.x Berkeley DBXML 2.3.10 Perl 5.8.x Java 1.5 Installation & configuration – half day

User interface Mirror node Source node Sneak preview Search, browse, lookup/retrieve, export, manage node Source node Same as mirror node Plus: add, edit Sneak preview

Current state: machine interface Web services using SRU Can do everything supported by the human user interface Except browsing Plus mirror-to-source node synchronization

Relationship to PRONOM Andrea Goethals

Relationship to PRONOM – what’s the problem? Two different “format” registries Overlapping but digressing data model No common format model No mechanism to exchange data PRONOM is in production, GDFR is not yet PRONOM has been publicly available for over 4 years and is used by some preservation repositories Interoperates with DROID Basis for PLANET projects How many format registries does the digital preservation community need? Depends on how different they are…

Relationship to PRONOM – core differences Who governs the registry and makes policy, scope and enhancement decisions? PRONOM: TNA GDFR: community-based Who adds and edits format information? PRONOM: TNA (does take addition requests) Where is the format information physically located? PRONOM: at TNA GDFR: replicated in different geographic locations

Relationship to PRONOM – what’s the solution? Recognize there is a problem – DONE Mutual willingness to resolve TNA desire to participate in a GDFR pilot Common web service API across the registries? PRONOM could become a GDFR node PRONOM and GDFR could each support a new web service API Cross-walk PRONOM PUIDs and GDFR GFIDs? Use common format identification tools (DROID, JHOVE, etc.) with either registry

Issues and Observations Dale Flecker

Use cases Andrea Goethals

Use cases – 3 sets (see handout) Higher-level use cases submitted by many institutions (early 2003) Lower-level use case model created for the software design (2006-7) Use cases arising from informal talks and meetings

Key use cases – discussed but not supported Determine duplicates Notifications/warnings Determine migration/emulation pathways Determine at-risk formats (machine-actionable risk assessments) Support the registry & discovery of GDFR nodes Authentication of nodes and users (outside the UI) Storage of local profiles separate from central formats Synchronizations based on vetted or non-vetted data Determine “quality” of format information Multiple source nodes

Use cases- common issues How evaluative should GDFR be? Neutral vs judgmental Are services in the scope of GDFR? Should GDFR provide services directly (notifications, validation, etc.) or should GDFR be a reference that can be used by external services?

Discussion of pilot All

Discussion of pilot Purposes

Discussion of pilot Pilot use cases

Discussion of pilot Process

Discussion of pilot Participants

Review next steps from the GDFR Governance Workshop Report Richard Steinbacher Robert Chadduck

Outreach to other interested parties All

Next steps? All