Global Digital Format Registry Progress Andrea Goethals, Harvard University Library NDIIPP Digital Preservation Partners’ Meeting Arlington, VA July 9, 2008
Agenda 1. GDFR? 2. History & Context 3. Relationship to PRONOM 4. Current state 5. Upcoming plans 6. Challenges ahead 7. Questions
GDFR? What? “Global Digital Format Registry” A pooled body of information about digital formats Both common and obscure formats XML to Quake II 3D Model File Information: Format-specific + associated “agents”, technologies, specifications, assessments Why? Reference for digital preservation activities
GDFR History
GDFR History in Context of PRONOM
GDFR-PRONOM Relationship Two different “format” registries How many format registries does the digital preservation community need? Depends on how different they are…
GDFR-PRONOM core differences 1. Who governs the registry and makes policy, scope and enhancement decisions? PRONOM: TNA GDFR: community-based 2. Who adds and edits format information? PRONOM: TNA (accepts requests) GDFR: community-based 3. Where is the format information physically located? PRONOM: at TNA GDFR: replicated in different geographic locations Sufficient differences to continue with GDFR
Current state
GDFR Home website It moved! Old GDFR Home: New GDFR Home: All existing GDFR docs migrated from the old GDFR Home website Over the next month Updated documentation! Demo source node?
Architecture Currently: One GDFR source node Where all data additions and edits are performed Many GDFR mirror nodes Replicated data Future? Multiple GDFR source nodes? Multiple interoperable format registry source nodes? “Discoverable” from GDFR Home website Each node has 2 Interfaces For humans: user interface For machines: web service interface
GDFR source node Housed by Harvard for now In test mode until August 1 – then will be publicly available in beta mode Populated with test data- ~2000 formats from Magic database Will need an authorized account to add/edit data
GDFR mirror nodes Test mirror nodes at OCLC and Harvard Anyone will be able to run a mirror node Software available for download August 1 from the GDFR Home website Installation & configuration – half day Can brand your mirror node
User interface Mirror node Search, browse, read, export, manage node Source node Same as mirror node Plus: add, edit Sneak preview
Upcoming plans Tomorrow: pilot planning meeting Pilot purposes Create process for building the registry Integrate GDFR with tools, repository software and workflows Governance questions (headed by NARA)
Challenges ahead Resolving relationship to PRONOM Quality or quantity? Is there a large enough community of format contributors? Do we need a simpler data entry interface? Integrating tools and services How to handle documentation Preservation Proprietary formats Governance
Questions?