Building A Repository for Digital Objects Allen Mullen Texas State Library & Archives Commission
Overview Considerations for Developing a Permanent Repository for Digital Objects Texas State Library’s Electronic Publications Repository project
Background Publication Repositories vs. other Digital Object projects Standards in use
Types of Digital Object projects Records management preservation Digital image access/preservation projects Electronic journal access/preservation projects
Government Publication Repositories National Library of Australia NEDLIB National Library of Canada GPO and U.S. States
Standards and models Open Archival Information Systems (OAIS) Document Object Model Preservation metadata XML framework
OAIS NASA developed; proposed ISO standard Framework for archive architecture; establishes concepts and processes Processes: Ingest, Data Management, Archival Storage, Access, and Administration Data object interpreting Representation Information produces Information Object
Document Object Model W3C Recommendation for an open standard to dynamically access and update documents Potential for Preservation - work on basis of collections sharing common characteristics
XML Framework The Platform and The Content must be “persistent” for preservation Collection context - “infrastructure independent” platform for data exchange Presentation context - XML based objects for platform independent migration
Preservation Metadata Standards National Library of Australia, CEDARS, NEDLIB and Library of Congress facets - descriptive, access, administrative, technical, legal OCLC/RLG work
Other considerations Migration or emulation? Authenticity Version control persistent locator
Texas Government Information Repository Background Key considerations Key decisions Project progress
Repository Background Migration of Government Publishing TRAIL and Dublin Core Print Publication Depository system
Key Considerations What to collect? How to collect? How to process? How to store? How to preserve? How to access?
Challenges of Publication Framework Multiple formats for publications Multiple files comprising single publications Multiple versions of publications Relative vs. Dynamic linking Difficult formats
Key Decisions Collection Development Policy Manual harvest Collections-based processing; conversion to XHTML (or other XML) Archives and redundant depository Migration or emulation? Enhanced Dublin Core
Project progress Research phase completed RFIs issued and evaluated Integration of digital images projects Purchasing/implementation in Autumn 2001 Integration via Z39.50 Future work to be done