Presentation is loading. Please wait.

Presentation is loading. Please wait.

Identifying Barriers To File Rendering In Bit-level Preservation Repositories A Preliminary Approach Kyle R. Rimkus, University Library Scott D. Witmer,

Similar presentations


Presentation on theme: "Identifying Barriers To File Rendering In Bit-level Preservation Repositories A Preliminary Approach Kyle R. Rimkus, University Library Scott D. Witmer,"— Presentation transcript:

1 Identifying Barriers To File Rendering In Bit-level Preservation Repositories A Preliminary Approach Kyle R. Rimkus, University Library Scott D. Witmer, School of Information Sciences Medusa Preservation Repository As of November, 2016: 30,000,000 files 86+ TB of storage (replicated 2x+1 backup) Sources of Medusa content: Digitization: in-house and with external vendors Books, newspapers, documents Manuscripts, photographs, maps Audio and video Born digital electronic records Self-deposit of scholarly materials in IDEALS institutional repository

2 Digital Preservation Challenge: Identifying and evaluating trusted file formats

3 File Rendering Profile
Testing Random Samples against profile

4 Reasons Reason Type Total
System file not within scope of current testing out of scope 48 Auxiliary file created and used by a software program, not meant to be opened as individual file 12 Not meant to be opened—Mac system file with underscore in name 9 Not a file—artifact of disk formatting 5 Software available on market, but testers have not yet acquired it 2 Not meant to be opened—software system file symbol in name 1 Not meant to be opened - temporary file with ~$ in name TOTAL OUT OF SCOPE 78 No file extension file management 16 Despite file extension, file is in a folder designating it for another system purpose 14 Not a file extension Saved with incorrect extension TOTAL FILE MANAGEMENT 34 Software considers file invalid problematic file 13 File does not render in software 3 Software unavailable Software attempts to convert file to new version of format and fails. problematic file TOTAL PROBLEMATIC FILE 18 TOTAL ALL CATEGORIES 130 Testing Profile Pass Fail Total Tested TIFF 1276 1 1277 JPEG 1124 13 1137 JPEG2000 325 434 759 XML 540 2 542 PDF 402 GIF 192 3 HTML 130 TXT 114 EMLX 81 DOC 37 39

5 Conclusion and Next Steps
Revisit JPEG 2000 Policy Remediate problem files to TIFF format Use TIFF for preservation master files Explore born digital electronic records Based on these results, we recommend that the batches of problematic JPEG 2000 files in Medusa be isolated and remediated to TIFF format. The failure rate of this subpopulation indicates that there may be around 700,000 files whose structure is unreadable by the Library’s image rendering software. These files don’t represent an immediate preservation risk. They can be reformatted. However, they do represent an access barrier to users. UIUC Library has also shifted its policy to using TIFF for preservation master files. JPEG 2000 will still be used, but in a more limited scope. Both online web applications and back-end image presentation systems render JPEG 2000 files quickly and efficiently, ensuring its ongoing value to the UIUC Library. The next stage of research involves the development of an improved methodology for testing collections of born digital electronic records. Born digital records make up only 2 out of 60 TB of Medusa storage, but they are disproportionately represented in failed tests. 52 of the non-JPEG 2000 failures originated from born digital collections. This failure rate demands closer attention from Medusa’s preservation managers and perhaps a reconsideration of the appraisal and curation requirements of born digital records. In conclusion, our analysis revealed some of the access challenges curators and patrons may face when attempting to open files stewarded in the Medusa repository. While we don’t know if these local challenges are shared by other institutions, we hope that the testing method and evidence-based approach demonstrated here may be useful to others in assessing their own file rendering issues


Download ppt "Identifying Barriers To File Rendering In Bit-level Preservation Repositories A Preliminary Approach Kyle R. Rimkus, University Library Scott D. Witmer,"

Similar presentations


Ads by Google