Download presentation
Presentation is loading. Please wait.
Published byAnnice Ella Mitchell Modified over 9 years ago
1
The Evolving Process to Add Preservation Support for New Formats at Harvard Library IS&T Archiving 2015 Andrea Goethals. Franziska Frey and David Ackerman
2
Digital Repository Service (DRS), Harvard Library’s Dig. Pres. Repository
3
DRS “Support” Allowable in at least one DRS “content model” Repository tools “know” the format Usable now (e.g. through delivery services) Preservation staff reasonably certain it can be made usable on an ongoing basis via interventions
4
Formats Supported Per Year Text XML Target Images Kodak PhotoCD GIF RealAudio Tiff JPEG JP2 ICC color profiles AIFF ESRI world files WAVSMIL playlists Web harvests GZIP containers ZIP containers PDF documents Email
5
55 Harvard Units Using the DRS
6
Born Digital Formats in Harvard Libraries Number of Libraries (out of 21 that answered) Already have Will have in 3 years Source: HL Preservation Needs Assessment (2013)
7
DRS Format Requests (2004 -) Chart last updated: 12/23/2013 (39 requests for 53 formats)
8
DRS Format Requests (2004 -)
9
Support Gap Additional audio Video Vector graphics PDF Databases E-articles Datasets DNG Word processing docs Spreadsheets Email Software CAD Presentations 3D Models E-books E-newspapers Python notebooks Shapefiles Disk images
10
2008: Stop-Gap Solution “Opaque objects and containers” Any format, BUT... – Only bit-level preservation – No delivery – Very coarse description – Less attention by preservation staff Moderate uptake - < 20,000 Zip files
11
Adding Format Support – Old Workflow All analysis & development done in-house – by existing staff – concurrently with other projects / operations – intermittently (requiring re-familiarization) Sometimes stalled by lack of expertise Ad-hoc, undocumented process
12
Fast-Tracking Experiment 3 year project enabled by the Arcadia Foundation Formats: – video – word processing – vector graphics – 3D graphics – disk images – image stacks – spreadsheets – presentations Goal: create a faster format support workflow that can be repeated
13
Analysis & Development for New Format A Analysis & Development for New Format B old way: sequential development, one after the other, in between other work
14
Analysis for New Format A Development for New Format A new way: 1.) split into 2 sub-projects Analysis & Development for New Format A
15
Analysis for New Format A Development for New Format A new way: 2.) hire consultants to help with analysis
16
Analysis for New Format A Development for New Format A new way: 3. schedule independently and in parallel as expertise and resources become available Analysis for New Format B Analysis for New Format C Development for New Format B Development for New Format C Analysis for New Format D Analysis for New Format E Analysis for New Format F Development for New Format D etc... Specifications & guidelines are ready in advance for developers
17
Format Expert Consultants AVPreserve (video, disk images, image stacks) Paul Wheatley Consulting (word processing documents, spreadsheets, presentations) Applied Informatics Group (AIG) at the College of Computing and Informatics (CCI), Drexel University (vector graphics, 3D formats) Tarkus Imaging Inc. (camera raw images)
18
Analysis Tasks 1.Divide up analysis responsibilities 2.Determine format analysis criteria 3.Analyze formats 4.Create format profiles 5.Determine preservation strategy 6.Analyze metadata 7.Design DRS content model 8.Analyze tools
19
Collaboration of Internal & External Experts Format groupAnalysis Tasks Divide respons. Format criteria Format analysis Format profiles Preserv. strategy Metad. analysis DRS content model Tool analysis Video Internal ExternalComboExternal Word Processing InternalComboExternal ComboExternalComboExternal 2D vector InternalComboExternal Combo External 3D formats InternalComboExternal Combo External Camera raw InternalComboExternal Combo External Image stacks InternalComboExternal ComboExternalComboExternal Disk images InternalComboExternal ComboExternalComboExternal
20
Ex. – Video – Format Criteria Generic criteria, prioritized – (9) Very important (ex: Dependency on a single organization or company) – (9) Somewhat important (ex: standardized) – (10) Not very important (ex: descriptive metadata support) (7) Format-specific criteria, examples: – Ability to encode in true lossless compression – Max resolution
21
Ex. – Video – Format Analysis
22
Ex. - Video Preservation Strategy Prefer several formats as archival – uncompressed, JPEG 2000, MPEG-2 and DV (for DV tape) – provide a video reformatting service for these Accept a few popular proprietary formats but expect to fast-track migrations for them – DNxHD, ProRes Few wrapper formats (QT, MXF) One delivery format (H.264)
23
Ex. Video – Metadata Analysis Technical metadata – EBU Core 1.5 (aligns well with AES-60, structure mirrors MediaInfo’s output) Source metadata – A revised UTVideoSrc (native suitability to physical media, right amount of detail) Process history – A revised reVTMD (specific, simple, sufficient)
24
Ex. – Video – DRS Content Model VIDEO OBJECT = 1 Object Descriptor 1..n Video Files 0..n Video Files VIDEO OBJECT = 1 Object Descriptor 1..n Video Files 0..n Video Files 1 metadata file and 1 or more derivative video files HAS_SOURCE
25
VIDEO OBJECT VIDEO EDIT DECISION LIST OBJECT DOUBLE SYSTEM AUDIO OBJECT DOUBLE SYSTEM AUDIO OBJECT CLOSED CAPTION DATA OBJECT SUBTITLE DATA OBJECT POSTER FRAME OBJECT DISK IMAGE OBJECT HAS_DOCUMENTATION HAS_LARGER_CONTEXT HAS_SUPPLEMENT
26
Ex. – Video – Tool Analysis Incorporate MediaInfo into FITS (fitstool.info) Make FITS track-aware
27
Models for Obtaining Format Expertise Our old model – build all expertise internally (slow, inefficient) Our new model – build a network of external experts to back up internal experts Other potential models – Rely completely on external experts (risky?) – Dig. Pres. institutions form a network of experts; declare areas of expertise (NDSA idea)
28
Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.