Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Evolving Process to Add Preservation Support for New Formats at Harvard Library IS&T Archiving 2015 Andrea Goethals. Franziska Frey and David Ackerman.

Similar presentations


Presentation on theme: "The Evolving Process to Add Preservation Support for New Formats at Harvard Library IS&T Archiving 2015 Andrea Goethals. Franziska Frey and David Ackerman."— Presentation transcript:

1 The Evolving Process to Add Preservation Support for New Formats at Harvard Library IS&T Archiving 2015 Andrea Goethals. Franziska Frey and David Ackerman

2 Digital Repository Service (DRS), Harvard Library’s Dig. Pres. Repository

3 DRS “Support” Allowable in at least one DRS “content model” Repository tools “know” the format Usable now (e.g. through delivery services) Preservation staff reasonably certain it can be made usable on an ongoing basis via interventions

4 Formats Supported Per Year Text XML Target Images Kodak PhotoCD GIF RealAudio Tiff JPEG JP2 ICC color profiles AIFF ESRI world files WAVSMIL playlists Web harvests GZIP containers ZIP containers PDF documents Email

5 55 Harvard Units Using the DRS

6 Born Digital Formats in Harvard Libraries Number of Libraries (out of 21 that answered) Already have Will have in 3 years Source: HL Preservation Needs Assessment (2013)

7 DRS Format Requests (2004 -) Chart last updated: 12/23/2013 (39 requests for 53 formats)

8 DRS Format Requests (2004 -)

9 Support Gap Additional audio Video Vector graphics PDF Databases E-articles Datasets DNG Word processing docs Spreadsheets Email Software CAD Presentations 3D Models E-books E-newspapers Python notebooks Shapefiles Disk images

10 2008: Stop-Gap Solution “Opaque objects and containers” Any format, BUT... – Only bit-level preservation – No delivery – Very coarse description – Less attention by preservation staff Moderate uptake - < 20,000 Zip files

11 Adding Format Support – Old Workflow All analysis & development done in-house – by existing staff – concurrently with other projects / operations – intermittently (requiring re-familiarization) Sometimes stalled by lack of expertise Ad-hoc, undocumented process

12 Fast-Tracking Experiment 3 year project enabled by the Arcadia Foundation Formats: – video – word processing – vector graphics – 3D graphics – disk images – image stacks – spreadsheets – presentations Goal: create a faster format support workflow that can be repeated

13 Analysis & Development for New Format A Analysis & Development for New Format B old way: sequential development, one after the other, in between other work

14 Analysis for New Format A Development for New Format A new way: 1.) split into 2 sub-projects Analysis & Development for New Format A

15 Analysis for New Format A Development for New Format A new way: 2.) hire consultants to help with analysis

16 Analysis for New Format A Development for New Format A new way: 3. schedule independently and in parallel as expertise and resources become available Analysis for New Format B Analysis for New Format C Development for New Format B Development for New Format C Analysis for New Format D Analysis for New Format E Analysis for New Format F Development for New Format D etc... Specifications & guidelines are ready in advance for developers

17 Format Expert Consultants AVPreserve (video, disk images, image stacks) Paul Wheatley Consulting (word processing documents, spreadsheets, presentations) Applied Informatics Group (AIG) at the College of Computing and Informatics (CCI), Drexel University (vector graphics, 3D formats) Tarkus Imaging Inc. (camera raw images)

18 Analysis Tasks 1.Divide up analysis responsibilities 2.Determine format analysis criteria 3.Analyze formats 4.Create format profiles 5.Determine preservation strategy 6.Analyze metadata 7.Design DRS content model 8.Analyze tools

19 Collaboration of Internal & External Experts Format groupAnalysis Tasks Divide respons. Format criteria Format analysis Format profiles Preserv. strategy Metad. analysis DRS content model Tool analysis Video Internal ExternalComboExternal Word Processing InternalComboExternal ComboExternalComboExternal 2D vector InternalComboExternal Combo External 3D formats InternalComboExternal Combo External Camera raw InternalComboExternal Combo External Image stacks InternalComboExternal ComboExternalComboExternal Disk images InternalComboExternal ComboExternalComboExternal

20 Ex. – Video – Format Criteria Generic criteria, prioritized – (9) Very important (ex: Dependency on a single organization or company) – (9) Somewhat important (ex: standardized) – (10) Not very important (ex: descriptive metadata support) (7) Format-specific criteria, examples: – Ability to encode in true lossless compression – Max resolution

21 Ex. – Video – Format Analysis

22 Ex. - Video Preservation Strategy Prefer several formats as archival – uncompressed, JPEG 2000, MPEG-2 and DV (for DV tape) – provide a video reformatting service for these Accept a few popular proprietary formats but expect to fast-track migrations for them – DNxHD, ProRes Few wrapper formats (QT, MXF) One delivery format (H.264)

23 Ex. Video – Metadata Analysis Technical metadata – EBU Core 1.5 (aligns well with AES-60, structure mirrors MediaInfo’s output) Source metadata – A revised UTVideoSrc (native suitability to physical media, right amount of detail) Process history – A revised reVTMD (specific, simple, sufficient)

24 Ex. – Video – DRS Content Model VIDEO OBJECT = 1 Object Descriptor 1..n Video Files 0..n Video Files VIDEO OBJECT = 1 Object Descriptor 1..n Video Files 0..n Video Files 1 metadata file and 1 or more derivative video files HAS_SOURCE

25 VIDEO OBJECT VIDEO EDIT DECISION LIST OBJECT DOUBLE SYSTEM AUDIO OBJECT DOUBLE SYSTEM AUDIO OBJECT CLOSED CAPTION DATA OBJECT SUBTITLE DATA OBJECT POSTER FRAME OBJECT DISK IMAGE OBJECT HAS_DOCUMENTATION HAS_LARGER_CONTEXT HAS_SUPPLEMENT

26 Ex. – Video – Tool Analysis Incorporate MediaInfo into FITS (fitstool.info) Make FITS track-aware

27 Models for Obtaining Format Expertise Our old model – build all expertise internally (slow, inefficient) Our new model – build a network of external experts to back up internal experts Other potential models – Rely completely on external experts (risky?) – Dig. Pres. institutions form a network of experts; declare areas of expertise (NDSA idea)

28 Thank you!


Download ppt "The Evolving Process to Add Preservation Support for New Formats at Harvard Library IS&T Archiving 2015 Andrea Goethals. Franziska Frey and David Ackerman."

Similar presentations


Ads by Google