Digital Preservation Steps 1 & 2: Identify & Select
Steps Identify - what digital content do you have? Select - what portion of that content will be preserved? Store - what issues are there for long term storage? Protect - what steps are needed to protect your digital content? Manage - what provisions are needed for long-term management? Provide - what considerations are there for long-term access? DPOE Baseline Modules: Identify, version 2.0, Nov 2011
identify select store protect manage provide DPOE Baseline Concepts
DPOE Baseline Modules: Identify, version 3.0 Problem Summary: Preservation is a resource commitment, so we need to effectively plan for our current and future preservation needs, because not all digital content will be preserved. Solution: An explicit inventory is the best way to identify content
How will an inventory help? Good preservation decisions are based on a deep understanding of the possible content to be preserved Possible to preserve Actually preserved All Content
Inventory Considerations An inventory’s content is more important than style and format however… Inventory results should preferably be: – Scalable: content will be added during Select – Available: accessible to team, managers, others – Usable: simple format to sort, list, etc. – Current: update periodically – Electronic: Needs to be a dynamic format – Documented: an inventory needs to be captured DPOE Baseline Modules: Identify, version 3.0
Inventory Tips Use available, familiar software to get started – What software or tools do you already have? – What free or open source tools might be useful? DPOE Baseline Modules: Identify, version 3.0 Be consistent, comprehensive, and concise
Inventory Scope What content are we already preserving? What other digital content do we have? What content do/will our producers create? What content are we required to keep? What content do we need to review? DPOE Baseline Modules: Identify, version 2.0, Nov 2011
Exercise Where is all our content? At your tables, think about the digital content at your library, where it’s located, and what kinds of files there are
Level of Detail Inventories can be general to detailed Determine appropriate level of detail for you Factors in determining level of detail: – Extent of content to be inventoried – Nature and location of content to be inventoried – Resources available to complete inventory – Timeframe, deadlines for completing inventory DPOE Baseline Modules: Identify, version 2.0, Nov 2011
Content Categories Inventories should include all relevant, e.g.: Institutional records Special collections & Archives Scholarly content – licensed and open Research data Web content DPOE Baseline Modules: Identify, version 2.0, Nov 2011
Format Types An inventory should identify format types within categories of content - examples: Indicate the range of file types when possible Images Video Audio Text Maps/geospatial Drawings Web content Structured data DPOE Baseline Modules: Identify, version 3.0
Date Considerations Inventories should note: Date of inventory – and updates to it Date of files – when possible Dates covered in content – even approximate Date created/received – if relevant, possible DPOE Baseline Modules: Identify, version 2.0, Nov 2011
Location Issues Locations of content are important – consider: Method to specify online/offline location General location – e.g., with us, with creator Ability to change locations as content moves Method storage systems use to note location Be clear enough without going to extremes… DPOE Baseline Modules: Identify, version 2.0, Nov 2011
Location Cloud Platform
Sample Basic Inventory Category: Special Collections - Slides Title/Description: Circus photographs Type: Images, digitized Format: TIFF Extent: 242 GB, 2250 images Location: Server (Systems), CDs (Digital Center) Coverage dates: early 1950s, Creation date: , Inventoried: by Andrew Huot, November, 2013
Identify - what digital content do you have? Select – what portion of that content will be preserved? Store - what issues are there for long term storage? Protect - what steps are needed to protect your digital content? Manage - what provisions are needed for long-term management? Provide - what considerations are there for long-term access? DPOE Baseline Modules: Select, version 2.0, Nov 2011 Steps
Why be selective? Storage may be cheap, management is not … especially over time DPOE Baseline Modules: Select, version Tb Hard drive= $100IT Department = $100 hour
Why be selective? Quality of content DPOE Baseline Modules: Select, version 3.0
Why be selective? Discovery and dissemination services … scale, scope, performance, sustainability DPOE Baseline Modules: Select, version 3.0
Why be selective? Match mission to content: What kind of content would this organization preserve? DPOE Baseline Modules: Select, version 3.0 Cottonwood Foundation, a charitable grant-making organization, is dedicated to promoting empowerment of people, protection of the environment, and respect for cultural diversity.
Terminology for Select Different terms in different domains: Archives – appraisal and scheduling Libraries – e.g., selection Museums – e.g., acquisition Records Management – vital and non-vital Commercial media - channelization DPOE Baseline Modules: Select, version 3.0
Steps to Select DPOE Baseline Modules: Select, version 3.0 Review your potential digital content Implement your decisions Document (and preserve) selection decisions Define and apply selection criteria
Tons of Review Priorities Most significant (producer, content) Most extensive Most requested Easiest (e.g., most familiar) Oldest (possible historical importance) Newest (possible immediate interest) Mandate (local, legislation, etc.) DPOE Baseline Modules: Select, version 3.0
Another layer: Audience / Stakeholders DPOE Baseline Modules: Select, version 3.0
Quick Review Tool Stop if or when the answer is ‘no’… 1.Content – does the content have value (consider stakeholders)? – does it fit your scope? 2.Technical – is it feasible for you to preserve the content? 3.Access – is it possible to make the content available? DPOE Baseline Modules: Select, version 3.0
To Select or Not to Select? Stakeholders Archive researchers Television viewers Stockholders Web site users Policy questions Is held in other archives Does it have use value beyond this one project? Does it fit our policy? DPOE Baseline Modules: Select, version 3.0 Selection can
Project Management Treat selection as an ongoing structured project to plan and coordinate the process DPOE Baseline Modules: Select, version 3.0
Augment Inventory Add Descriptions – more granular – Not item level, but enough to specify categories (additional information from the creator) DPOE Baseline Modules: Select, version 3.0 Oh by the way, that whole part about me being Italian in my diary is only half true.
Augment Inventory Supplement inventory from Identify Extent – How much content is there/will there be? (number of files, megabytes, number of subfolders) – When will content no longer be active/disposition? DPOE Baseline Modules: Select, version 3.0 A file full of footage has several extents: Number of files, length of footage, number of gigabytes, number of reels digitized, number of subseries, etc.
Outcomes for identifying & selecting content to preserve. Expand inventories of content. Permit agreements with producers such as retention schedules, acquisition lists, submission agreements Objectives. Gain control of possible content for planning. Develop a sustainable program. Identify potential digital content you may need to preserve. Treat the inventory as a management tool that grows as your program grows. Use it as a planning tool to prepare.. e.g., staff, training, annual growth. Provides a basis for acquiring content, defining submission agreements, plans