Page 1 Content from the Library of Congress DPOE Baseline Modules version 2.0, Nov 2011 Jody L. DeRidder University of Alabama Libraries

Slides:



Advertisements
Similar presentations
E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started.
Advertisements

Platter Planning Tool For Trusted Electronic Repositories
Pulling it all together… with thanks to Sheila Anderson.
Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From.
Digital Preservation and Trusted Digital Repositories Priscilla Caplan Florida Center for Library Automation ALA 2005 Chicago IL.
Digital Preservation Steps 1 & 2: Identify & Select.
Policy on digital records preservation in the NSW public sector Cassie Findlay Senior Project Officer, Government Recordkeeping.
ICDL-Contentra Workshop 29 th November /11/2013 Contentra Technologies Confidential (RajuB)1.
An Introduction to Digital Preservation An Introduction to Digital Preservation Jody L. DeRidder University of Alabama Libraries (First.
Dr. Helen R. Tibbo, Alumni Distinguished Professor School of Information & Library Science University of North Carolina at Chapel Hill Chapel Hill, NC.
A centre of expertise in data curation and preservation MIS Seminar :: University of Edinburgh :: 2 October 2006 Funded by: This work is licensed under.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
By Eileen Clegg Digital Preservation at Columbia in the Old Days (2009)
TRAC / TDR ICPSR Trustworthy Digital Repositories.
Mark Evans, Tessella Digital Preservation Boot Camp – PASIG meeting, Washington DC, 22 nd May 2013 PREMIS Practical Strategies For Preservation Metadata.
Depositing and Disseminating Digital Resources Alan Morrison Collections Manager AHDS Subject Centre for Literature, Linguistics and Languages.
NHPRC ELECTRONIC RECORDS RESEARCH FELLOWSHIP SYMPOSIUM Nov. 19, 2004 Rebecca Schulte University of Kansas Project Title: Testing Boundaries—An Exploration.
Information Asset Classification
Ingest and Dissemination with DAITSS Presented by Randy Fischer, Programmer, Florida Center for Library Automation, University of Florida DigCCurr2007.
Statewide Digitization and the FCLA Digital Archive Priscilla Caplan, Florida Center for Library Automation Statewide Digitization Planners Meeting OCLC,
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
Considerations for Storage and Protection of Content An Introduction to Digital Preservation (Second of 3 ASERL Webinars) February 14, 2011 Jody L. DeRidder.
Science Archives in the 21st Century 25/26 April Towards an International standard for Audit and Certification of Digital Repositories David Giaretta.
World Data Center for Human Interactions in the Environment Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as.
Repository Requirements and Assessment August 1, 2013 Data Curation Course.
NDSA Update: TRAC Review Project and DPOE Nancy Y McGovern, MIT Libraries 1 st NDSR-NE on May 10, 2013.
Page 1 An Introduction to Digital Preservation Make Plans to Manage Content and Provide Access Over Time (Last of 3 ASERL Webinars) February 21, 2011 Jody.
Recordkeeping for Good Governance Toolkit Digital Recordkeeping Guidance Funafuti, Tuvalu – June 2013.
Digital Preservation: Store & Protect Laurie Sauer Information Technologies Librarian Knox College
Investing in the Long-Term Viability of British Columbia’s Digital Collections A presentation to the Steering Committee of the B.C. Digitization Coalition.
24 March 2010Atlanta, Georgia Passing it on: Notes on digital initiative sustainability Marty Kurth HBCU Library Alliance – Cornell University Library.
11-15 April 2011 Mauritius Institute of Health S.S.Pillai
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
OAIS Open Archival Information System. “Content creators, systems developers, custodians, and future users are all potential stakeholders in the preservation.
Relationships July 9, Producers and Consumers SERI - Relationships Session 1.
OAIS in the Library Environment Managing and Preserving Electronic Resources FLICC/CENDI Washington DC, December 11,2001 Anne Van Camp RLG, Member Initiatives.
DigCCurr Professional Institute: Curation Practices for the Digital Object Lifecycle Digital Curation Program Development Nancy Y McGovern Research Assistant.
Libraries, Archives, and Digital Preservation: The Reality of What We Must Do Leslie Johnston Acting Director, National Digital Information Infrastructure.
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
Digital Preservation: Current Thinking Anne Gilliland-Swetland Department of Information Studies.
Digital preservation activities at the NLW Sally McInnes 18 September 2009.
E.Soundararajan R.Baskaran & M.Sai Baba Indira Gandhi Centre for Atomic Research, Kalpakkam.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
APT Trustworthy Digital Repository / Certification Working Group Progress Report, October 2015 Stephen Paul Davis, Columbia University Libraries.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Digital Preservation across the technologies, strategies, open standards & interoperability aspects including the legal issues Pratik Shrivastava Scientist.
United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Introduction to Census Archiving Session.
NDSR Boston webinar: Digital Preservation Introduction Presenter: Nancy Y McGovern October 2015.
Steps Identify - what digital content do you have? Select - what portion of that content will be preserved? Store - what issues are there for long term.
NDSR Boston webinar: Provide module Presenter: Nancy Y McGovern December 2015.
Aligning Digital Preservation Policies with Community Standards Nancy McGovern Digital Preservation Officer.
Managing Access at the University of Oregon : a Case Study of Scholars’ Bank by Carol Hixson Head, Metadata and Digital Library Services
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
SEDAC Long-Term Archive Development Robert R. Downs Socioeconomic Data and Applications Center Center for International Earth Science Information Network.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
Data Stewardship Lifecycle A framework for data service professionals Protectors of data.
Digital Preservation What, Why, and How? Dan Albertson’s Digital Libraries Class April 13, 2016 Jody DeRidder Head, Metadata & Digital Services University.
Joint Meeting of CSUL Committees,
Digital Imaging in an Archives World
Ingest and Dissemination with DAITSS
Trustworthiness of Preservation Systems
Statewide Digitization and the FCLA Digital Archive
Implementing an Institutional Repository: Part II
Digital Preservation and Trusted Digital Repositories
Nancy Y. McGovern Digital Preservation Officer, ICPSR IASSIST 2007
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
Presentation transcript:

Page 1 Content from the Library of Congress DPOE Baseline Modules version 2.0, Nov 2011 Jody L. DeRidder University of Alabama Libraries July 16, 2012 An Introduction to Digital Preservation

Page 2 DPOE Modules Identify Identify - what digital content do you have? Select Select - what portion of that content will be preserved? Store Store - what issues are there for long term storage? Protect Protect - what steps are needed to protect your digital content? Manage Manage - what provisions are needed for long-term management? Provide Provide - what considerations are there for long-term access? DPOE Baseline Modules: Intro, version 2.0, Nov 2011

Page 3 identify select store protect manage provide Managing Content Over Time DPOE Baseline Modules: Intro, version 2.0, Nov 2011

Page 4 Why do we identify content? SCOPE! DPOE Baseline Modules: Identify, version 2.0, Nov 2011 Image from:  Preservation requires an explicit commitment of resources  Effective planning is based on knowing the extent of what will be preserved  Identifying content is a first step to planning for current and future preservation needs  An explicit inventory is the best way to identify content

Page 5 Content Categories Inventories should include all relevant material: Institutional records Special collections Scholarly content – licensed and open Research data Web content Digitized collections DPOE Baseline Modules: Identify, version 2.0, Nov 2011

Page 6 Example entry: Category: Special Collections Title/Description: Railroad Photographs, SE U.S. Type: images, digitized Format: TIFF Extent: 242 GB; 2,250 images Location: archival server in Room A, Central IT Coverage Dates: early 1900’s Creation date: January-June 2006 Inventoried: 12/15/2011, by Fred Jones

Page 7 Selecting Content for Preservation: Why do it? Storage may be cheap, but management is not … especially over time Sustaining the quality of content takes effort Continually changing discovery and dissemination services will be needed as hardware and software change … think scale, scope, performance, sustainability … think scale, scope, performance, sustainability DPOE Baseline Modules: Select, version 2.0, Nov 2011

Page 8 Selection Criteria: matching mission to content… Acquisition or collection development policy Departmental criteria (priorities, precedents) Research criteria (interests, significance) Uniqueness (only source) Value (historical, evidential, can’t reproduce) DPOE Baseline Modules: Select, version 2.0, Nov 2011

Page 9 Practical Considerations Stop if or when the answer is ‘no’… 1.Content –does the content have value? –does it fit your scope? 2.Technical –is it feasible for you to preserve the content? 3.Access –is it possible to make the content available? DPOE Baseline Modules: Select, version 2.0, Nov 2011

Page 10 Selection starts at the beginning … Contact content creators (as needed) –Arrange a convenient time for them –Prepare brief statement of outcomes –Identify list of materials to review with them –Send a reminder before the meeting –Document the results and send them a copy DPOE Baseline Modules: Select, version 2.0, Nov 2011 Prevent later headaches!

Page 11 STORAGE involves… What you store –File Formats –Metadata How you store it –Number of copies –Storage media –Repository selection

Page 12 What are storage needs? Archival Storage Archival Storage manages content as objects objects files + metadata = object Digital content (files + metadata = object): May include any types –e.g., images, text, sound, video, maps Requires some identification and description –Captured as metadata DPOE Baseline Modules: Store, version 2.0, Nov 2011

Page 13 Selecting File Formats for Text “…the agency must clearly define the purpose and the requirements for preservation… The appropriate answer will depend on: the mission of the agency the kind of information to be preserved the uses to which the objects may be put in the future the expectations of current and future users, and how far into the future the objects are intended to remain useful. CENDI Digital Preservation Task Group. “Formats for Digital Preservation: A Review of Alternatives and Issues”, Revised Mar. 1, p ts_WhitePaper_ pdf For text:  TIFF  XML  PDF / A

Page 14 Selecting File Formats for Images Sustainability factors: Disclosure Adoption Transparency Self-documentation External dependencies Impact of patents Technical protection mechanisms Bill LeFurgy, October 12, “Digital Preservation- Friendly File Formats for Scanned Images” al-preservation-file-formats-for-scanned-images/ al-preservation-file-formats-for-scanned-images/ For images:  TIFF  JPEG 2000  PDF / A

Page 15 Which Formats Are Best? Sustainability of Digital Formats Planning for Library of Congress Collections

Page 16 Importance of Metadata How do you know what an object is? − Metadata uniquely identifies digital objects How do you use content in the future? –Metadata makes digital objects understandable How do you know an object is authentic? –Metadata allows objects to be traced over time Metadata enables long-term preservation DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011

Page 17 Preservation Metadata Content (what), Fixity (unchanged), Provenance (life story), Reference (this thing), Context (relationships) Administrative (manage) Structural (understand, use) Descriptive (find, use) Object-level Metadata Diagram courtesy DPM Workshops DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011

Page 18 Object Metadata Characteristics Content: preserve the substance Fixity: demonstrate content is unchanged Reference: identify as this content and no other Provenance: trace to its origin (or to deposit) Context: preserve linkages with other objects Original source: Preserving Digital Information Report, 1996 DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011

Page 19 Number of Copies How many copies are enough for you? Minimum: two (2) copies in two locations Optimum: six (6) copies Examples of storage factors: Video files are too large to store 6 copies Possible legal restrictions (e.g., storage locations) Types of media used for storing the content DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011

Page 20 Storage Media Options Content (objects) are kept on storage media Options include: online, near-line, offline Factors for choosing options include –Cost (available resources for preservation) –Quantity (size and number of files) –Expertise (skills required to manage) –Partners (achieving geographic distribution) –Services (outsourcing) DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011

Page 21 Multiple, geographically distributed copies Storage Partners or Hosted Services Storage Considerations Services and collaborations can make it easier for organizations to manage content over time DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011

Page 22 Repository Selection Range of types to consider: –general (any content) to special (format-specific) –open source to proprietary –unified to distributed –easy to advanced installation and management Each option has pros and cons No system is fully compliant to standards Select best option for your content – for now DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011

Page 23 What are we protecting content from? Change and loss – accidental and intentional Obsolescence – as technology evolves Inappropriate access – e.g., confidential data Non-compliance – standards and requirements Disasters – emergencies of all kinds DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Protect, version 2.0, Nov 2011

Page 24 Obsolescence Prevention Terminology Refresh : moving content to newer media Migrate : moving content to newer formats that can be accessed with current hardware and software Normalize: migrate to archival formats that meet your specifications Emulate : attempting to provide the original look and feel of the content with newer software

Page 25 Readiness Proper planning should allow you to: Prevent – undesirable outcomes Predict – most likely risks and threats Detect – errors, problems, damage Respond – with appropriate measures Repair – damage or possible loss DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Protect, version 2.0, Nov 2011

Page 26 Everyday Protection Know where your content is located –Onsite and offsite; online and offline Know who can have access to it –DP staff, IT staff, others? Manage authentication information –For staff, depositors, users Track and review usage then adjust practices –Web use, internal use and activities, maintenance DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Protect, version 2.0, Nov 2011

Page 27 Emergency Protection Engage in ongoing disaster planning –Establish committee and share information –Develop and maintain documents Identify possible outcomes and prepare –e.g., server goes down, media is damaged DPOE Baseline Modules: Protect, version 2.0, Nov 2011

Page 28 Disaster Planning Resources DPOE Baseline Modules: Protect, version 2.0, Nov 2011

Page 29 Why do we emphasize management? Preserving Digital Information (PDI), 1996 DPOE Baseline Modules: Manage, version 2.0, Nov 2011 Rapid technological obsolescence Media fragility Legal and organizational environment in flux Complex practical issues Lack of clarity as to procedures and responsibilities Multiplicity of types of content in growing number of formats Massive amounts of content

Page 30 Balanced Management An effective approach will address: Organizational requirements and objectives Technological opportunities and change Resources – funding, staff, equipment, etc. DPOE Baseline Modules: Manage, version 2.0, Nov 2011 Kenney and McGovern, “The Five Organizational Stages of Digital Preservation”

Page 31 Preservation Planning (ongoing) Self-assessment (internal process) Audit (external review by peers) Business Continuity Disaster Planning DPOE Baseline Modules: Manage, version 2.0, Nov 2011 Organizational Requirements: Planning

Page 32 Organizational Objectives: DP Standards Standards emerging since 1996 report : Trusted Digital Repositories, 2002 Open Archival Information Systems (OAIS) Reference Model, 2003 and 2009 revision Preservation Metadata Implementation Strategies, 2005 plus updates Trustworthy Repositories Audit and Certification (TRAC), 2011 Common practices are Common practices are emerging and evolving emerging and evolving DPOE Baseline Modules: Manage, version 2.0, Nov 2011

Page 33 Trusted Digital Repository A TDR should have these characteristics: community standards (OAIS Compliance ) commitment (Administrative Responsibility) management (Organizational Viability) resources (Financial Sustainability) infrastructure (Technological … Suitability) protection and control (System Security) documentation (Procedural Accountability) DPOE Baseline Modules: Manage, version 2.0, Nov 2011

Page 34 Community Expectations: Ten Principles Available on the CRL website DPOE Baseline Modules: Manage, version 2.0, Nov 2011 organizational fitness 1)Demonstrates organizational fitness (including financial, staffing, and processes) to fulfill its commitment. fulfills responsibilities 2)Acquires and maintains requisite contractual and legal rights and fulfills responsibilities. effective and efficient policy 3)Has an effective and efficient policy framework. ingests digital objects based upon stated criteria 4)Acquires and ingests digital objects based upon stated criteria that correspond to its commitments and capabilities.

Page 35 Available on the CRL website DPOE Baseline Modules: Manage, version 2.0, Nov )Maintains/ensures the integrity, authenticity and usability 5)Maintains/ensures the integrity, authenticity and usability of digital objects it holds over time. maintains requisite metadata 6)Creates and maintains requisite metadata about: during actions taken on digital objects during preservation before and relevant contexts before preservation: production access usage Community Expectations: Ten Principles

Page 36 Available on the CRL website DPOE Baseline Modules: Manage, version 2.0, Nov 2011 commits to continuing maintenance 7)The repository commits to continuing maintenance of digital objects for identified community/communities. 8)Fulfills requisite dissemination 8)Fulfills requisite dissemination requirements. 9)Has a strategic program 9)Has a strategic program for preservation planning and action. technical infrastructure adequate 10)Has technical infrastructure adequate to continuing maintenance and security of its digital objects. Community Expectations: Ten Principles

Page 37 Technological Opportunities: Investing in Technology PrioritizePrioritize: weigh requirements to be met AssessAssess: define criteria to select appropriate SequenceSequence: identify steps to meet goals FundFund: decide when to own/join/share AnticipateAnticipate: look ahead, be prepared EvaluateEvaluate: measure outcomes and success DPOE Baseline Modules: Manage, version 2.0, Nov 2011

Page 38 Technological Opportunities: Adopting Technologies Characteristics of sound software: written in a well-documented language usable on a wide variety of platforms sustained support by creators/developers modular in design supports batch processing and workflows licenses support secondary use DPOE Baseline Modules: Manage, version 2.0, Nov 2011

Page 39 Resources: Designated Funding Funds set aside for digital preservation Measurable indication of intent to preserve Challenging to do, but important Over time, contributes to track record May not be explicit (e.g., budget line item) … but must be able to make a compelling case DPOE Baseline Modules: Manage, version 2.0, Nov 2011

Page 40 Resources: Sustainable Access Effective and sustainable DP programs address: Value – understand and stress content value Roles – identify stakeholders and involve them Incentives – identify “carrots” for preserving Identify and address costs across life cycle See: Blue Ribbon Task Force Report on Sustainable Preservation and Access Report DPOE Baseline Modules: Provide, version 2.0, Nov 2011

Page 41 What is Long-term Access? Preservation relies upon proven technologies to preserve digital objects across generations of technology accumulates metadata over the life cycle to trace preserve content DPOE Baseline Modules: Provide, version 2.0, Nov 2011 Access relies on cutting edge technologies to provide best and fastest access at a point in time selects metadata needed to use and understand content Preservation makes long-term access possible…

Page 42 Preservation preservation systems create new versions of digital objects for access to deliver as needs change over time purpose: ensure long-term access focus: future users Access access systems deliver objects with user-oriented services to make the objects purpose: provide content to users focus: current users Preservation makes long-term access possible… DPOE Baseline Modules: Provide, version 2.0, Nov 2011 What is Long-term Access?

Page 43 Understand Users Who are your users? Track and respond to them. User expectations will change over time, and must be monitored. Preservation provides pathway from one generation of technology to the next Digital content will need to be packaged in new ways for delivery over time. DPOE Baseline Modules: Provide, version 2.0, Nov 2011

Page 44 Access Policies: Issues Who is allowed to have access to content? Are access policies equal for all content? If not, how are categories managed? How are exceptions/special requests handled? How do users request/get access? What options (if any) do users have? Consider using FAQs as a step to develop policies Consider using FAQs as a step to develop policies DPOE Baseline Modules: Provide, version 2.0, Nov 2011

Page 45 Legal issues include copyright, but copyright is only a portion of legal issues in DP Legal questions emerge throughout lifecycle … and most of us are not lawyers Access raises legal issues, but manage from submission (or before) throughout lifecycle DP requires well-formed, valid documentation − agreements, contracts, licenses, policies, etc. Good legal advice should enable well-formed evidential documentation and transparency Managing Life Cycle Legal Issues DPOE Baseline Modules: Provide, version 2.0, Nov 2011

Page 46 The Donor grants […] and its agents the right to: Digitize all submitted content, and create derivative representations for web access Reproduce and distribute reprints or derivative representations for noncommercial scholarly purposes Augment or create metadata to enhance accessibility and management of content Electronically view, present and display the full digital content to others, including providing open access via the web Electronically store, archive, copy and/or convert the digitized content for preservation and access purposes Create and use Permissions Agreements

Page 47 DPOE Baseline Principles (1-2) 1.Define the digital content within your scope of responsibility [Identify] 2.Specify the digital content you need/want to preserve [Select] DPOE Baseline Modules: Wrap Up, version 2.0, Nov 2011

Page 48 DPOE Baseline Principles (3-6) 3.Establish requirements for storing files in preservation formats [Store] 3.Determine (and review) your best option for storing your content [Store] 3.Ensure that your content is secure during day-to-day activities [Protect] 3.Work to ensure that your content is prepared for an emergency [Protect] DPOE Baseline Modules: Wrap Up, version 2.0, Nov 2011

Page 49 7.Develop (and review) plans for managing content over time [Manage] 8.Use policies to contain and develop your preservation program [Manage] 9. Remember that long-term access is the purpose of preservation [Provide] 10.Make sure the means to deliver content to users remains current [Provide] DPOE Baseline Principles (7-10) DPOE Baseline Modules: Wrap Up, version 2.0, Nov 2011 ©iStockphoto.com/CGinspiration

Page 50 Resources “Digital Preservation Management: Implementing Short-Term Strategies for Long-Term Problems”  Online Tutorial:  Survey of Institutional Readiness: "Planning for Digital Preservation: 20 Questions for Providers of Digital Storage Services," Bernard Reilly, Center for Research Libraries ofDigitalStoragefinal.pdf ofDigitalStoragefinal.pdf "Digital Preservation Metadata Standards," Angela Dappert and Marcus Enders, Information Standards Quarterly, Spring 2010, Volume 22, Issue 2 qv22no2.pdf qv22no2.pdf

Page 51 More Resources ICPSR Digital Curation: Trustworthy Repositories Audit & Certification (TRAC): Criteria and Checklist (2007): [NOTE: ISO version of TRAC approved fall 2011] Center for Research Libraries Reports on Digital Archives and Repositories: archives/digital-archive-reportshttp:// archives/digital-archive-reports “Digital Preservation Outreach and Education,” Library of Congress. / /