Download presentation
Presentation is loading. Please wait.
Published byLeslie McDonald Modified over 9 years ago
1
Page 1 Content from the Library of Congress DPOE Baseline Modules version 2.0, Nov 2011 Jody L. DeRidder University of Alabama Libraries jlderidder@ua.edu July 16, 2012 An Introduction to Digital Preservation
2
Page 2 DPOE Modules Identify Identify - what digital content do you have? Select Select - what portion of that content will be preserved? Store Store - what issues are there for long term storage? Protect Protect - what steps are needed to protect your digital content? Manage Manage - what provisions are needed for long-term management? Provide Provide - what considerations are there for long-term access? DPOE Baseline Modules: Intro, version 2.0, Nov 2011
3
Page 3 identify select store protect manage provide Managing Content Over Time DPOE Baseline Modules: Intro, version 2.0, Nov 2011
4
Page 4 Why do we identify content? SCOPE! DPOE Baseline Modules: Identify, version 2.0, Nov 2011 Image from: http://en.wikipedia.org/wiki/Telescopic_sight Preservation requires an explicit commitment of resources Effective planning is based on knowing the extent of what will be preserved Identifying content is a first step to planning for current and future preservation needs An explicit inventory is the best way to identify content
5
Page 5 Content Categories Inventories should include all relevant material: Institutional records Special collections Scholarly content – licensed and open Research data Web content Digitized collections DPOE Baseline Modules: Identify, version 2.0, Nov 2011
6
Page 6 Example entry: Category: Special Collections Title/Description: Railroad Photographs, SE U.S. Type: images, digitized Format: TIFF Extent: 242 GB; 2,250 images Location: archival server in Room A, Central IT Coverage Dates: early 1900’s Creation date: January-June 2006 Inventoried: 12/15/2011, by Fred Jones
7
Page 7 Selecting Content for Preservation: Why do it? Storage may be cheap, but management is not … especially over time Sustaining the quality of content takes effort Continually changing discovery and dissemination services will be needed as hardware and software change … think scale, scope, performance, sustainability … think scale, scope, performance, sustainability DPOE Baseline Modules: Select, version 2.0, Nov 2011
8
Page 8 Selection Criteria: matching mission to content… Acquisition or collection development policy Departmental criteria (priorities, precedents) Research criteria (interests, significance) Uniqueness (only source) Value (historical, evidential, can’t reproduce) DPOE Baseline Modules: Select, version 2.0, Nov 2011
9
Page 9 Practical Considerations Stop if or when the answer is ‘no’… 1.Content –does the content have value? –does it fit your scope? 2.Technical –is it feasible for you to preserve the content? 3.Access –is it possible to make the content available? DPOE Baseline Modules: Select, version 2.0, Nov 2011
10
Page 10 Selection starts at the beginning … Contact content creators (as needed) –Arrange a convenient time for them –Prepare brief statement of outcomes –Identify list of materials to review with them –Send a reminder before the meeting –Document the results and send them a copy DPOE Baseline Modules: Select, version 2.0, Nov 2011 Prevent later headaches!
11
Page 11 STORAGE involves… What you store –File Formats –Metadata How you store it –Number of copies –Storage media –Repository selection
12
Page 12 What are storage needs? Archival Storage Archival Storage manages content as objects objects files + metadata = object Digital content (files + metadata = object): May include any types –e.g., images, text, sound, video, maps Requires some identification and description –Captured as metadata DPOE Baseline Modules: Store, version 2.0, Nov 2011
13
Page 13 Selecting File Formats for Text “…the agency must clearly define the purpose and the requirements for preservation… The appropriate answer will depend on: the mission of the agency the kind of information to be preserved the uses to which the objects may be put in the future the expectations of current and future users, and how far into the future the objects are intended to remain useful. CENDI Digital Preservation Task Group. “Formats for Digital Preservation: A Review of Alternatives and Issues”, Revised Mar. 1, 2007. p.22. http://www.cendi.gov/publications/CENDI_PresForma ts_WhitePaper_03092007.pdf For text: TIFF XML PDF / A
14
Page 14 Selecting File Formats for Images Sustainability factors: Disclosure Adoption Transparency Self-documentation External dependencies Impact of patents Technical protection mechanisms Bill LeFurgy, October 12, 2011. “Digital Preservation- Friendly File Formats for Scanned Images” http://blogs.loc.gov/digitalpreservation/2011/10/digit al-preservation-file-formats-for-scanned-images/ http://blogs.loc.gov/digitalpreservation/2011/10/digit al-preservation-file-formats-for-scanned-images/ For images: TIFF JPEG 2000 PDF / A
15
Page 15 Which Formats Are Best? Sustainability of Digital Formats Planning for Library of Congress Collections http://www.digitalpreservation.gov/formats/
16
Page 16 Importance of Metadata How do you know what an object is? − Metadata uniquely identifies digital objects How do you use content in the future? –Metadata makes digital objects understandable How do you know an object is authentic? –Metadata allows objects to be traced over time Metadata enables long-term preservation DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011
17
Page 17 Preservation Metadata Content (what), Fixity (unchanged), Provenance (life story), Reference (this thing), Context (relationships) Administrative (manage) Structural (understand, use) Descriptive (find, use) Object-level Metadata Diagram courtesy DPM Workshops DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011
18
Page 18 Object Metadata Characteristics Content: preserve the substance Fixity: demonstrate content is unchanged Reference: identify as this content and no other Provenance: trace to its origin (or to deposit) Context: preserve linkages with other objects Original source: Preserving Digital Information Report, 1996 DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011
19
Page 19 Number of Copies How many copies are enough for you? Minimum: two (2) copies in two locations Optimum: six (6) copies Examples of storage factors: Video files are too large to store 6 copies Possible legal restrictions (e.g., storage locations) Types of media used for storing the content DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011
20
Page 20 Storage Media Options Content (objects) are kept on storage media Options include: online, near-line, offline Factors for choosing options include –Cost (available resources for preservation) –Quantity (size and number of files) –Expertise (skills required to manage) –Partners (achieving geographic distribution) –Services (outsourcing) DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011
21
Page 21 Multiple, geographically distributed copies Storage Partners or Hosted Services Storage Considerations Services and collaborations can make it easier for organizations to manage content over time DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011
22
Page 22 Repository Selection Range of types to consider: –general (any content) to special (format-specific) –open source to proprietary –unified to distributed –easy to advanced installation and management Each option has pros and cons No system is fully compliant to standards Select best option for your content – for now DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Store, version 2.0, Nov 2011
23
Page 23 What are we protecting content from? Change and loss – accidental and intentional Obsolescence – as technology evolves Inappropriate access – e.g., confidential data Non-compliance – standards and requirements Disasters – emergencies of all kinds DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Protect, version 2.0, Nov 2011
24
Page 24 Obsolescence Prevention Terminology Refresh : moving content to newer media Migrate : moving content to newer formats that can be accessed with current hardware and software Normalize: migrate to archival formats that meet your specifications Emulate : attempting to provide the original look and feel of the content with newer software
25
Page 25 Readiness Proper planning should allow you to: Prevent – undesirable outcomes Predict – most likely risks and threats Detect – errors, problems, damage Respond – with appropriate measures Repair – damage or possible loss DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Protect, version 2.0, Nov 2011
26
Page 26 Everyday Protection Know where your content is located –Onsite and offsite; online and offline Know who can have access to it –DP staff, IT staff, others? Manage authentication information –For staff, depositors, users Track and review usage then adjust practices –Web use, internal use and activities, maintenance DPOE Baseline Modules: Identify, version 2.0, Nov 2011 DPOE Baseline Modules: Protect, version 2.0, Nov 2011
27
Page 27 Emergency Protection Engage in ongoing disaster planning –Establish committee and share information –Develop and maintain documents Identify possible outcomes and prepare –e.g., server goes down, media is damaged DPOE Baseline Modules: Protect, version 2.0, Nov 2011
28
Page 28 Disaster Planning Resources DPOE Baseline Modules: Protect, version 2.0, Nov 2011
29
Page 29 Why do we emphasize management? Preserving Digital Information (PDI), 1996 DPOE Baseline Modules: Manage, version 2.0, Nov 2011 Rapid technological obsolescence Media fragility Legal and organizational environment in flux Complex practical issues Lack of clarity as to procedures and responsibilities Multiplicity of types of content in growing number of formats Massive amounts of content
30
Page 30 Balanced Management An effective approach will address: Organizational requirements and objectives Technological opportunities and change Resources – funding, staff, equipment, etc. DPOE Baseline Modules: Manage, version 2.0, Nov 2011 Kenney and McGovern, 2003. “The Five Organizational Stages of Digital Preservation” http://www.dpworkshop.org/
31
Page 31 Preservation Planning (ongoing) Self-assessment (internal process) Audit (external review by peers) Business Continuity Disaster Planning DPOE Baseline Modules: Manage, version 2.0, Nov 2011 Organizational Requirements: Planning
32
Page 32 Organizational Objectives: DP Standards Standards emerging since 1996 report : Trusted Digital Repositories, 2002 Open Archival Information Systems (OAIS) Reference Model, 2003 and 2009 revision Preservation Metadata Implementation Strategies, 2005 plus updates Trustworthy Repositories Audit and Certification (TRAC), 2011 Common practices are Common practices are emerging and evolving emerging and evolving DPOE Baseline Modules: Manage, version 2.0, Nov 2011
33
Page 33 Trusted Digital Repository A TDR should have these characteristics: community standards (OAIS Compliance ) commitment (Administrative Responsibility) management (Organizational Viability) resources (Financial Sustainability) infrastructure (Technological … Suitability) protection and control (System Security) documentation (Procedural Accountability) DPOE Baseline Modules: Manage, version 2.0, Nov 2011
34
Page 34 Community Expectations: Ten Principles Available on the CRL website DPOE Baseline Modules: Manage, version 2.0, Nov 2011 organizational fitness 1)Demonstrates organizational fitness (including financial, staffing, and processes) to fulfill its commitment. fulfills responsibilities 2)Acquires and maintains requisite contractual and legal rights and fulfills responsibilities. effective and efficient policy 3)Has an effective and efficient policy framework. ingests digital objects based upon stated criteria 4)Acquires and ingests digital objects based upon stated criteria that correspond to its commitments and capabilities.
35
Page 35 Available on the CRL website DPOE Baseline Modules: Manage, version 2.0, Nov 2011 5)Maintains/ensures the integrity, authenticity and usability 5)Maintains/ensures the integrity, authenticity and usability of digital objects it holds over time. maintains requisite metadata 6)Creates and maintains requisite metadata about: during actions taken on digital objects during preservation before and relevant contexts before preservation: production access usage Community Expectations: Ten Principles
36
Page 36 Available on the CRL website DPOE Baseline Modules: Manage, version 2.0, Nov 2011 commits to continuing maintenance 7)The repository commits to continuing maintenance of digital objects for identified community/communities. 8)Fulfills requisite dissemination 8)Fulfills requisite dissemination requirements. 9)Has a strategic program 9)Has a strategic program for preservation planning and action. technical infrastructure adequate 10)Has technical infrastructure adequate to continuing maintenance and security of its digital objects. Community Expectations: Ten Principles
37
Page 37 Technological Opportunities: Investing in Technology PrioritizePrioritize: weigh requirements to be met AssessAssess: define criteria to select appropriate SequenceSequence: identify steps to meet goals FundFund: decide when to own/join/share AnticipateAnticipate: look ahead, be prepared EvaluateEvaluate: measure outcomes and success DPOE Baseline Modules: Manage, version 2.0, Nov 2011
38
Page 38 Technological Opportunities: Adopting Technologies Characteristics of sound software: written in a well-documented language usable on a wide variety of platforms sustained support by creators/developers modular in design supports batch processing and workflows licenses support secondary use DPOE Baseline Modules: Manage, version 2.0, Nov 2011
39
Page 39 Resources: Designated Funding Funds set aside for digital preservation Measurable indication of intent to preserve Challenging to do, but important Over time, contributes to track record May not be explicit (e.g., budget line item) … but must be able to make a compelling case DPOE Baseline Modules: Manage, version 2.0, Nov 2011
40
Page 40 Resources: Sustainable Access Effective and sustainable DP programs address: Value – understand and stress content value Roles – identify stakeholders and involve them Incentives – identify “carrots” for preserving Identify and address costs across life cycle See: Blue Ribbon Task Force Report on Sustainable Preservation and Access Report DPOE Baseline Modules: Provide, version 2.0, Nov 2011
41
Page 41 What is Long-term Access? Preservation relies upon proven technologies to preserve digital objects across generations of technology accumulates metadata over the life cycle to trace preserve content DPOE Baseline Modules: Provide, version 2.0, Nov 2011 Access relies on cutting edge technologies to provide best and fastest access at a point in time selects metadata needed to use and understand content Preservation makes long-term access possible…
42
Page 42 Preservation preservation systems create new versions of digital objects for access to deliver as needs change over time purpose: ensure long-term access focus: future users Access access systems deliver objects with user-oriented services to make the objects purpose: provide content to users focus: current users Preservation makes long-term access possible… DPOE Baseline Modules: Provide, version 2.0, Nov 2011 What is Long-term Access?
43
Page 43 Understand Users Who are your users? Track and respond to them. User expectations will change over time, and must be monitored. Preservation provides pathway from one generation of technology to the next Digital content will need to be packaged in new ways for delivery over time. DPOE Baseline Modules: Provide, version 2.0, Nov 2011
44
Page 44 Access Policies: Issues Who is allowed to have access to content? Are access policies equal for all content? If not, how are categories managed? How are exceptions/special requests handled? How do users request/get access? What options (if any) do users have? Consider using FAQs as a step to develop policies Consider using FAQs as a step to develop policies DPOE Baseline Modules: Provide, version 2.0, Nov 2011
45
Page 45 Legal issues include copyright, but copyright is only a portion of legal issues in DP Legal questions emerge throughout lifecycle … and most of us are not lawyers Access raises legal issues, but manage from submission (or before) throughout lifecycle DP requires well-formed, valid documentation − agreements, contracts, licenses, policies, etc. Good legal advice should enable well-formed evidential documentation and transparency Managing Life Cycle Legal Issues DPOE Baseline Modules: Provide, version 2.0, Nov 2011
46
Page 46 http://www.lib.ua.edu/wiki/digcoll/index.php/Digital_Services_Permission_Agreement The Donor grants […] and its agents the right to: Digitize all submitted content, and create derivative representations for web access Reproduce and distribute reprints or derivative representations for noncommercial scholarly purposes Augment or create metadata to enhance accessibility and management of content Electronically view, present and display the full digital content to others, including providing open access via the web Electronically store, archive, copy and/or convert the digitized content for preservation and access purposes Create and use Permissions Agreements
47
Page 47 DPOE Baseline Principles (1-2) 1.Define the digital content within your scope of responsibility [Identify] 2.Specify the digital content you need/want to preserve [Select] DPOE Baseline Modules: Wrap Up, version 2.0, Nov 2011
48
Page 48 DPOE Baseline Principles (3-6) 3.Establish requirements for storing files in preservation formats [Store] 3.Determine (and review) your best option for storing your content [Store] 3.Ensure that your content is secure during day-to-day activities [Protect] 3.Work to ensure that your content is prepared for an emergency [Protect] DPOE Baseline Modules: Wrap Up, version 2.0, Nov 2011
49
Page 49 7.Develop (and review) plans for managing content over time [Manage] 8.Use policies to contain and develop your preservation program [Manage] 9. Remember that long-term access is the purpose of preservation [Provide] 10.Make sure the means to deliver content to users remains current [Provide] DPOE Baseline Principles (7-10) DPOE Baseline Modules: Wrap Up, version 2.0, Nov 2011 ©iStockphoto.com/CGinspiration
50
Page 50 Resources “Digital Preservation Management: Implementing Short-Term Strategies for Long-Term Problems” Online Tutorial: http://www.dpworkshop.org/dpm-eng/eng_index.html http://www.dpworkshop.org/dpm-eng/eng_index.html Survey of Institutional Readiness: http://www.dpworkshop.org/ http://www.dpworkshop.org/ "Planning for Digital Preservation: 20 Questions for Providers of Digital Storage Services," Bernard Reilly, Center for Research Libraries http://www.nedcc.org/resources/digital/downloads/QuestionstoAskProviders ofDigitalStoragefinal.pdf http://www.nedcc.org/resources/digital/downloads/QuestionstoAskProviders ofDigitalStoragefinal.pdf "Digital Preservation Metadata Standards," Angela Dappert and Marcus Enders, Information Standards Quarterly, Spring 2010, Volume 22, Issue 2 http://www.loc.gov/standards/premis/FE_Dappert_Enders_MetadataStds_is qv22no2.pdf http://www.loc.gov/standards/premis/FE_Dappert_Enders_MetadataStds_is qv22no2.pdf
51
Page 51 More Resources ICPSR Digital Curation: http://www.icpsr.umich.edu/icpsrweb/ICPSR/curation/ Trustworthy Repositories Audit & Certification (TRAC): Criteria and Checklist (2007): http://www.crl.edu/sites/default/files/attachments/pages/trac_0.pdf [NOTE: ISO 16363 version of TRAC approved fall 2011] Center for Research Libraries Reports on Digital Archives and Repositories: http://www.crl.edu/archiving-preservation/digital- archives/digital-archive-reportshttp://www.crl.edu/archiving-preservation/digital- archives/digital-archive-reports “Digital Preservation Outreach and Education,” Library of Congress. http://www.digitalpreservation.gov/education / http://www.digitalpreservation.gov/education /
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.