Digital Preservation Andrea Goethals Wendy Gogel From Harvard University Library NELA 18 October 2010.

Slides:



Advertisements
Similar presentations
What is HathiTrust and How Can it Make a Difference? Sourcing and Scaling brought to the collective collection.
Advertisements

E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started.
Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
CLEARSPACE Digital Document Archiving system INTRODUCTION Digital Document Archiving is the process of capturing paper documents through scanning and.
DRS 2 one in a series of periodic updates Harvard University Library Andrea Goethals October 21, 2009 DRS = Digital Repository Service.
Institutional Repositories It’s not Just the Technology New England Archivists Boston College March 11, 2006 Eliot Wilczek University Records Manager Tufts.
From Analog to Digital: Changes in Preservation Gregor Trinkaus-Randall Digital Commonwealth Conference Worcester, MA March 25, 2010.
Challenges of Digital Preservation MA / CS 109 April 22, 2011 Andrea Goethals Manager of Digital Preservation & Repository Services Harvard Library.
Developing a Records & Information Retention & Disposition Program:
Current Thinking on Digital Preservation: Role of Metadata Oya Y. Rieger Coordinator, Library Office of Distributed Learning Cornell University Library.
Depositing and Disseminating Digital Resources Alan Morrison Collections Manager AHDS Subject Centre for Literature, Linguistics and Languages.
NHPRC ELECTRONIC RECORDS RESEARCH FELLOWSHIP SYMPOSIUM Nov. 19, 2004 Rebecca Schulte University of Kansas Project Title: Testing Boundaries—An Exploration.
Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
Different approaches to digital preservation Hilde van Wijngaarden Digital Preservation Officer Koninklijke Bibliotheek/ National Library of the Netherlands.
Promoting Digital Preservation Partnerships at the U.S. Library of Congress April 2004.
Digital Asset Management for All? Visualising a Flexible DAMS Solution for Small and Medium Scale Institutions Paul Bevan Llyfrgell Genedlaethol Cymru.
WGBH, Boston MA May 10, 2013 Andrea Goethals, Harvard Library.
Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August Online materials published in Austria collecting, archiving and metadata.
Records Management Overview. Why? It’s the Law It’s the Law It’s University Policy It’s University Policy Fiscal and Legal Compliance Fiscal and Legal.
Records Management: It’s Not Just Paper
Digital Preservation Dale Flecker Stephen Abrams February 15, 2007 HUL University Library Council.
City of Seattle Office of the City Clerk Open Government = Access Challenges and Opportunities with Digital Records.
A Public Trust at Risk: The Heritage Health Index Report on the Condition of Alabama’s Collection.
Managing the Record of Research At the Smithsonian Using SIdora SAA Research Forum August 12, 2014.
Tackling concrete digital preservation challenges with SPRUCE Paul Wheatley SPRUCE Project Manager University of Leeds Twitter:
Digitization of the Federal Depository Library Program Judith C. Russell Superintendent of Documents & Managing Director, Information Dissemination “Electronic.
Jenn Riley Metadata Librarian Indiana University Digital Library Program.
Recordkeeping for Good Governance Toolkit Digital Recordkeeping Guidance Funafuti, Tuvalu – June 2013.
Update on UDFR (Unified Digital Format Registry) NDIIPP Meeting June 25, 2009 Andrea Goethals.
Digital Preservation 101, or, How to Keep Bits for Centuries Julie C. Swierczek Digital Asset Manager and Digital Archivist Harvard Art Museums.
Investing in the Long-Term Viability of British Columbia’s Digital Collections A presentation to the Steering Committee of the B.C. Digitization Coalition.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Digital Preservation: Lessons learned through national action Digital Preservation Interoperability Framework Workshop April 2010.
Preventing Common Causes of loss. Common Causes of Loss of Data Accidental Erasure – close a file and don’t save it, – write over the original file when.
Meet and Confer Rule 26(f) of the Federal Rules of Civil Procedure states that “parties must confer as soon as practicable - and in any event at least.
The Library of Congress Martha Anderson Program Officer, NDIIPP Office of Strategic Initiatives Library of Congress April 2005 LC Perspective : Preservation.
Libraries, Archives, and Digital Preservation: The Reality of What We Must Do Leslie Johnston Acting Director, National Digital Information Infrastructure.
1 Designing Storage Architecture for Digital Collections 2012.
The Canadian Information Network for Research in the Social Sciences and Humanities Tim Au Yeung and Mary Westell Libraries.
Digital preservation activities at the NLW Sally McInnes 18 September 2009.
E.Soundararajan R.Baskaran & M.Sai Baba Indira Gandhi Centre for Atomic Research, Kalpakkam.
Implementing an Institutional Repository: Part III 16 th North Carolina Serials Conference March 29, 2007 Resource Issues.
Small steps and lasting impact: making a start with preservation or It’s not all NASA Patricia Sleeman Digital Archives and Repositories University of.
Archival Workshop on Ingest, Identification, and Certification Standards Certification (Best Practices) Checklist Does the archive have a written plan.
The KB e-Depot long-term preservation of scientific publications in practice Marcel Ras, National library of The Netherlands.
ALA Institutional Repository Update ALA Archives at the University of Illinois Urbana-Champaign Chris Prom Cara Bertram Denise Rayman.
DRS 2 Project (2008 – Present!) Andrea Goethals, Harvard Library Digital Preservation Management Workshop, MIT June 13, 2013.
Archiving Geospatial Data: Background to the Problem Area State Government Users Committee October 16, 2008 Steve Morris, NCSU Libraries.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Digital Preservation across the technologies, strategies, open standards & interoperability aspects including the legal issues Pratik Shrivastava Scientist.
Enterprise Solutions Chapter 10 – Enterprise Content Management.
1/ 4 OCTOBER 2007 Electronic Records Retention Issues Frank Nemeth NMCI Engineering.
Digitization & Digital Preservation
Portico’s “d-collections” preservation service Stephanie Orphan Positive trends in sustainability? Emerging approaches to archiving commercial databases.
ARIADNE is funded by the European Commission's Seventh Framework Programme Archiving and Repositories Holly Wright.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
@ulccwww.ulcc.ac.uk IRMS Cymru October 2015 From EDRMS to digital archive: a wish-list for ways to preserve digital records.
Leveraging the Expertise of our Staff and the Information Resources We Manage MIT Libraries Visiting Committee April 13, 2005.
Digital Stewardship Lee Dotson Digital Initiatives Librarian University of Central Florida John C. Hitt Library Presentation available at
Chang, Wen-Hsi Division Director National Archives Administration, 2011/3/18/16:15-17: TELDAP International Conference.
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
Digital Preservation What, Why, and How? Dan Albertson’s Digital Libraries Class April 13, 2016 Jody DeRidder Head, Metadata & Digital Services University.
Working with personal digital archives Susan Thomas Project Manager & Digital Archivist project Manuscripts Matter, Electronica panel London, October.
KEEPS – a system for UELMA preservation and security
KEEPS – a system for UELMA preservation and security
2. ISO Certification Discussed already at 2015 PoW and several WLCG OB meetings Proposed approach: An Operational Circular that describes the organisation's.
Digital Project Lifecycle Curating Across the Curriculum
Emulation: Good or Bad? Emulation as a Digital Preservation Strategy – Stewart Granger Reality and Chimeras in the Preservation of Electronic Records –
Presentation transcript:

Digital Preservation Andrea Goethals Wendy Gogel From Harvard University Library NELA 18 October 2010

Digital Preservation 1. Why digital preservation? 2. What’s the problem? 3. What’s being done? 4. What can you do? 5. Questions?

1. Why digital preservation?

Everything is digital  1957 first digital image  1969 ARPAnet  1971 first sent  1972 first video game  1998 first digital theatrical release

Digital content may be…  “… after 12:00 noon January 20, 2001, the National Archives and Records Administration ("NARA") shall have sole legal custody of all Clinton­Gore Administration electronic mail records that are governed by the Presidential Records Act ("PRA"), 44 U.S.C. 2201,…” Memorandum of Understanding between NARA and The Executive Office of the President, dated January 11, 2001 accessed Oct at: …historically significant.

Digital content may be… or your favorite movie. …your favorite song,

Digital content may be… Harvard Magazine May/June 2009 …the only version.

Digital content may be… …a work of art. Doug Aitken. (American, born 1968). sleepwalkers Six-channel video (color, sound), seven monitors, 12:57 min. The Dunn Bequest. © 2008 Doug Aitken. Photo: Fred Charles.

Digital content may be… …important to scholarship.

Who cares?  Cultural Resource Institutions Museums, historical societies  MOMA’s Matters in Media Arts Libraries, archives, special collections Academic institutions  Governments National Library Of New Zealand’s NDHA NARA’s ERA  The Entertainment Industry AFI Digital Preservation Project

Who cares?  You and me, personally!

2. What’s the problem?

Digital content is…  Transient  Fragile  Hidden 2400 B.C.E C.E.

Digital content is transient  The average lifespan of a web site is between 44 and 100 days Captured April 8, 2009Visited October 13, 2010

Digital content is fragile  Digital things are amazingly easy to destroy Bad people Software or hardware failure Human mistakes  The slip of a finger or an unnoticed consequence of change can happen easily - and are potentially catastrophic  “Help! Accidental deletion. I accidentally deleted 62 images… can you please recover them from backups?”

Digital content is hidden  Loss is not always apparent Are either of these corrupt?

Digital content is hidden  Loss is not always apparent Both are corrupt! Use helps but its not enough

Even if it’s safe is it usable???  It’s not enough to preserve the bits if the format of the bits is obsolete! WordStar? AppleWorks? Excel 1.0?  To use digital content we are dependent on software that can understand the format…

The importance of format  Understanding formats is fundamental to preservation ffd8ffe000104a ffed0fb050686f74 6f73686f e d 03e90a e e666f f40240ffeeffee fc d f d03ed0a f6c f 6e a

The importance of format  Understanding formats is fundamental to preservation ffd8ffe000104a ffed0fb050686f74 6f73686f e d 03e90a e e666f f40240ffeeffee fc d f d03ed0a f6c f 6e a SOI APP0 JFIF 1.2 APP13 IPTC APP2 ICC DQT SOF0 183x512 DRI DHT SOS ECS0 RST0 ECS1 RST1 ECS2...

The importance of format  Understanding formats is fundamental to preservation ffd8ffe000104a ffed0fb050686f74 6f73686f e d 03e90a e e666f f40240ffeeffee fc d f d03ed0a f6c f 6e a SOI APP0 JFIF 1.2 APP13 IPTC APP2 ICC DQT SOF0 183x512 DRI DHT SOS ECS0 RST0 ECS1 RST1 ECS2...

Using information content information content bits formats SW HW HW (paper) information content HW (paper) symbols language Analog book Unmediated use Digital book Technology-mediated use

Formats are key to determining usability information content bits formats SW HW supporting technologies digital content Formats are the bridge between the content we want to preserve and supporting technologies

Dependence on fleeting technology  We are dependent on technology to interpret digital content...  Technologies must understand the format of the content  Technologies age and disappear!

3. What’s being done?

Primary goals of digital preservation 1. Keep the bits safe 2. Keep the bits useful to people

1. Keep the bits safe  Infrastructure, processes, policies and professional staff to counter risks High quality storage Redundancy (multiple copies, multiple locations) Media refreshing (replacing) Integrity monitoring (check for corruption) Security and access management Content recovery

2. Keep the bits useful  Provide ways for people to find it  Provide ways to manage it  Keep records of history and significant events  Know what formats you have  Make sure there’s technology to support the formats! “Technology watch” And if there’s not, force there to be technology that supports the formats (migration, emulation, creation of viewing software)

Degrees of preservation “passive preservation” aka “bit-level preservation” “active preservation” aka “full preservation” aka “logical preservation” better understood & less costly will not ensure long-term usability - ensures current and near-term usability more complex, challenging & costly requires more expertise but better ensures very long- term usability requires passive preservation

Degrees of preservation “passive preservation” aka “bit-level preservation” “active preservation” aka “full preservation” aka “logical preservation”  Store  Secure  Maintain  Prevent  Migrate  Re-engineer software  Emulate  Digital archaeology  Monitor  Restore  Add value

Strategic thinking  The least expensive, and most effective preservation measure is to think about the future when digital content is created!  The content production matters!  It makes good sense to try to influence the content creation process

Preservation lifecycle  Create or acquire digital content  Ingest into a preservation repository Continuous cycle of:  Monitoring  Planning  Intervention Subject to collection management decisions  Transfer to next generation of the repository or to a different repository A series of hand-offs over time

Ongoing commitment  Requires continual pro-active program You can’t just stop and start Time frames are MUCH shorter than for preservation of physical collections  Requires ongoing investment in both technology and staffing

Can’t do it alone  More than any other library activity, preservation responsibility must be shared across institutions  Even collectively we do not have adequate resources or understanding

Preservation community efforts  Collaborative organizations (NDSA, IIPC, OPF)  Collaborative projects (AIHT, TIPR)  Standards and metadata Technical metadata for still images, audio, documents METS (package for metadata and digital objects) PREMIS (preservation metadata) “Preservable formats” (PDF/A) Repository certification  Infrastructure Formats registry (UDFR, Pronom) Repository software (Fedora, DAITTSS, LOCKSS, etc.) Tools (Jhove, FITS, etc.)

4. What can you do?

First steps  Inventory your content  Identify where it is all kept web locations computer hard drive Removable media (CDs, etc.)  Select Decide what is worth keeping Given a choice keep the highest quality version Is someone else already preserving it? Consider deleting content that's not needed

Second steps  Organize your digital content Create a logical directory/folder structure for the content Give descriptive names to the files If possible tag or embed with descriptions Catalog your content  Draft a summary description Keep your inventory and a summary description of the content and how you have it organized in a secure location

Third steps  Make multiple copies of your content Use formats that are amenable to long-term survival Use open formats when possible  Store on durable media  Store in multiple locations Preferably in different disaster zones.  Use it! Periodically check that you can access the content  Migrate to new media over time.

Fourth steps  Keep informed. LC's website Research, training and outreach (DCC, DPC, JISC, IIPC, NEDCC) p p Professional organizations (ALA, SAA) Conference proceedings (iPRES, IS&T Archiving, DLF) How to preserve your own digital materials (LC): basic characteristics of digital preservation repositories (CRL website): archives/metrics-assessing-and-certifying/core-re archives/metrics-assessing-and-certifying/core-re

Image Credits First digital image Pong: : First theatrically released: iPod ad: Avatar: Cuneiform 2400 BC: Book of Hours in French and Latin: Server: Sleepwalkers at MOMA: PRS data sets: Corrupt images: New Yorker Cover, June 8 and 15, 2009 and October 18, 2010

5.Questions?