Katherine Skinner, Emory University Gail McMillan, Virginia Tech NDIIPP Annual Partners Meeting June 24, 2009
Central aim: to better understand the terrain of the emergent field of digital curation. how emergent is it? what trends are beginning to emerge within it? MetaArchive 20092
ETD: December 2007-April 2008 Universities and Colleges 96 Respondents Five Listservs: ▪ Association of Research Libraries, Association of Southeastern Research Libraries, Council of Graduate Schools, Digital Library Federation, and Electronic Theses and Dissertations MetaArchive 20093
Two surveys, 158 participants Cultural Memory: March 2009 Archives, Museums, Libraries, Historical Societies, Government Agencies 62 Respondents Three Listservs: ▪ H-Museum, A&A-L (Society of American Archivists), and ERECS-L (Electronic Records Managers) MetaArchive 20094
Who is collecting digital materials, what are they collecting, and how are they storing these materials? Who seeks to preserve their digital collections and how do they want to preserve them? What are the biggest barriers to preservation? What are the most desired offerings in preservation? MetaArchive 20095
Cultural Memory: 98.4% are collecting Range: 1 GB-20 TB, average 2 TB Average Growth: 540 GB/year Formats/Genres include: text (83%), video (76%), audio (75%), (47%), databases (48%), websites (41%), and GIS material (36%) + scads more Repository structures include: home-grown (65%), CONTENTdm (17%), Fedora (9%), DSpace (7%), Access/Excel (6%), plus SRB, Filemaker, and 10 others MetaArchive 20096
ETDs: 80% accept ETDs; 40% only accept ETDs Range: GB, average 41 GB Average Growth: 4.5 GB/year Formats/Genres include: images (92%), applications (89%), audio (79%), text (64%), video (52%), and other (15%) Repository structures include: DSpace (31%), ETD-db (15%), Fedora (5%), Eprints (2%), as well as locally developed solutions (34%) and vendor-based solutions: bepress (6%), DigiTool (6%), ProQuest (6%), and CONTENTdm (6). MetaArchive 20097
Formats (ETD & Cultural Memory) ETD.ppt.qt.tif.xml.wav.png.pdf.mpg.mp3.aif.avi.doc.gif MetaArchive html.jpg.mov.dwt.xls.csv.zip.mix.snd.tex.txt.midi.exe.jar Cultural Memory Textual documents Databases Still images Video Audio GIS Websites Computer games Science data Publications Presentation materials JP2.ps
Platforms (ETD & Cultural Mem.) ETDdb Eprints Fedora DSpace Archimede bepress/ Digital Commons CONTENTdm Cybertesis Dias DigiTool DLXS Proquest MetaArchive MS Access Excell SRB ResCarta Augias-data Cumulus CollectiveAccess Windows Explorer IRODS Filesystem ArchivalWare Filmaker Pro iTunes Documentum Fez Millennium Online Catalog OhioLINK Oracle Sesame VTLS Vital Past Perfect ANCS MINISIS CDs/DVDs In House
Structure (ETD & Cultural Mem) Cultural Memory subject (33%) collection (35%) format (21%) date (10%) department (10%) creator (8%) funder (4%) *some Cultural Memory respondents selected multiple ways MetaArchive ETD All in one directory (28%) Date (26%) Departments, Authors, or Disciplines (26%) Access-level labels (7%) Don’t know (13%)
Variation is the theme Infrastructures Data Structures Presents preservation challenges, to be sure! MetaArchive
Who seeks preservation and how do they want to preserve? Readiness is low Most institutions are not even backing up Dearth of preservation plans and policies Desire is high Want training Want independent assessments Want to manage their own digital preservation solutions MetaArchive
Cultural Memory: Only 50% back up 100% of their digital holdings Only 19% report having in-house “expert” knowledge in digital preservation 79% have NO preservation plan 55% have NO written policies ETDs: 95% are engaging SOME backup strategies 72% have NO preservation plan MetaArchive
Cultural Memory 83% will develop policies in the next 3 years 90% cited interest in participating in a community-based digital preservation solution Only 30% cited interest in third-party vendor offerings, even at a reasonable cost ETDs 70% have experience with/knowledge of LOCKSS 92% cited interest in participating in an NDLTD- supported LOCKSS-based EDT archive MetaArchive
CMO’s engaging actively with the idea of digital preservation High level of knowledge about community- based approaches to digital preservation Outsourcing is not the top choice of institutions as they pursue digital preservation; they would rather participate in it themselves MetaArchive
What are the biggest barriers to preservation? Growth of digital collection Backups. NOT File formats Platforms Structures. NOT Lack of documented policies, procedures MetaArchive
What are the threats identified by our survey respondents? MetaArchive
What are the most desired preservation offerings? 1. Training provided by professional organizations 2. Independent study/assessment 3. Local courses in computer or digital technology 4. Hire staff with digital knowledge experience 5. Hire consultants 6. Training provided by vendors MetaArchive
The MetaArchive Cooperative The most effective preservation strategies incorporate replication of content geographically distributed secure locations private network of trusted partners MetaArchive
Desirable Preservation Service 1. Cooperative preservation network 2. Standards 3. Training: Best practices, inc. technical 4. Model policies 5. Conversion or migration services 6. Preservation services provided by third party vendors 7. Access services MetaArchive
Conclusion Calf-Path Syndrome Idiosyncratic, ad-hoc data storage structures Increasingly difficult remediation MASH: triage Survey documented narratives Outreach Offer help to those adrift in cyberspace Through collaboration there are cost-effective and strong strategies that can protect cultural memories MetaArchive
Katherine Skinner Gail McMillan MetaArchive