Katherine Skinner, Executive Director, Educopia Institute Martin Halbert, Dean of Libraries, University of North Texas Tyler Walters, Dean of Libraries,

Slides:



Advertisements
Similar presentations
1 Metadata Tools for JISC Digitisation Projects of still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton.
Advertisements

ETD Preservation Workshop Session One: ETDs and Preservation Needs Gail McMillan, Virginia Tech.
An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.
Katherine Skinner Executive Director, Educopia Institute Program Manager, MetaArchive Cooperative An Age of Discovery, ARL-CNI Washington D.C. Friday,
Mairéad Martin, Penn State University Commons Solutions Group Storage Workshop May 2010.
PREMIS in Thought: Data Center for LC Digital Holdings Ardys Kozbial, Arwen Hutt, David Minor February 11, 2008.
R.Jantz, August 31, Two-day forum on PREMIS Preservation Metadata and the Trusted Digital Repositories August 31, September 1 National Library of.
May , IASSIST 2006 May Ann Arbor, MI Ronald C. Jantz Rutgers University Libraries RUtgers COmmunity REpository (RUcore) A FEDORA-based.
DCAPE Project Update Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management.
SFU Library’s Migration to Islandora
Collaborative Preservation of ETDs: The MetaArchive Cooperative and LOCKSS Gail McMillan Digital Library and Archives, Virginia Tech 1 st Canadian ETD.
Preserving Digital Collections Andrea Goethals Florida Center for Library Automation (FCLA)
Promoting Digital Preservation Partnerships at the U.S. Library of Congress April 2004.
Digital Asset Management for All? Visualising a Flexible DAMS Solution for Small and Medium Scale Institutions Paul Bevan Llyfrgell Genedlaethol Cymru.
Toward a Distributed and Collaborative Framework for Preservation Martin Halbert, UNT Dean of Libraries David Minor, Chronopolis Program Manager Katherine.
Tyler O. Walters, Associate Director, Technology & Resource Services Library & Information Center, Georgia Institute of Technology For NSF Site Visit to.
High Water Raises All Boats Leveraging Partnerships on Campus to Build a Repository Mary Molinaro University of Kentucky Libraries.
Adventures in Digital Asset Management: Fedora at the National Library of Wales Glen Robson National Library of Wales
Tyler Walters Dean, University Libraries and Professor Virginia Tech July 18, 2013 Collaboratively Preserving Our Digital Memory.
DuraSpace Summit meeting Baltimore, Md March 13,
Martin Halbert (President, MetaArchive Cooperative) Digital Directions 2009 Wednesday, May 27, 2009 San Diego, California.
Katherine Skinner, Executive Director, Educopia Institute Martin Halbert, Dean of Libraries, University of North Texas CNI 2010 Spring Forum, Baltimore.
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager
PREMIS and the National Digital Newspaper Program Justin Littman Office of Strategic Initiatives, LC
Digital Preservation: Lessons learned through national action Digital Preservation Interoperability Framework Workshop April 2010.
Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management ServicesSALT DCAPE.
Katherine Skinner Educopia Institute and MetaArchive Cooperative Matt Schultz Educopia Institute and MetaArchive Cooperative NDIIPP Partners Meeting Arlington,
Preserving ETDs: NDLTD & MetaArchive Collaboration Gail McMillan Digital Library and Archives, Virginia Tech Newcomers’ USETDA 2012.
The Portal to Texas History: Harnessing Technology to Enable Collaboration with Small Museums and Libraries CNI, December 6, 2005 Cathy Nelson Hartman.
Session 2.  Wake Up Call, LSTA Digitization Grant  Digital Preservation Summit, May 2008  ISU Digital Preservation Group, September 2009.
Developing small worlds of e-science: using quantum mechanics, biological science, and oceanography for education and outreach strategies for engaging.
Katherine Skinner, Emory University Gail McMillan, Virginia Tech NDIIPP Annual Partners Meeting June 24, 2009.
Martin Halbert UNT Dean of Libraries MetaArchive President Monday, April 11, 2011 Newspaper Archive Summit University of Missouri Columbia, MO.
Preserving eScholarship and Digitized Special Collections Distributed Digital Preservation Bill Donovan
Digital Preservation MetaArchive Cooperative.  9:00-9:45 - Session 1: Digital Preservation Overview  9:45-11:00 - Session 2: Policy & Planning Overview.
The Canadian Information Network for Research in the Social Sciences and Humanities Tim Au Yeung and Mary Westell Libraries.
T HE M ETA A RCHIVE M ODEL : D ISTRIBUTED D IGITAL P RESERVATION N ETWORKS Dr. Martin Halbert VIVA/SCHEV LAC Meeting Christopher Newport University Trible.
Systems Analysis Dr. Vicki Sauter and Friends Professor, Information Systems University of Missouri Saint Louis InfoSys 3810 Week Three 2013.
Katherine Skinner, Executive Director, Educopia Institute ESOPI 2013 Chapel Hill, NC April 19, 2013.
Session 3.  Now you know WHY to make policies and WHAT they should contain…  But HOW do you implement policies?  And then HOW do you implement a program.
Martin Halbert President, MetaArchive Cooperative DigCCurr 2009 Meeting Chapel Hill, NC Friday, April 3, 2009.
Dr. Martin Halbert Dr. Katherine Skinner Digital Preservation: What’s Now, What’s Next. Amigos Online Conference, August 12, 2011.
INSTITUTIONAL REPOSITORIES: POLICY, SCOPE, AND DIRECTION Jenn Riley Head, Carolina Digital Library and Archives.
The Alabama Digital Preservation Network (ADPNet) A statewide Private LOCKSS Network Aaron Trehub, Auburn University Libraries SAA/CoSA Joint Annual Meeting.
What is NDIIPP doing?. July 7 th, Web-At-Risk is opening its archives for public access, having captured nearly 6 TB of data—the entire CA State Government.
Top Priorities in IT and Digital Projects at Georgia Tech Tyler Walters Georgia Tech Library and Information Center For ASERL ITDIIG – September 24, 2009.
Providing the ETDs of Today for the Researchers of Tomorrow Martin Halbert, Katherine Skinner, Matt Schultz 2012 CNI Fall Membership Meeting Washington,
Institutional Repositories and the Need for "Value-added" Services Tyler O. Walters Associate Director, Technology & Resource Services Georgia Tech Library.
Katherine Skinner, Educopia Institute Emily Gore, Clemson University U.S. Workshop on Roadmap for Digital Preservation Interoperability Framework NIST,
Chronopolis – MetaArchive Improving and Strengthening Inter-Institutional Preservation.
PLN Members and the ARL Digital Preservation Survey Preliminary Findings Gail McMillan Director, Digital Library and Archives Virginia Tech Matt Schultz,
Distributed Digital Preservation Networks Across a Region, Across a State: Stretching LOCKSS Gail McMillan, Virginia Tech Martin Halbert, Emory Aaron Trehub,
Managing live digital content with DuraSpace services Bill Branan PASIG Spring 2015.
The R EPOSITORY AS P UBLISHER OPPORTUNITIES AND CHALLENGES IN A DUAL ROLE BEN HOCKENBERRY SYSTEMS LIBRARIAN | ST. JOHN FISHER COLLEGE.
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
Katherine Skinner, Martin Halbert & Matt Schultz Educopia Institute and MetaArchive Cooperative NDSA Infrastructure Committee
Gail McMillan (Director, Digital Library and Archives, VA Tech) Martin Halbert (President, MetaArchive Cooperative) ETD 2009 Meeting Pittsburgh, PA Thursday,
Turning the Corner on ETDs Digital Initiatives Symposium University of San Diego, April Ellen Ramsey, University of Virginia Jennifer Roper, University.
Building Digital Archives Mark Phillips Cathy Hartman June 6, 2008.
Preservation of Newspapers
Preserving Digital Collections
Identifying Barriers To File Rendering In Bit-level Preservation Repositories A Preliminary Approach Kyle R. Rimkus, University Library Scott D. Witmer,
Implementing Metaarchive At Robert E. Kennedy Library
? What is Institutional Repository for Rutgers University
SowiDataNet - A User-Driven Repository for Data Sharing and Centralizing Research Data from the Social and Economic Sciences in Germany Monika Linne, 30.
Gail McMillan Digital Library and Archives, Virginia Tech
Preserving Our Collective Digital History
The MetaArchive Model: Distributed Digital Preservation Networks
Presentation transcript:

Katherine Skinner, Executive Director, Educopia Institute Martin Halbert, Dean of Libraries, University of North Texas Tyler Walters, Dean of Libraries, Virginia Tech CNI 2012 Spring Membership Meeting Baltimore, MD April 3, 2012 Curation Practices for Born-Digital and Digitized Newspaper Collections

 Chronicles Project background  State of the Field report  Early Findings 2Skinner, Halbert, and Walters 2012

One day, through the primeval wood, A calf walked home, as good calves should; But made a trail all bent askew, A crooked trail, as all calves do. This forest path became a lane, That bent, and turned, and turned again. This crooked lane became a road, Where many a poor horse with his load, Toiled on beneath the burning sun, And traveled some three miles in one. And thus a century and a half, They trod the footsteps of that calf. Skinner, Halbert, and Walters Since then three hundred years have fled, And, I infer, the calf is dead. But still he left behind his trail, And thereby hangs my moral tale. The trail was taken up next day By a lone dog that passed that way; And then a wise bellwether sheep Pursued the trail o’er vale and steep, And drew the flock behind him, too, As good bellwethers always do. And from that day, o’er hill and glade, Through those old woods a path was made, And many men wound in and out, And dodged and turned and bent about, And uttered words of righteous wrath Because ’twas such a crooked path; But still they followed — do not laugh — The first migrations of that calf. by Sam Walter Foss The years passed on in swiftness fleet, The road became a village street; And this, before men were aware, A city's crowded thoroughfare; And soon the central street was this, Of a renowned metropolis; And men two centuries and a half, Trod the footsteps of that calf. Each day a hundred thousand men were led By one calf near three centuries dead. They follow still his crooked way, And lose one hundred years a day, For thus such reverence is lent To well-established precedent.

Educopia Institute-led partnership, comprised of the following: Preservation groups MetaArchive (LOCKSS) Chronopolis (iRODS) University of North Texas (CODA) Content Curators Penn State Virginia Tech University of Utah Georgia Tech Boston College Clemson University University of Kentucky Funded by: Skinner, Halbert, and Walters 20124

To study, document, and model the use of data preparation practices and distributed digital preservation frameworks to collaboratively preserve digitized and born-digital newspaper collections. Skinner, Halbert, and Walters 20125

 MetaArchive  Founded 2004, 50+ members in 3 countries  Multi-node, wide distribution of content  Chronopolis  3-node system (SDSC, NCAR, UMIACS)  CODA  Developing multi-node framework based on a micro-services approach Skinner, Halbert, and Walters 20126

Born DigitalDigitized Skinner, Halbert, and Walters 20127

 How can curators effectively and efficiently prepare their existing digitized and born-digital newspaper collections for preservation?  How can curators ingest preservation-ready newspaper content into existing DDP solutions?  What are the strengths and challenges of three leading DDP solutions when used to preserve digital newspaper content? Skinner, Halbert, and Walters 20128

 Guidelines to Digital Preservation Readiness  Interoperability Tools  Comparative Analysis of DDP Frameworks Skinner, Halbert, and Walters 20129

 Early findings based on the following surveys:  2008 ETD Preservation Survey (VT-NDLTD)  2009 Digital Preservation Needs Survey (NHPRC)  2011 Digital Preservation SPEC Kit 325 (ARL)  Chronicles Survey (8 academic libraries) Skinner, Halbert, and Walters

ETD and NHPRC surveys  Readiness is low. Desire is high. ▪ >70% had NO preservation plan. ▪ >25% were not even backing up ▪ almost none engaged in active preservation Skinner, Halbert, and Walters

survey results 12Skinner, Halbert, and Walters 2012

 SPEC Kit #325: Digital Preservation (ARL)  Types of content ▪ ~100% ETDs, images, special collections  80% preserve some now; all but 4% plan to. Top barriers? ▪ Lack of experienced staff ▪ Lack of funding ▪ Institutional policies and strategies Skinner, Halbert, and Walters

Skinner, Halbert, and Walters  Chronicles Project Survey  Type ▪ NDNP: 18; non-NDNP: 459; born digital: 19  Image formats ▪ TIFF, JP2, PDF, HTML, TXT, XML  Metadata formats ▪ METS/ALTO, MIX, MODS, PREMIS  OCR formats ▪ METS, ALTO, PDF, Abbyy, XML, PRIME OCR.pro

Skinner, Halbert, and Walters  Chronicles Project Survey (cont)  Object identifier schemes ▪ Fedora PID, Handles, Veridian and CONTENTdm custom URLs, ARKs ▪ All but two are internal to the repository system  Validation ▪ ½ use JHOVE at least for some content  Versioning ▪ Only one institution

 Chronicles Project Survey – Findings (cont.)  Access and storage systems ▪ Access: local, hosted, open, & proprietary ▪ e.g., Fedora, Dspace, Olive, Veridian, CODA, web-server ▪ Masters: e.g., SAN, tape, hard-drive  Preferred ingest mechanisms ▪ Secure FTP or “Frisbee-net” Skinner, Halbert, and Walters

 VA Tech - starting with the essential  Well entrenched in the calf-path  “diverse and un-normalized legacy” collections  the “born-digital dilemma” institution  extensive Data Wrangling experience  Hosting e-news since 1997 ▪ HTML 4.0, PDF 1.1 ▪ Metadata?  Outside NDNP recommendations Skinner, Halbert, and Walters

Skinner, Halbert, and Walters

Skinner, Halbert, and Walters

Skinner, Halbert, and Walters

Skinner, Halbert, and Walters

 What strategies help to improve and optimize newspaper digitization workflows?  Avoiding the calf-path requires a willingness to re-examine workflow and impose discipline  Normalization is required for all incoming content – including newspapers  Digitizing and preserving to current standards, using local flavors  Builds off NDNP foundations Skinner, Halbert, and Walters

Skinner, Halbert, and Walters  Relatively large scale and streamlined state digitization project (2.5M files, 186K serials/titles, now used 275K times/month)  Digitizes content from 220 libraries and museums across Texas  Strong ties to state educational groups and learning standards  Much of the portal was created through NDNP funding streams  Part of the much larger UNT Digital Library  Micro-services modular system architecture based on open standards

 Back-up vs. preservation  Adoption of existing standards is low  e.g., OCR, metadata  Lack of standards  e.g., file structures, naming conventions, and object identifier schemes  Diverse array of expectations for access & recovery  very institution-specific  Versioning processes will be necessary  e.g., for growing, changing, and/or remediated projects Skinner, Halbert, and Walters

Martin Halbert Katherine Skinner Tyler Walters 25Skinner, Halbert, and Walters 2012