Archive-It and CINCH tool: Using web harvesting to facilitate born- digital preservation Kathleen Kenney Archive-It Partners Meeting 2012.

Slides:



Advertisements
Similar presentations
Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
Advertisements

DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
OCLC Digital Archive Overview Judith Cobb LIPA Meeting July 2006.
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
Digital & Preservation Resources Managing the digital collection life cycle.
Looking Ahead Archive-It Partner Meeting November 18, 2014.
The PeDALS approach.  Pete Watters Arizona State Library, project coordinator  Richard Pearce-Moses Clayton State University, Georgia,
Applying Theoretical Archival Principles and Policies to Actual Born Digital Collections LEIGH ROSIN | Digital Archivist | National Library of New Zealand.
PREMIS in Thought: Data Center for LC Digital Holdings Ardys Kozbial, Arwen Hutt, David Minor February 11, 2008.
Sustainable Preservation Services for Archivists through Distributed Custody Caryn Wojcik State of Michigan Records Management Services.
Providing Access to Wisconsin State Government Documents By Abby Swanton, Librarian Dept. of Public Instruction, Reference and Loan Library Minnesota Capitol.
Providing Online Access to the HKUST University Archives: EAD to INNOPAC Sintra Tsang and K.T. Lam The Hong Kong University of Science and Technology 7th.
DCAPE Project Update Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management.
Richard MARCIANO Chien-Yi HOU School of Information and Library Science (SILS) Sustainable Archives & Leveraging Technologies Group (SALT) University of.
APSR Forum on Long-Term Repositories National Library of Australia, 31 August – 1 September, Trust and the Web: Can the audit criteria apply to.
THE RUTGERS WORKFLOW MANAGEMENT SYSTEM Mary Beth Weber Cataloging and Metadata Services Rutgers University Libraries August 3, 2007.
Richard MARCIANO Chien-Yi HOU School of Information and Library Science (SILS) Sustainable Archives & Leveraging Technologies Group (SALT) University of.
Persistent Digital Archives and Library System (PeDALS) South Carolina Department of Archives and History.
1 Archiving and Preserving the Web Kristine Hanna Internet Archive April 2006.
Persistent Digital Archives and Library System (PeDALS) A Guide for Wisconsin State Agencies.
1 Archive-It Training University of Maryland July 12, 2007.
Variations On Video project update DLF Fall Forum 2010 Jon Dunn, Indiana University Claire Stewart, Northwestern University November 2, 2010.
NCSU Libraries Ingest Workflow Issues: Metadata North Carolina Geospatial Data Archiving Project Steve Morris North Carolina State University Libraries.
MIRA to TDIL Workflows Alicia Morris October 2, 2014.
1 Archiving and Preserving the Web Dan Avery Kristine Hanna Merrilee Proffitt Internet Archive RLG April 2006.
PeDALS Persistent Digital Archives & Library System Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library,
Describing Collections So Visitors Can Find Them: A sampling of ways to get materials on-line Amanda Focke, Rice University
Plan for the preservation of digital content and archives in THUL Jiang Airong, Dong Li Tsinghua University Library EMANI Meeting GRENOBLE – 16 Oct, 2006.
Finding a New Way Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library, Archives and Public Records Using.
Persistent Digital Archives and Library System (PeDALS) SC Department of Archives and History.
Open Access Symposium 2015 Open Access, the Law, and Public Information Mary Alice Baish UNT Dallas College of Law May 19, 2015 National Plan for Access.
1 Archive-It: Archiving and Preserving Born Digital Content NDIIPP June 2009 Molly Bragg Partner Specialist Internet Archive.
Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management ServicesSALT DCAPE.
The Real At Risk E-Content: University Web Resources EDUCAUSE Joanne Kaczmarek University of Illinois at Urbana-Champaign Taylor Surface OCLC October 12,
Can we be doing more? Beth Tillinghast University of Hawaii at Manoa October 19, 2011 Archive-It Partner Meeting ACCESS TO OUR ARCHIVED WEBSITE COLLECTIONS.
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
0 GPO’s Federal Digital System October 21, 2008 Selene Dalecky – Lisa LaPlant – Blake Edwards U.S. Government Printing Office.
Omnia Mutantur, Nihil Interit: Archiving state government social media in North Carolina and beyond North Carolina Department of Cultural Resources Lawrence.
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
BUILDING ON COMMON GROUND: EXPLORING THE INTERSECTION OF ARCHIVES AND DATA CURATION Lizzy Rolando & Wendy Hagenmaier 6/3/2015IASSIST 2015.
From Your Archive to the Web: Managing the Project The digitization of the Historic Photograph Collection of the Public Library of Brookline Digital Commonwealth/
Promoting Indian Institutional Repositories for Scholarly Communication: DESIDOC/DRDO Initiatives Dr Rajeev Vij & Navin K Soni Institute of Nuclear Medicine.
Digital Accountability: The Line Between Producing and Preserving Digital Government Information Mary Alice Baish Superintendent of Documents Indiana State.
Persistent Digital Archives and Library System (PeDALS)
OAIS: From Requirements to Reality at OCLC FLICC / CENDI Symposium, Dec Pam Kircher Product Manager, Digital Archive OCLC Digital & Preservation.
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
The Web-at-Risk NDIIPP Sponsored Project Partners include: California Digital Library – project lead University of North Texas New York University California.
Carcanet Case Study Fran Baker, John Rylands University Library University of Manchester SPRUCE event 19 January 2012.
Preservation Program Digital Preservation Program Digital Preservation Services: Extending tools to meet campus needs Patricia Cruse, Director, Digital.
Digital Archiving at Elsevier Joep Verheggen, ScienceDirect ICSTI Conference, London, 17 May 2004.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan Florida Center for Library Automation (FCLA)
Chronopolis – MetaArchive Improving and Strengthening Inter-Institutional Preservation.
The Project Three-year grant from the National Historical Publications and Records Commission (NHPRC), April 2010-March 2013 Develop electronic records.
Managing Access at the University of Oregon : a Case Study of Scholars’ Bank by Carol Hixson Head, Metadata and Digital Library Services
SEDAC Long-Term Archive Development Robert R. Downs Socioeconomic Data and Applications Center Center for International Earth Science Information Network.
@ulccwww.ulcc.ac.uk IRMS Cymru October 2015 From EDRMS to digital archive: a wish-list for ways to preserve digital records.
Repository-specific Spoke Scripts Content Repository JSR-170/283 Content Repository for Java Technology API Normalized H&S METS Files METS Import/ExportMETS.
Grant Writing for Digital Projects September 2012 IODE Project Office IODE Project Office Oostende, Belgium Oostende, Belgium Sustainability and.
Digitalcommons.unl.edu Archiving Department Records.
Government Printing Office Future Digital System (FDsys) Special Library Association Open Access and Public Access: New Models for Information Access June.
Discover ScholarSphere A repository service collaboration between the University Libraries and ITS.
13 July 2005 Archives Hub day conference The Paradigm Project: The University of Oxford & The University of Manchester
Digitization Workflows From the Digital Projects Unit University of North Texas Libraries Mark E. Phillips Jeremy D. Moore February 12, 2009.
7th Annual Hong Kong Innovative Users Group Meeting
Building A Repository for Digital Objects
Statewide Digitization and the FCLA Digital Archive
Implementing an Institutional Repository: Part II
Technical Issues in Sustainability
Implementing an Institutional Repository: Part II
The Bentley Digital Media Library
Presentation transcript:

Archive-It and CINCH tool: Using web harvesting to facilitate born- digital preservation Kathleen Kenney Archive-It Partners Meeting 2012

State Library of North Carolina Collection: State government, genealogy, North Caroliniana Support: All North Carolina libraries Services & Materials: for those with visual/physical disabilities

North Carolina General Statute 125 The State Library shall be the official, complete, and permanent depository for all State publications… State Library of North Carolina Collection: State government, genealogy, North Caroliniana Support: All North Carolina libraries Services & Materials: for those with visual/physical disabilities

North Carolina General Statute 125 The State Library shall be the official, complete, and permanent depository for all State publications… Session laws Annual reports Technical reports Newsletters Websites and more …

Challenges Manual collection We’re not getting it all The ingested object may not be “authentic” We have to badger encourage contributors Website archiving A web archive is hard for users to understand We can’t provide the continuity from digitized to digital We have tons o’ data

North Carolina State Government web presence Preservation Storage How can we extract, use, & preserve the publications discovered by our web archiving project, in an automated and preservation-responsible way? Our Use Case

CINCH (Capture INgest CHecksum) is a tool that automates the transfer of online content to a repository, using ingest technologies appropriate for digital preservation. More familiarly, CINCH grabs freely available online content, authenticates it, extracts metadata, and readies it for repository ingest.

Modular Flexible Easy to use Repository-neutral Open source Hosted (North Carolina institutions)

The Web CINCH Checksums De-duplication Virus scan Metadata extraction Audit trail Descriptive & administrative metadata Audit trail Error list What you get File Types gif, jpg, png Microsoft Excel Powerpoint Word mp3 pdf txt, csv

Stop Words Application Memo Contract Poster FAQ Example Draft Worksheet Checklist Chart Instruction Deadline Invitation Schedule Flyer Survey Letter Program Rules Procedure Meeting Minutes Schedule Comment proposal Agenda Calendar Form Press release

Metadata, audit trail Files Problem files

URL original file name

Where can I get it? slnc-dimp.github.com/Cinch/ I want more information! cinch.nclive.org

Feedback welcome! Funding for the CINCH: Capture, INgest, & CHecksum tool made possible through an IMLS Sparks! Ignition grant.