© 2008, IDEALS This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. To view a copy of this license, visit

Slides:



Advertisements
Similar presentations
An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.
Advertisements

October 28, 2003Copyright MIT, 2003 METS repositories: DSpace MacKenzie Smith Associate Director for Technology MIT Libraries.
DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
Institutional Repositories It’s not Just the Technology New England Archivists Boston College March 11, 2006 Eliot Wilczek University Records Manager Tufts.
Preserving a Born-Digital Archive: The H-Net Lists Lisa M. Schmidt MATRIX: The Center.
Digital Preservation Practices and Strategies at Colorado State University Libraries.
DSpace Devika P. Madalli DRTC, ISI Bangalore.
MIT’s DSpace A good fit for ETDs Margret Branschofsky Keith Glavash MIT LIBRARIES.
The KnowledgeBank: Powered by DSpace Laura Tull Systems Librarian Ohio State University Libraries WiLSWorld July 27, 2004.
NHPRC ELECTRONIC RECORDS RESEARCH FELLOWSHIP SYMPOSIUM Nov. 19, 2004 Rebecca Schulte University of Kansas Project Title: Testing Boundaries—An Exploration.
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
I:\Share\Bestuursinligting\OUDITfinaal\Portfolio\Statistics\BI UPSpace An institutional repository for the University of.
I:\Share\Bestuursinligting\OUDITfinaal\Portfolio\Statistics\BI UPSpace An institutional repository for the University of Pretoria.
EZID (easy-eye-dee) is a service that makes it simple for digital object producers (researchers and others) to obtain and manage long-term identifiers.
I:\Share\Bestuursinligting\OUDITfinaal\Portfolio\Statistics\BI UPSpace An institutional research repository for the University of Pretoria.
Digital Asset Management for All? Visualising a Flexible DAMS Solution for Small and Medium Scale Institutions Paul Bevan Llyfrgell Genedlaethol Cymru.
ETD Repositories Using DSpace Software Andrew Penman The Robert Gordon University 27 th September 2004.
Ingest and Dissemination with DAITSS Presented by Randy Fischer, Programmer, Florida Center for Library Automation, University of Florida DigCCurr2007.
Electronic Mail List Preservation Takes Off: The H-Net Archive Lisa M. Schmidt MATRIX: The Center.
Statewide Digitization and the FCLA Digital Archive Priscilla Caplan, Florida Center for Library Automation Statewide Digitization Planners Meeting OCLC,
PeDALS Persistent Digital Archives & Library System Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library,
City of Seattle Office of the City Clerk Open Government = Access Challenges and Opportunities with Digital Records.
Dspace 1 Introduction to DSpace Mukesh Pund Scientist NISCAIR, New Delhi.
Advancing Institutional Repositories A Case Study in Digital Agricultural Publication Management Laura Hanson University of Illinois at Urbana-Champaign.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
Digital Preservation 101, or, How to Keep Bits for Centuries Julie C. Swierczek Digital Asset Manager and Digital Archivist Harvard Art Museums.
© 2007, Sarah L. Shreeves This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. To view a copy of this license,
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan, Florida Center for Library Automation DCC Workshop on Long-term Curation within Digital Repositories.
© 2007, This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. To view a copy of this license, visit.
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
Digital Preservation MetaArchive Cooperative.  9:00-9:45 - Session 1: Digital Preservation Overview  9:45-11:00 - Session 2: Policy & Planning Overview.
The Canadian Information Network for Research in the Social Sciences and Humanities Tim Au Yeung and Mary Westell Libraries.
Digital Preservation: Current Thinking Anne Gilliland-Swetland Department of Information Studies.
CONTENT DISCOVERY, SERVICES, AND SUSTAINED ACCESS Timothy Cole, William Mischo, Beth Sandore, Sarah Shreeves ~ University of Illinois Library
© 2007, IDEALS This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. To view a copy of this license, visit
SAYING WHAT WE DO – DOING WHAT WE SAY: Preservation Issues (Metadata And Otherwise) In Institutional Repositories Sarah L. Shreeves University of Illinois.
Storage of digital objects Adolf Knoll National Library of the Czech Republic
Multimedia ETD Questions Bill Savage UMI Dissertations Publishing ETD 2002 Provo, Utah Saturday, June 1, 2002.
Institute Repositories and Digital Preservation : Assessing Current Practices at Research Library Rathachai Chawuthai Information.
ScholarSpace & Open UH Mānoa March 2013 Beth Tillinghast Web Support Librarian ScholarSpace & eVols Project Manager UHM Library.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
DIGITAL PRESERVATION IN THE WILD CARLI Digital Preservation Forum – July 21, 2009 Tim Donohue Research Programmer - IDEALS University of Illinois (with.
How Not to Lose Track of Your Research Organization and Planning Resources at Brandeis Melanie Radik and Raphael Fennimore Library & Technology Services.
From ePrints to eSPIDA: Digital Preservation at the University of Glasgow William J Nixon, Service Development DAEDALUS, University of Glasgow DPC: Digital.
DSpace - Digital Library Software
Institutional Repositories: the DSpace Experience Ann J. Wolpert Director of Libraries Massachusetts Institute of Technology.
Vicki Tobias Introduction to and Institutional Repositories.
Archiving and Preservation Michele Kimpton CEO, DuraSpace Bryan Beecher Director, ICPSR DuraSpace Webinar November 2, 2011.
Digitization & Digital Preservation
Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives Lisa M. Schmidt
DAITSS and the Florida Digital Archive Priscilla Caplan Florida Center for Library Automation iPRES 2006.
DEEP BLUE University of Michigan Institutional Repository.
ARIADNE is funded by the European Commission's Seventh Framework Programme Archiving and Repositories Holly Wright.
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
A Project of the University Libraries Ball State University Libraries A destination for research, learning, and friends.
Managing live digital content with DuraSpace services Bill Branan PASIG Spring 2015.
Digital Stewardship Lee Dotson Digital Initiatives Librarian University of Central Florida John C. Hitt Library Presentation available at
Data Management and Digital Preservation Carly Dearborn, MSIS Digital Preservation & Electronic Records Archivist
Introduction to RDM Sarah Jones & Joy Davidson Digital Curation Centre
Digital Preservation What, Why, and How? Dan Albertson’s Digital Libraries Class April 13, 2016 Jody DeRidder Head, Metadata & Digital Services University.
Discover ScholarSphere A repository service collaboration between the University Libraries and ITS.
Preservation Planning Bojana Tasić FORS SEEDS Workshop I Belgrade, October.
DAITSS: Dark Archive in the Sunshine State
DAITSS and the Florida Digital Archive
Statewide Digitization and the FCLA Digital Archive
Introduction to DSpace
IDEALS at the University Of Illinois: A Case Study of Integration Between an IR and Library Discovery Systems Sarah L. Shreeves University of Illinois.
Institutional Repositories
Presentation transcript:

© 2008, IDEALS This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. To view a copy of this license, visit Illinois Digital Environment for Access to Learning and Scholarship Day-to-Day Digital Preservation: A Case Study GSLIS Data Curation Institute June 4, 2008 Tim Donohue IDEALS Technical Lead

Outline Intro to IDEALS How we got started... Our preservation strategies Current / future data curation work Next steps...

What is IDEALS? Digital repository for the scholarship and research of the faculty, students, and staff of the University of Illinois at Urbana-Champaign. Dissemination Persistent Access Preservation A joint initiative between the University Library and CITES with support from the Office of the Provost.

What type of materials? Also audio and video

IDEALS goals…. Help increase access to published and unpublished research Help increase the impact to published and unpublished research Provide a persistent, permanent URL for citing research Preserve research for long term access and use

The IDEALS “service model” Not just a repository… Set of services to collect, disseminate, and provides persistent and reliable access to the research and scholarship of faculty, staff, and students at the University of Illinois at Urbana-Champaign.

Other services we offer… Consultation on copyright issues Access restrictions / Embargo of items Statistics on number of downloads (working on departmental monthly reports…) Pilot service to deposit research into PubMed Central and ‘disciplinary’ repositories

IDEALS: the beginning

In the beginning: Promises proceed us - Can we really commit to preserving everything? - What does it really mean to preserve this stuff? - What kind of staff expertise do we need? - What kind of resources do we need? - What kind of technical infrastructure do we need?

Getting our act together Got our Preservation Librarian involved Training and self education Cornell’s Digital Preservation Management Workshop and Online Tutorial Understanding Open Archival Information System (OAIS) conceptual model Trustworthy Repositories Audit & Certification (TRAC)

The Digital Preservation Platform Borrowed from the ICPSR Digital Preservation Tutorial:

Preservation Takeaways: Be explicit about what you will do and what you won’t do. You don’t have to preserve everything if you say you aren’t. Digital preservation management is not just about the technology. Photo borrowed from:

Photo borrowed from: Not Really Our Server Room! Backup tapes stored next to the server! Getting our act together, cont.

Looking forward to production: Digital Preservation White Paper Laid out for the Library and CITES administration what supporting a digital preservation management program would mean: Commitment on the part of both organizations Resources in terms of funding and staff are specifically allocated Processes, policies, and the institutional commitment are documented and as transparent as possible. The technical infrastructure is developed using community standards. Commitment of resources for planning and community standards building.

IDEALS Preservation Policy: Operating Principles Adherence To: OAIS Reference Model Community standards for preserving digital content Hardware, software, and storage media best practices. Intellectual property, copyright, and ownership rights of all content Commitment To: Interoperable, scalable digital archive Clear, openly documented policies & procedures Archival requirements for provenance, custody, authenticity, integrity Goal: A Certified, Trustworthy Repository

What resources do we need? Funding Currently from the Office of the Provost Designated staff Built into our job descriptions Technology infrastructure Move from Library to CITES Better environment Better security Distributes support for the tech infrastructure

Risks and Challenges Technological Change Sustainability Partnership between the University Library and CITES Identifying an Exit Strategy

How IDEALS supports data (files)

What have others done? Michigan’s Deep Blue – format support policy Florida Digital Archive – format “action plans” LC: Sustainability of Digital Formats Australian Partnership for Sustainable Repositories (APSR)

Digital Preservation Support Format-based Categories of Support High Confidence Full Support (including migration) Medium Confidence No migration promised Low Confidence “Bit-level” support only Low Confidence (gray area) (size ≠ weight)

Compilation of “known” formats Concentration on textual formats Format Support Matrix ProprietaryOpen Microsoft OfficeOpenOffice.org, HTML Limited Adoption Widely Adopted OpenOffice.orgMicrosoft Office, HTML Limited Support Widely Supported Microsoft OfficeAdobe PDF, HTML Embedded Content / DRM Nothing Embedded MS Powerpoint (w/ Audio or Video)MS Powerpoint Lossy Compression No/Lossless Compression JPEGTIFF, JPEG 2000

Format Recommendations Textual CSV, Text, PDF/A, XML Open Document Format RTF, MS Office, PDF, HTML Audio AIFF, WAVE, Ogg Vorbis, FLAC AAC, MP3, Real, WMA Images TIFF, JPEG 2000 GIF, JPEG, PNG Video AVI, Motion JPEG 2000 MP2, MP4, Quicktime, WMV High Confidence / Preference Medium Confidence / Preference Data Concentration

What we are doing Basic Activities (All Items: )  Regular Virus Scans, Checksum verification  Nightly off-campus backups  Refresh storage media  Preservation Metadata (extremely minimal) Format, checksum, file size, etc.  Permanent Identifiers (Handles)  Always keep the original file(s)  Monitoring and reassessment of formats Very minimal/infrequent for

What we are doing Intermediate Activities ( )  Automated nightly “access copies” generated for major formats  When possible, attempt to migrate formats to preserve content and style (hopefully) No promises that functionality will be preserved Examples: MS Excel  CSV (possible functionality loss) MS Word  PDF (possible style / font loss)

Full Support Activities ( )  When necessary, migrate document to successive format.  Attempt to preserve content, style and functionality Example: OpenOffice.org 2.x  OpenOffice.org 3.x What we are doing

What we are NOT doing Checking every file for content problems character encodings, DRM, embedded content Verifying ALL automated migrations are “successful” Checking validity of format (e.g. JHOVE)‏ Removing/modifying/replacing original file Exceptions: viruses found or OCR necessary

Making data available in IDEALS

Data in IDEALS: still early days Mostly ‘simplistic’ data sets Data in spreadsheets, text, XML formats i.e. “familiar” formats Problem: capturing relationships (between datasets, procedures, papers, etc.) in DSpace

Vole Demographic Data 25 years of data Data in 20 Excel files HTML Explanations PDF Manuscripts HTML “Sitemap” organizes data/files Lowell Getz (Dept of Animal Biology)

Illinois long-term selection exp. for oil & protein in corn Data since 1896 (ongoing) SAS Statistical System files (text) ReadMe describes experiments Tech Reports in PDF Collection description organizes data/files Department of Crop Sciences

Soon: Crystallography Data Processed data in CIF (Crystallographic Info File) Transform to CML (Chemical Markup Lang.) Original unprocessed data kept on server in Clark X- Ray Facility (size concerns) Data available to SPECTRa Search tools… Scott Wilson, School of Chemical Sciences Borrowed from Jmol website:

Future: Morrow Plots Data Oldest continuous agricultural research fields in USA (est. 1876), 2 nd in world National Historical Landmark Data ongoing Likely require digitization of lab notebooks, etc. Borrowed from Agronomy Day 2001 website:

Gaps – What we are NOT doing Providing unique “visualizations” of data Some will be coming, for crystallography data Photo borrowed from: Making data “useable” directly from IDEALS Making data itself “searchable” (besides metadata)

Sustainability issues… How do we preserve *more* than just files? How do we keep data understandable? Disk space concerns… versus access Photo borrowed from:

A more “ideal” future When possible, finer grained access to data Better ways to show relationships between data and results/papers Begin to develop models for talking to faculty about their data Photo borrowed from:

For More Information Sarah ShreevesTim Donohue IDEALS CoordinatorTechnical Lead Policies / Documentation: