Kopal - a Co-operative Approach to develop a Long-Term Digital Information Archive ICOLC 2006, Rome Dr. Thomas Wollschläger, German National Library (GNL)

Slides:



Advertisements
Similar presentations
Current State of Play in Digital Preservation Peter B. Hirtle Cornell University Library Society of American Archivists.
Advertisements

E-government mapping in Brazil Brazilian Court of Audit Cláudia Dias - August 2004.
1 L U N D U N I V E R S I T Y Integrating Open Access Journals in Library Services & Assisting Authors in choosing publishing channels 4th EBIB Conference.
1 Data for the Future: the German Project "Co-operative Development of a Long-term Digital Information Archive" (kopal) Hands-on Workshops Reinhard Altenhöner,
Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.
1 of 18 Information Dissemination New Digital Opportunities IMARK Investing in Information for Development Information Dissemination New Digital Opportunities.
1 of 15 Information Access Internal Information © FAO 2005 IMARK Investing in Information for Development Information Access Internal Information.
Network of Expertise in Long-Term STOrage of Digital Resources Creation of a Network of Expertise in Long-Term- Archiving and Long-Term-Accessibility.
IBM Haifa Research Lab © 2008 IBM Corporation Contacts: Simona Cohen, Michael Factor, Dalit Naor
Permanent access to the records of science: The e-Depot at the Koninklijke Bibliotheek Current Status & Developments Erik Oltmans Manager e-Depot Koninklijke.
Beyond the Google Book: the Future of the Digital Library Cory Snavely Library IT Core Services manager University of Michigan April 20, 2010.
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started.
ABTC Business Training & Consultancy 1 AZERBAIJAN BANK TRAINING CENTRE Knowledge path to successful business about ABTC July 01, 2008.
Kristīne Pabērza Ministry of Culture State Agency Culture Information Systems Latvia Member States' Expert Group on Digitization and Digital Preservation.
Introduction to Planets Hans Hofman Nationaal Archief Netherlands Prague, 17 October 2008.
Edinburgh 23 October DSpace: A Platform for Research Repositories Peter Morgan Project Director, Cambridge University Library.
Permanent access to digital resources Digital Archiving at the national library of the Netherlands Erik Oltmans Head, Acquisitions & Cataloguing Division.
LIFE Project Lifecycle Information for E-literature Richard Davies LIFE Project Manager The British Library CARL Visit to the BL 27 November 2007.
New Developments in Library and Archives Canadas ETD Program 11 th International Symposium on ETDs Aberdeen, Scotland, June 5, 2008 Sharon Reeves, Manager,
OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
Metadata for preservation: the Cedars perspective
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
September 1 st 2010 Igelu Ghent The on-the-fly conversion circus Matthias Gross (Bavarian State Library)
Preserving and Sharing Digital Data Greg Colati, Director, Archives and Special Collections May 11, 2012.
Copyright, 2011 WowWe® VERS ONLINE 2 3.
Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From.
Configuration management
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs.
1 János Káldos – István Moldován: Digitisation activity in the National Széchényi Library 1st colloquium of library information employees of the V4 countries.
Digital Archiving at the national library of the Netherlands Hans Jansen Director, Research & Development Kansai-kan, Japan, 16 March 2007.
SLP – Endless Possibilities What can SLP do for your school? Everything you need to know about SLP – past, present and future.
By CA. Pankaj Deshpande B.Com, FCA, D.I.S.A. (ICA) 1.
Klaus Kempf Long Term Preservation: Needs and Activities at the Bavarian State Library (BSB) „ALA 2010 Washington“
EDLocal kick off meeting June 26-27, María Luisa Martínez-Conde Subdirectorate General for Library Co-ordination Digital Libraries in Spain: Policies.
Testing and Evaluation in Digital Preservation Projects: the case of KEEP Milena Dobreva Janet Delve, David Anderson, Leo Konstantelos.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 National and International.
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
| IFLA2010. Newspaper Section | Newspaper Resources in transition: Digital Preservation and Access - keynote - IFLA International Newspaper.
| IFLA2010. Newspaper section | Changing preservations tasks for the German National Library: Some insights and preliminary remarks IFLA International.
ETD‘s as pilot materials for long-term preservation efforts in kopal 9th ETD Conference 2006, Quebec Dr. Thomas Wollschläger, German National Library (GNL)
The British Library’s METS Experience The Cost of METS Carl Wilson
Different approaches to digital preservation Hilde van Wijngaarden Digital Preservation Officer Koninklijke Bibliotheek/ National Library of the Netherlands.
Architecting an Extensible Digital Repository Anoop Kumar, Ranjani Saigal,Rob Chavez, Nikolai Schwertner Tufts University, Medford, MA.
Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August Online materials published in Austria collecting, archiving and metadata.
Ingest and Dissemination with DAITSS Presented by Randy Fischer, Programmer, Florida Center for Library Automation, University of Florida DigCCurr2007.
1 The Universal Object Format - A METS Profile for an archiving and exchange format for digital objects.
Brussels, Belgium, ABD/BVD 60, Conference 2007 november 19 The legal deposit for digital publications - new challenges for the German National Library.
Mass digitisation? Astrid Verheusen Projectmanager Research & Development Division National library of the Netherlands LIBER-EBLIDA Workshop on Digitisation.
File format registries - a global infrastructure for local persistence Andreas Aschenbrenner, ERPANET.
Digital preservation activities at the NLW Sally McInnes 18 September 2009.
| Ingest Levels and Persistent Identification | October Ingest Levels and Persistent Identification Services for R & D and heritage organisations.
The KB e-Depot long-term preservation of scientific publications in practice Marcel Ras, National library of The Netherlands.
Metadata for digital preservation: a review of recent developments Michael Day UKOLN, University of Bath ECDL2001, 5th European Conference.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan Florida Center for Library Automation (FCLA)
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
Preservation Functionality in a Digital Archive Erik Oltmans Koninklijke Bibliotheek Raymond J. van Diessen IBM Business Consulting Services Hilde van.
Ingest and Dissemination with DAITSS
An Introduction to Tessella and The Safety Deposit Box Platform
Statewide Digitization and the FCLA Digital Archive
Introduction to Implementing an Institutional Repository
DIGITAL LIBRARY.
Implementing an Institutional Repository: Part II
Digital Preservation Planning:
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
Presentation transcript:

kopal - a Co-operative Approach to develop a Long-Term Digital Information Archive ICOLC 2006, Rome Dr. Thomas Wollschläger, German National Library (GNL)

2 2 Agenda 1.Challenges for long-term preservation 2.The role and features of the kopal initiative 3.Planned & present data ingest 4.Future challenges

3 3 * 196 b.c. - † not yet * † 2005 (?) The problem of the digital age

4 4 Preservation challenges at GNL  German online publications are being delivered in numerous file formats  Innovative file formats have been encouraged over the years  3-D images & simulations  Embedded audio and video  Executables  First file types are no longer accessible  Unsatisfying document server architecture up to now  Advantage: Excellent metadata format (for ETD‘s) throughout Germany, trusted workflows for ETD delivery from universities

5 5 Challenges of a digital long-term archive  Rapid technology changes hinder the access to older file formats  Problem 1: Conservation of binary data (0 and 1) – No existing data carrier lasts forever – Solution: Regular bitstream-preservation  Problem 2: Access to the content – Numerous formats; always new ones; old ones vanish – Dependencies from present soft- and hardware – Solutions: Migration (regular conversion), Emulation (re-enacting used systems)

6 6 German national initiative „kopal“  Co-operative development of a long-term digital information archive  funded by the Federal Ministry for Education and Research  Financial volume: 4,2 Mio € + self-financed activities of all partners, duration: – (+ X)  Task: Development of a standardized long-term preservation solution to facilitate long-term preservation for other libraries / industries  Solution as a facilitator for co-operation between libraries and other institutions / companies

7 7 kopal: Concept and background  Basis: DIAS (Digital Information and Archiving System) of the Royal Dutch Library, The Hague  Developed by IBM  reliable standard components (CM, TSM, …)  Implementation of the OAIS standard  Further development of a suitable long-term preservation component (emulation, migration)  Starting point for preservation planning  What we’ve missed:  Enhancement for co-operative usage  Hosting outside the library (remote access)  Development of a universal object scheme  A more generic approach  Conclusion:  Extension of DIAS-Core and development of peripheral open-source based software tools to broaden its usability

8 8 kopal: Partners  German National Library (GNL, leader)  State and University Library Göttingen  Industrial Business Machines (IBM) Germany  Society for Scientific Data Processing Göttingen (GWDG) Working relationship:  Royal Dutch Library, The Netherlands

9 9 Kopal storage structure in Germany

10 GWDG (Göttingen) DIAS by IBM Account 1 Account 2 SUB Göttingen GNL (Frankfurt) Local software Local software Local software Local software kopal: Structure & concept Partners nn

koLibRI Retrieval Component Selection Collection Cache koLibRI Ingest Component Metadata Extraktion Metadata Generation (JHOVE) UOF Creation (SIP with METS) Presentation components User XML + Data XML + Data (OAIS Compliant) UOF (SIP)UOF (DIP) Archival Storage Ingest Preservation Data Manag. Access Admin DIAS

12 Packaging Submission Information Package Object METS 1.4 UniversalObjectFormat LMER 1.2 – Long-term preservation Metadata for Electronic Ressources Header dmdSec amdSec File Section Structural Map Mets.xml

13 Example for mets.xml in kopal

14 XMetaDiss Example for an ETD

15 Kopal preservation strategy  Migrate object with urn xxx into new format yyy  Migrate all objects  of format xxx and/or  that have been ingested before a certain date and/or  that are larger than zzz MB into new format xyz (e.g. from TIFF to PNG)  Implementation of emulation view paths  No restriction as of file size or file format / type – all known and unknown file formats are being accepted (text, pictures, video, audio, executables,... etc.)

16 Data for Ingest  Online Theses and Dissertations at GNL  Number: ~ at present, Data amount: ~ 350 GB  Most used digital collection of GNL (> access cases/month)  Electronic journals & serials  Data amount: ~ 300 GB  CD-ROM images  Number: ~ to , Data amount: ~ to GB  Digitised materials:  Exil Press Digital (from GNL): ~ 150 GB  External digital collections: ~ to ~ GB  Digitised books from (GNL): ~ GB (for starters)  Digital audio from German Music Archive (GNL): ~ GB

17 Present ingest  Productive system was installed and made available to SUB and DNB in June 2006  Several tests conducted (same Tests as on the ATE)  Productive ingests of dissertations with an URN started early August 2006  About dissertations processed  Over ingested successfully  Rest was seperated before ingest for validation and reviewing (yet unsupported filetypes, etc.)  Everything ingested to DIAS was processed correctly

18

19 Data ingest for kopal with ETD‘s as start

20 Challenge: Preservation Planning + Access  In face of rising data amounts and large single objects (e.g. digitised DVD-ROM images with ~8 GB):  Guarantee a sufficient performance of the system  Implementation of suitable access systems  Fast Internet connections, user support  Implementation of a functioning Preservation Planning mechanism  Functioning international File Format Registry  Performant migration of large data amounts  Successful implementation of emulation mechanisms  Information, support & encouragement of ETD producers towards a format & preservation awareness

21 Informations on kopal  The kopal project, used standards and downloads of documentation:  Questions to the kopal team at German National Library:  Thanks for your patience and attention!