The KB e-Depot long-term preservation of scientific publications in practice Marcel Ras, National library of The Netherlands
Libraries: traditional or digital
What is the problem with digital? Digital information needs an intermediary to be interpreted –Hardware, operating system, software Physical carriers can be damaged. Digital carriers too and can become obsolete There is an awful lot of digital information –annual growth of 500 million TB –2008: 490 billion GB of information produced on the internet (30 billion iPods) We rely on digital information Digital information only (no paper equivalent)
The KB KB is national library of The Netherlands Task of a national library is to collect, describe and preserve national imprint Paper, but also digital National deposit since 1974 (but no deposit legislation) 49 km books (2,5 million) –annual growth: km journals –annual growth : over 1 km microfilm 13 million digital scientific articles –annual growth: 3 million
The KB e-Depot (2) Digital version of traditional depot Operational since 2003 No legal deposit legislation Based on agreements with publishers International focus International scope of scientific output Safe Places Network
History of the KB e-Depot 1994: e-publications part of deposit: need for an infrastructure 1995: Experiments with Elsevier, Dutch Publishers Association 2002: Landmark archiving agreement with Elsevier 2003: e-Depot system operational 2006:start project with Academic repositories 2007:Archiving agreements with major 19 STM publishers 2007: Start development ingest procedures for other materials 2008:co-operation with Open Access communities (DARe / DOAJ)
What do we preserve?
Why should we bother? Accessibility of scientific information and knowledge is in danger Next generations do not have access to the information of our era (digital dark ages) E-only and e–communication only Digital information is extremely fragile –Thread for cultural heritage –Thread for scientific research –Financial consequences
Should we bother? BBC Domesday project 1086:William the conqueror 1986: information on British society stored on state-of-the-art laser disks After 20 year carrier and hardware were useless
What can we do? Digital Preservation Long-term and safe storage registration Tools for permanent access Preservation metadata + + permanent access Time machine storage
Digital archive
Registration
Time machine
Preservation research Research on file formats and tools –Characterization (Jhove, Droid) –validation Preservation strategies –Migration –Emulation Preservation metadata –PREMIS Preservation planning storage R&D results directly implemented into e-Depot infrastructure Quality improvement is continuous process International collaboration
What does it cost? Initial costs for development (2000 – 2003) Annual costs, of which –Operational Staffing: 30 % –Project staffing & development: 25 % –Maintenance + hard- software licenses: 25 % –Storage: 20 % But how to calculate –Preservation management –Preservation actions Annual costs about 4 million euro
DP is not only a technical issue e-Depot system e-Depot infrastructure Day-to-day operations e-Depot department Research & Development DP department Technical management IT department IBM Access Online Services department Functional owner, acquisitions, analysis, quality control, ingest, data management, publisher contact, guidelines Research, projects, European research projects, guidelines, Development, DP policy Daily maintenance, storage, IT infrastructure, coordinating technical improvements User services, ILL, User Interfaces, user survey, Access policy ProducerDesignated community Acquisitions and Processing Division Research & Development Division IT Department User Services Division
New challenges New collections to be preserved –Websites, digitized materials, institutional repositories New content types –e-books, image files, AV, multimedia, complex objects Hybrid collections –Websites –Publications & research data –Analog & digital combined –“liquid publications” –Compound objects Growing storage capacity –From 11 TB to 500 TB
Conclusions Preservation is not just storage and technique It demands for long-term organizational commitment It requires continuous research It requires substantial investments in infrastructure and up-to-date expertise and skills It brings organizational changes –Process innovation: from traditional library to digital library –e-Depot brought new set of skills to the “traditional” library –Organizational change It asks for constant rethinking: next generation LTP solution
VRAGEN?