Long term preservation: an overview Michael Day Digital Curation Centre UKOLN, University of Bath Joint Workshop on Electronic.

Slides:



Advertisements
Similar presentations
OCLC Online Computer Library Center Steering Around the Iceberg: Economic Sustainability for Digital Collections Brian Lavoie Research Scientist OCLC Economics.
Advertisements

The Reference Model for an Open Archival Information System (OAIS) Michael Day Digital Curation Centre UKOLN, University of Bath
Philip LordDigital Archiving Consultancy Alison Macdonald Digital Archiving Consultancy Liz LyonDigital Curation Centre David GiarettaDigital Curation.
Supporting Further and Higher Education Joint Information Systems Committee JISC Strategies & Support of e-Science for Research Dr Malcolm Read JISC Executive.
Metadata for preservation: the Cedars perspective
Collection-level description & collection management: tool for the trade or information trade-off? Collection Description Focus Workshop 4 Newcastle, 8.
The OAIS Reference Model: current implementations Michael Day, UKOLN, University of Bath Chinese-European Workshop.
The PREMIS Data Dictionary Michael Day Digital Curation Centre UKOLN, University of Bath JORUM, JISC and DCC.
A centre of expertise in data curation and preservation EAOLUG :: RSC :: Cambridge23 May 2006 Funded by: This work is licensed under the Creative Commons.
Joint Information Systems Committee Digital Library Services BL/JISC Workshop Rachel Bruce JISC Programme Director The Digital Library and its Services,
The metadata challenge for libraries: a view from Europe Michael Day UKOLN: The UK Office for Library and Information Networking, University of Bath
Preservation Metadata Initiatives: Practicality, Sustainability, and Interoperability Michael Day UKOLN, University of Bath ERPANET Training.
Collection-level description & the Information Landscape: users evaluate strategies for resource discovery Collection Description Focus Workshop 5 Cambridge,
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
A centre of expertise in data curation and preservation UKOLN Open ForumIWMW June 2006 Funded by: This work is licensed under the Creative Commons.
Joint Information Systems Committee 11/03/07 | | Slide 1 Joint Information Systems CommitteeSupporting education and research JISC Conference 2007 Managing.
A centre of expertise in data curation and preservation CETIS MDR SIG::28 June 2006::University of Bath Funded by: This work is licensed under the Creative.
Integrating metadata schema registries with digital preservation systems to support interoperability Michael Day UKOLN, University.
Supporting further and higher education Supporting Digital Preservation and Asset Management in Institutions eSPIDA event University of Glasgow 11 February.
DARE: building a networked academic repository in the Netherlands ICOLC October 25 Ronald Dekker Delft University of Technology Library.
A centre of expertise in data curation and preservation MIS Seminar :: University of Edinburgh :: 2 October 2006 Funded by: This work is licensed under.
INFSO-RI Enabling Grids for E-sciencE Grid & Data Preservation Boon Low System Development, EGEE Training National.
Preservation and Long-term access through Networked Services Adam Farquhar, The British Library iPres2006 Cornell University, October 2006.
Depositing and Disseminating Digital Resources Alan Morrison Collections Manager AHDS Subject Centre for Literature, Linguistics and Languages.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
Digital | Curation | Centre The UK Digital Curation Centre Michael Day UKOLN, University of Bath (with thanks to Peter Burnhill, Chris Rusbridge, et al.)
© HATII, University of Glasgow Introduction to the UK ’ s Digital Curation Centre Prof Seamus Ross Visiting Fellow at Oxford Internet Institute ,
Metadata for preservation Michael Day, UKOLN, University of Bath Chinese-European Workshop on Digital Preservation,
Documenting to preserve your data: metadata in support of digital preservation Michael Day, UKOLN, University of Bath
David Giaretta Associate Director (Development) Funders: DCC Development Digital Curation Centre a centre of expertise in data curation and preservation.
Metadata in support of digital preservation Michael Day, UKOLN, University of Bath Beginners Guide to Metadata:
Supporting further and higher education The UK FAIR Programme: OAI in context Chris Awre OAI3, CERN, February 2004.
A disaggregated model for preservation of E-Prints Gareth Knight SHERPA DP Project Arts and Humanities Data Service.
Caring and Sharing Collaboration in Digital Curation outside North America Ross Harvey Simmons College, Boston Curation Matters: 17 June 2010.
OAIS in the Library Environment Managing and Preserving Electronic Resources FLICC/CENDI Washington DC, December 11,2001 Anne Van Camp RLG, Member Initiatives.
Peter Burnhill Director (Phase One) Funders: Aims & Organisation Digital Curation Centre a centre of expertise in data curation and preservation.
1 Digital Archives - Past, Present & Future Issues Anne Van Camp Manager, Member Initiatives The Research Libraries Group Digital Archives Directions (DADs)
IFAP Special Event: Information and Knowledge for All, Emerging Trends and Challenges Information Preservation 4000 Years of Traditions Challenged by Digital.
Metadata in a distributed information environment: Interoperability as recombinant potential Lorcan Dempsey OCLC/SCURL pre-IFLA conference, 15/16 Aug 02.
Digital preservation Michael Day UKOLN, University of Bath, UK University of Bristol, MSc in Library and Information.
The Canadian Information Network for Research in the Social Sciences and Humanities Tim Au Yeung and Mary Westell Libraries.
Digital Preservation Coalition Supporting Digital Preservation NOF-digi Preservation Workshop Senior Managers’ Brief Maggie Jones DPC Co-ordinator
Digital preservation Michael Day UKOLN, University of Bath, UK University of Bristol, MSc in Library and Information.
European Commission on Preservation and Access Preservation of digital heritage Yola de Lusenet Lisbon, November
The KB e-Depot long-term preservation of scientific publications in practice Marcel Ras, National library of The Netherlands.
Gateways Heather Brown Project Officer, State Library of S.A, for Business Information Program, University of S.A. and Assistant Director, Paper, Artlab.
Metadata for digital preservation: a review of recent developments Michael Day UKOLN, University of Bath ECDL2001, 5th European Conference.
The digital preservation technological context Michael Day, Digital Curation Centre UKOLN, University of Bath
OCLC Online Computer Library Center The ‘Hows’ and ‘Whys’ of Preserving Digital Materials Brian Lavoie Research Scientist OCLC CARL program: “Here Today,
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Digital Preservation across the technologies, strategies, open standards & interoperability aspects including the legal issues Pratik Shrivastava Scientist.
UKOLN is supported by: Introduction to UKOLN Dr Liz Lyon, Director UKOLN, University of Bath, UK Grand Challenge Meeting, June a centre.
From ePrints to eSPIDA: Digital Preservation at the University of Glasgow William J Nixon, Service Development DAEDALUS, University of Glasgow DPC: Digital.
JISC/CNI Conference Edinburgh, 26th June 2002 Challenges of Digital Preservation – do we have a road map? Maggie Jones.
Digital Repositories: Concepts and Issues By Devendra. S. Gobbur (Sr) Assistant Librarian, Gulbarga University, Gulbarga. 10 NOV, NOV, 2009.
The OAIS Reference Model Michael Day, Digital Curation Centre UKOLN, University of Bath Reference Models meeting,
Preservation metadata and the Cedars project Michael Day UKOLN: UK Office for Library and Information Networking University of Bath
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
Cedars work on metadata Michael Day UKOLN, University of Bath Cedars Workshop Manchester, February 2002.
Long-term preservation and access: the UK context Michael Day, UKOLN, University of Bath RCUK Workshop on Publication.
New Opportunities Fund Preservation Workshop March 15th 2002 Maggie Jones Cedars Project Manager.
An overview of the Reference Model for an Open Archival Information System (OAIS) Michael Day, Digital Curation Centre UKOLN, University.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Digital Preservation Initiatives in the United States A Summary Deanna B. Marcum.
Building A Repository for Digital Objects
Metadata for preservation
Presentation transcript:

Long term preservation: an overview Michael Day Digital Curation Centre UKOLN, University of Bath Joint Workshop on Electronic Publishing, Lund, Sweden, 15 April 2005 Digital Curation Centre a centre of expertise in data curation and preservation Funded by:

2 Session overview –Quick introduction –A fifteen year view –Overview of current issues

3 What is digital preservation? –Dealing with the potential technical problems that impede continued access to digital resources (of all types) –No longer possible to place physical artefact on a shelf and ignore for 100+ years –Not just a technical problem: "... The planning, resource allocation, and application of preservation methods and technologies to ensure that digital information of continuing value remains accessible and usable" - Margaret Hedstrom (1998)

4 What is digital curation? –New(ish) term, from science data world (e.g. bioinformatics) –Reflects those extra things that need to be done to facilitate access and reuse –"The activity of managing and promoting the use of data from its point of creation, to ensure it is fit for contemporary purpose, and available for discovery and reuse" - Philip Lord, et al. (2004)

5 Why is it a problem? (1) –An increasing flood of 'born-digital' data The World Wide Web –Comprises billions of pages + "deep Web" –Internet Archive = >1 petabyte, and 20 Tb. per month ( Data deluge in science and engineering –Petabytes generated by high throughput instruments, streamed from sensors and satellites, etc. 5 exabytes of new information created in 2002: – much-info-2003/

6 Why is it a problem? (2) –Need for (open) access to this data Results in added scientific value New analytic techniques OECD member states endorsed the principle that publicly funded research data should be openly available to the maximum extent possible

7 Technical problems –Media longevity Estimated lifetimes are short compared to paper or good quality microform Solutions: more durable media, 'refreshing' regimes –Hardware and software obsolescence Relatively short obsolescence cycles for hardware, peripherals, media, and software For example, BBC Domesday Project (1986) - hybrid videodisc

8 Preservation strategies (1) –Technology preservation The preservation of an information object together with all of the hardware and software needed to interpret it –But will lead to museums of "ageing and incompatible computer hardware" - Mary Feeney (1999) –Has key role in the rescue of digital objects (digital archaeology) –Emulation The preservation of original application software and to run this on emulators that mimic the behaviour of obsolete hardware and operating systems –Development of ‘virtual machines’ that will be migrated to work on different platforms (Jeff Rothenberg, 1998) –Universal Virtual Computer (UVC) concept

9 Preservation strategies (2) –Migration –Managed transformations –The periodic transfer of digital information from one hardware and software configuration to another, or from one generation of computer technology to a subsequent one - CPA/RLG report (1996) –Widely used strategy, e.g. on ingest into a repository –Problems with preserving the integrity of an object –Encapsulation –Self-describing objects, e.g. information package in OAIS model, METS, Buckets, Universal Preservation Format

10 Preservation strategies (3) –Metadata and documentation –All digital preservation strategies depend - to some extent - on the creation, capture and maintenance of metadata –"Preserving the right metadata is key to preserving digital objects" (ERPANET Briefing Paper, 2003) –The various types data that will allow the re-creation and interpretation of the structure and content of digital data over time (Ludäsher, Marciano & Moore, 2001) –Reference Model for an Open Archival Information System (OAIS) - ISO 14721:2003 –PREMIS working group

11 A fifteen year retrospective –Based on my dissertation: "Preservation problems of electronic text and data" - Loughborough University (1989) Overview of the state of the art in digital preservation in the late 1980s Hardware and software used = IBM PC XT, MS DOS, 5¼" floppy disks, shareware word processing program (Galaxy)

12 The 1980s - contexts –Still faith in the "paperless" future –Electronic publishing in its infancy Online databases (mainly bibliographic) Viewdata systems (e.g., Minitel, Prestel) Experiments with electronic journals (e.g. BLEND, project quartet) and electronic document delivery systems (ADONIS) CD-ROM databases

13 The 1980s - issues (1) –Digital preservation issues: Major focus on the longevity of media –e.g., BNB Research Fund funded comparison of microfilm, magnetic media, and optical disks for archival storage (1983) –Interest in the potential value of new types of optical media, e.g. videodisc, Compact Disc (CD-ROM, CD-R) –No promising results from initial research

14 The 1980s - issues (2) –Knowledge that media longevity was not the only issue "The problem with machine-readable records is the long term availability of the machines rather than the physical decay of the recording mechanism" - John Mallinson (1986) Brief consideration of COM (microform) for long- term storage

15 The 1980s - experiences (1) –National archives: A focus in some countries on machine-readable records from the 1960s The principle that machine-readable records should be treated in the same manner as conventional records was established very early on, e.g. by Meyer Fishbein (1972) Also, there was an early recognition of the importance of documentation and economic factors

16 The 1980s - experiences (2) –Data archives: Storage of social science survey data started in the punched-card era (1940s) ESRC Data Archive established 1967 –Recognised the importance of developing procedures to manage data (e.g., migration on ingest) and of standardised descriptions (metadata) –National libraries: Were considering legal deposit obligations

17 The 1980s - summing up –Some differences with the position today, e.g.: –General lack of awareness –Focus on media longevity, 'refreshing' strategies –Little practical experience (except for data archives) –Some continuity, e.g. it was recognised: –That the obsolescence of hardware (and software environments) was a serious problem –That data management strategies and documentation/metadata were important –That digital resources were not conceptually different to non-digital ones

18 The current context (1) –The World Wide Web –Changes in scholarly communication, e.g.: Increased use of electronic journals, e-print repositories Changes in scientific practice: data-intensive science, Grid computing, petabyte-scale storage, e-research Current focus on open access –Similar developments elsewhere, e.g.: Broadcasting, e-commerce, e-government,...

19 The current context (2) –Task Force on Archiving of Digital Information (1996) in UK led to influential research projects like Cedars, eventually to the Digital Preservation Coalition (DPC) –Major current initiatives: US National Digital Information Infrastructure and Preservation Program (NDIIPP) ERPANET, NESTOR, KB's e-Depot, etc. UK Digital Curation Centre

20 Digital Curation Centre (1) –Funded from 2004 for three years by the JISC and the e-Science Core Programme –Main aim: "continuing improvement in the quality of data curation and digital preservation" –Will focus on all aspects of the research process, e.g. from data creation to publication and beyond, also on the work of repositories and data archives –Not itself a digital repository, but offering outreach and practical services to assist those who curate data …

21 Digital Curation Centre (2) –Main activities: Advisory services and outreach Development –Registries of Representation Information, testing of tools, … Research programme –Role of annotation, legal and socioeconomic issues, … Collaborative network of associates –Partners: Universities of Edinburgh (lead), Glasgow and Bath (UKOLN), CCLRC –

22 Key developments (1) –Greater awareness of the issues –Digital preservation now beginning to be taken seriously by governments and NGOs (e.g. Unesco Charter on the Preservation of Digital Heritage, World Summit on the Information Society) –More experience with developing systems and tools, e.g.: –DIAS (IBM), DSpace, Fedora, Internet Archive, LOCKSS, OCLC Digital Archive, PANDAS, PubMed Central, Storage Resource Broker, etc. –Journal publishers co-operating with KB on e-Depot

23 Key developments (2) –Standards Reference Model for an Open Archival Information System (OAIS) - ISO 14721:2003 –A reference model, not a blueprint - but increasingly influential Preservation metadata –Current focus on PREMIS working group, supported by OCLC and Research Libraries Group –Other activity ongoing, e.g. in scientific research domains

24 Research (1) Some key requirements identified in: It's about time: research challenges in digital archiving and long-term preservation, National Science Foundation and Library of Congress (2003): pdf Invest to Save: report and recommendations of the NSF-DELOS Working Group on Digital Archiving and Preservation (2003): WGs/digitalarchiving/Digitalarchiving.pdf

25 Research (2) –DELOS preservation cluster: Frameworks for the analysis of preservation strategies Building preservation functionality into digital libraries File formats and metadata Workshop on Digital repositories: interoperability and common services, Crete, May 2005:

26 Research (3) –Current JISC research programmes: Supporting Digital Preservation and Asset Management in Institutions –Relatively small-scale projects: assessment tools, training, user guides, etc. Digital Repositories (deadline last week) –Building on Focus on Access to Institutional Resources (FAIR) programme

27 Some issues (1) –Open access repositories and preservation: –Exact role of repositories still evolving: »Some advocates of open access treat digital preservation concerns as a distraction to the primary task of "filling up the archives" »But the recent National Institutes of Health public access policy requests grantees to submit publications to PubMed Central - emphasising its role for permanent preservation –Disaggregated model proposed, whereby not all repositories will have preservation responsibilities »Possible need for mechanisms for transferring content to third parties, e.g. national libraries

28 Some issues (2) –Trusted repositories: Attributes and responsibilities of 'trusted repositories' defined by RLG and OCLC working group (2002) –Builds on 1996 Task Force report and OAIS model –Attributes include the viability and financial sustainability of the organisation, and the need for accountability –Question whether these (and other criteria) could be used as a basis for certification is being explored by the Task Force on Digital Repository Certification, supported by RLG and the National Archives and Records Administration (NARA)

29 Some issues (3) –Collection development: Selection/appraisal, storage, access, 'de-selection' Preservation issues need to be considered early in an object's life-cycle (the traditional 'transfer to repository' model will not work) –Rethinking concept of 'custody' Cannot be done in isolation –Sharing responsibilities across repositories while maintaining useful redundancy

30 Some issues (4) –Legal issues: Repositories need the legal right to copy, migrate, reverse engineer software, etc. Problems with identifying rights holders Access - are "dark archives" the answer?

31 Some issues (5) –Economic issues: Still very little known about costs over the long term No widely used economic models Research-type funding is not long-term –Recent draft report for National Science Foundation asks whether digital collections should be treated like scientific facilities

32 Summing up (1) –Major differences from the late 1980s Problem has grown, but awareness of it is now much higher Many research projects, vendors, services, etc. now investigating this problem - not always particularly co-ordinated Encouraging signs in funding of NDIIPP, DCC and other recent initiatives

33 Summing up (2) –Co-operation is essential Some progress, e.g. DPC, ERPANET Need to work out how trusted repositories will work together in a distributed network Need for training –Many problems remain to be resolved Research (e.g. into provenance of data, the role of file format registries) Development of tools Integrating existing work

34 More information –National Library of Australia's Preserving Access to Digital Information (PADI) gateway: –Joint DPC and PADI bulletin What's New in Digital Preservation: –UK Digital Curation Centre:

35 Acknowledgements The Digital Curation Centre is funded by the Joint Information Systems Committee (JISC) of the UK higher and further education funding councils and the Core e-Science Programme of the UK research councils. The consortium comprises the University of Edinburgh (lead partner), the University of Glasgow, the Council for the Central Laboratory of the Research Councils, and the University of Bath (UKOLN). UKOLN is funded by the Council for Museums, Libraries and Archives (MLA) and the JISC, as well as by project funding from the JISC, the European Union and other sources. UKOLN also receives support from the University of Bath, where it is based.