a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. sa/2.5/scotland/ From Digital Creation to Digital Curation Managing Digital Cultural Heritage Resources Maureen Pennock Digital Curation Centre, UKOLN, University of Bath
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Todays Talk Introductions The UK Digital Curation Centre Curation and the digital life-cycle Issues in developing and managing digital collections Helpful projects and initiatives Discussion
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. sa/2.5/scotland/ The UK Digital Curation Centre
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Digital Curation Digital Curation, broadly interpreted, is about maintaining and adding value to a trusted body of digital information for current and future use The active management and appraisal of data over the entire life-cycle
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee The DCC Launched in 2004 Established to help solve the extensive challenges of digital preservation and curation, and to provide research, advice and support services to UK institutions Consortium project with 4 main partners 4 main teams distributed across the 4 UK locations Funded by JISC & the e-Science Core Programme
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Organisation to Engage & Collaborate Industry research collaborators standards bodies testbeds & tools communities of practice: users community support & outreach research development co-ordination service definition & delivery management & admin support Collaborative Associates Network of Data Organisations curation organisations eg DPC
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee DCC Outreach Raising Awareness and Dissemination Website ( ) International Journal of Digital Curation Annual International Conference Understanding Users and their Needs Requirements gathering Associates Network DCC Forum
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee DCC Services Information Services Community-developed Digital Curation Manual Briefing Papers & FAQs Technology Watch, Standards Watch, Legal Watch Case Studies Best Practice Checklists Advisory Services Events: information days, workshops, training Helpdesk Audit and Certification Services
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee DCC Research Annotation in Databases Data archiving Socio-economic and legal issues Metadata extraction and curation Ontologies and data dictionaries Provenance and databases Data transformation, integration and publishing Supporting technologies Networks of trusted digital repositories Organisational and cultural challenges to digital curation
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee DCC Development DCC Approach to Digital Curation (white paper) – sets out the path for development activities: Monitoring international standards Creating testbeds for digital curation tools Development of recommendations for tools and methods for generating Representation Information Development of a Representation Information Registry (DCC RIR)
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. sa/2.5/scotland/ Digital Curation and the Life-Cycle
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Why a life-cycle approach? Curation is a life-cycle approach to management and preservation of digital objects, necessary because: Digital materials are fragile & susceptible to change from technological advances throughout their life-cycle Each stage can impact on subsequent stages Traditional management processes can need adapting for digital materials with different requirements. The life-cycle approach enables continuity and provenance despite technological and organisational contextual change Maximises investments and potential
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Life-Cycle model Digital Object Life-cycle model differs slightly depending on the context (e.g. libraries/ archives/museums) This generic model addresses libraries
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee From Creation to Curation Life-cycle approach facilitates continuity and control over the different stages Each stage can impact on the following one: Creation impacts on many stages, as the way a resource is created affects the way it can be curated and its sustainability Creation problematic in a digital heritage context as you may not have control over the way resources are created
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. sa/2.5/scotland/ Issues in Developing and Managing Digital Collections
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee The Digital Library: Discuss What exactly is a digital library?
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee The Digital Library: Discuss What exactly is a digital library? A library accessible over the internet? (but to what extent?) A library with (only?) digital holdings? A cutting-edge institution that maximises IT potential? (can be achieved multifariously) An added-value service?
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee The Digital Library: Discuss What exactly is a digital library? A library accessible over the internet? (but to what extent?) A library with (only?) digital holdings? A cutting-edge institution that maximises IT potential? (can be achieved multifariously) An added-value service? Professional disparity over the definition (especially the difference between this and a digital archive)
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee The Digital Library: Discuss What exactly is a digital library? A library accessible over the internet? (but to what extent?) A library with (only?) digital holdings? A cutting-edge institution that maximises IT potential? (can be achieved multifariously) An added-value service? Professional disparity over the definition (especially the difference between this and a digital archive) More than just a search engine and an access mechanism – more than just the Internet!
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Potential digital library resources Digitised Maps and Posters Photographs Original texts – books, manuscripts, newspapers, journals Audio-visual material Microfilm Born Digital Maps and Posters Photographs E-Publications Audio-visual material Websites (which will invariably contain multi- media objects) Cataloguing data?
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Issues Range across the life-cycle Involves different stakeholders in each Communication essential TechnicalPreservationOrganisational LegalFinancialCultural
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Technical issues (1) Harvesting & Accession Storage – which model to implement? Metadata – what metadata are needed? Security – protection from unauthorised or malicious access User access – what tools are needed?
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Technical issues (2) Preservation Objects highly environmentally dependent Software/hardware changes many times during the lifetime of the records – every five years? Content may be altered if action is undertaken Content will become inaccessible if action is not taken Preservation strategies & tools Fragility of storage media Media obsolescence File deterioration Hardware & software obsolescence
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Organisational and Cultural issues Organisational and cultural infrastructure not usually geared towards digital longevity Digital cultural heritage resources are often primarily recognised as resources for the here and now Here and now access practices longevity! Preservation issues not recognised/regarded Staffing – expansion of duties or new staff? Need for senior managerial support, e.g policy, finances…
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Financial issues Financial: Not just a one-off digitising or collecting cost Preservation activity can require ongoing financial commitment Who will pay – now and in the future? What are the cost benefits? Wheres the business model? Will access be payment-restricted?
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Legal issues Legal: Meeting legal obligations: data protection, copyright, database right… Who is responsible? Copyright particularly relevant, as copying can be a vital act in preservation and access Impact of DRM on copying abilities A new definition of copying needed?
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Addressing the issues Follow progress in national initiatives Collaborate & communicate Engage the consumer Success requires commitment: At a policy level (integrated) At a managerial level (support/backing) At a staffing level (actions/activities)
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Strategy (1) A written policy and strategy to support activities and help secure resources Take a life-cycle approach to support curation and preservation planning If creating resources, provide good practice guidance for sustainability (eg when digitising or accepting digitised resources) Assess collection/selection criteria – are they still valid? Do they need expanding? Identify possible resources Digital resources can complement & enhance physical ones Be aware of externally produced digital resources (eg websites); check other heritage collections before gathering!
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Strategy (2) Identify legal restraints in collection/management/access Can value be added to resources during acquisition? Store objects in a secure environment Plan for preservation activities to maintain access to authentic resources over time and avoid incurring extra costs Determine access and user requirements Implement integrated approach to collection accessibility Adapt and learn from national and other leading activities
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. sa/2.5/scotland/ Helpful projects and initiatives for preservation and accessibility
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee National Library of Scotland Developed several digital and web-accessible themed collections: Propaganda: A weapon of war (posters/images) Maps First Scottish books Robert-Louis Stevenson (letters, sketches, photos) Muriel Spark – the story Churchill: The evidence (contains school resources) Trusted Digital Repository Part of the UK Web Archiving Consortium (UKWAC) Selection and collection criteria for Scottish web sites Archiving the UK General Election 2005
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee UK WAC UK Web Archiving Consortium (6 members) British Library, National Library of Scotland, National Library of Wales, The National Archives, Wellcome Library, JISC Collects Web content selectively Uses modified PANDAS collection/harvesting software developed by the National Library of Australia Underlying harvesting program is currently HTTrack Permission is sought from site owners in advance Persistent Identifier URLs Single partner assumes responsibility for each site Central repository of metadata The collections are publicly accessible Website:
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Internet Archive Non-profit organisation, based in U.S. Wants to offer permanent access to digital online materials of all types Founded in 1996, has been collecting since then … much content donated by Alexa Internet Collects sites by crawling and harvesting web sites Sites can 'opt out' by way of robots.txt file on the web server Most content is freely available to the public, e.g. through the Wayback Machine Interface issues: only the URL indicates that the page is archived Website:
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee IIPC (1) International Internet Preservation Consortium Builds co-operation between the Internet Archive and national and research libraries Co-ordinated by the Bibliothèque nationale de France The British Library is the only current UK member, other national library partners include the Library of Congress, the Library and Archives Canada and the national libraries of Australia, Denmark, Finland, Iceland, Italy, Norway and Sweden Reflects those with current experience of Web archiving Both working-groups and tool development Phase II will enable new partners to join the consortium Website:
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee IIPC (2)* Phase I - developing the IIPC toolkit Standards and tools for supporting: Acquisition - archival quality crawler (Heritrix); portable database extraction and migration tool for database-driven deep web sites (DeepARC) Managing collections - analytical and prioritization tools for automatically focusing harvesting; curation tools to provide a non-technical interface for selecting, monitoring and verifying archived web sites Collection storage and maintenance - tools for manipulating formats; a standardised storage format (WARC), standards for metadata Access and finding aids - browse interfaces (WERA) and search facilities (NutchWAX) * Michael Day, IWMW 2006
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee LOCKSS (1) Lots of Copies Keeps Stuff Safe (LOCKSS) An easy and inexpensive way to collect, store, preserve, and and provide access to their own, local copy of authorised content they purchase (LOCKSS website) E-Journal collection and preservation system Open Source Software Runs on standard desktop hardware Requires very little technical administration
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee LOCKSS (2) Trial and pilot projects underway DCC support available through helpdesk and dedicated Advisory post Current trial suitable only for certain titles (due to licensing arrangements with publishers) Private networks can be developed: Requires technical development Minimum of six machines necessary to achieve desired redundancy Suitable for, eg, online course material
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Further resources National Library of Scotland National Library of Wales British Library DCC website UKOLN website SLAINTE website Digital Archives Regional Pilot (DARP) project Building and Sustaining Digital Collections, Abbey Smith
a centre of expertise in data curation and preservation CILIPs Branch/Group Day :: 27 September 2006 :: Dundee Thank You & Discussion Maureen Pennock Join the DCC Associates Network (its free!)