August 2005IFLA - CDNL1 The International Internet Preservation Consortium (IIPC)

Slides:



Advertisements
Similar presentations
Kulturarw³ Capturing the web The Swedish experience
Advertisements

Harvesting and archiving the Web Nordunet2000, Juha Hakala Helsinki University Library.
DLM-Forum - Barcelona, 7-8 May 2002 Promoting and Supporting Open Archives in Europe: The Open Archives Forum Project Donatella Castelli IEI-CNR
Recent developments in digital archiving and preservation Jan Fullerton Director General National Library of Australia.
DRIVER Long Term Preservation for Enhanced Publications in the DRIVER Infrastructure 1 WePreserve Workshop, October 2008 Dale Peters, Scientific Technical.
OpenAccess.se First DRIVER Summit, January 2008 Göttingen Jan Hagerlid, National Library of Sweden, co-ordinator of.
Institutional Repositories: Laying Foundations for a New Era of Scholarly Communication? Jessie Hey Online Information London, UK 1 Dec 2004 A practical.
A survey of Web preservation initiatives Michael Day UKOLN, University of Bath 7 th European Conference on Research and Advanced Technology.
Collection-level description & collection management: tool for the trade or information trade-off? Collection Description Focus Workshop 4 Newcastle, 8.
A centre of expertise in digital information management UKOLN is supported by: Memory institutions and the social fabric of the Web Dr.
The Future of Scholarship in the Digital Age: The Role of Institutional Repositories Ann J. Wolpert Director of Libraries Massachusetts Institute of Technology.
Collection-level description & the Information Landscape: users evaluate strategies for resource discovery Collection Description Focus Workshop 5 Cambridge,
A centre of expertise in data curation and preservation London :: ARK Group Workshop: Archiving the Web :: 28 Sept 2006 Funded by: This work is licensed.
A centre of expertise in data curation and preservation SoA Annual Conference::York::August 2008 Funded by: This work is licensed under the Creative Commons.
A centre of expertise in data curation and preservation CETIS MDR SIG::28 June 2006::University of Bath Funded by: This work is licensed under the Creative.
Libraries for Future Generations Martha Anderson Director National Digital Information Infrastructure and Preservation Program The Library of Congress.
Providing collections, tools and services for digital humanities A national library perspective Clément Oury Head of Digital Legal Deposit Bibliothèque.
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
Bibliothèque nationale de France Tallinn, BnF update: production and development priorities in 2015.
Cultural Content and Digital Heritage Bernard Smith European Commission INFSO/D2.
BUILDING DIGITAL WEB ARCHIVES FOR FUTURE SCHOLARS Jani Stenvall
Creating the User’s European Digital Library Jill Cousins The European Library Knowbynet, Berlin, June 2007.
Michele Kimpton DSpace Foundation, Building Collaborative networks Organizations and infrastructure DRIVER Conference Jan 2008.
10/25/03 E-ICOLC 5th - Denmark Sara Kjellberg Lund University Libraries Head Office Directory of Open Access Journals DOAJ.
APSR Forum on Long-Term Repositories National Library of Australia, 31 August – 1 September, Trust and the Web: Can the audit criteria apply to.
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
1 Archiving and Preserving the Web Kristine Hanna Internet Archive April 2006.
1 WEB ARCHIVING IN THE BRITISH LIBRARY John Tuck Head of British Collections February 2004.
The capture and preservation of websites at the National Library of New Zealand Gillian Lee Alexander Turnbull Library.
Annick Le Follic Bibliothèque nationale de France Tallinn,
OPEN ACCESS IN CONFLICT WITH COPYRIGHT AND TECHNICAL BARRIERS By Dr. Ta Ba Hung Director, NACESTI, Vietnam 2 nd International IFLA Presidential Meeting.
Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August Online materials published in Austria collecting, archiving and metadata.
WebArchiv Czech Web Archive IIPC 2007, Paris.
1 Archiving and Preserving the Web Dan Avery Kristine Hanna Merrilee Proffitt Internet Archive RLG April 2006.
Svein Arne Brygfjeld National Library of Norway Nordic Web Archive.
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
Building Scalable Web Archives Florent Carpentier, Leïla Medjkoune Internet Memory Foundation IIPC GA, Paris, May 2014.
Open Textbooks and Electronic Publishing Formats/Standards Arctic Virtual Learnng Tools
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Chinese-European Workshop on Digital Preservation Chinese-European Workshop.
Annick Le Follic Bibliothèque nationale de France Tallinn,
The Development of National Archives of Malaysia (NAM) as National Research Centre & SARBICA’s Roles Presented by : Ahmad Sukri Abdul Kadir National Archives.
Enhancing the Culture of Reading and Books in the Digital Age - ARROW Olav Stokkmo, Chief Executive, IFRRO 13 October 2009IFLA-IFRRO-WIPO-IPA-EWC Conference;
CNI Fall Task Force, December 2007 International Internet Preservation Consortium Abbie Grotke IIPC Communications Officer Library of Congress & George.
27. August Kyung-Ho Choi Manager of Digital Archiving Division The National Library of Korea Sang-hoon Oh Secretary of General in.
Aarhus. BnF main topics – 2013 – crawling side Keep crawling –Broad and focused crawls –Limit of 100 Tb Crawl of password protected content –“Press project”:
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Caring and Sharing Collaboration in Digital Curation outside North America Ross Harvey Simmons College, Boston Curation Matters: 17 June 2010.
Finding out about the preservation of e-journals: the PEPRS Project Piloting an E-journals Preservation Registry Service Fred Guy, Project Manager, EDINA,
1 Archive-It: Archiving and Preserving Born Digital Content NDIIPP June 2009 Molly Bragg Partner Specialist Internet Archive.
Preserving our audiovisual heritage Plan for a national television and radio archive.
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
Digital Archiving in the Hungarian Széchényi Library The story and the plans of the Hungarian Electronic Library Rome, 21. Oct István Moldován OSZK,
IFAP Special Event: Information and Knowledge for All, Emerging Trends and Challenges Information Preservation 4000 Years of Traditions Challenged by Digital.
The Library of Congress Martha Anderson Program Officer, NDIIPP Office of Strategic Initiatives Library of Congress April 2005 LC Perspective : Preservation.
UKOLN is supported by: Iniciativas de preservación de la Web: una visión actual Michael Day Digital Curation Centre, UKOLN, University of Bath, UK
EVA Workshop, 26 March 2003, Florence, Italy1 COINE Cultural Objects In Networked Environments Anthi Baliou University of Macedonia,Library Thessaloniki,
Building Knowledge Societies Abdul Waheed Khan Assistant Director-General for Communication and Information Durban ::: 19 August 2007 E-Learning: Universities.
1 Collection Development and Web Publications at the British Library John Tuck Head of British Collections Digital Memory, Session 2, Tallinn 24 th November.
ON-line SERVICES based on DIGITAL DOCUMENTS Prof. Doina Banciu ROCS Bucharest, 2008.
1 Strategic Developments at the British Library Lynne Brindley, Chief Executive UK Serials Group, 7 April 2003.
1 BCS, Oxfordshire, 19 February, 2004 WEB ARCHIVING issues and challenges Deborah Woodyard Digital Preservation Coordinator.
Institutional Repositories: the DSpace Experience Ann J. Wolpert Director of Libraries Massachusetts Institute of Technology.
1 NetarchiveSuite Workshop Paris November , 2011.
Pcdm, iiif, & interoperability esmé dplafest
Web Archiving Workshop Mark Phillips Texas Conference on Digital Libraries June 4, 2008.
2008 DOT GOV HARVEST PRESERVING ACCESS UNIVERSITY OF NORTH TEXAS LIBRARIES Cathy N. Hartman Mark E. Phillips FDLC Oct 21, 2008.
Workshop on Web Archiving
László Drótos – Márton Németh National Széchényi Library Department of Electronic Library Services Web archiving Planning a new pilot project.
DRIVER Digital Repository Infrastructure Vision for European Research
Márton Németh – László Drótos How to catalogue a web archive?
Presentation transcript:

August 2005IFLA - CDNL1 The International Internet Preservation Consortium (IIPC)

August 2005IFLA - CDNL2 Synopsis The IIPC - what is it? Background IIPC goals and organisation IIPC issues IIPC future? Concluding remarks

August 2005IFLA - CDNL3 The IIPC - What is it? International collaboration for preserving Internet content Mission: Acquire, preserve and make accessible Internet (WWW) content for future generations 12 participating institutions –National libraries of: Australia, Canada, Denmark, Finland, France, Iceland, Italy, Norway, Sweden. The British Library (UK), The Library of Congress (USA) and the Internet Archive (USA) Chartered in Paris July, 2003, agreement in effect for 3 years Future not decided but IIPC seeks to involve national libraries IIPC welcomes inquiries about future membership

August 2005IFLA - CDNL4 Background The Internet is a specific medium with attributes of: –Books, journals, radio, images, video Characterised by –Exponential growth since 1994 –Proliferation –Immense volume –Anybody can publish –Accessible everywhere

August 2005IFLA - CDNL5 Archiving the Web – WHY - Who Presently and in the future, a large and significant part of our culture will exist ONLY on the Internet If the Web pages are not collected in an orderly and continuous manner they will disappear and thereby an important part of the worlds cultural and intellectual heritage Therefore we should: Preserve material that is only available on the Web Preserve scholarly data and secure access to it because it is: –Important and valuable –Cited –Finding and locating it is a problem A logic extension of national libraries mission and goals LEGAL DEPOSIT LAW

August 2005IFLA - CDNL Evolution of Legal deposit Law in Iceland WWW

August 2005IFLA - CDNL7 Pre IIPC Development Internet Archive, Sweden, Australia 1998 – Nordic co-operation – Loc, BnF, UK, Austria, Slovenia, Check Republic, Lithuania, Canada IFLA 2002: Brewster Kahle presents the IA and Web archiving September 2002 – IA proposes a project with a few libraries September 2002 – Meeting in Rome (during ECDL ) January 2003 – Meeting in Paris (COBRA +) July 2003 IIPC incorporated

August 2005IFLA - CDNL8 IIPC Goals To build a virtual global distributed collection to ensure that the distributed and linked nature of the original web material is not lost forever Find a new way of collaborating among national heritage institutions In order to create a network of heritage institutions That can build and preserve the global distributed collection Global information space of the Internet Global Distributed Collection

August 2005IFLA - CDNL9 IIPC Organisation Steering group one person from each institution Working groups –Access –Content Management –Deep Web –Framework –Metrics and Testbed –Researchers Requirements

August 2005IFLA - CDNL10 IIPC Objectives Collaborative work, within each country's legislative framework, to identify, develop and facilitate implementation of solutions for selecting, collecting, preserving and providing access to internet content Facilitate international coverage of internet content archive collections within national legal frameworks, in accordance with national collection policies International advocacy for initiatives that encourage the collection, preservation and access to internet content Provide a forum for sharing knowledge about internet content archiving both within the Consortium and beyond Develop and recommend standards Develop interoperable tools and techniques to acquire, archive and provide access to web sites Raise awareness of internet preservation issues and initiatives through conferences, workshops, training events and publications.

August 2005IFLA - CDNL11 IIPC Results Intangible Common understanding and clarification of issues Definition of the overall architecture for web archiving with system interface specifications Proposed standards for Web Archive file format and Metadata Access requirements with Use cases illustrating common understanding of the functionality of a web archive Identification and requirement specification of new access tools Curator tool for controlling and scheduling the collection of web content Definition of the the WARC (web ARChive) file format to store information blocks harvested by web crawlers

August 2005IFLA - CDNL12 IIPC Results Tangible Heritrix Crawler/Harvester –Smart crawling –Continuous harvesting Full Text Indexer/Search Engine –searching/browsing the content of a Web Archive Extract data from an archived database Arc files manipulation tool

August 2005IFLA - CDNL13 IIPC Future - Issues Collection building Broad scope representative collection of Web Narrow scope in depth collection of selected sites Registration Cataloguing is not possible Indexing of text (with time element) Access Direct using a URL Search Engine (Google type) Data Mining (Analytical and statistical methods) Long time preservation of a web archive a conscious omission

August 2005IFLA - CDNL14 IIPC Future Current IIPC charter ends in July 2006 Proposals for continuation will be discussed at the next meeting in late October 2005 Challenge is to keep the work focused and effective Many unsolved problems and hopefully new members can help

August 2005IFLA - CDNL15 Concluding remarks Creating and accessing a Web Arcive is: Very complex, challenging and exiting - not a problem nor a burden Collection – Preservation – Access The first phase has started Our knowledge of the Web and its contents is incomplete All present software and tools must be improved International cooperation needed to: Define and develop standards, techniques and methods Create national and even a global Web Archives Provide access to the archives

August 2005IFLA - CDNL16

August 2005IFLA - CDNL17 Books/Journals/Sound Rec.Video/Micro/CDsManuscr.Internet INDEX Films National Bibliography reflecting new law Bibliography of National Cultural Heritage Gallery Archive Museum National National Bibliography - from Print to Digital Present National Bibliography