Danish Legal Deposit on the Internet: Current Solutions and Approaches for the Future ECDL, September 2001 by Birgit N. Henriksen Head of Digitization.

Slides:



Advertisements
Similar presentations
E-resources Collection Management Anna Grigson E-resources Manager.
Advertisements

Lecture 2 - Revenue Models
Harvesting and archiving the Web Nordunet2000, Juha Hakala Helsinki University Library.
USING WORDPRESS. WEEK 1 1.Why WP? 2.Setting Up WP 3.Exploring the Admin screen 4.Page Organization 5.Posting 6.Polls.
BUILDING DIGITAL WEB ARCHIVES FOR FUTURE SCHOLARS Jani Stenvall
| IFLA2010. Newspaper Section | Newspaper Resources in transition: Digital Preservation and Access - keynote - IFLA International Newspaper.
HINARI website interface, journals, and other full text resources (module 2)
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
Jesper Klein The Swedish Library of Talking Books and Braille The Swedish talking book model
ADMINISTRATION Sources of Information REVISION – BLOCK 6.
_______________________________________________________________________________________________________________ E-Commerce: Fundamentals and Applications1.
1 CS 502: Computing Methods for Digital Libraries Lecture 27 Preservation.
1 Minerva The Web Preservation Project. 2 Team Members Library of Congress Roger Adkins Cassy Ammen Allene Hayes Melissa Levine Diane Kresh Jane Mandelbaum.
ISD Mission To promote scientific and technical information usage among KISR researchers and the public.
_______________________________________________________________________________________________________________ E-Commerce: Fundamentals and Applications1.
Web Programming Language Dr. Ken Cosh Week 1 (Introduction)
Recent approaches to capture web content, which Heritrix can’t harvest  Capturing Social Media  Screen filming of Rich Media  Project: Event crawl of.
The capture and preservation of websites at the National Library of New Zealand Gillian Lee Alexander Turnbull Library.
A Seminar report On Electronic Resources :An Overview
Danish Legal Deposit Experiences & the Need for Adjustments by Birgit N. Henriksen Head of Digitization and Web Department The Royal Library, Denmark.
Lecturer: Ghadah Aldehim
Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August Online materials published in Austria collecting, archiving and metadata.
University of Palestine College of Information Technology Management Information Systems Supervised By MS. Rasha Ragheb Atallah 26. Dec Prepared.
World Bank, Africa Region, Africa Household Survey Databank - The World Bank - Africa.
Danish Legal Deposit on the Internet National Diet Library, Tokyo, January 2002 by Birgit N. Henriksen Head of Digitization and Web Department The Royal.
Svein Arne Brygfjeld National Library of Norway Nordic Web Archive.
Paul Mundy and Bob Huggan 1 Websites.
Swapan Deoghuria Scientist-II, Computer Centre Indian Association for the Cultivation of Science Kolkata , INDIA URL:
1 Library Services. 2 Benefits of using the Library To find resources for your assignments and identify areas of interest To produce extra good papers.
Copyright © Allyn & Bacon 2008 POWER PRACTICE Chapter 7 The Internet and the World Wide Web START This multimedia product and its contents are protected.
Ask A Librarian and QuestionPoint: Integrating Collaborative Digital Reference in the Real World (and in a really big library) Linda J. White Digital Project.
South West Grid for Learning Educational Portal Awareness Event.
DIGAR as the way and possibility to re-use the publications of public sector National Library of Estonia Kairi Felt Chief Specialist of E-Collections
Managing Serials in an Electronic World the Stirling Experience Sonia Wilson University of Stirling Library 19 October 2004.
Cataloguing Electronic resources Prepared by the Cataloguing Team at Charles Sturt University.
How did the internet develop?. What is Internet? The internet is a network of computers linking many different types of computers all over the world.
Depth customization of DSpace: Best practices and techniques of institutional repository at IIT Kanpur, India By S. K. Vijaianand V. D. Shrivastava Gaurav.
1 CS 502: Computing Methods for Digital Libraries Lecture 28 Current work in preservation.
Publisher’s Perspective: Digitization of print resources, and archiving of digital resources Judy Best, June 13, 2006.
1 Wawasan Open Library Library Orientation 21 January 2007.
The Legislative Library of Ontario’s Ontario Documents Repository Road to Partnership.
Technology Choices for the JSTOR Online Archive Presented by Chang Feng Department of Computer Engineering and Computer Science, University of Missouri-Columbia,
ERIKA Eesti Ressursid Internetis Kataloogimine ja Arhiveerimine Estonian Resources in Internet, Indexing and Archiving.
1 Archive-It: Archiving and Preserving Born Digital Content NDIIPP June 2009 Molly Bragg Partner Specialist Internet Archive.
Finding Credible Sources
Digital Archiving in the Hungarian Széchényi Library The story and the plans of the Hungarian Electronic Library Rome, 21. Oct István Moldován OSZK,
We have displayed the Browse publisher drop down menu. This You have full access to: list for an institution where all the material is included in the.
Netarkivet RESAW seminar, Dec 2-3, 2013 Day 1. Who are we today □Birgit N. Henriksen, head of digital preservation, KB □Bjarne Andersen, head of digital.
Electronic publications in the Swiss National Library ELAG 2005 CERN, Geneva, June 1-3, 2005 Barbara Signori Swiss National Library (SNL)
Use & Access 26 March Use “Proof of Concept” Model for General Libraries & IS faculty Model for General Libraries & IS faculty Test bed for DSpace.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
Selection Strategies for Digital Institutional Repositories Kent Woynowski 30 September 2004.
Multimedia ETD Questions Bill Savage UMI Dissertations Publishing ETD 2002 Provo, Utah Saturday, June 1, 2002.
Examples for Open Access Scholar Electronic Repository by New Bulgarian University IP LibCMASS Sofia 2011 Contract № 2011-ERA-IP-7 Sofia, September,
Chapter 17.1 Civic Participation. A Tool for Political Education and Action ► The Internet is a mass communication system of millions of networked computers.
Collecting History: Profiles in Science Alexa T. McCray National Library of Medicine Bethesda, MD Stanford University August 21, 1999.
Corporation For National Research Initiatives Technical Issues in Electronic Publishing Corporation for National Research Initiatives William Y. Arms.
1 BCS, Oxfordshire, 19 February, 2004 WEB ARCHIVING issues and challenges Deborah Woodyard Digital Preservation Coordinator.
A centre of expertise in digital information management 1 UKOLN is supported by: Approaches to Archiving Professional Blogs Hosted in the.
Catherine Fournier ICOLC October LOCKSS: FEEDBACK FROM INIST’s EXPERIENCE Foreword Preservation-Why? LOCKSS overview LOCKSS at INIST Conclusion.
1 Dissemination on Internet Experience of the Statistical Office of the Slovak Republic Dissemination on Internet Experience of the Statistical Office.
Primo at the British Library Mandy Stewart. 2 About the British Library The British Library is the National Library of the UK It is a world-class.
Strategies for archiving the Danish web space Bjarne Andersen Head of Digital Resources State and University Library, Aarhus
Digitalcommons.unl.edu Archiving Department Records.
DATABASES. Learning outcomes for today By the end of this session you will be able to: ◦ Use boolean operators ◦ Understand the structure of information.
Building A Repository for Digital Objects
Making Sense of the Alphabet Soup of Standards
YugNIRO Digitization Proposal 2012
Library Web Portals: Reinventing Libraries for the Future
UNIT 15 Webpage Creator.
Presentation transcript:

Danish Legal Deposit on the Internet: Current Solutions and Approaches for the Future ECDL, September 2001 by Birgit N. Henriksen Head of Digitization and Web Department The Royal Library, Denmark

Presentation outline Since 1998 selection based archiving (production) netarchive.dk (new project, multiple archiving strategies, ) Nordic Web Archive (project , access to web archives) Three different initiatives:

The Danish Legal Deposit Law 1697: All printers in royal and ducal lands must deposit 1703: Only printers in Copenhagen have to deposit 1781: All printers in royal and ducal lands must deposit 1902: All printed materials to be deposited 1927: Posters and some types of ephemera excluded 1997: All published works to be deposited

The law from 1997 covers any work published in Denmark regardless of medium “work”: a delimited quantity of information which must be considered a final and independent unit “published”: when … copies of the work have been placed on sale or otherwise distributed to the public

Types of Net Publications Static included (only periodically updated) monographs periodicals Dynamic excluded (continuously updated) Databases homepages

How do we get the material? Download based on notification NOT Harvesting the Danish domain Delivery of works (a collection of files) from the individual publishers

Registration WHO the person in charge of the technical completion of the digital copy HOW by filling out a form at

Registration Form - Monographs

Download - workflow The staff at the Danish Department, The Royal Library determines whether a publication is covered by the law if yes, downloads all files belonging to the work checks downloaded work catalogues and classifies the work in the OPAC (only periodicals) transfers work to archival server (server mirrored nightly to State and University Library, Århus)

Plug-ins

System Environment

Domain names in.dk domain # of sub-domains Registered in.dk May 12’th Registered in.dk June 12’th Represented in archive June 12’th 2001 < 1000

Volume in archived material June 1999June 2000June 2001 # net publications #Representatio ns Repr./net pub #Files – total Files/net pub #Bytes – total1,66 Gbyte12,0 Gbyte18,2 Gbyte

Monographs vs Periodicals Before July 1st 1999 Before July 1st 2000 Before July 1st 2001 #%#%#% Monograph s Periodicals (issues)

Public vs. Private Publishers June 1999June 2000June 2001 #%#%#% Public ,5 Private ,5

Staff resources Man YearsPaid hours per publication Comments 19982,312,75System being developed and set up 19991,91,2Downloading, cataloguing and classifying all publications 20001,30,6Downloading all, cataloguing and classifying periodicals

MimeType Statistics – % of collected files June 1999June 2000June 2001 TEXT/ HTML 56,058,6 %59,3 % Image (GIF, JPEG, PNG) 41,8 %38,4 %37,9 % PDF1,3 % 1,6 %1,7 % Other formats 0,9%1,4 %1,1 %

Three generations using the internet 1st (age 74)2th (age 40)3th (age 10-15) Professionel life (Work/ school related) Professional online periodicals /portals Product information Institutions and organisations Newsgroups Uncritical all available material Entertainment Just surfing aroundAuctions Game services Bizarre websites Newsgroups Events Game services Gimmicks Chat services Searching for information Search engines News Municipal sites Search engines (including cashed web pages) News and media/portals State- and municipal sites Product databases Search engines Special interests Homebanking Stock exchange Homebanking and info related to family economy E-commerce Organisations Seasonal interests Sport clubs (results) Live role play

The modifications from 1902 Brochures and advertisements Catalogues Election campaign material Club/organisation magazines Songs Scouting magazines, church newsletters Maps Portraits Art prints Brouchers Online services like krak.dk Organisation websites Newsletters/minuts on websites Product databases/portals Net Art

Problems related to the notification concept Lack of notification of multiple representations of a publication Lack of notification of new versions

Problems related to technical issues Errors or inconsistencies in the published files Java applets – no solution at the moment Found solutions on previous problems: Documents with java scripts Data behind forms Data behind username/password logins Cookies-based session handling SSL encryption

Gains if harvesting is used Better coverage of Denmark outside the public sphere Updated versions – also for static publications New trends on the net as soon as they appear

Why not only harvesting? Programs and plug-ins are difficult to keep track on Harvesting is not always possible (e.g.. streamed and web casted material) Harvesting may not give a useful result - technical problems (java, interactive sites) - personalised sites Harvesting may produce a collection of documents that have never existed on the net Harvesting may not always give the best format for long-time preservation

Net Art

Home banking

Searching the catalogue

Collections made by harvesting Are not complete – previous slides No robot will never be able to make a ’true’ snapshot – the snapshot contains a mix of documents that have never been published together at the same time – a ’fake’

Archive for Danish Literature from 1. October All full texts are structured in XML on work level The XML is loaded to a database The database performs the web publishing in well-formed HTML on a page level What do we prefer to archive and for what purpose?

Birte Christensen-Dalsgaard: Archive Experience, not Data

Web Archiving Conference, CPH June 2001 Focus: User Expectations to webarchiving in DK Brought together : members of the user community, scholars as well as scientits member from the organisations traditionally in charge of preserving oral and written material members with technical knowledge Proceedings (UK) – netarchive.dk

Web Archiving Conference, CPH June 2001 Sholars & scientist: Archive the dynamic part of the web Focus on archiving the content the context the evidence of use Archivists: Use different archiving approaches New methods for archiving dynamic material Budgets for making snapshots and making selective collections are comparable

Birte Christensen-Dalsgaard: 3 dimensions - duration Real time dialog Published, static Signal lifetime Hourly Update Book-like publications Scientific Journals News-sites Chat

Birte Christensen-Dalssgard: 3 dimensions - Permanent value Transient Persistent Permanent Value What is worth preserving? Quality vs. Representative

Birte Christensen-Dalsgaard: Background - Nature of Information Interactivity StaticDynamic Transient Persistent Permanent Value Real time dialog Published, static Signal lifetime Hourly Update

Birte Christensen-Dalsgaard: Domain of different harvesting methods Interactivity StaticDynamic Transient Persistent Permanent Value Real time dialog Published, static Signal lifetime Legal Deposit, DK Hourly Update Accumulative harvesting Snapshot

Birte Christensen-Dalsgaard: What is missing? Interactivity StaticDynamic Transient Persistent Permanent Value Real time dialog Published, static Signal lifetime Legal Deposit, DK Hourly Update Accumulative harvesting Snapshot

Accumulative Snapshot netarchive.dk (1) Interactivity StaticDynamic Transient Persistent Permanent Value Real time dialog Published, static Signal lifetime Process Test different archival approaches and the subsequent usability of the archived material for research

netarchive.dk (2) Pilot project testing different archival approaches and the subsequent usability of the archived material for research Project partners: State and University Library, Aarhus Centre for Internet Research The Royal Library With economic support from the Danish Electronic Research Library (DEF) Period: August 2001 – July 2002 Case: Danish municipal elections November 2001

netarchive.dk (3) Which materials with What frequency? Collection method? Which software? How should the collection of materials be organized and how should it be stored? How should obsolescence of data formats be dealt with? How should access be given? Budgets for collecting and storing

netarchive.dk (4) Net material covered by netarchive.dk net activities from existing news media (newspapers, radio, TV (both national, regional and local media)) political parties official pages, national and local individual politicians’ personal pages official (county) municipal pages voters’ personal pages »local themes«- pages special interest organisations portals in the broadest sense opinion polling firms public s/ press releases news groups / usenet net-conferences and chat

How do we catch the missing part? Process rather than material – ‘Filming’ the net through a browser Goal: Catch chronological series of displayed WebPages Tools to take into consideration: Business intelligence tools Tools used in usability laboratories …

Nordic Web Archive (NWA) Establish a Danish test archive in order to participate in NWA Software: NEDLIB robot Status 1/9 2001: Archiving started 20/ mio documents 43 GB uncompressed data

Questions?