IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist.

Slides:



Advertisements
Similar presentations
An Introduction To Heritrix
Advertisements

EPrints 3 Technical Overview EPrints 3 Briefing 8 th December 2006, London.
Recent developments in digital archiving and preservation Jan Fullerton Director General National Library of Australia.
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
Bibliothèque nationale de France Tallinn,
BUILDING DIGITAL WEB ARCHIVES FOR FUTURE SCHOLARS Jani Stenvall
Looking Ahead Archive-It Partner Meeting November 12, 2013.
Drupal Online Tutorial A Product of an ENGL 421 class at Purdue University Page 1.
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
1 The IIPC Web Curator Tool: Steve Knight The National Library of New Zealand Philip Beresford and Arun Persad The British Library An Open Source Solution.
Improving access to digital resources: a mandate for order mandate: managing digital assets in tertiary education craig green,
1 Archive-It Training University of Maryland July 12, 2007.
Open Inside: The Open Source Tools that Power Archive-It Archive-It Partners 2009 Gordon Mohr, Internet Archive November 4, 2009.
Annick Le Follic Bibliothèque nationale de France Tallinn,
1 WebWatch: Monitoring Web Developments In The UK Brian Kelly UK Web Focus UKOLN University of BathURL Bath, BA2 7AY
Managing your web records Patrick Power Manager, Government Recordkeeping Programme Archives New Zealand.
Joanne Archer University of Maryland Kate Odell Archive-It Abbie Grotke Library of Congress Tessa Fallon Columbia University Creating and Maintaining Web.
WebArchiv Czech Web Archive IIPC 2007, Paris.
1 News and media websites harvesting. 2 A daily crawl since December 2010 The selective crawl contains 92 websites National daily newspapers (
Malaysian Grid for Learning October DC 2004, Shanghai, China. © 2004 MIMOS Berhad. All Rights Reserved Metadata Management System DC2004: International.
Module - Technical Basics
Tool Academy: Web Archiving Nicholas Digital Cultural Heritage DC Meetup December 20, 2012 “cobwebbed screw driver” by Flickr user Colby.
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
AYAN MITRA CHRIS HOFFMAN JANA HUTCHINS Arizona Geospatial Data Sharing Web Application Development April 10th, 2013.
Annick Le Follic Bibliothèque nationale de France Tallinn,
WHS joined Archive-It in the fall of 2010 Began capturing state information with the capture of Governor Jim Doyle’s websites at the end of the administration.
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
ECHO DEPository Project: Highlight on tools & emerging issues The ECHO DEPository Project is a 3-year digital preservation research and development project.
Web Indexing and Searching By Florin Zidaru. Outline Web Indexing and Searching Overview Swish-e: overview and features Swish-e: set-up Swish-e: demo.
Caught in the Web: Web Archiving at U of A Libraries Geoff Harder and Kenton Good Digital Preservation Seminar | March 5, 2010 | University of Alberta.
ERIKA Eesti Ressursid Internetis Kataloogimine ja Arhiveerimine Estonian Resources in Internet, Indexing and Archiving.
1 Archive-It: Archiving and Preserving Born Digital Content NDIIPP June 2009 Molly Bragg Partner Specialist Internet Archive.
Was.cdlib.org California Digital Library University of California Rosalie Lack
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
University of Illinois at Urbana-Champaign A Unified Platform for Archival Description and Access Christopher J. Prom, Christopher A. Rishel, Scott W.
Introduction to Eclipse CSC 216 Lecture 3 Ed Gehringer Using (with permission) slides developed by— Dwight Deugo Nesa Matic
Introduction to Omeka. What is Omeka? - An Open Source web publishing platform - Used by libraries, archives, museums, and scholars through a set of commonly.
What’s new in Kentico CMS 5.0 Michal Neuwirth Product Manager Kentico Software.
Web Archiving Service (WAS) Rosalie Lack Data Curation for Practitioners 2012 Workshop.
CyberCemetery Preserving At-Risk Government Web Content.
Metadata for the Web Andy Powell UKOLN University of Bath
9:00am – Welcome/Setting the Agenda for the Day 9:10am - 10:30am – Challenges of the Web Now & in the Future Response to these Challenges 10:30am – BREAK.
Metadata Extraction & Web Archives: Automating the Record Creation Process Abbie Grotke / Gina Jones /
Obtaining MISR Data and Information Nancy Ritchey Atmospheric Science Data Center March 20, 2006.
ACT : Legal Deposit Annotation and Curation Tool Peter Webster British Library
Current Quality Assurance Practices in Web Archiving Brenda Reyes Ayala, Mark Phillips, and Lauren Ko University of North Texas
IPT – Getting Started June Online Resources Project Website Requirements Server Preparation Installation Running IPT Installation Demo Upgrade/Reinstall.
DSpace System Architecture 11 July 2002 DSpace System Architecture.
Building Collections on the Web BCWeb. What’s BCWeb ? BCWeb was developped entirely by the BnF for the content curators to replace its old selection tools.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
1 NetarchiveSuite Workshop Paris November , 2011.
2015 NetarchiveSuite Workshop Eesti Rahvusraamatukogu Tallinn, Estonia January
Collection Management Systems
DigiBoard Curator Tools Fair IIPC GA 2014 Abbie Grotke ~ Library of Congress
A RCHIVAL COLLECTIONS IN A D IGITAL W ORLD Cheryl Walters Nov. 6, 2008.
William J Nixon Setting up a Repository. Introduction Key Features to consider (and review) Wide Range of Technology Available –Best fit for purpose –Clear.
Breeda Herlihy, IR Manager, UCC Library. UCC selected DSpace in 2008 Software selection group Staff from Library IT, Computer Centre, Special Collections,
Search can be Your Best Friend You just Need to Know How to Talk to it IW 306 Ágnes Molnár.
Web Archiving Workshop Mark Phillips Texas Conference on Digital Libraries June 4, 2008.
Architecture Review 10/11/2004
EVENT LOGGING & CONTENT VERSIONING SYSTEM
BnF - DLWEB - Umbra & Heritrix 3
Joanne Archer University of Maryland Libraries
Introduction, Features & Technology
Latin American Government Documents Archive, LAGDA
Health On-Line Patient Education Web Site
MSC photo:  It was taken some time in the late 1930s, but we don’t have an exact date.  The college was known as MSC from 1925 until 1955 when we became.
VT Web Archiving Anthony Rinaldi and Dev Mehta CS 4624
Márton Németh – László Drótos How to catalogue a web archive?
Metadata supported full-text search in a web archive
Presentation transcript:

IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist

2 Introduction Jointly developed by BL and NLZ 2006 under the auspices of the IIPC WCT manages the selective web harvesting process Designed for use in libraries by non-technical users Open source Uses the Heritrix web crawler

3 What it does and doesn’t do. Appraisal and selection: choosing websites for capture. –Subject specialists, curators, external agencies –BL uses a selection permission tool plugged into WCT Metadata/Description –Basic Dublin Core Metadata –Titles, description, subject and collection tagging Scoping and Data Capture –Scheduling –Crawl parameters, e.g. path depth, size of download QA and Analysis –Heritrix log files –Browse tools –Recommendations based on indicators

4 What it does and doesn’t do continued.. Storage and Organisation –WARC files created in WCT –Passed out of WCT for indexing and long term storage Access/Use/Reuse –Wayback is plugged in as the access tool –Harvested sites can be viewed within the tool Risk Management –Harvest Authorisation module, rights metadata –Records the outcome of publisher communications –Control the display of Targets

5 Development Latest version available now. UI new features and improvements (x 17) including… –Date pickers for date fields –Scheduling heat map –Harvest optimisation Bug Fixes (x11) Development related e.g., –No longer need to install Apache Tomcat server or database etc NLNZ budgeted NZD 50,000 for Open development process up to all WCT users. –WCT pages –Wiki (Code, Support, mailing lists, bug tracker)

6 Thank-you. UK Web Archive