Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Preservation for Ongoing Accessibility: research group Professor.

Slides:



Advertisements
Similar presentations
K-12 Web Archiving Project Archive-It Partner Meeting November 4, 2009.
Advertisements

WMR - Hobsons Bay Network Students As eLearning Leaders Project Attribution &
OAForum – September 2003 Muriel Foulonneau Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector Muriel.
Recent developments in digital archiving and preservation Jan Fullerton Director General National Library of Australia.
Providing collections, tools and services for digital humanities A national library perspective Clément Oury Head of Digital Legal Deposit Bibliothèque.
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
BUILDING DIGITAL WEB ARCHIVES FOR FUTURE SCHOLARS Jani Stenvall
Long-Term Preservation of Astronomical Research Results Robert Hanisch US National Virtual Observatory Space Telescope Science Institute Baltimore, MD.
APSR Forum on Long-Term Repositories National Library of Australia, 31 August – 1 September, Trust and the Web: Can the audit criteria apply to.
CHAPTER 2: WEBLOGS PEDAGOGY AND PRACTICE BY ARION LONG & ANGELA ALSTON.
Archiving the Web: the PANDORA archive at the National Library of Australia Preserving the Present for the Future Copenhagen, June 2001 Warwick Cathro,
Web archiving at the NLA ‘ Archiving the music web’ Music Council of Australia Annual Assembly 28 September 2009 Paul Koerbin Manager Digital Archiving.
1 Archiving and Preserving the Web Kristine Hanna Internet Archive April 2006.
The capture and preservation of websites at the National Library of New Zealand Gillian Lee Alexander Turnbull Library.
1 Archive-It Training University of Maryland July 12, 2007.
Annick Le Follic Bibliothèque nationale de France Tallinn,
Archive-It collection on “Occupy Movement 2011/2012” Archiving Web Content.
Australian web domain harvests 2005, 2006 & 2007.
How to Face the Challenges of Web Archiving? The experiences of a small library on the edge. Chloe Martin, Internet Memory Catherine Ryan, National Library.
Web The Internet Archive. Agenda Brief Introduction to IA Web Archiving Collection Policies and Strategies Key Challenges (opportunities for.
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
The Australian Government Web Archive ALIA Conference September 2014, Melbourne Alison Dellit Director, Australian Collection Management.
Geoff Payne ARROW Project Manager 1 April Genesis Monash University information management perspective Desire to integrate initiatives such as electronic.
Ymchwil Research Ymchwil Research RESAW Ioan Isaac-Richards Ingest Processes Manager Head of Web Archiving
Build a Free Website1 Build A Website For Free 2 ND Edition By Mark Bell.
Challenges & opportunities in the preservation of (digital) information: the case of European research libraries Museo de las Ciencias Teatro de UNIVERSUM.
BLOG. WHAT IS A BLOG ? We have a lot of definition of blog.. A blog is a personal diary. A daily pulpit. A collaborative space. A political soapbox. A.
Annick Le Follic Bibliothèque nationale de France Tallinn,
Re-imagining the national data store Warwick Cathro Assistant Director-General, Innovation.
Europeana - next steps Policy and practice Yvo Volman European Commission DG Information Society and Media Conference on the integration of Bulgarian cultural.
Vital Signs: Draft Cultural Indicators for Australia.
Caught in the Web: Web Archiving at U of A Libraries Geoff Harder and Kenton Good Digital Preservation Seminar | March 5, 2010 | University of Alberta.
Office of Strategic Initiatives All Hands Meeting-March 2010 Challenges in Web Archiving: Library of Congress Edition Abbie Grotke, Web Archiving Team.
Caring and Sharing Collaboration in Digital Curation outside North America Ross Harvey Simmons College, Boston Curation Matters: 17 June 2010.
DEVELOPING A COMMUNITY WEBSITE CAROLINE EGAN. KEY ISSUES -Design Brief -Working with the Designer (Files stored online – Basecamp) -Developing Content.
1 Archive-It: Archiving and Preserving Born Digital Content NDIIPP June 2009 Molly Bragg Partner Specialist Internet Archive.
Content Area Reading, 11e Vacca, Vacca, Mraz © 2014 Pearson Education, Inc. All rights reserved. 0 Content Area Reading Literacy and Learning Across the.
The Real At Risk E-Content: University Web Resources EDUCAUSE Joanne Kaczmarek University of Illinois at Urbana-Champaign Taylor Surface OCLC October 12,
From here to perpetuity: challenges (and a few confessions) in preserving web-based AV content ASRA Conference 2011 Paul Koerbin Manager Web Archiving.
Why Archiving and Preserving GIS Data Is Important Maps tell a compelling story of change over time. They document movement, progress, and change to the.
Television archiving – managing the move from analogue to digital Dr Bob Pymm School of Information Studies Charles Sturt University, Australia.
Web Archiving at the National Library of Australia Russell Latham Senior Web Archivist, National Library of Australia.
HOW BIG IS THE INTERNET? As of 2005, Internet size is estimated at 5 million terabytes: 5.
November 2004 NDIIPP: Future Directions and Relevance to Other Countries Beth Dulabahn Office of Strategic Initiatives Library of Congress November 7,
Identifying Web Resource Preservation Issues Richard Davis, ULCC JISC-PoWR Workshop 3 Manchester, 12/9/2008.
The Top 100 Online Resources for Business Growth A guide for business and marketing professionals to some of the most popular resources on the web for.
Gateways Heather Brown Project Officer, State Library of S.A, for Business Information Program, University of S.A. and Assistant Director, Paper, Artlab.
Metadata for digital preservation: a review of recent developments Michael Day UKOLN, University of Bath ECDL2001, 5th European Conference.
Archiving Geospatial Data: Background to the Problem Area State Government Users Committee October 16, 2008 Steve Morris, NCSU Libraries.
Warwick Cathro Assistant Director-General Resource Sharing and Innovation National Library of Australia Trove – a service built on collaboration OCLC Asia.
Week 2- Overview of the internet The construction of a webpage Four Key Elements – how the internet works Elements and Design concepts Introduction to.
+ Web Design Terminology Digital Communications III- Frameworks-2.1 Terminology HTML Domain Name Hot Spot Site Maps.
The Boston TV News Digital Library: Partners WGBH Media Library and Archives (WGBH) Northeast Historic Film (NHF) Boston Public Library (BPL)
Building Collections on the Web BCWeb. What’s BCWeb ? BCWeb was developped entirely by the BnF for the content curators to replace its old selection tools.
1 NetarchiveSuite Workshop Paris November , 2011.
Tina Morton & Matt Greenhall Engagement Managers 18 September 2015 Higher Education Archives Programme #HEAP Action plan.
The small thin quiz of the course. Q1. WordPress is... A.A website creation tool B.A blogging tool C.A content management system D.An accessible and free.
Library and IT Services. Marc van den Berg Some UvT facts.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Grant Writing for Digital Projects September 2012 IODE Project Office IODE Project Office Oostende, Belgium Oostende, Belgium Sustainability and.
1 Web Search What are easy ways to create a website? 2 Web Search What is a blog? What type of content does this type of website provide? 3 Web.
Use cases for BnF broad crawls Annick Lorthios. 2 Step by step, the first in-house broad crawl The 2010 broad crawl has been performed in-house at the.
Finnish web-archive and digital legal deposit copies
Workshop on Web Archiving
Joanne Archer University of Maryland Libraries
Challenges and Opportunities of Archiving the UK Web
The Australian Government Web Archive
                                                                                                                                                                                                                                                    
NSLA Digital Collecting Project - Scope
Technical Issues in Sustainability
Presentation transcript:

Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Preservation for Ongoing Accessibility: research group Professor Ross Harvey Dr Bob Pymm Dr Anne Lloyd Geoff Fellows Jake Wallis

Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Pandora - NLA solution to website preservation Archive of over 1.7 terabytes of data selective - identifies specific sites for harvest and gains permission to archive

Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Internet Archive - Automated Harvests ‘the web’ issues? – cost – reliability of the crawl eg deep web

Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest.au Harvest by Internet Archive first ran producing 6.9 terabytes of data, 185 million unique files Issues? – difficulties with certain file types – password-protected sites – difficulty in accessing the ‘deep’ web

Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest.au Harvest September 2006 – more sophisticated crawl 19 terabytes of data, 596 million files predominant dataset for POA group

Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Research potential? digital preservation Australian digital culture

Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest 3 broad questions What are the contents of the harvests? How can access be provided to this content? What is the value of the domain harvests in relation to the NLA’s overall web preservation interests?

Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Blogs low skill threshold technology as barometer of engagement social space catalyst for online community a new and important collecting point for digital cultural heritage

Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Archiving and preserving blogs how to identify Australian specific material? what to capture – selection criteria? – linked material? frequency of capture to ensure accurate representation provision of access to harvested blog content

Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Aspirations a conceptual framework for studies in digital anthropology a broadening of voices within the Australian public sphere

Separating the wheat from the chaff: Identifying key elements in the NLA.au domain harvest Questions/comments?