Presentation is loading. Please wait.

Presentation is loading. Please wait.

Netarchive Plans for the next year. Netarchive – Plans for the next year  4 broad crawls  One broad crawl lasts less than 55days  We are able to fullfill.

Similar presentations


Presentation on theme: "Netarchive Plans for the next year. Netarchive – Plans for the next year  4 broad crawls  One broad crawl lasts less than 55days  We are able to fullfill."— Presentation transcript:

1 Netarchive Plans for the next year

2 Netarchive – Plans for the next year  4 broad crawls  One broad crawl lasts less than 55days  We are able to fullfill our task to do 4 broad crawls a year.

3 Netarchive – Plans for the next year  The never ending story Facebook and YouTube Addapt the harvest template to Facebook’s ongoing changes. Vi have harvested about 40.000 YouTube videos. What is the next step? How can we implement a setup for e.g. event harvests?

4 Netarchive – Plans for the next year  Focus on harvesting e-books  Pilote project:  Museum Tusculanum: http//… harvested with a template with xml-extractor  Publizon: ftp… with a putty login on kb-prod-udv-001.kb.dk (password protected)  What is next step?  Challenge: According to the legal deposit law we have to collect e-books, but we are not allowed to shoe them to anybody.How do we do this?

5 Netarchive – Plans for the next year  Selective crawls: focus on harvesting of password protected content.  Especially News Sites are moving to more and more password protected content.  The easiest way to capture this content is via ip-validation  Not all publishers will give ip-access to us.  http-password: has to be implemented in the harvest template  html-password: Netarchive does not support html-password at the moment.  Is there en easy way to get the password protected content?

6 Netarchive – Plans for the next year  Improvement of user acces (DigHumLab project)  DigHumLab is a digital research infrastructure project  Improvement of wayback to Netarchive in cooporation with our users/media researchers.  A grant from a fund made it possible to employ a developer: the goal is to give other possibililties to access archived content then browsing url’s  Implementation of SOLR?

7 Netarchive – Plans for the next year  Automatisation of screen casts  Development of a kind of on/off switch to a screen cast tool for in conjunction to a research project on cross mediality.  We hope it will be usefull for event harvests (as we are not able to capture streeming)

8 Netarchive – Plans for the next year  TwitterVane  Learn about and test the IIPC tool:  ” The TwitterVane tool will allow capture of URL’s related to a specific topic of interest in a collection. It extracts and analyses URLs embedded in the tweet to allow reporting on top URLs and domains for a given collection. This is a prototype application and not yet fully developed. You are free to download the code for your own use but at your own risk.” (http://www.netpreserve.org/projects/twittervane )http://www.netpreserve.org/projects/twittervane


Download ppt "Netarchive Plans for the next year. Netarchive – Plans for the next year  4 broad crawls  One broad crawl lasts less than 55days  We are able to fullfill."

Similar presentations


Ads by Google