1 Advanced Archive-It Application Training: Quality Assurance October 17, 2013.

Slides:



Advertisements
Similar presentations
1 IDX. 2 What you will learn: What IDX is Why its important How to use it Tips and tricks Introduction Q & A.
Advertisements

Bellwork If you roll a die, what is the probability that you roll a 2 or an odd number? P(2 or odd) 2. Is this an example of mutually exclusive, overlapping,
Slide 1 Insert your own content. Slide 2 Insert your own content.
1 Chapter 40 - Physiology and Pathophysiology of Diuretic Action Copyright © 2013 Elsevier Inc. All rights reserved.
ARRA Reporting School Level Expenditure Report February 12, 2010 (SLER)
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination.
HERMES TUTORIAL version 1.0 Published 24th July 2007 This tutorial version is based on the actual deployed version of Hermes, as of the date of publication.
State of New Jersey Department of Health and Senior Services Patient Safety Reporting System Module 2 – New Event Entry.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
0 - 0.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
MULTIPLYING MONOMIALS TIMES POLYNOMIALS (DISTRIBUTIVE PROPERTY)
ADDING INTEGERS 1. POS. + POS. = POS. 2. NEG. + NEG. = NEG. 3. POS. + NEG. OR NEG. + POS. SUBTRACT TAKE SIGN OF BIGGER ABSOLUTE VALUE.
SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION
MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
Addition Facts
Making the System Operational
©2011 Quest Software, Inc. All rights reserved.. Andrei Polevoi, Tatiana Golubovich Program Management Group ActiveRoles Add-on Manager Overview.
1 SLIDE Insurance Company Regulation Division Insurance Market Regulation Division Medical Professional Liability Insurance Claim Reports Online Claim.
Google as a Hacking Tool James Lee Advanced Searching.
BT Wholesale October Creating your own telephone network WHOLESALE CALLS LINE ASSOCIATED.
Using Family Connection Online Resource for Planning & Advising.
ACCESSING AiM This is for first time users to access AiM. Questions/suggestions:
Software change management
Chapter 6 Computer Assisted Audit Tools and Techniques
How To Use Google Forms to Create A Test Quick Easy Self-Graded!! Instant Reports.
ECATS RCCA CAMP PROCESS ENHANCEMENTS
Training Delivery Session
Creating a WordPress Website Oklahoma Conference of The UMC Department of Communications 1.
PowerPoint Basics Tutorial 4: Interactivity & Media PowerPoint can communicate with the outside world by linking to different applications, managing different.
© S Haughton more than 3?
CAR Training Module PRODUCT REGISTRATION and MANAGEMENT Module 2 - Register a New Document - Without Alternate Formats (Run as a PowerPoint show)
PubMed for Trainers, Winter 2013 U.S. National Library of Medicine (NLM) and NLM Training Center Full Text.
© 2010 Cisco and/or its affiliates. All rights reserved.Presentation_IDCisco Confidential CISCO LEARNING CREDITS MANAGEMENT TOOL CLP ADMINISTRATOR – USER.
Linking Verb? Action Verb or. Question 1 Define the term: action verb.
Proprietary and Confidential External Job Board Posting In FOX Live on Monday – October 20,
Getting Started with D2A
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Session Outline: 1. Research Strategy - the 8 steps including: Finding information on the subject guide Searching the library catalogue Searching online.
1 Web Pages Week Three more tags… Sound Redirection Marquee.
Past Tense Probe. Past Tense Probe Past Tense Probe – Practice 1.
Addition 1’s to 20.
25 seconds left…...
Test B, 100 Subtraction Facts
Performance Tuning for Informer PRESENTER: Jason Vorenkamp| | October 11, 2010.
11 = This is the fact family. You say: 8+3=11 and 3+8=11
Week 1.
Visions of Australia – Regional Exhibition Touring Fund Applicant organisation Exhibition title Exhibition Sample Support Material Instructions 1) Please.
Useful Tips  How to quickly verify if you are logged on or not  Get the full navigation menu window for e- application  What is a time-out and how to.
NSCHS College Visit Sign Up Procedure 1.Determine which college(s) you want to sign up to visit with List of college visits are found on the CCC bulletin.
WorkKeys Internet Version Training
We will resume in: 25 Minutes.
12. NLTS2 Documentation: Quick References. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Page 1 of 15 Welcome To the ETS – Crown Mineral Activity Road Allowance Online Training Course This module describes the process for initiating a CMA application.
1 PART 1 ILLUSTRATION OF DOCUMENTS  Brief introduction to the documents contained in the envelope  Detailed clarification of the documents content.
A lesson approach © 2011 The McGraw-Hill Companies, Inc. All rights reserved. a lesson approach Microsoft® PowerPoint 2010 © 2011 The McGraw-Hill Companies,
Page 1 of 30 This process involves authorizing a company to act on behalf of the designated representative for a Crown petroleum and natural gas licence.
Use the buttons on the top to navigate through the presentation 1 PrevNext Menu.
Step 1: Enter your “21 Character Employee Id Or Alternate User Id” Step 2: Enter Personal Password & Click Login NOTE : (First use password is “21 Character.
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
Looking Ahead Archive-It Partner Meeting November 12, 2013.
1 Archive-It Training University of Maryland July 12, 2007.
1 Advanced Archive-It Application Training: Archiving Social Networking and Social Media Sites.
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
NetarchiveSuite Workshop, November 24, 2011, Paris 1 Austria Using Wayback for Access and QA Andreas P. Austrian National Library
1 Advanced Archive-It Application Training: Crawl Scoping.
Current Quality Assurance Practices in Web Archiving Brenda Reyes Ayala, Mark Phillips, and Lauren Ko University of North Texas
1 Advanced Archive-It Application Training: Reviewing Reports and Crawl Scoping.
Presentation transcript:

1 Advanced Archive-It Application Training: Quality Assurance October 17, 2013

Goals Effective use of tools within the Archive-It web application to get the best quality capture possible of your archived content, including embedded resources necessary to the display and functionality of all in scope content. See recorded training videos for more detailed information about crawl scoping. 2

Quality Assurance Tips 1.Prioritize crawls and websites within your collection to use your time effectively 2.Review Reports, including QA report 3.Browse your Websites -Wayback QA -Proxy Mode 3

4 Reviewing Reports How make the most of your time reviewing reports: – Review high level reports first (Seed Status and Seed Source) for seed level issues – Then review more detailed reports (Hosts report and file type specific reports) – Run a QA Report to see if any embedded content on your seed pages was not captured

5 Seed Status Report Are there any seeds not being crawled? – Double check your seed URLs are correct – Ignore robots.txt

6 Seed Source Report Are there any seeds that are capturing far fewer or far more URLs than others? – Fewer: Was seed “Not Crawled” in seed status report? – More: Check host report for any obvious area to limit your crawl

7 Hosts Report Are there numbers in the “Queued” or “Robots.txt Blocked” column? – Check the URL lists to see if you want to capture these URLs or not Are there hosts with fewer or more archived URLs than you expected? – Fewer: Are any expected URLs “Out of Scope”? – More: Are there parts of the site or specific URLs you want to block?

8 File Type/PDF/Videos Reports Are there file types you expected to archive that were not archived? – Check the “Out of scope” column of host report for files not captured

9 QA Report Is there embedded content on your seed pages that was not captured? – Run a Patch Crawl!

10 QA Report Quickly see from the Reports menu which crawls you have run a QA report for already.

11 Reviewing Archived Websites How to make the most of your time reviewing archived websites: Browse in Proxy mode Use Wayback QA to check for missing URLs that may be important to the display and functioning of the website. – Be sure to check pages that are heavy in javascript or video files, to ensure that content was archived

12 Proxy Mode Why is this helpful? – Browsing in proxy mode ensures that you are only seeing archived versions of files, and no content is coming from the live web – Sometimes sites that are heavy in javascript display more fully in proxy mode, so this can help you ensure that content was captured

13 Proxy Mode Live demonstration

14 Wayback QA Why is this helpful? – Wayback QA allows you to perform automated quality assurance work as you’re browsing through your archived pages in Wayback. – Wayback QA will note any missing files from the pages you view and allow you to run a patch crawl in order to capture these files and improve the display of your archived pages.

15 Wayback QA Live Demonstration

16 Wayback QA - Tips Browse through all of the sites that you would like to QA before running a patch crawl- you can do one patch crawl across your entire collection. Sometimes Wayback QA can be an iterative process. Ignoring Robots.txt for a patch crawl does not change crawl settings for future crawls.

17 Wayback QA vs. QA Report Wayback QA Immediately check for missing resources. Can be conducted on any page Occurs while browsing in Wayback Patch crawl: selective QA Report Takes 24 hours to generate after content is Wayback. Includes initial seed pages Tied to a specific crawl report Patch crawl: All or nothing

18 Potential Workflow 1.After crawl completes- log in to web application 2.Analyze reports- any surprises? 3.Check pages in Wayback – any surprises? 4.Request QA Report and run patch crawl 5.In archive mode, run Wayback QA on necessary seeds, as well as some linked content or pages that may not have archived well. Optional: compare sites in Proxy Mode versus Archive mode 6.Run patch crawl from Wayback QA 7.Check for improvements to archived content. 8.Use “Submit a Question” link to get further help and guidance for difficult to archive sites. What is your workflow like?

19 Questions? Please take our quick survey to let us know what you thought about today’s training, and any suggestions or ideas you have for further Archive-It trainings! (see Webex chat for link)