An Arizona Model for Capturing and Describing Documents on the Web Richard Pearce-Moses Director of Digital Government Information Arizona State Library,

Slides:



Advertisements
Similar presentations
Dublin Core for Digital Video: Overview of the ViDe Application Profile.
Advertisements

Moving Forward With Digital Preservation at the Library of Congress Laura Campbell Associate Librarian for Strategic Initiatives Library of Congress.
Adapting Archives to EMu. Corporate Records Excavation Records Personal Papers Byzantine Research Fund Archive Art Collection Exhibitions.
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
The National Digital Stewardship Alliance: Community, Content, Commitment.
Creating Finding Aids Sara Casper Government Records Archivist South Dakota State Archives.
PeDALS Persistent Digital Archives & Library System GladysAnn Wells, Director and State Librarian Lisa Maxwell, Division Director, Records Management Division.
A atuação do arquivista e o mercado de trabalho nos Estados Unidos XV Congresso Brasileiro do Arquivologia.
Identification, Selection, and Appraisal within the North Carolina Geospatial Data Archiving Project (NCGDAP) NCSU Libraries Steve Morris Head of Digital.
Open Discovery: Collaborative Approaches to Metadata 26 August 2011 Kira B. Homo Electronic Records Archivist.
Web archiving at the NLA ‘ Archiving the music web’ Music Council of Australia Annual Assembly 28 September 2009 Paul Koerbin Manager Digital Archiving.
Persistent Digital Archives and Library System (PeDALS) A Guide for Wisconsin State Agencies.
EMu and Archives NA EMu Users Conference – Oct Slide 1 EMu and Archives Experiences from the Canada Science and Technology Museum Corporation.
National Digital Information Infrastructure and Preservation Program (NDIIPP) Building a Network of Preservation Partners CNI Spring Task Force Meeting.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
OCLC Online Computer Library Center OCLC’s Digital Archive – Disseminating with METS Jay Goodkin Software Engineer Digital Collection and Preservation.
Joanne Archer University of Maryland Kate Odell Archive-It Abbie Grotke Library of Congress Tessa Fallon Columbia University Creating and Maintaining Web.
PeDALS Persistent Digital Archives & Library System Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library,
DSpace, CyberCemeteries and Other Active Sites for Community Networking Records Maria Esteva and Sue Soy School of Information, UT Austin Austin History.
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
Mark Sullivan University of Florida Libraries Digital Library of the Caribbean.
Finding a New Way Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library, Archives and Public Records Using.
The Web Archiving Service Tracy Seneca California Digital Library California Digital LibraryNew York UniversityUniversity of North Texas National Digital.
Persistent Digital Archives and Library System (PeDALS) SC Department of Archives and History.
Cataloging and Metadata at the University Library.
Copyright © 2008, Open Geospatial Consortium, Inc., All Rights Reserved. NDIIPP Partnership Update: North Carolina and Multi-state Demonstration Projects.
Digitization Panel August 12, 2010 Christopher C. Brown, coordinator Mike Culbertson, Colorado State U. James Mauldin, GPO.
The ECHO DEPository Project A project of the University of Illinois at Urbana-Champaign and OCLC in partnership with the Library of Congress ALA Annual.
ECHO DEPository Project: Highlight on tools & emerging issues The ECHO DEPository Project is a 3-year digital preservation research and development project.
Metadata Considerations Implementing Administrative and Descriptive Metadata for your digital images 1.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management ServicesSALT DCAPE.
The Real At Risk E-Content: University Web Resources EDUCAUSE Joanne Kaczmarek University of Illinois at Urbana-Champaign Taylor Surface OCLC October 12,
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
Digital Library of the Caribbean Creating Single Items with mydLOC and Editing Materials with the Curator Dashboard
The Library of Congress Martha Anderson Program Officer, NDIIPP Office of Strategic Initiatives Library of Congress April 2005 LC Perspective : Preservation.
November 2004 NDIIPP: Future Directions and Relevance to Other Countries Beth Dulabahn Office of Strategic Initiatives Library of Congress November 7,
Marshall Breeding Director for Innovative Technology and Research Vanderbilt University
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Digital Preservation: Current Thinking Anne Gilliland-Swetland Department of Information Studies.
CONTENT DISCOVERY, SERVICES, AND SUSTAINED ACCESS Timothy Cole, William Mischo, Beth Sandore, Sarah Shreeves ~ University of Illinois Library
Web Archiving Service (WAS) Rosalie Lack Data Curation for Practitioners 2012 Workshop.
ALA Institutional Repository Update ALA Archives at the University of Illinois Urbana-Champaign Chris Prom Cara Bertram Denise Rayman.
Cyndi Shein Lightning Session 608 Society of American Archivists 2013 Image by William Warby.
Persistent Digital Archives and Library System (PeDALS)
The Web-at-Risk NDIIPP Sponsored Project Partners include: California Digital Library – project lead University of North Texas New York University California.
Metadata Extraction & Web Archives: Automating the Record Creation Process Abbie Grotke / Gina Jones /
Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the.
Preservation Program Digital Preservation Program Digital Preservation Services: Extending tools to meet campus needs Patricia Cruse, Director, Digital.
Warwick Cathro Assistant Director-General Resource Sharing and Innovation National Library of Australia Trove – a service built on collaboration OCLC Asia.
Preservation metadata and the Cedars project Michael Day UKOLN: UK Office for Library and Information Networking University of Bath
Discovering libraries’ gold through collection-level descriptions ELAG 2014, Bath Valentine Charles Data specialist.
Library of Congress Partnerships for Managing Geospatial Data North Carolina Geographic Information Coordinating Council Raleigh, NC November 7, 2007 William.
The Promise of Institutional Repositories : Scholars’ Bank at the University of Oregon Carol Hixson Head, Metadata and Digital Library Services University.
Genealogy: Traveling through Time Bethany Fiechter Archivist for Manuscript and Digital Collections.
EAD 101: An Introduction to Encoded Archival Description XML and the Encoded Archival Description: Providing Access to Collections Oregon Library Association.
Challenges in Web Archiving UNT Perspective NDIIPP – July 21, 2010.
Repository-specific Spoke Scripts Content Repository JSR-170/283 Content Repository for Java Technology API Normalized H&S METS Files METS Import/ExportMETS.
The Usability of Electronic Finding Aids during Searches for Known Items Christopher J. Prom Assistant University Archivist University of Illinois at Urbana-Champaign.
The National Digital Stewardship Alliance: Stewardship, Collaboration, Inclusiveness, Exchange.
Jaime Stoltenberg Map and Geospatial Data Librarian Arthur H. Robinson Map Library University of Wisconsin-Madison Wisconsin Land Information Association.
The National Digital Stewardship Alliance: Community, Content, Commitment.
ASEE 2011 Adriana Popescu Princeton University
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
National Digital Stewardship Alliance Web Archiving Survey Update
Introduction to Semantic Metadata & Semantic Web
Wisconsin County and Municipal Government Collections in Archive-It
Márton Németh – László Drótos How to catalogue a web archive?
Unless otherwise noted, the content of this course material is licensed under a Creative Commons Attribution - Non-Commercial - Share Alike 3.0 License..
Metadata supported full-text search in a web archive
Presentation transcript:

An Arizona Model for Capturing and Describing Documents on the Web Richard Pearce-Moses Director of Digital Government Information Arizona State Library, Archives and Public Records rpm at lib.az.us

What Does WWW Stand For? They both abbreviate to WWW Rugged Individualism Lack of standards ~ Lawlessness [Collage of Robert Conrad as James West in the Wild, Wild West removed to avoid violation of copyright.]

The Dream To collect, manage, preserve, and make useful the enormous amount of digital information our culture is now producing

The Reality Two Approaches Bibliocentric (Item-by-Item) Tech-centric (Capture-It-All) Emphasis on Software Tools and Technology Limited Assistance from Content Providers

Library of Congress & NDIIPP University of Illinois at Urbana-Champaign School of Library Information Science OCLC Content Providers Tufts University Perseus Project Michigan State University Library State libraries: Arizona Connecticut, Illinois, North Carolina, Wisconsin UIUC partners: NCSA WILL- AM/FM/TV Information Management Services

Digital Archives Libraries Artificial collections Item Level Control Archives Provenance Original Order Hierarchy Aggregate Control

Websites as Archival Collections Documents of Common Provenance Organized into Directories (Archival Series) Publications v. Records

The Art and Craft of Building a Collection What we do remains the same How we do it will change ※ Identification/Selection Acquisition Description Reference Preservation

Identification — Where Do We Look? Finding the Forest az.gov state.az.us ※ Domain Tool Identifies all distinct domains Reports new sites since previous spider Reports when sites disappear

Selection: Which Collections Do We Harvest? Collection-Level Analysis Macro appraisal sets priorities Materials appraised as series Content Providers Taxonomy Tool Names Administrative history Relationships Subjects Functions

Selection: Which Documents Do We Harvest? Identify Series Aggregate selection Set frequency of harvests Site Analysis Tool Display structure Harmonize physical, intellectual structure Identify inaccessible content Show what’s new Show significant changes

Description To be able to locate documents when the creator or provenance is known when the subject is known and to aid in selection as to character Series Description Make directory name a meaningful title Scope and contents note High-level subject headings Recorded in site analysis tool database Document Description Creator: taxonomy, internal metadata Title: from internal metadata, noun phrases Subject: from series metadata, internal metadata

Access Finding Aids A valuable bird’s-eye view for archivists Of limited value to patrons... Unless they’re transformed into topic maps Full Text Search Engines Ranking Algorithms Categorization / Packaging Results Based on series-level metadata Based on autoclassification

Description and Access Series-Level Description name=“Creator”Governor’s Drought Task ForceRural Watershed Alliance name=“Subject”reservoirsground water name=“Subject”droughtwater conservation name=“Subject”potable wateragriculture name=“Type”planningreports Categorized Results Your search for water, Phoenix Found documents in the following categories water (500+) water conservation (357) Salt River Project (210) drought (110) flood control (98) xeriscape (25) Found documents from the following agencies Water Resources (135) Governor's Drought Task Force (102) Phoenix (87) Maricopa County (84) Corporation Commission (35)

Administration / Curation / Stewardship Systematic Regular Workflows Not idiosyncratic Collaborative Consensual, Not Idiosyncratic Avoid Redundant Efforts Quality Control Need for Good Metrics Need for Regular Audits

Stay Tuned....