Robin Butterhof & Deborah Thomas Library of Congress Leah Weinryb Grohsgal National Endowment for the Humanities Digitized Newspapers & Research DPLAfest.

Slides:



Advertisements
Similar presentations
Special Features of Publishers Web Sites. Objectives Review standard features via Elsevier website Identify special features in the websites of the following.
Advertisements

More Toys & A Bigger Sandbox Future Trends for Digital Libraries.
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
Library-based Publishing in North America: Coming of Age New Approaches in Library-based Publishing: COASP 2014, Paris, 17 – 19 September Charles Watkinson.
Illinois Newspapers: Anna FitzSimmons, Amy Sullivan, Tracy Nectoux, Nathan Yarasavage Preparing Our Past for the Future.
Live Search What’s New With Search? Internet Librarian, October 30 th, 2007 Heather Dystrup-Chiang Program Manager, Live Search.
These ain’t “Old News”! Creating access to historic newspapers Christine Guenther OCLC Product Manager, Digital Services Preservation Service Centers Bethlehem,
Newspaper Preservation through Collaboration and Communication The Texas Digital Newspaper Program By Ana Krahmer & Mark Phillips University of North Texas.
HATHITRUST A Shared Digital Repository Big Collections in an Era of Big Copyright: Practical Strategies for Making the Most of Digitized Heritage Jeremy.
WorldCat and the Family Tree: A lesson in creative catalog searching.
„AMERICAN MEMORY“ Presentation of the digital public historical database of the Library of Congress INTRODUCTION „American.
ProQuest Supporting Research & Education 24 th June 2008 Stephen Hawthorne.
Access to Digital Materials through the Library of Congress OPAC Presentation by Dr. Barbara B. Tillett Chief, Cataloging Policy and Support Office Library.
N ew Stage of the Digital Library of the National Diet Library of Japan: Digitization of Japanese Books and Digital Archive Portal By Kazuharu Honda Assistant.
NOBLE Digital Library. How does it work? The NOBLE Digital Library uses the DSpace platform. Image files and metadata are imported into DSpace using.
1 Archiving and Preserving the Web Kristine Hanna Internet Archive April 2006.
National Archives Records on Microform What they are and where to find them Archival Research Basics with the National Archives Lesson # 8 The National.
Sai Deng, Metadata Catalog Librarian, Wichita State University Libraries Tse-Min Wang, Graduate Student in CS, Wichita State University Digital Imaging.
Information Literacy Jen Earl: Academic Support Librarian- HuLSS.
Evaluating Search Results Fundamentals of Research Capital Community College Spring Semester 2013.
The National Digital Newspaper Program (NDNP) An NEH/LC Collaborative Program Enhancing access to historical newspapers Release: September 2006.
Public Library Use 19 th Century U.S. Newspapers Digital Archive.
Adventure of the American Mind Slides go here!. Electronic Replica Newspapers Denver Post Rocky Mountain News.
Thursday, July 30, 2014 Lake Oswego, Oregon 1. Ralph Hartsock, University of North Texas Libraries, and TMUMC Library and Tara Carlisle, University of.
{ Building Open Access To Our Heritage Andrew Weidner Project Coordinator, New Mexico Historical Newspapers University of North Texas Libraries: Digital.
A Guide to Using DEstreaming Digital Resources January 2009.
Oklahoma City, Oklahoma
Primary Sources of the Serial and Government Publications Division by Georgia Higley Head, Newspaper Section
Discovering Georgia History and Culture: GALILEO and the Digital Library of Georgia.
Erin Kinney, Wyoming State Library. Motivation #1 priority that came out of 2004 statewide digitization meeting WSL received many reference questions,
Primary and Secondary Resources for Secondary Teachers August 16, 2012 Access “Media Talk”, our District Media Center wiki, using
Europeana Libraries: what is the value of a library domain aggregator? Susan Reilly (LIBER) LIBER 2012, Tartu.
Special Collections Keys to History Research. Databases.
Electronic Replica Newspapers. Identical to the print edition – but enhanced. Viewing options: switch to single or double page mode choose – and + to.
From Concept to Reality: An overview of the University of Wisconsin Digital Collections Melissa Mclimans.
National Park Service U.S. Department of the Interior Resource Information Management Division National Information Systems Center Office of the Chief.
INFORMATION LITERACY. What is information?  Information is knowledge derived from data  Knowledge is data that an individual recognizes as relevant.
California Digital Library eScholarship Publishing Services CDL Users Council Meeting, May 9, 2008 Catherine Mitchell Acting Director, Publishing Group.
New information solutions from Gale Charles-Louis Moreau Regional Sales Manager.
Understanding large digital collections and learning new tools: The Texas Digital Newspaper Program Visualizations. Mark Phillips & Will Hicks.
Primary Source Material in Historical Research MENC - Anaheim 2010.
Introducing Intute: Social Sciences Your Guide to the Best of the Web.
Following the 8 Steps to Historical Research Step 1: Organize! Step 2: Topic Selection Step 3: Background Reading for Historical Context Step 4: Narrowing.
ALA Institutional Repository Update ALA Archives at the University of Illinois Urbana-Champaign Chris Prom Cara Bertram Denise Rayman.
PAN-European Exploitation of the Results of the Libraries Programme - EXPLOIT German Libraries Institute Berlin EXPLOIT 1 Electronic library materials.
INTELLECTUAL RIGHTS AND HISTORIC CORPORA Mark Sandler University of Michigan ICOLC, March, 2003.
Ebooks? John Akeroyd Milano March 7 th Ebook Readers.
Digital library of Spanish old newspapers and magazines National Library of Spain.
HATHITRUST A Shared Digital Repository Institution Uses of HathiTrust Jeremy York University of Maine May 24, 2013.
Research skills Data Bases Find it on the Computer.
The AECID digital library and its possibilities to contribute to sustainable international development Araceli GARCÍA IFLA Satellite Conference. August.
Making Thinking Visible Using Primary Sources Learning Conference 2012 Think Imagine Create.
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
Presenters:Lea Domingo, Branch Manager, Kahuku Public and School Library Sunny Pai, Digital Initiatives Librarian, Kapiolani Community College If you.
Premium FH Websites Available now at your local FHC.
CENTRAL/WESTERN MASSACHUSETTS AUTOMATED RESOURCE SHARING Digitization GOALS & THEIR LOGISTICS Michael J. Bennett Digital Initiatives Librarian C/WMARS,
February 22, 2012 Jim Duran and Julia Stringfellow
Global Rangelands Data Entry Guidelines March 23, 2015.
UNEARTHING THE INTERNET’S TREASURES Finding Free Primary Sources On The Web Kathy Snediker Reference & Instruction Librarian University of South Carolina.
Michigan Digital Newspaper Project Contributing 100 thousand pages in Chronicling America
Archiving & Preserving Digital Content
Top Ten 21st Century Genealogy Websites
Microfilm & Microfiche Ordering Discontinuation
Digital Collections Update
Challenges and Opportunities of Archiving the UK Web
Why is the Times Literary Supplement Historical Archive an essential resource?
Finding Sources Introduction Types of sources Locating sources
Immerse Yourself In History with Primary & Secondary Sources
Immerse Yourself In History with Primary & Secondary Sources
Presentation transcript:

Robin Butterhof & Deborah Thomas Library of Congress Leah Weinryb Grohsgal National Endowment for the Humanities Digitized Newspapers & Research DPLAfest April 16, 2016

NDNP / Chronicling America p.2 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS From … history’s markers

NDNP / Chronicling America p.3 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS “History’s Rough Draft” Something for everyone - Crime, Fashion, Travel, Economics, Events, Battles, Tragedy, Politics, Social Activism, Diplomacy, Society, Technology …

NDNP / Chronicling America p.4 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS Working with U.S. Newspapers Many types of users, high demand for access Newspaper format challenges Physical characteristics Large, brittle, acid paper, poor ink, light damage Content characteristics Many subjects on a page, small text, hard to identify parts No single U.S. collection – 153,000 titles published since 1690 (collected across the country) Newspapers = fundamentals of U.S. history

NDNP / Chronicling America p.5 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS National Digital Newspaper Program (2004- )  Enhance access to American newspapers  Develop permanent digital resource including selected historic content from all US states and territories  Shared resources and cost distribution (LC/NEH/Awardees)  Shared practices/specifications = community  Paced scalability  Plan for technical change and sustainability requirements

NDNP / Chronicling America p.6 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS National Digital Newspaper Program (2016 )  10.7 million pages online  Approx. 70+ Tb online, 550+ Tb archival storage  3.9 million visits in 2015 (chroniclingamerica.loc.gov)  40 states and territories participating  Also received:  1000 newspaper history essays  2000 bibliographic titles (of 153,000 titles published)  10,000 reels of microfilm (duplicate print negative)

NDNP / Chronicling America p.7 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS Chronicling America: Historic American Newspapers

NDNP / Chronicling America p.8 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS Finding Our History  Page Search – Full text  Search by place, time, keyword  Page information – Title, Date, Edition, Section, Page (Image)  Visual search results (Thumbnail view with hit-highlights)  Pan and Zoom  Full-screen view  US Newspaper Directory Search  Search by place, time, keyword, format, subject, etc. (CONSER/WorldCat data)  Keyword search – e.g., “http” (external Web site links) or “times”

NDNP / Chronicling America p.9 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS PARTNERS: 40 institutions | 10.7 million pages now online |

NDNP / Chronicling America p.10 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS Robin’s section  See notes for text...  [Put pretty poster here]

NDNP / Chronicling America p.11 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS Halibut prices, Seattle Influenza Epidemic Civil War Editorials Mark Twain Great Blizzard of 1888

NDNP / Chronicling America p.12 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS Genealogy and historical romance research... made visible by a Twitter bot

NDNP / Chronicling America p.13 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS ChronAm: What’s Available Digitized page images OCR Mars has atmosphere, seasons, land, y!?H water, storms, clouds and mountains. "H Mars has i-wr. "'o - H only 3,700 miles awa.y and revolves around ?!i it ni seven and a half 'houvs ? phoot- fciji': ing star. Metadata "place_of_publication": "Salt Lake City, Utah", "lccn": "sn ", "start_year": "1890", "place": [ "Utah--Salt Lake--Salt Lake City" ], "name": "The Salt Lake tribune.", "publisher": "Tribune Pub. Co.", "url": " json", "end_year": "current", "issues": [ { "url": " / /ed-1.json", "date_issued": " " Newspaper Directory Records

NDNP / Chronicling America p.14 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS Usage: Directory Records Stanford

NDNP / Chronicling America p.15 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS Usage: OCR Northeastern Georgia Tech

NDNP / Chronicling America p.16 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS Usage: Digitized Page Images University of Nebraska - Lincoln

NDNP / Chronicling America p.17 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS ChronAm: How do we make it available?  Public website  Open API – no login required  Industry standard endpoints – like OpenSearch  Machine readable views (like JSON)  Easier to play with the stuff  Stable URLS  Added bonus - URLs make sense (title/date/page)

NDNP / Chronicling America p.18 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS ChronAm: How do we make it available?  As pre-fab datasets (OCR bags)

NDNP / Chronicling America p.19 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS Lessons Learned about the API  Easiest methods are best  CSV file  People will try to use the API first and contact you as a last resort  Users may underestimate size of files and downloading time 225,000 pages x 5.2MB = 1.2TB = BAD IDEA To avoid: Ask-a-Librarian!Ask-a-Librarian

NDNP / Chronicling America p.20 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS Lessons Learned about the API  Expect the unexpected.

NDNP / Chronicling America p.21 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS ChronAm: What Users Tell Us They Want  More stuff  Practical concerns  Structure of program  Copyright issues  Better OCR  OCR problems are big with newspapers (multicolumn layout, microfilming artifacts like uneven lighting, bad condition newspapers at time of filming, less contrast than a book, etc.) See UNL visual analysis project.UNL visual analysis project

NDNP / Chronicling America p.22 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS For Libraries: Challenges and Opportunities  Mixing full-text search and metadata search is hard  Lots of bad OCR versus relatively little clean metadata  Newspapers are serial objects  Secondary concerns for monographs (time, place) are critical to newspapers  Newspapers are big  Compared to a book page or photograph, newspaper pages are huge

NDNP / Chronicling America p.23 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS For Researchers: Challenges and Opportunities  Going bigger  Interdisciplinary  Ability to scale up a project  Getting stuff across project borders  Gaps in dataset

NDNP / Chronicling America p.24 NATIONAL ENDOWMENT FOR THE HUMANITIES LIBRARY OF CONGRESS Thank you!  NDNP Public Web  NDNP Web Service Chronicling America: Historic American Newspapers  Contact us at