Texas Digital Newspaper Program Data What we gather, and how we use it. By Ana Krahmer & Mark Phillips University of North Texas Libraries 4 February 2014 Start Spreading the News!
Overview I.Target Audiences II.Data Collection III.Data Use IV.Questions
Target Audiences Of the Texas Digital Newspaper Program
Target Audiences K-12 Students & Educators Higher Education Researchers, including undergraduate, graduate, and faculty researchers Librarians Genealogists Lay-Historians Lifelong-Learners
Current Classroom Use Teaching with Primary Sources: National History Day; Texas Junior Historians; Texas History, grades 4 & 7. Texas Tech University, Dr. Ann Hawkins’ Texas Manuscript Cultures Online, undergraduate and graduate Book History and Research Methods courses. University of North Texas, Dr. Andrew Torget’s courses, History of Texas and American History courses. Austin College, Dr. Light Cummins’ Research Methods.
Library Use Texas Digital Newspaper Program Feedback utility Partner institution pages Newspaper Program Traveling Banner Links to share on social networking feeds
Genealogists Full-text searchability Search content highlighting Full metadata records Faceted links
Lay-Historians Annual Digital Frontiers Conference Free and open access to newspaper content Zoomable views Permissions to use in research, with citations
Lifelong-Learners Emeritus College students Partner public library patrons Local civic interest groups
Data Collection Where did it all come from?
Data Collection Qualitative: Surveys, Feedback responses, grant report comments, partner communications Quantitative: Analysis of collection usage, geographic origin, contributing partners, annual additions to collection
Data Collection (Qualitative) Grant-funded projects require final reports from TDNP partner institutions. In 2012, Kathleen Murray and Dreanna Belden launched an impact survey for Portal to Texas History users. The Portal feedback database offers years of user questions.
Data Collection (Qualitative) “The newspaper digitization project places our library in position to reach out toward the future. By taking a piece of the past and bringing it with us, we are sure to grow and learn, appreciate and respect what was, what is and what will be.” - Whitaker, L. (2013). Richard S. & Leah Morris Memorial Library Final Grant Project Report to the Tocker Foundation.
Data Collection (Qualitative) The Portal to Texas History Impact Survey was launched in 2012 by Murray and Belden. Of 573 respondents, 36% self-identified as genealogists, 19% as lifelong-learners, 19% as historians, 6% as librarians, 5% as students, and 15% as “other.” 93 individual comments within this survey cited the newspapers as being of especial value.
Data Collection (Quantitative) Descriptive metadata for the Texas Digital Newspaper Program collection is available via OAI-PMH in multiple formats. Utilized a Python-based harvester, pyoaiharvester, to collect metadata from the TDNP OAI Repository endpoint. All years combined total 165,298 metadata records. Phillips then prepared a Python script to parse records and extract relevant information. This script uses the output of the pyoaiharvester tool as input, to return a tab-delimited file displaying one newspaper issue per row.
FieldField DescriptionExample Data ARKARK Identifier for issueark:/67531/metapth16320 PartnerContributing Partner CodeBDPL Year OnlineYear issue went online (prefixed with “od:”) od:2006 YearYear of newspaper issue1934 DecadeDecade of newspaper issue1930 CountyCounty of newspaper issuePalo Pinto County CommunityCommunity of newspaper issueMineral Wells TitleTitle of newspaper issueThe Tattler *All values in the Partner field can be resolved from the controlled vocabulary: The fields* in the tab-delimited file are:
Year# of IssuesCumulative% of IssuesCummulative % % %0.03% %0.06% 20097,2637, %4.46% ,78852, %31.56% ,62684, %51.30% ,836115, %69.95% ,538165, %99.92% Number of issues added per year (n=165,149)
Year# of Titles Added # of New Titles Added Cummulative # of Titles Number of counties added per year (n=109)
Year# of Communities Added# of New Communities Added Cummulative # of Communities Number of communities added per year (n=142)
Counties currently represented in the TDNP Collection
Year# of Partners Added# of New Partners Added Cummulative # of Partners Number of partners added per year (n=48)
Partner Type# of Partners by Type# of Issues Public Libraries2766,320 Academic Libraries & Archives* 1341,308 Genealogical/Historical Societies 45,065 Museums22,544 * “UNT Archives” and “UNT Libraries” are two partner institutions, thus actually totaling to 48 partners. This table indicates content whose digitization was funded by external partners. Content funded internally by UNT totals to 49,805 and has been removed from this table. TDNP Partner Institutions
Google Fusion Tables map, derived from TDNP Newspaper Locations: iontables/embedviz?q=select +col1+from+17utAXOiLgXhaE XlHgsIfE2DJ_2OlQyD-XIR- LZU&viz=MAP&h=false&lat= &lng= &t=1&z=6 &l=col1&y=2&tmplt=2&hml= GEOCODABLE
External References Python Metadata Extraction Script: Texas Digital Newspaper Program OAI API: PYOAIHarvester Script: Belden, Dreanna & Murray, Kathleen R. Where do users find value?. UNT Digital Library. Accessed January 27,
Questions? Ana Krahmer Mark Phillips