The National Digital Newspaper Program (NDNP) An NEH/LC Collaborative Program Enhancing access to historical newspapers Release: September 2006
2 NDNP Mission Enhance access to all American newspapers Improve access to products of United States Newspaper Program (USNP) using current technologies Establish standards and “best practices” for newspaper digital reformatting and access Use multi-phased approach for research and scaled development Develop geographically-diverse program that benefits all US communities
3 Why Newspapers? Newspapers: a unique resource for understanding the fundamentals of history –Democracy, free press, diverse geographic viewpoints at the community level Enormous corpus of newspapers presents an archival challenge Text-intensive layout is labor-intensive to search without reference tools Digitization of microfilmed corpus economically feasible
4 Why a National Effort? Voluminous, distributed collections No one institution holds the “master collection” Broad user-base for newspaper material Think nationally, select locally Comprehensive chronological coverage, eventually Need for leadership to build on past national efforts (USNP)
5 LC’s Historical Newspaper Activities 20-year NEH/LC collaboration of USNP –Existing national network of cooperative programs –Standards established for preservation microfilm –Standards established for descriptive metadata/ cataloging American Memory’s “Stars and Stripes” – shome.htmlhttp://memory.loc.gov/ammem/sgphtml/sashtml/sa shome.html –Proof-of-concept for historical newspaper format and description
6 What will NDNP Produce? Web access to –National directory of US newspaper holdings (what, when, where) – based on USNP legacy data –More than 30 million page images of historical newspapers digitized primarily from microfilm, with full-text –Historical context of newspaper, printing tech, etc Depository of duplicate digitized microfilm at LC
7 How? Multi-partner program –NEH: Funds the program (“We the People” initiative) –LC: Aggregates, preserves and serves –Awardees: Selects and converts Phase I – FY04-FY06 (Test bed) NEH awardees (up to 10) with existing digital collections infrastructure and master microfilm negatives 100,000 pages each + 100,000 LC pages by 2007 (from ) Microfilm reel analysis for research
8 Phase I Timeline 2004 July – NEH cooperative agreement guidelines issued, LC technical architecture under development October – Application deadline; 15 applications received 2005 April – NEH Awards announced May – Award conference held at LC 2006 September – NDNP application publicly available via Web
9 NDNP, September 2006 Web access - American Chronicle –Newspaper Title Directory, 1693-present –Full-text of content w/in visual newspaper layout (page-level access) –Contextual historical material (Encyclopedia) Converted content from all awardees –Initial time period covered:
10 Newspaper Title Directory Re-use of CONSER and Newspaper Union List, created under USNP (maintained by OCLC) 147,000 newspaper titles 900,000 holdings records Searchable, Web access to all USNP- collected data, tied to digitized issues when available, as well as external newspaper Web sites
11 Full Text with Page-level Access Preserves integrity of primary historical content, text in context Minimal metadata required to achieve reasonable search results Economics of large-scale, large- format digitization Allows creation of substantial content-base for research and development on additional search strategies and technologies
12 Digital Asset Specifications Page Image - grayscale, 400 dpi, from microfilm TIFF 6.0; JPEG 2000 (.jp2); PDF with Hidden Text OCR XML – NDNP/ALTO Schema Page-level, uncorrected, column zones with “bounding box” mapping coordinates Metadata XML in METS/MODS for digital objects
13 Historical Context An Encyclopedia of Newspaper History Brief essays for each title digitized –Publisher, geography, significant events covered, audience/community, politics History of each participating state and the role of newspapers in its history Presentations for technology developments, significant people, places, etc
14 Future Phases: Addition of new partners (continuation of Phase I test bed, to represent all 54 states and territories) Increased efficiency in workflows, tools, technology, sustainable resources Additional access capabilities, improved technology Aggregate ~ Preserve ~ Serve
15 For more information, contact Georgia Higley Head, Newspaper Section Serials and Government Publications Division Library of Congress