WORKING WITH VENDORS: THE UCONN “DAILY CAMPUS” STUDENT NEWSPAPER DIGITAL REFORMATTING CASE STUDY DIGITAL COMMONWEALTH ANNUAL CONFERENCE MAY 1, 2013 DEVENS.

Slides:



Advertisements
Similar presentations
Preservation of the Texas Agricultural Experiment Station Bulletin in the Digital Repository By Dr. Rob McGeachin Texas A&M University Libraries June,
Advertisements

E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started.
Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
Services Digitisation & Content Management. 600 People – India.
Texas Newspaper PDF Preservation: A Low-Cost Solution with Tremendous Value Ana Krahmer, Digital Newspaper Program Coordinator Mark.
Chapter 5. Slide 1 Digital Archives, Collections Digital Archives, CollectionsObjectives  To provide or improve access to the most valuable and unique.
Strategies for Building Successful Digital Initiatives: Tools, Workflows and Ideas for Small to Medium Institutions Rachel L. Frick & Andrew Rouner University.
PDF (Portable Document Format) for Digital Preservation and Delivery John Laurie Digital Initiatives Librarian The University of Auckland Library National.
These ain’t “Old News”! Creating access to historic newspapers Christine Guenther OCLC Product Manager, Digital Services Preservation Service Centers Bethlehem,
Newspaper Preservation through Collaboration and Communication The Texas Digital Newspaper Program By Ana Krahmer & Mark Phillips University of North Texas.
Page Image Compression for Large-Scale Digitization Sample Images JPEG 2000 Yale University Library January, 2008.
PIALA 2010 UH Manoa Hamilton Library Chronicling America and the National Digital Newspaper Program: Technical Aspects  Part 1: Newspapers and Microfilm.
Processing PDF: How to Go from PDF to E-text to Audio Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges Foothill.
File Formats The most common image file formats, the most important for cameras, printing, scanning, and internet use, are JPG, TIF, PNG, and GIF.
Sai Deng, Metadata Catalog Librarian, Wichita State University Libraries Tse-Min Wang, Graduate Student in CS, Wichita State University Digital Imaging.
File Formats Different applications (programs) store data in different formats. Applications support some file formats and not others. Open…, Save…, Save.
The National Digital Newspaper Program (NDNP) An NEH/LC Collaborative Program Enhancing access to historical newspapers Release: September 2006.
OCLC Online Computer Library Center Digitization Lifecycle Solutions: An Integrated Approach ALA Annual 2005.
Prepared by George Holt Digital Photography BITMAP GRAPHIC ESSENTIALS.
Chpater 3 Resolution, File Formats and Storage. Introduction There are two factors that determine the quality of the picture you take; The resolution.
AgNIC Pre-conference 2009 “If It’s Digital and in Google – Then They Will Come” Presented at the National Agricultural Library By Dr. Rob McGeachin Texas.
Erin Kinney, Wyoming State Library. Motivation #1 priority that came out of 2004 statewide digitization meeting WSL received many reference questions,
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
“A whole new way of looking at microfilm” Digital Film Viewers and Scanners.
Looking back, moving forward: Examining the impact of digitizing the ACS archive 232nd ACS National Meeting September 13, 2006 David Martinsen, Adam Chesler.
Photoshop Software Rasterized, file formats, and printing choices.
Digitization: MSU Project Example and Funding Information Paul Martinez Cataloging Librarian/Archivist Montclair State University.
Mark Sullivan Digital Library of the Caribbean. Imaging  Imaging Theory & Specifications  Recommended Equipment and Software 2 dLOC Training (7/29/2013)
Dominic Bordelon and Adam St.Pierre.  Based upon The Advocate Obituary Index  Obtained obituaries from microfilm to make full-text searchable records.
National Park Service U.S. Department of the Interior Resource Information Management Division National Information Systems Center Office of the Chief.
Technology Choices for the JSTOR Online Archive Presented by Chang Feng Department of Computer Engineering and Computer Science, University of Missouri-Columbia,
Organizational Relationships and Shaping the Digital Resource July 21, 2010 Johanna Bauman, Senior Production Manager, ARTstor.
George the Magnificent Prepare to Be Amazed!. On Choosing A Preservation File Format for Video: “TIFFs are too big to store”, or “We Used JPEGs and.
Susan Garbarino, Librarian Giannini Foundation of Agricultural Economics, University of California, Berkeley USAIN conference 2008, Wooster, Ohio.
Digital Reformatting and File Management Public Library Partnerships Project Sheila A. McAlister Director, Digital Library of Georgia and Sandra McIntyre.
Digitizing Photographs For Sustainable Heritage Workshop, June 12-15, 2014 By Steven Bingo Project Archivist, Washington State University.
More Pixels, Less Process: Decision making for minimal processing digitization Amanda Focke, Rice University
Quality Levels of Reproduction Adolf Knoll National Library of the Czech Republic.
Robin L. Dale Director of Digital & Preservation Services LYRASIS Getting Started with the Digital Commonwealth.
Northwestern University Transportation Library Menu Collection.
Digitization Programmes National Library of the Czech Republic Adolf Knoll
University of Florida Digital Collections.
Digital Image Capture of Musical Scores Jenn Riley, Indiana University Digital Library Program Ichiro Fujinaga, McGill University.
From Your Archive to the Web: Managing the Project The digitization of the Historic Photograph Collection of the Public Library of Brookline Digital Commonwealth/
File Formats Different applications (programs) store data in different formats. Applications support some file formats and not others. Open…, Save…, Save.
Best Practices for Digital Imaging and Metadata Roy Tennant The Library, University of California, Berkeley
Storage of digital objects Adolf Knoll National Library of the Czech Republic
Digitizing Newspapers with the Quartz A0 Scanner Sarah Lynn Fisher Project Coordinator, NDNP Ana Krahmer Coordinator, TDNP University of North Texas Libraries.
The New DRS Introduction. What is DRS? Digital repository for preservation and access – Maintains integrity of deposited content – Preserves content for.
Martin Jacobson Director, Special Media Preservation Division.
PRESERVATION IN A DIGITAL WORLD Presented By: Darrell Garwood Imaging Lab Manager Library and Archives Division Kansas State Historical Society
1 « Luxembourg, 18 April 2007 « Virtual Library of Official Statistics « Dissemination Working Group.
David Robb 10/14/08 Discovery Streaming. From the Home Page, you can search for digital media by keyword, subject, grade level, or curriculum standards.
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
Michael J. Bennett University of Connecticut Storrs, CT/USA & F. Barry Wheeler Library of Congress Washington, DC/USA IS&T Archiving 2010 Conference The.
WHERE WE ARE TODAY: AN UPDATE TO THE UCONN SURVEY ON JPEG 2000 IMPLEMENTATION FOR STILL IMAGES JPEG 2000 SUMMIT MAY 12, 2011 LIBRARY OF CONGRESS WASHINGTON,
CENTRAL/WESTERN MASSACHUSETTS AUTOMATED RESOURCE SHARING Digitization GOALS & THEIR LOGISTICS Michael J. Bennett Digital Initiatives Librarian C/WMARS,
February 22, 2012 Jim Duran and Julia Stringfellow
OPTIMIZED STILL IMAGE BATCH PROCESSING OF SPECIAL COLLECTIONS BOUND MONOGRAPHS AND MANUSCRIPTS USING DNG, JPEG 2000, AND EMBEDDED XMP METADATA IS&T ARCHIVING.
Michigan Digital Newspaper Project Contributing 100 thousand pages in Chronicling America
David B. Lowe, Preservation Librarian & Michael J. Bennett, Digital Projects Librarian Michael J. Bennett, Digital Projects Librarian University of Connecticut.
Photoshop / Illustrator Workshop
File Formats Different applications (programs) store data in different formats. Applications support some file formats and not others. Open…, Save…, Save.
Digital Stewardship Curriculum
Identifying Barriers To File Rendering In Bit-level Preservation Repositories A Preliminary Approach Kyle R. Rimkus, University Library Scott D. Witmer,
The Basics of Creating Accessible Documents for ILL Practitioners
Digital Image Editing ASSH Computer Training 03
Guide to Creating Mask Files
Scan to USB.
Current Challenges in Digitization
Presentation transcript:

WORKING WITH VENDORS: THE UCONN “DAILY CAMPUS” STUDENT NEWSPAPER DIGITAL REFORMATTING CASE STUDY DIGITAL COMMONWEALTH ANNUAL CONFERENCE MAY 1, 2013 DEVENS COMMONS CENTER DEVENS, MA Michael J. Bennett University of Connecticut Libraries, Storrs, CT, USA

Define What You’re Trying to Do Ask yourself…  What are you trying to reformat (original format description and amount)?  How will the resulting digital files be used?  Will any post-processing be needed to allow for these anticipated uses?

Define What You’re Trying to Do  Let the answers to these basic questions guide your imaging requirements, specifications, benchmarks, and expected deliverables.

Document The Request  Turn your answers into something that can be coherently communicated to local stake-holders and possible vendors.  This can take the shape of a formal RFP or just a bulleted narrative.  The important thing is to document your request early on and then fine-tune as needed.

Document The Request  This will make what you are trying to accomplish clearer from the outset to not only external stakeholders but also to yourself and your own institution.

What We Had  Daily Campus: UConn student newspaper  53 Reels of Preservation Microfilm  Pub. Date 1896 – 1990  Roughly 38,500 Film Frames  2 Page Spread Per Frame: 77,000 Total Pages  Pages Microfilmed Sequentially Across Reels  No Inherent Issue-by-Issue Segmentation  6,252 Total Issues

What We Wanted…

What We Thought We Needed Archival Still Image Master Files For All Pages:  According to NDNP Specification  One Image Per Page  Grouped Into Reel>Issue>Nested Folders

What We Thought We Needed NDNP Specification Archival Master:  Conforms with TIFF 6.0  8-bit grayscale  400 dpi preferred  Uncompressed  Only de-skewing should be applied  Cropped to page edge  TIFF tags required for preservation*

What We Thought We Needed NDNP Specification Production Master:  Conforms with JPEG 2000, Part 1 (.jp2)  Use 9-7 irreversible (lossy) filter  Compressed to 1/8 of the TIFF  Tiling, but no precincts  RDF/Dublin Core metadata in XML box*

What We Thought We Needed PDF Files For Delivery:  OCR’d  One PDF per Issue (Page Image Segmentation)  PDF Nested Into Reel>Issue>Folders  NDNP Stipulates 150ppi, Medium Quality JPEG Image Layer*

What The Vendor Said Still Image Master Files For All Pages:  According to NDNP Specification? A: YES, TIF, Lossy JP2000, PDF  One Image File Per Page? A: YES, 2 Up Page Frames Split  Grouped Into Reel>Issue>Page Nested Folders? A: NO, Issue-level segmentation=$0.88 per issue 6,252 issues x $0.88 = $5,501 extra. (ouch) All files in single Reel-level folder.

What The Vendor Said PDF Files For Delivery:  OCR’d? YES  One PDF per Issue (Page Image Segmentation)? NO, see previous  PDF Nested Into Reel>Issue>Folders? NO, Individual PDF files per page in single Reel- level folder…

So, What We Needed To Do Then PDF Files For Delivery:  Segment Our Own Issues Visually  Adobe Bridge/Acrobat

Problems PDF Files For Delivery:  OCR Vanishes When Merging Vendor’s PDFs in Acrobat into single issues.  Supplied 150ppi PDFs (NDNP Specs) didn’t look so great. At least when compared to what we knew we visually had in the TIF archival masters.  So…

Solution PDF Files For Delivery:  We crafted our own merged, issue-level PDFs in Acrobat from the excellent TIF masters instead of the vendor’s individual PDFs.  300ppi: Noticeably sharper, better for printing, but larger file sizes.  So, we did some analysis before proceeding full steam…

Solution PDF Files For Delivery:  We came up with an average issue size in pages.  And paid attention to the largest issues that we saw throughout the run.  Finally, we felt that the merged 300ppi PDF file size hit was worth the extra image quality and flexibility it gave to end users for printing, image manipulation, etc.

Solution PDF Files For Delivery:  Our largest PDF issues are roughly 30MB  Average, roughly 10MB  In the end, as bandwidth continues to become more robust over time, the size of these files may be viewed as trivial…

Question We Asked Ourselves Since the TIFs and the JPEG 2000 Master Files contain the same visual content, should we archive just the smaller JPEG 2000s as Archival Masters? 77,000 total pages: Ave. TIF Page Image 18.5MB > Total Storage = 1.4TB Ave. JP2 Page Image 2.2MB > Total Storage = 169GB

UConn Collaborators  David Lowe, Preservation & Data Management Services Librarian  Betsy Pittman, University Archivist  George King, Digitization Specialist  Saroj Kashwan, Segmentation Volunteer  Dan Bullman, Segmentation Volunteer

Further Resources NEDCC Preservation Leaflet:  Outsourcing and Vendor Relations leaflets/6.-reformatting/6.7-outsourcing-and-vendor- relations leaflets/6.-reformatting/6.7-outsourcing-and-vendor- relations Link To UConn Daily Campus Online Archive:  Link To This Presentation: 

Contact Michael J. Bennett Digital Production Librarian University of Connecticut Libraries, Storrs, CT, USA