UNT Libraries TRAIL Processing Mark Phillips April 26, 2016

Slides:



Advertisements
Similar presentations
Preservation of the Texas Agricultural Experiment Station Bulletin in the Digital Repository By Dr. Rob McGeachin Texas A&M University Libraries June,
Advertisements

Beyond the Google Book: the Future of the Digital Library Cory Snavely Library IT Core Services manager University of Michigan April 20, 2010.
KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the.
What is HathiTrust and How Can it Make a Difference? Sourcing and Scaling brought to the collective collection.
KAT HAGEDORN HATHITRUST SPECIAL PROJECTS COORDINATOR UNIVERSITY OF MICHIGAN LIBRARIES OCTOBER 9, 2009 Seamless Sharing: NYU, HathiTrust, ReCAP and the.
E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started.
Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
Organising and Documenting Data Stuart Macdonald EDINA & Data Library DIY Research Data Management Training Kit for Librarians.
The UM Libraries’ Frost Concert Archive Documenting the Performance History of the University of Miami Frost School of Music Amy Strickland University.
October 24, 2006Merit Technical Staff Meeting1 The Google Project at the University of Michigan Perry Willett Head, Digital Library Production Service.
CAPTURE SOFTWARE Please take a few moments to review the following slides. Please take a few moments to review the following slides. The filing of documents.
CAPTURE SOFTWARE Please take a few moments to review the following slides. Please take a few moments to review the following slides. The filing of documents.
Dana Marlowe Accessibility Partners Accessibility Partners © Not to be reproduced without permission. 1 Giving a Picture 1000 Words: Accessibility.
Effective Tools for Digital Object Management University of North Texas Libraries Digital Projects Unit Jeremy D. Moore Lab Manager Sarah Lynn Fisher Digital.
DIGITIZATION OF LOCAL HISTORY COLLECTIONS IN PUBLIC LIBRARY “VLADISLAV PETKOVIC DIS” IN CHACHAK: DIGITIZATION OF THE NEWSPAPER “THE VOICE OF CHACHAK” Bogdan.
A National Portal for Canadian Theses Sharon Reeves Manager, Theses Canada Presented at ETD 2003, Berlin Theses Canada Thèses Canada.
NATIONAL LIBRARY OF MEDICINE PubMed Central Martha Fishel National Library of Medicine CENDI Meeting September 15, 2004.
R.Jantz, August 31, Two-day forum on PREMIS Preservation Metadata and the Trusted Digital Repositories August 31, September 1 National Library of.
6/15/20151 Opportunities for Collaboration: The HEARTH Project Joy Paulson and Nathan Rupp Cornell University Digital Library Federation Spring Forum New.
JSTOR & OCR - A Case Study Kiffany Francis. What is JSTOR? “JSTOR is a not-for- profit organization with a dual mission to create and maintain a trusted.
Persistent Digital Archives and Library System (PeDALS) A Guide for Wisconsin State Agencies.
Technology Bootcamp January 18, 2014 Large-Scale Digital Libraries Digitization Process Krystyna K. Matusiak, Ph.D. Assistant Professor Library & Information.
1 Newspaper Digitisation Workflows Rose Holley- Manager ANDP Presentation to Cultural Heritage Digitisation professionals 26 November 2008.
1 Australian Newspapers Digitisation Program Development of the Newspapers Content Management System Rose Holley – ANDP Manager ANPlan/ANDP Workshop, 28.
The National Digital Newspaper Program (NDNP) An NEH/LC Collaborative Program Enhancing access to historical newspapers Release: September 2006.
HathiTrust – How To By Dr. Rob McGeachin 20 th Annual AgNIC Meeting May 7, 2015.
Case History: Library of Congress Audio-Visual Prototyping Project METS Opening Day October 27, 2003 Carl Fleischhauer Office of Strategic Initiatives.
Mark Phillips Digital Projects Department University of North Texas Annexation of Texas Project.
AgNIC Pre-conference 2009 “If It’s Digital and in Google – Then They Will Come” Presented at the National Agricultural Library By Dr. Rob McGeachin Texas.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Columbia Digital Preservation Planning & Implementation Status Report, August 2010.
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
The Luminary Library Experience: Large scale digitization at Toronto Public Library Agenda Introduction Background The project Current status Implementation.
Web based METS creation Ralf Stockmann case study.
Digitization Panel August 12, 2010 Christopher C. Brown, coordinator Mike Culbertson, Colorado State U. James Mauldin, GPO.
Kentuckiana Digital Library: A Digital Archive of Kentucky History Eric Weig Head, Digital Programs Special Collections & Digital Programs Division University.
Mark Sullivan Digital Library of the Caribbean. Imaging  Imaging Theory & Specifications  Recommended Equipment and Software 2 dLOC Training (7/29/2013)
The Legislative Library of Ontario’s Ontario Documents Repository Road to Partnership.
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
HathiTrust’s Past, Present and Future. Short- and Long-term Functional Objectives Short-term Page turner mechanism (and Mobile!) Branding (overall initiative;
The FCLA Digital Archive Joint Meeting of CSUL Committees, 2005.
International Seminary on Digitisation: Experience and Technology 11 th May 2004 | National Library | Lisbon – Portugal DIGITAL ARCHIVE OF PORTUGUESE ART.
University of Florida Digital Collections.
1 UNOG Library Digitization and Microform Unit (DMU) – December 2009.
Curating the Southwest Region’s Maps: UNT-UTA Collaborative Project Daniel G. Alemneh, Mark E. Phillips, and Cathy Hartman University of North Texas (UNT)
IAEA International Atomic Energy Agency International Nuclear Information System (INIS) 2.3 Digital Preservation Activities 36 th Consultative Meeting.
Best Practices for Digital Imaging and Metadata Roy Tennant The Library, University of California, Berkeley
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
The New DRS Introduction. What is DRS? Digital repository for preservation and access – Maintains integrity of deposited content – Preserves content for.
A Multi-Tiered Architecture for Distributed Data Collection and Centralized Data Delivery Stacy Kowalczyk and James Halliday April 28, 2008.
HathiTrust: Possibilities Metadata Working Group Cornell University Library March 21, 2014.
O PEN A CCESS TO O UR H ERITAGE The Gateway to Oklahoma History Cross Timbers Library Conference – August 16, 2013 Sarah Lynn Fisher University of North.
Opportunities & Obstacles: Prospects of Digital Assets.
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
Developing a Dark Archive for OJS Journals Yu-Hung Lin, Metadata Librarian for Continuing Resources, Scholarship and Data Rutgers University 1 10/7/2015.
Joint Meeting of CSUL Committees,
Digitizing Hearings from the 60s, 70s and 80s
Digital Stewardship Curriculum
Pre-Course Assignment
Digitization of The Increase A. Lapham Papers Collection
Metadata and XML <xmlpresentation>
Digitisation in academic libraries: Experience from Makerere University Library, Kampala Uganda By Patrick Sekikome Presented at the CERN-UNESCO School.
Building Search Systems for Digital Library Collections
Experiences of the Digital Repository of Ireland
Daniel Gelaw Alemneh and
University of Florida Digital Collections
Preserving Our Collective Digital History
Current Challenges in Digitization
Managing the Institutional Repository for OA Khawulile Radebe: Librarian: Repository Administrator & Metadata.
Presentation transcript:

UNT Libraries TRAIL Processing Mark Phillips April 26, 2016

There are currently two processes for digitizing technical reports with TRAIL

The bulk of content goes to the University of Michigan for scanning by Google and inclusion in Hathitrust

There are some formats that are not sent to UMich

Reports with foldouts

Reports with Maps

Reports with other random parts

Microforms

Microfiche

Microcard

Maps

These items are scanned at the UNT Libraries in the Digital Projects Unit

The workflow

UNT receives boxes of new items from Arizona for scanning.

These arrive at the DPU and are processed by Lee Fulton and his students.

All reports come to UNT with a unique identifier assigned to them. metadc303203

We remove the binding for the items that have been donated for destructive scanning

Items loaned that can't be cut are set aside in a different workflow

Lee and his students scan all of the pages of an item and all foldouts and oddly shaped pages

600 DPI bitonal 400 DPI grayscale 400 DPI color

All uncompressed TIFF files

They align the pagination with the sequence of files

0001.tif = Front Cover 0002.tif = Front Inside Cover 0003.tif = blank 0004.tif = blank 0005.tif = title page 0006.tif = blank 0007.tif = Page 1 0008.tif = Page 2

This is done so you can “jump to page 4, not image sequence 4”

000100fc.tif = Front Cover 000200fi.tif = Front Inside Cover 00030000.tif = blank 00040000.tif = blank 000500tp.tif = title page 00060000.tif = blank 00070001.tif = Page 1 00080002.tif = Page 2

We use a local naming convention called “magicknumbers” for this.

Which also helps with the QC of the items.

Each report is verified to have all of the pages scanned.

A descriptive metadata record using the UNTL metadata schema is created partly from the supplied MARC record from Arizona and augmented with additional information.

PrimeOCR by Prime Recognition Each tiff image is processed with Optical Character Recognition (OCR) software. PrimeOCR by Prime Recognition

The RAW output is used so we have the coordinates of the words for highlighting later

A searchable PDF is created for each page along with the OCR output.

These pdfs are combined into a single PDF document for the report.

A finished report looks like this on the disc.

metadc303234 01_tif 000100fc.tif 000200fi.tif 00030000.tif 00040000.tif ... 02_pdf metadc303234.pdf metadata.xml

Reports are ingested into the UNT Libraries Digital Infrastructure in batches

Web scale images, pdf, and metadata are added to the UNT Digital Library into the TRAIL collection

Master files are added to the Coda Repository for preservation.

Once online physical reports are inventoried and discarded once verified to be online.

Loaned reports are returned to Arizona or the loaning insitution.

UNT Digital Library

2015 TRAC Self-Audit

In 2015 the UNT Libraries completed a self-audit using the Trusted Repository Audit & Certification (TRAC) framework

Full documentation for the self-audit is available via the UNT Libraries Website Includes Policies related to preservation, access, user feedback and usage Content and partnership agreements Detailed workflows and technical documentation

https://www. library. unt https://www.library.unt.edu/digital-libraries/trusted-digital-repository

UNT Libraries continues to value the partnership we have with TRAIL and look forward to opportunities to expand our work to provide access to these resources for users around the world.

Thank you. Mark Phillips http://digital.library.unt.edu/