Big Changes for a Sustainable Future

Slides:



Advertisements
Similar presentations
E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started.
Advertisements

Newspaper Preservation through Collaboration and Communication The Texas Digital Newspaper Program By Ana Krahmer & Mark Phillips University of North Texas.
Rutgers University Libraries What is RUcore? o An institutional repository, to preserve, manage and make accessible the research and publications of the.
OU Digital Library development project Liz Mallett – Project Manager James Alexander – Project Developer 25 January 2012.
DMS in Universities, Colleges and School Infocrew Solutions Pvt.Ltd.
NOBLE Digital Library. How does it work? The NOBLE Digital Library uses the DSpace platform. Image files and metadata are imported into DSpace using.
A Digital Preservation Repository for Duke University Libraries Jim Coble Digital Repository Developer Open Repositories 2013.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
The National Digital Newspaper Program (NDNP) An NEH/LC Collaborative Program Enhancing access to historical newspapers Release: September 2006.
{ Building Open Access To Our Heritage Andrew Weidner Project Coordinator, New Mexico Historical Newspapers University of North Texas Libraries: Digital.
Hydra from 35,000ft Chris Awre Hydra Europe Symposium London School of Economics, 23 rd April 2015.
City of Seattle Office of the City Clerk Open Government = Access Challenges and Opportunities with Digital Records.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
1. 2 introductions Nicholas Fischio Development Manager Kelvin Smith Library of Case Western Reserve University Benjamin Bykowski Tech Lead and Senior.
Digital Library Collections (DLC) Website A platform for integrated access to CUL/IS specialized, digital collections September 2014 Status Report.
Project Builder and MediaMatrix: Redefining Access in the Digital Age Dean Rehberger and Michael Fegan MERLOT August 7-10, 2006 New Orleans, LA.
PROJECT HYDRA SNEAK PEAK – ADVANCE SHOWING Brought to you by the Digital Repository Task Force Steve Marine (chair), Ted Baldwin, Dan Gottlieb, Kevin Grace,
The New Digital World and the Transformation of Information and Libraries Patricia L. Thibodeau Associate Dean Library Services & Archives Oct. 26, 2011.
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
The Global Video Grid: DigitalWell Update & Plan For SRB Integration Myke Smith, Manager Streaming Media Technologies University of Washington / ResearchChannel.
GPO’s Federal Digital System August 17, 2010 U.S. Government Printing Office.
Data Management BIRN supports data intensive activities including: – Imaging, Microscopy, Genomics, Time Series, Analytics and more… BIRN utilities scale:
Hypatia Hydra Platform for Access to Information in Archives DLF Forum * Baltimore * October 31, 2011 Stanford University Bradley Daigle Julie Meloni Tom.
Unit no. 5 Digital Library Adolf Knoll National Library of the Czech Republic © Adolf Knoll, National Library of the Czech Republic.
Digital Commons & Open Access Repositories Johanna Bristow, Strategic Marketing Manager APBSLG Libraries: September 2006.
GPO’s Federal Digital System December 10, 2009 U.S. Government Printing Office.
Enterprise Content Management
Enterprise Solutions Chapter 10 – Enterprise Content Management.
ROLLING YOUR OWN DIGITAL LIBRARY SYSTEM University of North Texas Libraries.
O PEN A CCESS TO O UR H ERITAGE The Gateway to Oklahoma History Cross Timbers Library Conference – August 16, 2013 Sarah Lynn Fisher University of North.
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
NLM Update and Still Image Serving April 27, 2016 John Doyle, Doron Shalvi, TA Nguyen National Library of Medicine.
CENTRAL/WESTERN MASSACHUSETTS AUTOMATED RESOURCE SHARING Digitization GOALS & THEIR LOGISTICS Michael J. Bennett Digital Initiatives Librarian C/WMARS,
CONTENTdm A proven solution September A complete digital collection management software solution Stores, manages and provides access for all digital.
The world’s libraries. Connected. CONTENTdm ® Digital Collection Management Solutions Learn what to consider when outsourcing your library’s digitization.
Breeda Herlihy, IR Manager, UCC Library. UCC selected DSpace in 2008 Software selection group Staff from Library IT, Computer Centre, Special Collections,
+ The Learning Registry: A How To Primer for Digital Content Publishers and Aggregators December 20, 2011.
Data Sources & Using VIVO Data Visualizing Science VIVO provides network analysis and visualization tools to maximize the benefits afforded by the data.
Leveraging the Results of NDNP: the Texas Digital Newspaper Program.
Building Digital Archives Mark Phillips Cathy Hartman June 6, 2008.
British Library Strategy
Data Management Program Introduction
Digital Asset Management at Michigan Tech
Building A Repository for Digital Objects
Multiple approaches to archival description
Local History & Genealogy
An Introduction to Tessella and The Safety Deposit Box Platform
CEDSCI for State Data Center (SDC) Affiliates
Avalon's Role in the Digital Collections Ecosystem
#dlbb IU Libraries and the Center for Biological Research Collections
SowiDataNet - A User-Driven Repository for Data Sharing and Centralizing Research Data from the Social and Economic Sciences in Germany Monika Linne, 30.
Building Search Systems for Digital Library Collections
VI-SEEM Data Repository
Content Management Systems
Rolling your own Digital Library System
NASA Technical Report Server (NTRS) Project Overview April 2, 2003
Carnegie Mellon University Libraries
Library Technology Conference: Building Exhibits
DIGITAL LIBRARY.
Hydra: a case study Chris Awre
Chapter 2: The Linux System Part 1
Keep Your Digital Media Assets Safe and Save Time by Choosing ImageVault to be Your Digital Asset Management Solution, Hosted in Microsoft Azure Partner.
Metadata, Ingest, and Data Feeds
opening our collections data to the public
Catherine Foley Director of Digital Archive and Library Projects MATRIX, Center for Digital Humanities and Social Sciences at MSU Mid-Michigan Digital.
DLG/HomePLACE Services Overview and Focus Group
Why IIIF? Shane Huddleston Jeff Mixter Dave Collins Product Manager
CRKN and Canadiana Update
New Platform to Support Digital Humanities in the Czech Republic
Adoption and Use of IIIF for Digital Resource Sharing in CONTENTdm
Presentation transcript:

Big Changes for a Sustainable Future DLG Technical Roadmap Big Changes for a Sustainable Future Mike Kanning, GALILEO Developer Sheila McAlister, DLG GALILEO Users Conference July 12, 2018

Background

Why Change? Stability Flexibility Sustainability Longevity Accessibility Ease of use DLG has been around since the late 1990s. As one of the early state-wide digital library aggregators, few turn-key or community supported technology solutions existed. So we built our own. We relied on homegrown technologies to provide the robust and seamless searching our users demanded. We developed low-cost methods of converting content and delivering it. As technologies and the profession have matured and our collections have grown exponentially, it only made sense for DLG to re-evaluate our current technology stack so we could be sure that our offerings were stable, flexible, sustainable, accessible, and user friendly. We needed to simplify our infrastructure and adopt community-driven standards. Today, we’ll talk about two of our major infrastructure upgrades: first, the launch of our new newspaper digitization workflows and delivery platform, and second, our reinvention of DLG’s administrative back-end and our public interface (i.e., the new DLG public site.) Elderly women with a spinning wheel, LBGlass - 216, Lane Brothers Commercial Photographers Photographic Collection, 1920-1976. Photographic Collection, Special Collections and Archives, Georgia State University Library.

Upgrades to Technical Infrastructure Move from Solaris to Linux Adoption of Solr and Blacklight IIIF server and viewer Community supported, open technologies Photograph courtesy of Georgia State University. Copyright Atlanta Journal-Constitution. ERA 1101 Computer located at Georgia Tech, 1955, AJCN038-058a, Atlanta Journal-Constitution Photographic Archives. Special Collections and Archives, http://digitalcollections.library.gsu.edu/cdm/ref/collection/ajc/id/8683.

New Interfaces Delivery of newspapers using Chronicling America Multiphase development of new Georgia portal interface

Newspaper Digitization and Delivery

DLG and GNP At least one newspaper from each of county filmed 2500 historic titles Over 220 current titles continue to be filmed Over 1 million pages digitized DLG’s most popular resources Georgia's newspaper publishing history began in 1763 with the establishment of the state's first newspaper, the Savannah Gazette, but efforts to preserve the state's print journalism heritage began much later. In 1953, the Georgia Newspaper Project which is headquartered at the University of Georgia Libraries began when the university's alumni association provided funds to establish the program. The first issues were filmed in December 1953, and that year, the UGA library staff began to collect backfiles of newspaper titles from throughout the state for filming. In 2000, it was estimated the project had filmed 24 million pages, 25,000 reels, The GNP continues to film over 220 current titles from throughout the state at a rate of about 500 new reels a year. In 2007, DLG’s first full-text newspaper database debuted (The Red and Black). Since beginning our full-text newspaper delivery, DLG staff have digitized over 1 million pages of newspapers from across the state.

Sustainability Efficiency Improved UI Why change? At the time that Red and Black debuted, the National Digital Newspaper Program was in its infancy. The technical specifications for NDNP were more robust than what DLG needed at the time. It was expensive to outsource newspaper digitization as few vendors offered these types of services. At the same time, GALILEO's developers were stretched thin. Since DLG's users were clamoring for full-text historic newspaper content, DLG staff came up with a low-cost, less metadata intensive method using the DJVu image format as way to provde hit-highlighting. DLG staff was already adept at launching XTF sites which lessened our need for developer assistance. Fast forward ten years and we find ourselves in a different landscape. Among DLG's current goals is ensuring the sustainability of our digital assets and technology framework. The NDNP technical specifications have become a de facto standard. We've already phased out the use of DJVUs (which required plug-ins) and XTF is aging. It made more sense to work with an open-source, community-driven delivery system than to continue as a "lone wolf." Chronicling America provides much of the functionality our users demand (for example, hit highlighting) as well as several new features. These include Essays about the publishing history of various newspaper titles, Browsing by region (corresponding to regions of older sites), and Browsing by types that include community papers, papers-of-record, African- American papers, religious papers, school papers, or Native American papers. Its robust nature will allow DLG to maintain a single, newspaper portal rather than separate city or regional portals, a plus for our users and more efficient for us! Thanks to tools developed by the North Carolina Digital Heritage Center, DLG staff is converting our previously digitized newspapers to incorporated them into the new GHN platform. Until that time, users may continue to access the existing regional and city sites (North, South, West Georgia, Athens, Macon, Milledgeville, and Savannah). At the same time, we're working with vendors to digitize more new content. With funds from the NDNP, we'll be digitizing another 100,000 pages over the next two years on top of our normal newspaper digitization efforts.

Building on Chronicling America Chronicling America is a Library of Congress project that provides a framework for creating digital newspaper sites with full-text searching and other standard features. Is a Python/Django project utilizing a MySQL database and Solr search index Not well documented Small user community Being rebranded as OpenONI and adopting a more community-driven project

Changes to ChronAm for GHNP Utilize existing DLG infrastructure Recreate “Regional” newspaper collections Add new functionality for different ways of discovering content Greatly improve user experience, aesthetics and performance

Implementation Difficulties Image tiling is very processor-intensive Long wait times for advanced searches Indexed OCR data and high resolution JP2 files for 650k+ pages takes up a LOT of space. Reindexing can take days, so get it right the first time :)

Future Directions Coming Soon IIF Image server Rights-related metadata for Issues APIs for 3rd party data access Search result filters/facets Improved Advanced Search

Administrative Site and Public Portal

Multiple administrative data stores GALILEO DPLA EDS OAI Multiple administrative data stores Varying data quality Boutique public sites Need to combine data streams Out-dated public interfaces Iterative development

DLG Administrative Site Initial step in this whole process First implementation of a Blacklight site and Solr index Allowed DLG staff to work on aspects of data migration and metadata remediation while development took place on DLG public site Features a novel workflow for ingesting and updating metadata (see more in our next presentation!)

DLG Administrative Site Testing ground for new public site features Stores record metadata in a Postgres relational database and handles indexing of data into our Solr index Has searching functionality specially tailored for metadata remediation and other administrative work

Blacklight Open Source, Ruby on Rails application Widespread adoption, good documentation and large community Very active development

Blacklight @ DLG Is likely to replace and enhance all our existing search- focussed collection sites and our larger portals. Blacklight-based Spotlight may also be implemented to support the creation and display of curated exhibits using DLG material.

The New DLG Public Portal Searches over the search index created DLG Admin Changes to this site can be made immediately via the DLGAdmin portal Solr search server enables lightning-fast search and faceting, a major upgrade Metadata work and Solr features combine to enable new map display of records and search results

Future Directions Soon Search of full-text resources Image display (IIIF) in DLG Portal Improved date searching and faceting Eventually Newspapers integration Video integration Integration of standalone collection sites into DLG Portal Adoption of Spotlight for exhibit hosting

Future Blacklight Sites System Structure DLG Admin PostgreSQL DLG Public IIIF (Cantaloupe) GHNP Looks crazy but this is quite a bit simpler than the old setup. What this shows is that our infrastructure is consistent across our disprate projects and this means less time spinning up new projects and less time maintaining dtabase servers, Solr servers, etc. for each individual project. Solr Future Blacklight Sites

Project development and code: https://github.com/GIL-GALILEO/ghnp Enjoy! https://gahistoricnewspapers.galileo.usg.edu http://chroniclingamerica.loc.gov/ Project development and code: https://github.com/GIL-GALILEO/ghnp National Digital Newspaper Program http://www.loc.gov/ndnp/