Download presentation
Presentation is loading. Please wait.
1
Big Changes for a Sustainable Future
DLG Technical Roadmap Big Changes for a Sustainable Future Mike Kanning, GALILEO Developer Sheila McAlister, DLG GALILEO Users Conference July 12, 2018
2
Background
3
Why Change? Stability Flexibility Sustainability Longevity
Accessibility Ease of use DLG has been around since the late 1990s. As one of the early state-wide digital library aggregators, few turn-key or community supported technology solutions existed. So we built our own. We relied on homegrown technologies to provide the robust and seamless searching our users demanded. We developed low-cost methods of converting content and delivering it. As technologies and the profession have matured and our collections have grown exponentially, it only made sense for DLG to re-evaluate our current technology stack so we could be sure that our offerings were stable, flexible, sustainable, accessible, and user friendly. We needed to simplify our infrastructure and adopt community-driven standards. Today, we’ll talk about two of our major infrastructure upgrades: first, the launch of our new newspaper digitization workflows and delivery platform, and second, our reinvention of DLG’s administrative back-end and our public interface (i.e., the new DLG public site.) Elderly women with a spinning wheel, LBGlass - 216, Lane Brothers Commercial Photographers Photographic Collection, Photographic Collection, Special Collections and Archives, Georgia State University Library.
4
Upgrades to Technical Infrastructure
Move from Solaris to Linux Adoption of Solr and Blacklight IIIF server and viewer Community supported, open technologies Photograph courtesy of Georgia State University. Copyright Atlanta Journal-Constitution. ERA 1101 Computer located at Georgia Tech, 1955, AJCN a, Atlanta Journal-Constitution Photographic Archives. Special Collections and Archives,
5
New Interfaces Delivery of newspapers using Chronicling America
Multiphase development of new Georgia portal interface
6
Newspaper Digitization and Delivery
7
DLG and GNP At least one newspaper from each of county filmed
2500 historic titles Over 220 current titles continue to be filmed Over 1 million pages digitized DLG’s most popular resources Georgia's newspaper publishing history began in 1763 with the establishment of the state's first newspaper, the Savannah Gazette, but efforts to preserve the state's print journalism heritage began much later. In 1953, the Georgia Newspaper Project which is headquartered at the University of Georgia Libraries began when the university's alumni association provided funds to establish the program. The first issues were filmed in December 1953, and that year, the UGA library staff began to collect backfiles of newspaper titles from throughout the state for filming. In 2000, it was estimated the project had filmed 24 million pages, 25,000 reels, The GNP continues to film over 220 current titles from throughout the state at a rate of about 500 new reels a year. In 2007, DLG’s first full-text newspaper database debuted (The Red and Black). Since beginning our full-text newspaper delivery, DLG staff have digitized over 1 million pages of newspapers from across the state.
8
Sustainability Efficiency Improved UI Why change?
At the time that Red and Black debuted, the National Digital Newspaper Program was in its infancy. The technical specifications for NDNP were more robust than what DLG needed at the time. It was expensive to outsource newspaper digitization as few vendors offered these types of services. At the same time, GALILEO's developers were stretched thin. Since DLG's users were clamoring for full-text historic newspaper content, DLG staff came up with a low-cost, less metadata intensive method using the DJVu image format as way to provde hit-highlighting. DLG staff was already adept at launching XTF sites which lessened our need for developer assistance. Fast forward ten years and we find ourselves in a different landscape. Among DLG's current goals is ensuring the sustainability of our digital assets and technology framework. The NDNP technical specifications have become a de facto standard. We've already phased out the use of DJVUs (which required plug-ins) and XTF is aging. It made more sense to work with an open-source, community-driven delivery system than to continue as a "lone wolf." Chronicling America provides much of the functionality our users demand (for example, hit highlighting) as well as several new features. These include Essays about the publishing history of various newspaper titles, Browsing by region (corresponding to regions of older sites), and Browsing by types that include community papers, papers-of-record, African- American papers, religious papers, school papers, or Native American papers. Its robust nature will allow DLG to maintain a single, newspaper portal rather than separate city or regional portals, a plus for our users and more efficient for us! Thanks to tools developed by the North Carolina Digital Heritage Center, DLG staff is converting our previously digitized newspapers to incorporated them into the new GHN platform. Until that time, users may continue to access the existing regional and city sites (North, South, West Georgia, Athens, Macon, Milledgeville, and Savannah). At the same time, we're working with vendors to digitize more new content. With funds from the NDNP, we'll be digitizing another 100,000 pages over the next two years on top of our normal newspaper digitization efforts.
9
Building on Chronicling America
Chronicling America is a Library of Congress project that provides a framework for creating digital newspaper sites with full-text searching and other standard features. Is a Python/Django project utilizing a MySQL database and Solr search index Not well documented Small user community Being rebranded as OpenONI and adopting a more community-driven project
10
Changes to ChronAm for GHNP
Utilize existing DLG infrastructure Recreate “Regional” newspaper collections Add new functionality for different ways of discovering content Greatly improve user experience, aesthetics and performance
11
Implementation Difficulties
Image tiling is very processor-intensive Long wait times for advanced searches Indexed OCR data and high resolution JP2 files for 650k+ pages takes up a LOT of space. Reindexing can take days, so get it right the first time :)
12
Future Directions Coming Soon IIF Image server
Rights-related metadata for Issues APIs for 3rd party data access Search result filters/facets Improved Advanced Search
13
Administrative Site and Public Portal
14
Multiple administrative data stores
GALILEO DPLA EDS OAI Multiple administrative data stores Varying data quality Boutique public sites Need to combine data streams Out-dated public interfaces Iterative development
15
DLG Administrative Site
Initial step in this whole process First implementation of a Blacklight site and Solr index Allowed DLG staff to work on aspects of data migration and metadata remediation while development took place on DLG public site Features a novel workflow for ingesting and updating metadata (see more in our next presentation!)
16
DLG Administrative Site
Testing ground for new public site features Stores record metadata in a Postgres relational database and handles indexing of data into our Solr index Has searching functionality specially tailored for metadata remediation and other administrative work
17
Blacklight Open Source, Ruby on Rails application
Widespread adoption, good documentation and large community Very active development
18
DLG Is likely to replace and enhance all our existing search- focussed collection sites and our larger portals. Blacklight-based Spotlight may also be implemented to support the creation and display of curated exhibits using DLG material.
19
The New DLG Public Portal
Searches over the search index created DLG Admin Changes to this site can be made immediately via the DLGAdmin portal Solr search server enables lightning-fast search and faceting, a major upgrade Metadata work and Solr features combine to enable new map display of records and search results
20
Future Directions Soon Search of full-text resources
Image display (IIIF) in DLG Portal Improved date searching and faceting Eventually Newspapers integration Video integration Integration of standalone collection sites into DLG Portal Adoption of Spotlight for exhibit hosting
21
Future Blacklight Sites
System Structure DLG Admin PostgreSQL DLG Public IIIF (Cantaloupe) GHNP Looks crazy but this is quite a bit simpler than the old setup. What this shows is that our infrastructure is consistent across our disprate projects and this means less time spinning up new projects and less time maintaining dtabase servers, Solr servers, etc. for each individual project. Solr Future Blacklight Sites
22
Project development and code: https://github.com/GIL-GALILEO/ghnp
Enjoy! Project development and code: National Digital Newspaper Program
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.