Overview of IU Digital Collections Search Hui Zhang Jon Dunn Indiana University Digital Library Program IU Digital Library Brown Bag October 19, 2011.

Slides:



Advertisements
Similar presentations
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
Advertisements

Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program
NU Primo – What’s Next “What to do with Digital Objects in Primo” IGeLU 2014 Michael North - Sr. Systems Analyst / Programmer Lead, Systems Team.
The Federal Science Repository Service Wayne Strickland, NTIS January 22, 2012 National Technical Information Service.
Interoperability and Preservation with the Hub and Spoke (HandS) Matt Cordial, Tom Habing, Bill Ingram, Robert Manaster University of Illinois Urbana-Champaign.
BCAD Architecture 2009 British Cartoon Archive. Projects A project to digitise and catalogue the Carl Giles Archive to current international standards.
Depositing e-material to The National Library of Sweden.
Providing Online Access to the HKUST University Archives: EAD to INNOPAC Sintra Tsang and K.T. Lam The Hong Kong University of Science and Technology 7th.
EXtensible Catalog XC Drupal Toolkit. XC Software Overview User Interface for searching and browsing Library Website (on Drupal) VoyagerUR Research XC.
Building a Digital Library with Fedora International Conference on Developing Digital Institutional Repositories Hong Kong December 9, 2004.
Demonstration of repositories Fedora (Flexible Extensible Digital Object Repository Architecture) Marie Lagerwall MIDESS Partners Meeting February 9, 2007.
Use of METS in CDL Digital Special Collections Brian Tingle.
A Digital Preservation Repository for Duke University Libraries Jim Coble Digital Repository Developer Open Repositories 2013.
Implementing search with free software An introduction to Solr By Mick England.
Putting it all together for Digital Assets Jon Morley Beck Locey.
Greg Harris President & CEO We Can Work It Out Establishing the World’s First Rock and Roll Library.
Open Source Software Sustainability: A Case Study of Indiana University's Variations Software Jon W. Dunn, Phil Ponella, and Robert H. McDonald Indiana.
SobekCM’s Community Ecosystems & Socio-Technical Practices Presented by Mark V. Sullivan June 10 th, 2014 Sobek image created by Jeff Dahl and is shared.
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
XML: The Strategic Opportunity Roy Tennant Challenges*  Only librarians like to search, everyone else likes to find  Our users want more information.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Improving the Catalogue Interface using Endeca Tito Sierra NCSU Libraries.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
Building a Fedora Architecture to Support Diverse Collections Jon Dunn Ryan Scherle Digital Library Program Indiana University.
Using Hydra/Fedora for digital repository infrastructure 5. September 2013 Andreas Borchsenius Westh The Royal Library, Copenhagen.
1. 2 introductions Nicholas Fischio Development Manager Kelvin Smith Library of Case Western Reserve University Benjamin Bykowski Tech Lead and Senior.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
IUScholarWorks is a set of services to make the work of IU scholars freely available. Allows IU departments, institutes, centers and research units to.
PROJECT HYDRA SNEAK PEAK – ADVANCE SHOWING Brought to you by the Digital Repository Task Force Steve Marine (chair), Ted Baldwin, Dan Gottlieb, Kevin Grace,
November 1, 2006IU DLP Brown Bag : Fall Data Integrity and Document- centric XML Using Schematron for Managing Text Collections Dazhi Jiao, Tamara.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
PHOTO CATALOGING AND DELIVERY SERVICE INTRODUCTION AND GETTING STARTED.
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
UVa's Digital Library CSG - September 2005 Slides courtesy of: Leslie Johnston Director, Digital Access Services, UVA Library Tim Sigmon University of.
Hypatia Hydra Platform for Access to Information in Archives DLF Forum * Baltimore * October 31, 2011 Stanford University Bradley Daigle Julie Meloni Tom.
Nate Trail Network Development & MARC Standards Office 8/1/2006 With help from Sydney Olive How to Build, Display and Find METS Objects.
EVIA Digital Archive New Tools William G. Cowan Mike Durbin Digital Library Program EVIA Digital Archive DLP Brown Bag 20 September 2006.
Successes and Growing Pains: The Indiana University Digital Library Program Jenn Riley Metadata Librarian Indiana University Digital Library Program January.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
CONTENT DISCOVERY, SERVICES, AND SUSTAINED ACCESS Timothy Cole, William Mischo, Beth Sandore, Sarah Shreeves ~ University of Illinois Library
METS Dissemination METS Opening Day Corey Keith
IUScholarWorks Technical Overview Randall Floyd Digital Library Program Programmer/Database Administrator.
Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Libraries and Museums Jenn Riley Metadata Librarian Indiana University Digital Library.
Iccha Sethi Serdar Aslan Team 1 Virginia Tech Information Storage and Retrieval CS 5604 Instructor: Dr. Edward Fox 10/11/2010.
MOODy :) Investigations into Massive Open Online Discovery at IU Juliet Hardesty Courtney Greene McDonald Bryan J Brown
Power to the People IU Bloomington Libraries’ Content Management System Doug Ryner, Tadas Paegle, Julie Hardesty.
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Search Overview Search Features: WSS and Office Search Architecture Content Sources and.
Interoperability and Collection of Preservation Metadata for Digital Repository Content Matt Cordial, Tom Habing, Bill Ingram, Robert Manaster University.
April 25, 2012 Making the Most of Library Collaboration and Cooperative Projects Partnering for Discovery: Jennifer LissErika Dowell Metadata/Cataloging.
A Multi-Tiered Architecture for Distributed Data Collection and Centralized Data Delivery Stacy Kowalczyk and James Halliday April 28, 2008.
Improving Description through Collaboration: The Ethnomusicological Video for Instruction & Analysis Digital Archive Music Library Association, February.
Ensuring Equal Access, Collaborating on Accessibility #dlbb Digital Library Brown Bag Series Humbert Joe Humbert, UITS Assistive Technology.
Challenges in the Nursery: Linking a Finding Aid with Online Content Elizabeth Johnson, Lilly Library Jenn Riley, Digital Library Program DL Brown Bag,
DSpace - Digital Library Software
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
DSpace System Architecture 11 July 2002 DSpace System Architecture.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
A technical overview Image Collection Workflow and Tools Michael Durbin 2010 Brown Bag Presentation Series April 21, 2010.
Collection Management Systems
A Faceted Interface to the Library Catalog Tito Sierra NCSU Libraries ALA Midwinter Meeting January 20, 2007.
Breeda Herlihy, IR Manager, UCC Library. UCC selected DSpace in 2008 Software selection group Staff from Library IT, Computer Centre, Special Collections,
EVIA Digital Archive Technical Overview EVIA Digital Archive DLP Brown Bag: 7 December 2005.
Avalon's Role in the Digital Collections Ecosystem
Building Search Systems for Digital Library Collections
Flexible Extensible Digital Object Repository Architecture
Flexible Extensible Digital Object Repository Architecture
VI-SEEM Data Repository
Islandora Learning Objectives
Presentation transcript:

Overview of IU Digital Collections Search Hui Zhang Jon Dunn Indiana University Digital Library Program IU Digital Library Brown Bag October 19, 2011

Outline Introduction and motivation – Jon Demo – Jon Technical implementation – Hui Next steps and future work – Jon

Why cross-collection search? Support discovery across multiple content formats, collections, and repositories at IU Use cases: ◦ Multiple formats/collections within a single thematic grouping (e.g. Hoagy Carmichael)Hoagy Carmichael ◦ Show off the richness and diversity of IU’s digital collections (PR – see open.iu.edu)open.iu.edu ◦ Find digital content at IU for teaching or research use

Why cross-collection search? Support discovery across multiple content formats, collections, and repositories at IU Use cases: ◦ Multiple formats/collections within a single thematic grouping (e.g. Hoagy Carmichael)Hoagy Carmichael ◦ Show off the richness and diversity of IU’s digital collections (PR – see open.iu.edu)open.iu.edu ◦ Find digital content at IU for teaching or research use

Digital collections evolution: Discrete collection web sites

Digital collections evolution: Services METS Navigator Archives Online PhotoCat Video Streaming Service Variations

Digital collections evolution: Services Advantages ◦ Can develop workflows for content ingestion and description that are both optimized and scalable ◦ Content stored in a common repository (Fedora) ◦ Can develop discovery interfaces optimized for particular content (e.g. images vs. music) ◦ Common services to expose content into other platforms (e.g. Google) Disadvantages ◦ “Siloing” discovery by content type can be an issue

Cross-collection search: First iteration Only selected collections with metadata in Fedora ◦ Includes Archives Online and most image collections ◦ Not video streaming, Variations, encoded text, IUScholarWorks, various “legacy” collections Metadata only (MODS) ◦ Stored natively as MODS in Fedora ◦ Disseminated on the fly from other formats (PhotoCat2) ◦ Transformed via XSLT from EAD (Archives Online)

Cross-collection search: First iteration Demonstration

Challenge: Item-level records from EAD

Apache Solr Overview A Java-based web application, open source search server, Apache Lucene at its core Demonstration Solr vs. relational database Pros: full-text search, text analysis, flexible fields Cons: no relational operation on fields Solr vs. Lucene Pros: web application, centralized configuration, facet Cons: security, slower

Solr Schema and Configuration Schema: specify how the index is built ◦ field, field type ◦ dynamicField, copyField, uniqueKey ◦ Text analysis: stop, stem, synonym, tokenization Configuration: specify Solr itself, query, data import

Converting MODS to Solr XML Solr XML ◦ … … ◦ Can simply be “POST” into the Solr index Translation of MODS to Solr XML ◦ Use XSLT ◦ Called by the indexing program Extract facet values ◦ Format: MODS:typeofResource ◦ Collection: customized based on item’s Fedora PID

iudl:10000 Women Medical Students Photographic Services, Photographer Photographic Services Medical students Bloomington Indiana still image Photographs P /archives/photos/ …

Solr Indexing Carried by two Java programs running under DLP’s Fedora Index Service framework The service can be invoked by a RESTful HTTP request, the Solr indexing is triggered based on conditions specified in the properties file The MODS records are extracted from the Fedora repository (natively stored) or generated by the getMODS disseminator (Photocat2 collections)

Overview of Blacklight An open source project developed for libraries with many potentials: ◦ As a library catalog ◦ As the discovery interface to a digital repository Optimized to handle diversified content (facet browsing) Originally developed by University of Virginia, has a growing community of active contributors and users Now part of Hydra Project Written in Ruby, runs on Rails, requires Solr

Customize Blacklight for DLP Collections Integrate blacklight with MODS-based index ◦ Blacklight by default expects MARC fields New functions and features ◦ Render thumbnail in result view ◦ Use collection website as the landing page Style and layout ◦ Standard IU banner and footer ◦ Color, font, and window size

Future Improvements Automatic update of Solr index ◦ Fedora repository communicates with the Solr indexing program via JMS about item update Include full-text content ◦ It is challenging to have full-text content and metadata in one index ◦ Optimize the indexing and search algorithms ◦ Search against full-text and use metadata as facets

Future Improvement (cont’d) Add more collections ◦ Other collections from Fedora ◦ Non-Fedora DLP collections ◦ Archives of Institutional Memory ◦ IUScholarWorks Repository? ◦ IUPUI Digital Collections (ContentDM)? Conduct usability evaluation Explore integration w/ new Blacklight-based discovery layer for IUCAT Variations on Video IMLS grant ◦ Hydra/Blacklight-based discovery on PBcore

Questions? Beta: Send comments to: