Developing a Thematic Library Integrating Drupal, Google and the ILS Ildegardo Jesus Elizondo Alejandro Garza October 14, 2008
Roadmap What triggered this thematic library? Development process Technology involved in the project What is expected of the thematic library?
What triggered this thematic library? OPAC Encyclopedia? Internet ?? Journal?? E-book?? Digital Library
Search is not “Googleized” Dead ends Unhelpful “No results found” screens Metasearch problems Lack of recall, slow execution Search is not “Googleized” Expect simpler interface Do quick search variations
Development process Mock ups and prototypes Questionnaires to faculty Usability tests
Wanted feature list Faceted search User ratings/reviews Which fields become facets? User ratings/reviews Lists of favorites Tables of contents Integration with Millennium Hmm creo que no es exactamente web2.0, sino mas bien decir que usamos algunos conceptos como “tags”. Lo unico “web2.0” es: tag clouds, reviews/ratings, y despues user tagging.
Using HILCC Developed at Columbia University. Groups Library of Congress classification numbers into a hierarchical vocabulary similar to Conspectus groupings, open Drupal custom module Automatic item categorizaztion upon harvesting Explicar un poco mas? A
What it looks like http://biblioteca.mty.itesm.mx/pasteur/ Multilingual Catalog search Subject browse Metasearch Popular items (most viewed recently) News / Blog
Results from subscription databases and select sites Facets HILCC Tables of contents Stemming work = working, works Ecommerce = e-commerce, e commerce Results from subscription databases and select sites
Item Record Realtime status Similar items Details
Technology involved in the project Drupal Extensible open source CMS. Apache Solr search Integrates with Drupal, fast faceted search. Google Custom Search Engine Powers our federated search Library Server Millennium ILS Google Drupal WebOpac Custom Search Engine Apache MySQL PHP Solr Item records Patron records Current item status Index A
Architecture Drupal EZ-Proxy A Google Library Server Millennium ILS Page Harvest Google Library Server Millennium ILS MARC harvest Custom Search Engine EZ-Proxy Book images Drupal Biblio records Subscription database records Tags Users Search WebOpac Index Authentication Item status Apache MySQL PHP Solr Item records Patron records Current item status Secure Access Subscription Databases A
Why Drupal At the time of project planning, the library already had some experience with Drupal. Drupal: Is extensible—hundreds of modules. Had solid performance. Is supported by the community (IRC, online docs, forums) and commercially (books, paid services) Is used in libraries. Free and Open A
Libraries using Drupal At least 33 university and public libraries 5 using Drupal to replace or intergrate with OPAC Slideshow of Drupal libraries: http://groups.drupal.org/node/13724 Drupal Library Groups http://drupalib.interoperating.info/ http://groups.drupal.org/libraries Poner AADL, Darien, otro?? A
Why Apache Solr? Dedicated search software Competes/replaces commercial solutions Open Source Features: Faceted search, synonyms, stemming, spellcheck “More like this” and spellchecking Powerful item ranking options Replication Free and Open A
Why Google? Some subscription databases already indexed by Google. Google Custom Search Engine allows building a search for only desired domains/URLs. Plus: Users are used to Google’s interfaces Google search technology and branding Google Co-op tools Free CrossRef
What is expected of the thematic library? Better relationships with students and faculty Drag new users to the library from Google searches Users will more easily find what they want
Other features Google Books preview. Include Google results alongside library. RSS feeds. Magazine A-Z listings, by subject (HILCC)
Results so far Traffic from search engines In August, 94% of total traffic was from search engines and from cities other than Monterrey (91%). Average visit times and pages viewed from users in Monterrey are much higher (3:37 and 4.66 pages/visit) versus the average (1:06 and 1.94 pages/visit). Item records in google.com.mx are high for some items, which appear in second or third places—ranking above bookstore results. At least one academic department is closely following our implementation.