VIVO Multi-site search Structure and function overview
What is it? A search tool for ISF-compatible sites VIVO, Profiles, Loki… Search index is built from all client sites Provides relative ranking of results across sites Two pieces of software An application that builds a Solr search index A web-app that presents a GUI for searching Configurable Decide which sites to index, and which classes of individuals Open source, built from open source components
Data flow User Browser User Browser MSS Web server MSS Web server MSS Solr server MSS Solr server Search index Web page Search result Search request AJAX MSS Indexer MSS Indexer Client sites Search records RDF
Data flow MSS Indexer MSS Indexer Client site Client site Discovery request List of URIs LOD request RDF LOD request RDF LOD request RDF LOD request RDF
Scalable Search index is a standard Solr webapp Compatible with any standard JEE server Indexer is multi-threaded For small number of client sites, using standard Java threads For large number of client sites, using the Apache Hadoop framework for distributed processing Interleaves requests among clients, for reduced load Front-end GUI uses AJAX Solr client GUI server serves static HTML and AJAX-based JavaScript Presentation is accomplished by JavaScript in the browser
For the community Get the software Configure for your sites and your classes Install Solr on a server Install and run the indexer Install the front end GUI on a server
Ready for enhancement The indexer is assembled from components at runtime Improve a component Contribute to the community Site admins may configure their indexer to use your component. The front end is based on the AJAX Solr toolkit Create your own front end look and feel Contribute to the community Site admins may install your front end, instead of the default front end
Configuration Evaluation Scheduling Discovery Synchronization Population Prioritization Assembly Modeling Indexing The Indexer - Configuration Assemble the application Use standard components or contributed alternatives Create the site list Name Type of installation (e.g. VIVO 1.5, Profiles) Classes to be indexed Get runtime options Assembly Modeling Indexing Configuration Built on the Digester component from Apache Commons. Processed like server.xml file in Tomcat.
Configuration Evaluation Scheduling Discovery Synchronization Population Prioritization Assembly Modeling Indexing The Indexer - Evaluation Scheduling Check to see which sites are due for discovery Discovery Ask each site for its list of URIs With “last modified” dates, if available Synchronization Create stub records for new URIs Remove expired records Scheduling Discovery Synchronization
Configuration Evaluation Scheduling Discovery Synchronization Population Prioritization Assembly Modeling Indexing The Indexer - Population Prioritization Create an ordered list of URIs for indexing. Modeling For each URI, ask the site for RDF statements to build the individual model Indexing Translate the individual model into a record in the search index Prioritization Modeling Indexing