EMBL-EBI MSD Search and Visualization tools Jawahar Swaminathan
EMBL-EBI Issues The raw database is large and complex: 27,190+ PDB entries 120+ tables in the warehouse, many very large Cross-referenced against UniProt, PubMed... Need to expose as much of the data as possible, without making the interface too complex We want to cater for three categories of user: "Novice" user Experienced user Expert user
EMBL-EBI biobar A toolbar search application for Mozilla/Netscape or firefox browsers
EMBL-EBI Biobar ( )
EMBL-EBI biobar All major bioinformatics databases covered. Search genomic, proteomic, structural, literature and functional databases. Links to deposition and analysis tools for sequence and structural data.
EMBL-EBI MSDlite A simple form-based query system to search the MSD Databases
EMBL-EBI MSDlite
EMBL-EBI MSDlite
EMBL-EBI The Atlas Pages
EMBL-EBI The Atlas: Ligands
EMBL-EBI The Atlas: Sequence
EMBL-EBI View structures as wireframe, backbone or ribbons Built-in sequence viewer Calculate and display surfaces Various display options: Ramachandran plots Distance matrix B-factors Based on the AstexViewer™ from Astex Technology Limited and modified under licence by the MSD group
EMBL-EBI Strengths: simple, easy to use form allows multiple search fields to be combined relatively fast, despite performing quite complex SQL queries Weaknesses: not exposing the power of a relational database user can't specify the relationship between search fields: "name" AND "title" AND "keyword" "name" OR "title" OR "keyword" ( "name" OR "title" ) AND NOT "keyword" the search form is defined by the authors of the search system, not the author of a query Simple search interface
EMBL-EBI Describing complex searches We want to allow the user to entirely control their query Since HTML forms are inherently static, we'll use an applet to provide a dynamic "form" that will let the user: choose the fields to be searched specify the relationships between search fields choose the result fields and how results are presented perform "complex" sub-queries e.g. SSM, FASTA
EMBL-EBI A graphical database search system MSDpro uses an applet for constructing queries and a server to execute them Avoids the need for the user to understand a complex database schema or know SQL The user describes their query entirely graphically, including logical operations such as AND, OR and NOT Applet generates an XML description of the user’s query, which is sent to the MSD query server and converted to SQL automatically
EMBL-EBI MSDpro A flexible graphical search interface for advanced searching
EMBL-EBI
EMBL-EBI
EMBL-EBI Automatic SQL query generation The query server is a Java servlet: accepts a query description as XML converts the user’s query description into a true SQL query, which is then submitted to the search database Searches can include components that are executed outside of the database, e.g. sequence similarity, determined using FASTA or structural similarity, determined using SSM
EMBL-EBI Search system is generic The search system is designed to be entirely database-independent All information about the architecture of the search database is stored in XML dictionaries Similarly, the search and result fields which the applet presents to the user are controlled by a dictionary The entire system could move to a completely different database simply by modifying the dictionaries
EMBL-EBI Java server
EMBL-EBI Java server architecture User interface Methods Interface Ontology DB DB and external object ontology Methods
EMBL-EBI Web-services Some of the new services from MSD are designed as web-services: web-services are network-based services with published method signatures can be accessed via the SOAP protocol from any language with a SOAP library, via http The same services used within MSDpro will be accessible to any SOAP client The MSD query engine will also be available as a web-service, allowing users to submit queries programmatically
EMBL-EBI
EMBL-EBI Query generation
EMBL-EBI Query generation SQL > select from where B,C,E Fragments of C A,BB,D “C” - external B A = selection B = DB objects C = Query D = table joins E = plugin description