EMBL-EBI MSD Search tools
EMBL-EBI The “Atlas” Pages
EMBL-EBI The Atlas: Ligands
EMBL-EBI The Atlas: Sequence
EMBL-EBI View structures as wireframe, backbone or ribbons Built-in sequence viewer Calculate and display surfaces Various display options: –Ramachandran plots –Distance matrix –B-factors Based on the AstexViewer™ from Astex Technology Limited and modified under licence by the MSD group
EMBL-EBI Strengths: simple, easy to use form allows multiple search fields to be combined relatively fast, despite performing quite complex SQL queries Weaknesses: not exposing the power of a relational database user can't specify the relationship between search fields: "name" AND "title" AND "keyword" "name" OR "title" OR "keyword" ( "name" OR "title" ) AND NOT "keyword" the search form is defined by the authors of the search system, not the author of a query Simple search interface
EMBL-EBI Describing complex searches We want to allow the user to entirely control their query Since HTML forms are inherently static, we'll use an applet to provide a dynamic "form" that will let the user: choose the fields to be searched specify the relationships between search fields choose the result fields and how results are presented perform "complex" sub-queries e.g. SSM, FASTA
EMBL-EBI Graphical DB search system MSDpro uses an applet for constructing queries and a server to execute them Avoids the need for the user to understand a complex database schema or know SQL The user describes their query entirely graphically, including logical operations such as AND, OR and NOT Applet generates an XML description of the user’s query, which is sent to the MSD query server and converted to SQL automatically
Automatic SQL generation The query server is a Java servlet: accepts a query description as XML converts the user’s query description into a true SQL query, which is then submitted to the search database Searches can include components that are executed outside of the database, e.g. sequence similarity, determined using FASTA or structural similarity, determined using SSM
EMBL-EBI Web-services Some of the new services from MSD are designed as web-services: web-services are network-based services with published method signatures can be accessed via the SOAP protocol from any language with a SOAP library, via http The same services used within MSDpro will be accessible to any SOAP client The MSD query engine will also be available as a web-service, allowing users to submit queries programmatically
EMBL-EBI Visualisation The process of representing abstract data to aid in understanding the meaning of the data. Not to be confused with rendering data (drawing pictures) Typically though, we render data in such a way to visualize the information within that data.
EMBL-EBI Introduction Biological data comes from & is of interest to: Chemists : reaction mechanism, drug design Biologists : sequence, expression, homology, function. Structure biologists : atomic structure, fold, classification, function. Medicine : clinical effect Education : Media : Presentation of diverse information to a diverse audience. Each has there own point of view (context). Expert = scientist working within their own field of expertise Non-expert = scientist using data/information outside their field Novice = Non-scientist
EMBL-EBI Web pages These are notoriously badly designed often resulting in the information on that site being unusable. The front page should load quickly The main point should appear on the first full screen Clutter – not logically laid out Too busy – cannot find the salient point 8% men & 0.5% women are colour blind Bad text/fonts Too often it doesn’t work User will go somewhere else The latest wiz-bang stuff only works on the latest browsers Only works in one browser – they only tested on one. Does not conform to standard HTML Not just presentation of results Google is a good design
EMBL-EBI Asking questions Biological data is very complex Chemistry, Biology, Physics, Statistics, Medicine.. Most users will be from a different field Asking the right question is difficult. The user cannot use the correct terminology Too many things to query (2000 attributes in MSD) SQL : not suitable for most users Interface too complex Too many check boxes, widgets etc Trying to be too clever The “Go” button is buried somewhere
EMBL-EBI Result presentation Results Biological data is complex Chemistry, physics, biology, statistics, medicine… Experts users want all the detail Ie : want to use a specific method They want all the details The want (I hope) the statistical validity of the results The non-expert wants the best practice answer returned within their own context. The want comparative analysis with other fields The want to know the results are valid
EMBL-EBI Query design Suitable for text queries Only one logic AND or OR Predefined Easy to use Limited scope 2000 attributes -> 2000 check-boxes ! The simple text box design is very common
EMBL-EBI Query design Graphical interface Multiple logic AND/OR/NOT Under users control Slower Steep learning curve Some users just cannot get it Intuitive once mastered Pretty
EMBL-EBI Query design HIS|SER:S/H>C2.0 HIS.ne2:S/S>C2.0 HIS.[n]:S/T>C2.0 Figurative 2D sketch for 3D query (Active sites) Informative – presents meaning for the question Slower Less error prone select distinct entry_id, ligand_id from contact_search sel where neighbour_code_3_letter in ('SER','HIS') and DISTANCE = 2 intersect select distinct entry_id, ligand_id from contact_search sel where neighbour_code_3_letter = 'HIS' and NEIGHBOUR_ATOM_NAME = 'NE2' and DISTANCE = 3;
EMBL-EBI YAMGP (yet another molecular graphics program) Many different programs are available Quanta Rasmol MolMol Chime O Spock Swiss-PDBviewer Molscript iMol Pymol Chimera XtalView Frodo Bobscript InsightII Raster3D WebLab-viewer POVRay Yasara LigPlot WebMol Pymol Grasp Mage Whatif VMD Frodo
EMBL-EBI Result visualisation Multiple types of biological data Textual data 3D structure 2D chemical sketches 1D sequence Node linked General/derived data Web pages Errors/Variance Data provenance
EMBL-EBI Java 1.1 Applet Should run under most browsers Small footprint, high speed. Structure Line, stick, ball & stick, sphere, schematic, surface + texture map. Written by Mike Hartshorn (Astex therapeutics Ltd). Multiple structures supported
EMBL-EBI Sequence Multiple sequence alignment Editing, Annotation, colours… Consensus alignment Pick, Brushing & Magic lens
EMBL-EBI Chemistry 2D flat representation Annotation, colours… Interaction types Placement fn(contact distance) Editable Pick, Brush and magic lens
EMBL-EBI Graphs 2D, 2D grid and ND Linkage plots Annotation, colours… Ramachandran, etc… Pick, Brush Magic Len
EMBL-EBI Visualisation Lensing Linked views Brushing Picking Flying views Hyperbolic distortion Animation Solid rendering Depth cues Colour,lighting Highlighting Etc…
EMBL-EBI Visualisation : comparative analysis Similarity/Difference Data superposition Attribute display Colour, size… Correlation Attribute mapping Sequence colour by structure alignment