EMBL-EBI Dimitris Dimitropoulos MSD-mine
EMBL-EBI MSD-mine overview Web application for online data analysis and mining For the advanced MSDSD researcher Flexible guidance for ad-hoc queries Exploitation of integrated knowledge Analysis, charts and Data drill Flexible combination of data with multiple joins Generic but customised for the MSDSD
EMBL-EBI Characteristics Classical systems give list of entries for visualisation MSD-mine returns detailed records, homogenised and ready for analysis Allows arbitrary queries on the more than 100 entities (tables) organised in 9 sections (or marts) restrictions and results for 2000 attributes combine entities based on 450 relations Operability safeguards Reject long queries (10 mins) and overload of results (1000 rows)
EMBL-EBI Exploring MSDSD Explores and explains MSDSD With context sensitive help and descriptions With links to MSDSD documentation Helps to understand the structure of MSDSD Helps learning query writing in SQL for advanced custom queries
EMBL-EBI Filter build page Areas on the page Entity area (E): select entities and relations Restriction area (R): set or view the restrictions Filter area (F): view the nodes of the filter Description area (D): context sensitive documentation
EMBL-EBI MSDSD marts MSDSD is organised in sections (marts) Each mart is a set of entities that may start a filter
EMBL-EBI Define Restrictions Select the attribute Choose the operator Type in the value or select one from a sample list Add the new restriction
EMBL-EBI Combine entities Using one of its relations Relations are organised per mart Understand cardinality User may choose the new entity as the working node and follow its relations
EMBL-EBI MSD preferences User may set preferences to specify MSDSD shortcuts for filters All assemblies – Representative assembly – Assymetric unit All models – Representative model One chain per sequence All entries – SCOP or DALI entries – Custom set
EMBL-EBI Execute query View-Navigate results Load all records Set result based constraints View details Navigate relation links Export in Text-XML-Script
EMBL-EBI Data analysis Complete or Sample Range or Value Fully customisable Context sensitive chart Data drill operations
EMBL-EBI Analysis over a base attribute Choose base attribute Choose grouping operation for analysis attribute Options and data-drill operations supported
EMBL-EBI First Example Find the entries with resolution < 1.2 Select the “Structure” mart and Choose the Entry table Set the restriction on resolution Browse the results
EMBL-EBI Filter Expressions Find the entries with resolution < 1.2 and are related to HEMOGLOBIN Add the main restriction on the resolution and Add a sub-expression where the logical operator is “Or” And the title contains the word “HEMO” or “HAEMO” or “GLOBIN”
EMBL-EBI A simple distribution chart Find the distribution of assembly types Use the “Assembly” table from the “Structure” mart Execute the query Go to the analysis page for the “Assembly type” attribute
EMBL-EBI Relation and external links Find entries related to “cell death” and follow their GO (gene ontology) mappings and the links to the external GO service Use the “Entry” table where the title contains the word “death” Follow the GO mappings for a particular entry Follow the links to the GO database
EMBL-EBI A more complex example Find the active site contacts of helices that are part of beta- alpha-beta motifs Examine their linearity Select “Motif” as the starting point and combine with “Helix” and “Residue Contacts” Add a restriction View results and statistics for the helix linearity Focus (drill) on an area of interest
EMBL-EBI Saving results and exporting Find the binding sites of “kinked” residues Build the query by combining “Residue”, “Helix” and “Site” tables Save the results on a local file Export the results in XML TAB delimited as a script
EMBL-EBI Preferences and representative sets Find the distribution of number of crystals in experiments Use the “XRay-data” table View the distribution of number of crystals For the whole PDB For the DALI representative set For our own custom representative set
EMBL-EBI Custom filters and results Find the percentage of residues that interact in helix interactions, of helices with similar size Use the “Helix interaction” table Add a custom “normalised interaction factor” result item Add a custom restriction “one helix is at most double in size than the other” View the distribution of the “interaction factor”