Presentation is loading. Please wait.

Presentation is loading. Please wait.

EBI is an Outstation of the European Molecular Biology Laboratory. MSDchem and the chemistry of the wwPDB EMBO 22nd-26th September 2008 EMBL-EBI Hinxton.

Similar presentations


Presentation on theme: "EBI is an Outstation of the European Molecular Biology Laboratory. MSDchem and the chemistry of the wwPDB EMBO 22nd-26th September 2008 EMBL-EBI Hinxton."— Presentation transcript:

1 EBI is an Outstation of the European Molecular Biology Laboratory. MSDchem and the chemistry of the wwPDB EMBO 22nd-26th September 2008 EMBL-EBI Hinxton UK

2 The PDB Chemical components  PDB has more than the folding of standard polymers in 3-D  It gives an insight of interesting special chemistry  Bound ligands  Modified aminoacids  Non-standard chemical components are often the most interesting  The PDB ligand dictionary has served for many years  As the reference dictionary for the chemical definition of 3 letter codes in the PDB data

3  The ligand dictionary has been maintained by the curators in all wwPDB sites  Problems were accumulated  Duplicate entries  Impossible chemistry  The definition of what a 3 letter code represents was not clear and consistent  Stereo-chemistry was ignored

4 The MSDchem database  The database that supported the chemical component dictionary in the MSD.  The curation team had an explicit clear definition about ligands, right from the start  A distinct stereo-isomer;  connectivity,  bond orders,  absolute stereo-descriptors of atoms and bonds  This was reflected in the design and the implementation of the MSDChem database

5  The ligand identity  Atom, elements, bonds and bond orders  Atom and bond absolute stereo-descriptors (Cahn-Ingold-Prelog)  Equivalent to a canonical stereo-smile or INCHI string MSDchem ligand definition DCF C4' R C3' S C1' R DCM C4' S C3' R C1' S

6  Other properties  Atom names, and atom/bond ordering  Representative coordinates  Derived properties  Aromatic bonds  Smiles – INCHI strings  Systematic names  Idealised coordinates  Rings – planes  Atom Energy types

7  For known ligands coordinates are checked with ligand definition (Program DOHLC)  Atom labeling is checked  A new ligand may have to be defined  For a new ligand  Fundamental properties are checked  Derived properties are generated  Is it identical to an existing ligand with another code? (DOHLC) Ligand curation 3TH Not possible New ligand Actually it is 6CP

8  Improvement of the chemical dictionary  A core task of the wwPDB remediation project  Remaining issues and data errors were fixed  Duplicate identical ligands  No representative coordinates  Wrong valences  The definition of the ligand identity and the deviations were agreed among wwPDB  The wwPDB invested significantly in this area with a new software toolkit (ChemComp)  Replaced most of the MSDChem backend Ligands in the wwPDB

9  Additional investment in chemical software  Use of chemical software packages  CACTVS  OpenEyes  CORINA  LexiChem  MSDChem not a separate data resource  Just loading of the wwPDB ligand dictionary in Oracle  IUPAC atom names,deoxy-bases, better chemical names

10  Molecules too big to be a single chemical component  Special chemistry (like metal complexes)  Limitations of chemical software  Legacy chemical components that is hard to deal with (like ions)  Components that have never been fully observed  Modified components Difficult Issues

11  Public pages for the wwPDB ligand dictionary  Based on an Oracle database load  Various search options  Visualisation and navigation  Exporting in other formats  Has been running for almost 6 years  Is used and referred by  Ligand Depot (RCSB equivalent)  ChEbi at EBI  PubChem at NCBI  HIC-Up and others The MSDChem web application

12 Statistics  Daily average load of MSDChem  ~ 400 queries  ~ 100 distinct IP adresses

13  Most common case: search for a 3 letter code seen in a PDB file  Search for a chemical name or part of it found in the literature  All known names are searched  Common, PDB  Systematic  A synonym Search following references

14  3 letter code  Chemical name  Common, PDB  Systematic  A synonym MSDChem search

15 Ligand details  For every kind of search there is a result list  Summary information  Preview icon of the molecule  Links to pages for every chemical component  With detailed images  Links for more information about atoms, bond etc.  Various options for 3-D visualization  Download options for common chemical formats

16 Results overview Ligand details Ligand overviewLigand details

17 Visualisation - Export  Coordinates  Ideal  Representative  Chemical formats  PDB  Molfile (SDF)

18 Searching for chemical composition  Often aspects of composition are known but not the exact structure  Like particular elements (metals etc.)  Or particular chemical fragments  User friendly expression building pages based on formula or fragments  Visually browse through the results

19 Formula range  Expression can be built with web form  Example : O1-4 N3-100 F0  1 to 4 oxygens  More than 3 nitrogens  No Fluorine  Anything else

20 Fragment search  Web form  Significant fragments  Example :  More than 2 benzimidazoles  No piperazine  Anything else

21 Searching for parts of structure  An outline of the structure or of some characteristic part is known  Looking for variants of molecules  Load the known target and remove the unimportant parts  Perform an sub graph search  Looking for chemical components with similar fragments and localized chemistry  Load the known target and perform a fingerprint search

22 Substructure search  Applet to draw diagram  Load and modify existing ligand  May take a couple of minutes

23 Links to the PDB  MSDchem searches strictly the reference dictionary  But provides links to the PDB entries that include a ligand or a set of ligands  From ligand details pages  And from any query results page  Links to the summary pages for the entries (MSD Atlas pages)  Or instances of the ligands in entries along with their environment and interactions (MSDmotif)

24 Link to PDB  From any result page  Like a fragment search  Link to PDB entries with such ligands

25 Link to Binding sites  Details - interactions of these ligands in entries  Statistics – search within results

26 Ligand index – download  Download of the complete archive  Compressed tar of Molfiles (SDF)  CML (ChEBI style)  MSDChem XML  Relational database  Just listings  Smile strings – name

27 Summary  The wwPDB ligand dictionary provides the chemistry of the PDB  The MSDChem backend has been merged in the remediation project  The state of the dictionary has improved  The MSDChem web application provides searching of the dictionary  Name  Formula  Substructure  Fragments - similarity


Download ppt "EBI is an Outstation of the European Molecular Biology Laboratory. MSDchem and the chemistry of the wwPDB EMBO 22nd-26th September 2008 EMBL-EBI Hinxton."

Similar presentations


Ads by Google