Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dimitris Dimitropoulos

Similar presentations


Presentation on theme: "Dimitris Dimitropoulos"— Presentation transcript:

1 Dimitris Dimitropoulos
Chemistry & the PDB MSDchem

2 The chemical database

3 MSDchem ligand dictionary
Complete, clean, up to date collection of all the chemical species and small molecules in the PDB A ligand in MSDchem is a complete, distinct stereo isomer of a chemical compound Atoms and element types Bonds and bond orders Stereo configuration of atoms and bonds in cases of stereo-isomers (R/S – E/Z) Atom names and coordinates are not fundamental properties XML defines neither the tag nor the grammar. This ensures easier working on the data that is being sent to and from the client. After creating some sample XML, I moved onto the next stage, styling the data within the tags

4 Role in the MSD database
An integral component in the core of MSD database Relational reference from entities where a molecule or atom name is used in the PDB (protein residues and atoms) It is not possible for an ATOM line: HETATM C2 PLA C to be loaded if the “PLA” ligand is not defined or it does not include a “C2” atom. XML defines neither the tag nor the grammar. This ensures easier working on the data that is being sent to and from the client. After creating some sample XML, I moved onto the next stage, styling the data within the tags

5 Chemistry and PDB Eliminate chemical inconsistencies from new PDB entries Structure and derived properties of a ligand apply automatically to residues and bound molecules that reference it The basic structure is carefully determined during curation, and a rich set of derived attributes is calculated for each ligand Graph isomorphism is being applied to check the consistency of the PDB, taking stereo-configuration into account Old legacy PDB entries are chemically “corrected” when loaded in the MSD database In thousands of cases errors are identified and corrected, involving most of them times inconsistent naming or different stereo-configuration Exchanged in cooperation with RCSB and the wwPDB XML defines neither the tag nor the grammar. This ensures easier working on the data that is being sent to and from the client. After creating some sample XML, I moved onto the next stage, styling the data within the tags

6 More than just the PDB codes
All ligands are modelled as separate inter-related ligands and the appropriate one is referenced No distinction is made in the PDB between ribo- and deoxyribonucleotides (all are identified with the same residue name i.e., A, C, G, T, U, I) Modified nucleic acids are given as +A etc regardless of modification No distinction between different topological variants (12 different variants can be found for HIS in PDB) XML defines neither the tag nor the grammar. This ensures easier working on the data that is being sent to and from the client. After creating some sample XML, I moved onto the next stage, styling the data within the tags

7 Derived information External scientific software (CACTVS, VEGA, CORINA, ACD-labs, CCP4, OELIB) together with in house development has been used to derive: Stereochemistry (R/S – E/Z) DCM C4' S C3' R C1' S DCF C4' R C3' S C1' R Smiles and detailed gifs Systematic IUPAC names XML defines neither the tag nor the grammar. This ensures easier working on the data that is being sent to and from the client. After creating some sample XML, I moved onto the next stage, styling the data within the tags THIOALANINE (ALT) CC(N)C(O)=S - (2S)-2-aminopropanethioic O-acid

8 Derived information Fingerprints:
A bit string in hexadecimal form that indicates the presence or not of segments from predefined lists Useful for fast search and classification Different libraries of predefined lists can be set Currently calculated for the CACTVS library (500 segments) Molecule Segments BitString 1 Fingerprint: 2A XML defines neither the tag nor the grammar. This ensures easier working on the data that is being sent to and from the client. After creating some sample XML, I moved onto the next stage, styling the data within the tags

9

10 Search options By ligand code By ligand name or synonym
By formula or formula range By non stereo substructure By non stereo superstructure By exact stereo or non stereo structure By fingerprint similarity XML defines neither the tag nor the grammar. This ensures easier working on the data that is being sent to and from the client. After creating some sample XML, I moved onto the next stage, styling the data within the tags

11

12

13

14 Results of ‘is superstructure of’
Click on EAA Results of ‘is superstructure of’

15 EAA details 3-chloro-phenol

16 Results Viewers

17 PDB residue KWT <chemComp> <code>KWT</code> <name>(1S,6BR,9AS,11R,11BR)-9A,11B-DIMETHYL-1-[(METHYLOXY)METHYL]-3,6,9-TRIOXO-1,6,6B,7,8,9,9A,10,11,11B-DECAHYDRO-3H-FURO[4,3,2-DE]INDENO[4,5-H][2]BENZOPYRAN-11-YL ACETATE</name> <nAtomsAll>55</nAtomsAll> <nAtomsNh>31</nAtomsNh> <overallCharge>0</overallCharge> <systematicName>(1S,6bR,9aS,11R,11bR)-1-(methoxymethyl)-9a,11b-dimethyl-3,6,9-trioxo-1,6,6b,7,8,9,9a,10,11,11b-decahydro-3H-furo[4,3,2-de]indeno[4,5-h]isochromen-11-yl acetate</systematicName>

18 Future targets Identify and model protein inhibitors as ligands
Pre-classify functional groups for ligands and ligand atoms based on substructure fragments. Optimise and boost the performance of substructure searches Enhance visualisation and integration with other MSD tools XML defines neither the tag nor the grammar. This ensures easier working on the data that is being sent to and from the client. After creating some sample XML, I moved onto the next stage, styling the data within the tags


Download ppt "Dimitris Dimitropoulos"

Similar presentations


Ads by Google