Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bringing Structure to Biology: Small Molecules and the PDBe

Similar presentations


Presentation on theme: "Bringing Structure to Biology: Small Molecules and the PDBe"— Presentation transcript:

1 Bringing Structure to Biology: Small Molecules and the PDBe

2 PDBe overview PDB is a core molecular database at EMBL-EBI
PDBe is a founding partner of Worldwide Protein Data Bank (wwPDB) Founder of Electron Microscopy Data Bank (EMDB) Mission: Bringing Structure to Biology Major activities: Deposition and annotation site for structural data on biomacromolecules (X-ray, NMR, EM) Integrated resource of high-quality macromolecular structural data and related information Provide tools and services for accessing, exploiting and disseminating structural data to the wider biomedical community The Worldwide Protein Data Bank (wwPDB) consists of organizations that act as deposition, data processing and distribution centers for PDB data.1 Members are: RCSB PDB (USA), PDBe (Europe) and PDBj (Japan), and BMRB (USA – NMR structures). The wwPDB's mission is to maintain a single PDB archive of macromolecular structural data that is freely and publicly available to the global community. We are actively involved in an effort to integrate data from major biomedical resources at EMBL-EBI and across the world. This "Structure Integration with Function, Taxonomy and Sequence" (SIFTS) initiative integrates data from a number of bioinformatics resources and is used by major global sequence, structure and protein-family resources. PDBe specialise in providing tools and services that exploit the wealth of structural knowledge contained within the PDB archive

3 wwPDB partners Collaborate on “data in”
Policy issues Weekly releases Chemical component database Deposition and annotation procedures Archive quality and remediation Journal interactions Validation standards and format specifications Friendly competition on “data out” Serving PDB data with added-value PDB-based services… Other services, resources and activities

4 PDB Depositions 10,000th PDBe annotated structure - April 2011 (2yf6)
Structure-based drug design seeks to identify and optimize such interactions between ligands and their host molecules, typically proteins, given their three-dimensional structures. This optimization process requires knowledge about interaction geometries and approximate affinity contributions of attractive interactions that can be gleaned from crystal structure and associated affinity data. 10,000th PDBe annotated structure - April 2011 (2yf6)

5 Chemical Component Dictionary
Compounds in the PDB Small molecules bound to macromolecules Individual components of macromolecules wwPDB maintains dictionary descriptions for all unique chemical components Name, synonyms, formula, SMILES, … Atoms and bonds Ideal and representative coordinates Each new component assigned a unique 3-letter identifier Release coincides with the release of the parent PDB entry The Chemical Component Dictionarya is as an external reference file describing all residue and small molecule components found in PDB entries. This dictionary contains detailed chemical descriptions for standard and modified amino acids/nucleotides, small molecule ligands, and solvent molecules. over ligands. Each chemical definition includes descriptions of chemical properties such as stereochemical assignments, chemical descriptors (SMILES & InChI), systematic chemical names, and idealized coordinates (generated using Molecular Networks' Corina, and if there are issues, OpenEye's OMEGA). The dictionary is organized by the 3-character alphanumeric code that PDB assigns to each chemical component. New chemical component definitions appear in the dictionary as the entries in which they are observed are released in the PDB archive; consequently, the dictionary is updated with each weekly PDB release. The dictionary is regularly reviewed and remediated. This dictionary is part of the core "reference" information of the PDBe relational database and is consistently referenced by all macromolecular structures for all bound molecules as well as standard and modified amino acids. Since every residue and every atom in the PDBe database references a ligand and an atom in this dictionary, this is the repository that defines the link between proteins and chemistry.

6 Molecule search options
Compound name Ligand 3-letter code SMILES Formula (exact or range) e.g. C6-10 N4 O2 S0 Chemical substructure

7 PDBe Home Page http://www.ebi.ac.uk/pdbe

8 Open chemistry sketchpad
Ligands and the PDBe Open chemistry sketchpad

9 Ligands and the PDBe

10 Ligands and the PDBe There is a published paper for the entry. Meaning of icons: The source organism was man (or a fungi, etc.), the sample of the biomacromolecule was obtained by expression and purification, X-ray data, the entry contains a protein, (not DNA), and is a small molecule.

11 2D Ligand Interaction Diagrams
Interaction diagrams for any given PDB entry Interactive control of distance criteria Diagram customisation Image export png, jpg, eps… Generates schematic 2D diagrams of protein-ligand binding site interactions for any given PDB file. S-benzyl-glutathione (GSB) Human Glyoxalase inhibitor (1guh)

12 PDBeXpress: rapid access to protein-ligand interaction statistics
Understand and assess binding site interactions Provide chemists with quick answers to common questions without the need to construct complex search queries What residues interact? Which enzymes interact? What binds here? PDBeXpress is a small collection of tools that extract and present information and statistics on protein-ligand binding site interactions. So, it can help you understand and assess those interactions that are important to molecular recognition The idea behind PDBeXpress was to provide users with quick answers to common questions without them having to construct complex search queries, or learn how to use specialist interfaces. Service allows you to answer 3 questions… For any ligand in the PDB – what residues interact. In a similar way, for any ligand in the PDB, we can retrieve the enzymes with which this ligand interacts – 30,000 enzyme structures in the PDB Finally, can approach the problem from the opposite direction - What binds here? This tool enables you to identify ligands in the PDB that interact with a given (protein environment) set of residues.

13 What residues interact?
RTL - Retinol PDB three-letter ligand code Ligand name e.g. what residues interact with RTL enter either the PDB three-letter ligand code or the name of the ligand below RTL – RETINOL the tool will retrieve the residues with which this ligand interacts, as observed in PDB entries.

14 What residues interact?
RTL - Retinol The results are plotted on an interactive graph that show the frequency of interaction with each aa residue. Frequency is plotted as a % of total interactions that ligand makes. Clicking on a bar will give you details of the interactions (LEU) Link to PDB entries containing those interactions and a link to PDBeMotif for advanced analyses options – more on this later… The results can be further refined by filtering on a Pfam or CATH domain. Download statistical data; Print or export pdf, image

15 Which enzymes interact?
MAN – Mannose PDB three-letter ligand code Ligand name For any ligand in the PDB, this tool will retrieve the enzymes with which this ligand interacts, as observed in PDB entries.

16 Which enzymes interact?
MAN – Mannose PDB three-letter ligand code Ligand name The results are plotted on an interactive graph, which can be used to drill down to deeper levels of EC, or to view the PDB entries in which the interactions occur.

17 What binds here? Search for ligands that interact with a given set of residues Can specify a partial or exact binding environment Search for ligands that interact with a given set of residues (e.g. "HIS ASP SER") Might represent a sub-pocket, or conserved motif that you believe is important to binding. Simply specify the residues by clicking in the table.. Can specify a partial or exact binding environment

18 What binds here? and the results are plotted on an interactive graph showing the ligands that bind to the given environment and how frequently. This graph links to PDB entries containing the interactions, as well as advanced analysis options within PDBeMotif.

19 PDBeMotif: powerful and flexible searching
11/26/10 11/26/10 11/26/10 18/11/11 PDBeMotif: powerful and flexible searching PDBeXpress modules driven by PDBeMotif PDBeMotif allows to combine protein sequence, chemical structure and 3D data in a single search Express modules are build on top of Motif… which is a powerful and flexible search engine… allows you to combine protein sequence, chemical structure and 3D data in a single search In additional to analysing PDB files… upload your own file for analysis 19 19 19 19

20 PDBeMotif: powerful and flexible searching
11/26/10 11/26/10 11/26/10 18/11/11 PDBeMotif: powerful and flexible searching construct queries based on - ligands and their 3D environment secondary structure elements and small 3D motifs protein φ/ψ angle sequences - sequential representation of the protein geometry results can be analysed against UniProt, CATH, PFAM or EC Which ligands bind within a given environment and frequencies of interactions, compare the binding environment of two different ligands Common binding site features, or local structural similarities across different protein families, or otherwise unrelated structures. 20 20 20 20

21 How do sulphones and sulphonamides prefer to interact?
In the PDB, 39% of the ligand sulfonyl groups are found to form a hydrogen bond with either a protein donor or a structural water molecule, while 74% are located in or close to van der Waals distance ( A ° ) to an aliphatic group. Notably, of the sulfonyl groups situated in a hydrophobic environment in the PDB, only 36% are found to interact simultaneously as a hydrogen bond acceptor but 79% of the hydrogen-bonded sulfonyl groups are found to interact simultaneously with a hydrophobic group. These findings clearly indicate a dual character of the weakly polar sulfonyl groups as a hydrogen bond acceptor and as a hydrophobic group. Closest interactions (distances in A ° ) formed by the sulfonyl oxygen atoms of a cathepsin S ligand within the active site (PDB code 2fra) Human Cathepsin S (PDB: 2fra) Electrostatic H-bonds VdW bonds M. Stahl, A Medicinal Chemist’s Guide to Molecular Interactions J. Med. Chem. 2010, 53, 5061–

22 Ligands need careful validation
CCDC analysis of ligand geometries (using Relibase+/Mogul/EDS) Around 20% of recently determined structures have geometric errors that could potentially cause a misleading interpretation of the binding interactions Wrong Unusual/Strained Correct There is a wealth of information about ligand interactions that can be exploited, particlaly in a design context.... However.... Does assume data is accurate and valid. An analysis put together by some of the CCDC colleagues has highlighted some problems in PDB ligand quality. This semi-manual analysis shows that over 3 time-periods, the fraction of good ligands in PDB is not rapidly increasing. Around 20% of recently determined structures have geometric errors that they could potentially cause a misleading interpretation of the binding interactions Wrong ligands have a serious error in density or geometry or close contacts. Dubious ones are possibly strained ligands. OK ligands have only minor errors in torsions or rings. ligands need more attention is that their refinement is tricky and error-prone. Reliable dictionaries are harder to come by, electron density in MX is often not enough to identify small-mols unambiguously, and any problem in ligand quality does not have big impact on overall quality indicator like R factor. Surely we at PDB need to think about validating ligand quality and implement processes to prevent bad ligands being deposited. Liebeschuetz, J.W., Hennemann, J. The good, the bad and the twisted: A survey of ligand geometry in protein crystal structures J. Comput. Aid. Mol. Des., 26, (2012)

23 The solution… Mogul – a Knowledge-based library of molecular geometry derived from the Cambridge Structural Database (CSD) Enables rapidly validation of the complete geometry of a given query structure and identification of unusual features

24

25 MoU with CCDC wwPDB/CCDC Memorandum of Understanding
wwPDB gets to use Mogul for validation of all current and future compounds in the PDB wwPDB gets to incorporate and redistribute CSD coordinates for all current and future ligand compounds in the PDB wwPDB gets to use Mogul and CSD coordinates to derive dictionaries for all current and future compounds in the PDB Mogul to be implemented as part of a new validation pipeline for PDB structures. This validation service to be made available via an independent server…

26 Prevention is the best cure
Thanks to collaboration with CCDC We can add CSD coordinates for all existing small molecules in the PDB (and variants, e.g. D-amino acids) that also occur in the CSD We can use these coordinates and Mogul to derive refinement dictionaries Grade (Global Phasing; uses Mogul and RM1) Will improve quality and consistency of the archive We can provide reasonable starting coordinates and refinement dictionaries for all existing compounds in the PDB We can add CSD coordinates at annotation time for new small molecules that also occur in the CSD wwPDB gets to use Mogul and CSD coordinates to derive dictionaries for all current and future compounds in the PDB

27 Future of the PDB? At present PDB is a historic archive
We have to accept and distribute everything “Archive” – i.e., what was described in the literature Essentially provider-centric We capture X-ray detector type but not ligand function… Organised by entry rather than molecule/complex/… Shifting user communities/demands We must serve the consumers of structural data (non-experts) Don’t think in terms of PDB entry codes Can’t tell a good from a bad model More non-expert users than experts Don’t think in terms of PDB entries Can’t tell a good from a bad model We have to understand what they know, what they want to find and what they want to do New ways to access structural information New ways to handle structural information Provide current-best-practice models Integration with other databases 27

28 PDBe Team February 2012

29 Funding 29 29 29

30 Thank you! Tutorials… Contact us… Follow us… www.pdbe.org
Contact us… Follow us…


Download ppt "Bringing Structure to Biology: Small Molecules and the PDBe"

Similar presentations


Ads by Google