Increasing the Value of Crystallographic Databases Derived knowledge bases Knowledge-based applications programs Data mining tools for protein-ligand complexes.

Slides:



Advertisements
Similar presentations
Scientific & technical presentation Structure Visualization with MarvinSpace Oct 2006.
Advertisements

1 Miklós Vargyas, Judit Papp May, 2005 MarvinSpace – live demo.
Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural.
Analysis of High-Throughput Screening Data C371 Fall 2004.
Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein.
3D Molecular Structures C371 Fall Morgan Algorithm (Leach & Gillet, p. 8)
Overview of Key ICM Features NIBR - Emeryville July
Improving enrichment rates A practical solution to an impractical problem Noel O’Boyle Cambridge Crystallographic Data Centre
Insight into Molecular Geometry and Interactions using Small Molecule Crystallographic Data John Liebeschuetz Cambridge Crystallographic.
Recent developments 1) Tests (outlier analysis) and Bug fixing ( with Paul) 2) Regeneration of Values of Bonds and Bond-angles existing all structures.
Geometric Algorithms for Conformational Analysis of Long Protein Loops J. Cortess, T. Simeon, M. Remaud- Simeon, V. Tran.
Protein Tertiary Structure Prediction
2. Modeling of small systems Building the model What is the optimal conformation of a molecule? What is the relative energy of a given conformation? What.
The TEXTAL System for Automated Model Building Thomas R. Ioerger Texas A&M University.
Biological Data Mining A comparison of Neural Network and Symbolic Techniques
FLEX* - REVIEW.
©CMBI 2007 Search tools Google, MRS, (SRS). ©CMBI 2007 Search tools Google= Thé best generic search and retrieval system MRS= Maarten’s Retrieval System.
©CMBI 2005 Search tools Google, MRS, SRS. ©CMBI 2004 Search tools SRS = Sequence Retrieval System MRS = Maarten’s Retrieval System Google = Thé best generic.
Molecular modelling / structure prediction (A computational approach to protein structure) Today: Why bother about proteins/prediction Concepts of molecular.
Protein structure prediction May 30, 2002 Quiz#4 on June 4 Learning objectives-Understand difference between primary secondary and tertiary structure.
High Throughput Processing of the Structural Information of the Protein Data Bank Zoltán Szabadka, Vince Grolmusz Department of Computer Science Eötvös.
CAPRI Critical Assessment of Prediction of Interactions.
Protein Structure Prediction and Analysis
Using 3D-SURFER. Before you start 3D-Surfer can be accessed at For visualization.
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Protein Tertiary Structure Prediction
Module 2: Structure Based Ph4 Design
Molecular Descriptors
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Bringing Structure to Biology: Small Molecules and the PDBe
Similarity Methods C371 Fall 2004.
EMBL-EBI MSD-mine. EMBL-EBI MSD-mine overview  Web application for online data analysis and mining For the advanced MSDSD researcher Interactive ad-hoc.
Introduction to MCMC and BUGS. Computational problems More parameters -> even more parameter combinations Exact computation and grid approximation become.
EMBL-EBI Adel Golovin MSDsite The project is funded by the European Commission as the TEMBLOR, contract-no. QLRI-CT under the RTD programme.
 Four levels of protein structure  Linear  Sub-Structure  3D Structure  Complex Structure.
1 PyMOL Evolutionary Trace Viewer 1.1 Lichtarge Lab Sept. 13, 2010.
Bridging cheminformatics and bioinformatics using protein structures Edith Chan Inpharmatica London 10 April 2001.
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
Crystallographic Databases I590 Spring 2005 Based in part on slides from John C. Huffman.
In silico discovery of inhibitors using structure-based approaches Jasmita Gill Structural and Computational Biology Group, ICGEB, New Delhi Nov 2005.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Virtual Screening C371 Fall INTRODUCTION Virtual screening – Computational or in silico analog of biological screening –Score, rank, and/or filter.
Structural Models Lecture 11. Structural Models: Introduction Structural models display relationships among entities and have a variety of uses, such.
Pairwise Local Alignment and Database Search Csc 487/687 Computing for Bioinformatics.
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Deposition, Validation, Search and Analysis Services.
Macromolecular Structure Database Project EMSD Infra-structure Services for Europe To develop an autonomous structural database capability in Europe
CCDC Tools for Mining Structural Databases Or – Building Solid Foundations for a Structure Based Design Campaign John Liebeschuetz,
Polish Infrastructure for Supporting Computational Science in the European Research Space EUROPEAN UNION Examining Protein Folding Process Simulation and.
Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine Ajay N. Jain UCSF Cancer Research Institute and Comprehensive.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBe Search Services (PDBelite, PDBePro and BIObar) Sanchayita Sen, Ph.D. PDB Depositions.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Molecular mechanics Classical physics, treats atoms as spheres Calculations are rapid, even for large molecules Useful for studying conformations Cannot.
We propose an accurate potential which combines useful features HP, HH and PP interactions among the amino acids Sequence based accessibility obtained.
EBI is an Outstation of the European Molecular Biology Laboratory. A web based integrated search service to understand ligand binding and secondary structure.
PROTEIN IDENTIFIER IAN ROBERTS JOSEPH INFANTI NICOLE FERRARO.
Structure Visualization
Sequence: PFAM Used example: Database of protein domain families. It is based on manually curated alignments.
PDBemotif A web based integrated search service to understand ligand binding and secondary structure properties in macromolecular structures.
Majid Masso School of Systems Biology, George Mason University
Getting the Most out of the PDBe
Virtual Screening.
Volume 17, Issue 1, Pages (January 2010)
Computational Analysis
Homology Modeling.
Volume 18, Issue 11, Pages (November 2010)
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Presentation transcript:

Increasing the Value of Crystallographic Databases Derived knowledge bases Knowledge-based applications programs Data mining tools for protein-ligand complexes

Mogul Knowledge base of molecular geometry information taken from CSD Bond length, valence angle and torsion angle distributions Aim: click on a molecular parameter of interest and get observed distribution with no intervening steps

Mogul - Search Setup User loads a molecule then specifies a bond length, bond angle or torsion angle, of interest

Mogul - Results Substructure

Mogul - Search Algorithm Substructures stored in a hierarchical tree: BC AD Properties of B,C Properties of A-B & C-D bonds Properties of atoms bound to B and C

Mogul - Getting More Hits Allow certain atoms to be more general Generification rules

Mogul - Generic Search Results Substructures sorted by 2D similarity with original query

IsoStar and SuperStar IsoStar - knowledge base of information about intermolecular interactions SuperStar - program for predicting binding points in an enzyme active site SuperStar predictions based solely on IsoStar data

IsoStar Scatterplots

CSD vs. PDB scatterplots Similarity index distribution for 72 comparisons

IsoStar Density Surfaces

Scaling of IsoStar Surfaces Densities of grid point i are converted to propensities by: Average density is the density of contacts expected by random chance:

SuperStar Calculate binding positions for specific probe atoms in protein active sites Identify functional groups in binding-site Look up relevant IsoStar scatterplots and overlay on functional groups Contour - combining by taking products + =

SuperStar - Example Map OH

SuperStar Features Cavity detection Surface or pharmacophore point display Metal coordination Hyperlinking to IsoStar scatterplots Choice of CSD- or PDB-based maps Gaussian fits

SuperStar Validation 265 PDB complexes Generate four maps (Me, C=O, NH, OH) See whether maps discriminate correctly, e.g. does Me have highest propensity where a ligand Me group is observed? Compute percentage success rate CSD 74% PDB75% Gaussian CSD % PDB maps fuzzier, fewer probes possible Gaussian 4-5 times faster

Relibase+ Protein-ligand database system Based on original software developed by Manfred Hendlich and colleagues at Merck and Marburg University Enables searching of PDB and of in-house proprietary databases

Some Relibase+ Options Text searching Sequence searching 2D substructure and similarity searching 3D substructure searching Logical combination of hit lists Searching for intermolecular interactions Auto-superposition of similar binding sites Scripting facility based on Python

Analysis of 3D Queries Distance Distribution Torsion Distribution Benzamidine-Carboxylate Interactions

Binding Site Superposition

Example Python Script # Find all benzamidines # and check contacts to ASP under 3Å relibase.load(’dbase1') ba = relibase.Hitlist({'smiles':'c1ccccc1C(=N)N'}) new = relibase.Hitlist() for ligand in ba: for chain in ligand.contacts(): for residue in chain.residues(): if residue.name() == 'ASP': ligatoms = ligand.atoms() resatoms = residue.atoms() d = mindist(ligatoms,resatoms) if d < 3.0: new.append(ligand) new.saveas(’contact')

Acknowledgements Manfred Hendlich Gerhard Klebe Ingo Dramburg Andreas Bergner Ian Bruno Jason Cole Paul Edgington Magnus Kessler Jie Luo Clare Macrae Patrick McCabe Willem Nissink Jon Pearson Scott Rowland Barry Smith Marcel Verdonk