X-ray Validation Package Present status Swanand Gore PDBe D&A meeting : 21-Oct-2010.

Slides:



Advertisements
Similar presentations
Structure Comparison, Analysis and Validation Ton Spek National Single Crystal Facility Utrecht University.
Advertisements

Business logic for annotation workflow Tom Oldfield July 21, 2010.
Quality of Protein Crystal Structures in the PDB Eric. N Brown, Lokesh Gakhar and S. Ramaswamy.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBeChem The Ligand Database.
HTCondor and macromolecular structure validation Vincent Chen John Markley/Eldon Ulrich, NMRFAM/BMRB, David & Jane Richardson, Duke University.
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Refinement procedure Copy your best coordinate file to “prok-native-r1.pdb”: cp yourname-coot-99.pdb prok-native-r1.pdb Start refinement phenix.refine.
Hydrogen bonds in Rosetta: a phenomonological study Jack Snoeyink Dept. of Computer Science UNC Chapel Hill.
Insight into Molecular Geometry and Interactions using Small Molecule Crystallographic Data John Liebeschuetz Cambridge Crystallographic.
Jack Snoeyink & Matt O’Meara Dept. Computer Science UNC Chapel Hill.
Recent developments 1) Tests (outlier analysis) and Bug fixing ( with Paul) 2) Regeneration of Values of Bonds and Bond-angles existing all structures.
CTRUNCATE Norman Stein CCP4 Daresbury Laboratory Abingdon 18/3/08.
Structure Validation using Coot Paul Emsley Mar 2007 York University of York.
Model Quality: Concepts & Statistics Swanand Gore & Gerard Kleywegt PDBe – EBI May 6 th 2010, 10:30-11:30 am Macromolecular Crystallography Course.
Lab Meeting 06/05/20051 NMRQ: Quality Assessment and Validation for Protein Structures Generated by NMR Spectroscopy Gary Van Domselaar
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Can protein model accuracy be identified? Morten Nielsen, CBS, BioCentrum, DTU.
Seminar series 2 Protein structure validation. In 't verleden ligt het heden; in 't nu, wat worden zal. The past: Linus Pauling ‘Inventor’ of helix and.
FLEX* - REVIEW.
High Throughput Processing of the Structural Information of the Protein Data Bank Zoltán Szabadka, Vince Grolmusz Department of Computer Science Eötvös.
Management and Distribution of Chemical Data in the Protein Data Bank John Westbrook, Dimitris Dimitropoulos, Jasmine Young, Peter Rose, Philip E. Bourne.
Judgment day. Topic 6 Chapter 14 & 15, Du and Bourne “Structural Bioinformatics”
eHiTS Score Darryl Reid, Zsolt Zsoldos, Bashir S. Sadjad, Aniko Simon, The next stage in scoring function evolution: a new statistically.
“This presentation is for informational purposes only and may not be incorporated into a contract or agreement.”
Recommendations and Questions wwPDB/CCDC/D3R Ligand Validation Workshop Center for Integrative Proteomics Research, Rutgers 7/30-31/2015 Group D, Academic.
Bringing Structure to Biology: Small Molecules and the PDBe
Worldwide Protein Data Bank wwPDB Common D&A Project January 28, 2010 Steering Committee Project Update.
Model-Building with Coot An Introduction Bernhard Lohkamp Karolinska Institute June 2009 Chicago (Paul Emsley) (University of Oxford)
Increasing the Value of Crystallographic Databases Derived knowledge bases Knowledge-based applications programs Data mining tools for protein-ligand complexes.
EMBL-EBI Adel Golovin MSDsite The project is funded by the European Commission as the TEMBLOR, contract-no. QLRI-CT under the RTD programme.
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
Data quality and model parameterisation Martyn Winn CCP4, Daresbury Laboratory, U.K. Prague, April 2009.
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Gaurav Sahni, Ph.D. Deposition, Validation, Search and Analysis.
Coot Tools for Model Building and Validation
Applied common sense The why, what and how of validation (and what EM can learn of X-ray) Gerard J. Kleywegt Protein Data Bank in Europe EMBL-EBI, Cambridge,
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
Crystallographic Databases I590 Spring 2005 Based in part on slides from John C. Huffman.
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
Ligand fitting and Validation with Coot Bernhard Lohkamp Karolinska Institute June 2009 Chicago (Paul Emsley) (University of Oxford)
SimBioSys Inc.© 2004http:// Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:
NMRQ: A Web Server for the Validation, Comparison and Analysis of Protein Structures Solved by NMR Gary Van Domselaar †, Paul Stothard, Trent Bjorndahl,
Data Integration and Management A PDB Perspective.
Data Harvesting: automatic extraction of information necessary for the deposition of structures from protein crystallography Martyn Winn CCP4, Daresbury.
EBI is an Outstation of the European Molecular Biology Laboratory. Validation & Structure Quality.
EMBL-EBI MSD Search and Visualization tools Jawahar Swaminathan.
EBI is an Outstation of the European Molecular Biology Laboratory. Sanchayita Sen, Ph.D. PDB Depositions Validation & Structure Quality.
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Deposition, Validation, Search and Analysis Services.
Macromolecular Structure Database Project EMSD Infra-structure Services for Europe To develop an autonomous structural database capability in Europe
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Gaurav Sahni, Ph.D. Deposition, Validation, Search and Analysis.
Worldwide Protein Data Bank wwPDB Common D&A Project November 24, 2009 November 24, 2009 Steering Committee Project Update.
Protein Structure Database for Structural Genomics Group Jessica Lau December 13, 2004 M.S. Thesis Defense.
Worldwide Protein Data Bank Common D&A Project Sequence Processing Modular Demo May 6, 2010 Project Deliverable.
Worldwide Protein Data Bank wwPDB Common D&A Project Full Project Team Meeting Rutgers March 16-19, 2010.
Refinement is the process of adjusting an atomic model to:
Forward and inverse kinematics in RNA backbone conformations By Xueyi Wang and Jack Snoeyink Department of Computer Science UNC-Chapel Hill.
Automated Refinement (distinct from manual building) Two TERMS: E total = E data ( w data ) + E stereochemistry E data describes the difference between.
EBI is an Outstation of the European Molecular Biology Laboratory. A web based integrated search service to understand ligand binding and secondary structure.
Structure Visualization
CommonCoot Common Coot (Fulica atra) (Fulica atra)
Take a REST from manual searching: PDBe, programmatically
Common Coot (Fulica atra).
PDBemotif A web based integrated search service to understand ligand binding and secondary structure properties in macromolecular structures.
Refinement procedure for native structure
From: Structural database resources for biological macromolecules
Crystal structure determination
1.b What are current best practices for selecting an initial target ligand atomic model(s) for structure refinement from X-ray diffraction data?
Conformation Dependence of Backbone Geometry in Proteins
Analysis of crystal structures
TargetDB and PEPCDB •
Volume 19, Issue 10, Pages (October 2011)
Presentation transcript:

X-ray Validation Package Present status Swanand Gore PDBe D&A meeting : 21-Oct-2010

VTF recommendations Model-based indicators – Covalent geometry (E&H) outliers – Protein backbone (Ramachandran) and sidechains (rotamericity, flips) outliers – RNA backbone (atypical suites) – Carbohydrates chirality and naming – Ligands Features not observed in high-quality small-molecule xtal structures and other instances in PDB – Packing Bad vdw clashes Underpacking, voids Unusual contacts Unsatisfied hbond donors, acceptors

VTF recommendations Data-based indicators – Wilson plot – Data anisotropy plot – Twinning (Padilla Yeates plot) – Mislabelling of amplitudes / intensities – Translational NCS – Missed symmetry Data and model based indicators – R, Rfree Reproducibility and difference – Real-space R Per-residue measure of fit with 2FoFc map, normalized per residue type

VTF recommendations Percentile scores – Per criterion, calculate the percentile rank against the whole set of X-ray entries and also against structures in its resolution bin – Update the percentiles periodically

VTF recommendations Presentation of results for various consumers – Depositors (and annotators) – Reviewers Concise PDF report highlighting any unusual features – End-users (experts and non-experts) Web-based frontends with adjustable level of detail – Developers Webservices and XML files

VTF recommendations Validation package – Be open-source and freely distributable wwPDB sites, labs, companies – Import/wrap existing 3 rd party functionality EDS (Uppsala), Molprobity, CCDC Mogul, WhatIf Phenix, CCP4 RosettaHoles, pdb-care, DACA, ProSA – Calculate recommended validation metrics and publish XML file per entry – Present XML contents in various kinds of reports

Prototypes – Validation Viewer Entry viewer Residue and maps viewer Raw data and plots of phi-psi, omega, chi, B-factor, occupancy, RSR, RSCC

New ligand-validation functionality Mogul is a chemical mining engine developed by CCDC for small-molecule xtal structures in CSD – Splits query molecule into bond, angle, torsion and ring substructures – Finds comparable substructures from high-quality small-mol structures in CSD Compares query substructures against CSD distributions – Bonds, angles: Z scores can be computed – Torsions: Z-score is undefined but gives an idea where a torsion lies w.r.t. distribution – Rings: computes query ring’s torsion RMSD against each comparable CSD ring, finds mean, stdev of tRMSDs to estimate a Z score for ring

Prototypes – Mogul webservice Distribution for the angle from Mogul 2D & 3D views of ligand Bonds, angles, torsions, rings with comparable CSD fragments Upload or select a ligand

Validation package (installed on each site) mmCIF under deposition D&A API Validation XML file (Data, Percentiles) Distributions Calculator (Runs yearly) Distributions Oracle Database (Time-stamped by year) Distributions Webservice (if DB only at PDBe) D&A Webservers D&A clients Released Validation XML file D&A pipeline on all sites wwPDB sites (PDBe - ?) Public Access

Validation XML Contents – Administrative Version of validation package and various 3 rd party programs Creation date Distribution database version – Hierarchy of validation XML for data Entry (id) – Model (id) » Chain (id) Residue (seqnum, icode, resname, gri) Atom (name, altcode, gai) – Annotations Level (e.g. chain), identifier (chain_id), attributes Supports modular development of validation package as annotations can be appended as and when new wrapper modules are ready

Example annotations Atom-level – clashes Residue-level – Average B factor, occupancy – Phi-psi, Rama outliers – Sidechain flips, rotamer outliers – RNA backbone and pucker values Atom-group-level – Covalent bond-length and angles outliers Chain-level – WhatIf Rama score, average RSR, NCS deviation Entry-level – Rfree, Clash-score – twinning, tNCS, anisotropy, fit to ideal Wilson plot

Summary VTF recommendations will be implemented in a validation package. The package will consist of modules which import/wrap 3 rd party functionality. The package will be open-source and freely distributable. A process for periodically updating distributions and validation XMLs will be implemented.