Jianlin Jack Cheng Computer Science Department University of Missouri, Columbia, USA Mexico, 2014.

Slides:



Advertisements
Similar presentations
PhyCMAP: Predicting protein contact map using evolutionary and physical constraints by integer programming Zhiyong Wang and Jinbo Xu Toyota Technological.
Advertisements

PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Protein Structure Prediction using ROSETTA
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Abstracts of main servers in CASP11
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Tertiary protein structure viewing and prediction July 1, 2009 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Fold Recognition Ole Lund, Assistant professor, CBS.
Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,
Tertiary protein structure viewing and prediction July 5, 2006 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Protein Fold recognition Morten Nielsen, Thomas Nordahl CBS, BioCentrum, DTU.
The 7 steps of Homology modeling. 1: Template recognition and initial alignment.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Protein Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Thomas Blicher Center for Biological Sequence Analysis
Jianlin Cheng Computer Science Department & Informatics Institute
Fold Recognition Ole Lund, Associate professor, CBS.
Protein Fold recognition
MULTICOM – A Combination Pipeline for Protein Structure Prediction
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Tertiary protein structure modelling May 31, 2005 Graded papers will handed back Thursday Quiz#4 today Learning objectives- Continue to learn how to manipulate.
Protein Tertiary Structure. Primary: amino acid linear sequence. Secondary:  -helices, β-sheets and loops. Tertiary: the 3D shape of the fully folded.
Molecular modelling / structure prediction (A computational approach to protein structure) Today: Why bother about proteins/prediction Concepts of molecular.
Hybrid Protein Model Quality Assessment Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri, Columbia, MO, USA.
1 Protein Structure Prediction Charles Yan. 2 Different Levels of Protein Structures The primary structure is the sequence of residues in the polypeptide.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Fa 05CSE182 CSE182-L6 Protein structure basics Protein sequencing.
Identification of Domains using Structural Data Niranjan Nagarajan Department of Computer Science Cornell University.
Protein Structures.
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Protein modelling ● Protein structure is the key to understanding protein function ● Protein structure ● Topics in modelling and computational methods.
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Protein Tertiary Structure Prediction
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
COMPARATIVE or HOMOLOGY MODELING
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Protein Structure Prediction. Historical Perspective Protein Folding: From the Levinthal Paradox to Structure Prediction, Barry Honig, 1999 A personal.
Modelling binding site with 3DLigandSite Mark Wass
Representations of Molecular Structure: Bonds Only.
Lecture 12 CS5661 Structural Bioinformatics Motivation Concepts Structure Prediction Summary.
Modeling Protein Structures and Gene Regulatory Networks by Mining Protein and RNA-Seq Data Jianlin Jack Cheng, PhD Computer Science Department University.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Modelling Genome Structure and Function Ram Samudrala University of Washington.
MolIDE2: Homology Modeling Of Protein Oligomers And Complexes Qiang Wang, Qifang Xu, Guoli Wang, and Roland L. Dunbrack, Jr. Fox Chase Cancer Center Philadelphia,
Multiple Mapping Method with Multiple Templates (M4T): optimizing sequence-to-structure alignments and combining unique information from multiple templates.
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Structure prediction: Homology modeling
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
Predicting Protein Structure: Comparative Modeling (homology modeling)
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Modelling protein tertiary structure Ram Samudrala University of Washington.
Protein Structure Prediction Graham Wood Charlotte Deane.
Exercises Pairwise alignment Homology search (BLAST) Multiple alignment (CLUSTAL W) Iterative Profile Search: Profile Search –Pfam –Prosite –PSI-BLAST.
Homology Modeling 原理、流程,還有如何用該工具去預測三級結構 Lu Chih-Hao 1 1.
Computational Biology, Part C Family Pairwise Search and Cobbling Robert F. Murphy Copyright  2000, All rights reserved.
Protein Tertiary Structure Prediction Structural Bioinformatics.
FM Model Assessment: Old scores and New Combinations ShuoYong Shi Nick Grishin Lab.
Automated Structure Prediction using Robetta in CASP11 Baker Group David Kim, Sergey Ovchinnikov, Frank DiMaio.
Protein Structure Visualisation
Computational Structure Prediction
Modelling the rice proteome
Protein Structure Prediction and Protein Homology modeling
TEMPLATE-BASED METHODS FOR PROTEIN MODEL QA
Protein Folding and Protein Threading
Protein Structures.
Protein structure prediction.
Protein Homology Modelling
Discussion of Protein Disorder Prediction
Protein structure prediction
High-Resolution Comparative Modeling with RosettaCM
Presentation transcript:

Jianlin Jack Cheng Computer Science Department University of Missouri, Columbia, USA Mexico, 2014

Targeted Sampling Fold Space Alignment SpaceModel Pool Sequence Space Model Generation Template & Alignment Combination

Internal or CASP Model Pool Combination Refinement Side Chain Tuning Massive Assessment Model Ranking

Samplers BLAST CSBLAST CSIBLAST PSIBLAST SAM HMMer HHSearch HHblits HHsuite MULTICOM PRC FFAS Compass MUSTER RaptorX 1. Alignment Combination Based on E- Values 2. Alignment Combination Based on Structures 3. Multiple Sequence Alignment + Structural Features 150 – 200 Models Template Library Alignment Combination Model Generation Modeller MTMG FUSION 125,000 templates (in-house) 125,000 templates (in-house) 39,000 (in-house) 39,000 (in-house) Third- party (local) Fold Sampling

MULTICOM (Server) MULTICOM (Human) Servers (partial list) Best (domains) nns10 BAKER-ROSETTASERVER8 IntFOLD37 Zhang-Server4 TASSER-VMT3 MULTICOM Server2 QUARK2 RBO_Aleph2 HHPred-A2 FFAS-3D2 myprotein-me2 PhyreX1 SAM-T08-server1 ZHOU-SPARKS-X1 HHPred-X1

Methods (blue: in-house) TypeFeatures MULTICOM-NOVELSingleStructural, physical, chemical features OPUS-PSPSCa atom contact potentials Proq2SStructural features RWplusSSide-chain orientation dependent potential ModelEva1SStructural features, contacts ModelEva2SStructural features, contacts, disorder, conservation RS_CB_SRSSDistance dependent statistical potential SELECTproSEnergy-based (h-bond, angle, electrostatics, vdw) DopeSStatistical potential DFire2SEnergy-based potential Modfoldcluster2ClusterPairwise model similarity (geometry) APOLLOCPairwise model similarity PconsCPairwise model similarity QAproC + SWeighted pairwise model similarity MULTICOM (human)ConsensusAverage ranking

Methods (blue: in-house) TypeAverage GDT-TS # Better# Best MULTICOM-NOVELSingle OPUS-PSPS Proq2S RWplusS ModelEva1S ModelEva2S RS_CB_SRSS0.343 SELECTproS0.411 DopeS DFire2S Modfoldcluster2Cluster0.403 APOLLOC PconsC0.402 QAproC + S MULTICOM (human)Consensus

Combine similar models or fragments 3DRefine (energy, bond, angle) + FUSION to refold unaligned loops and tails + SCRWL for side chain packing (server) Automated detection and replacement of bad models (worked in all 13 server exception cases)

Templates: 4IB2, 4EF1, 4OTE, 4K3F, 3UP9, 3GXA, 4GOT The best server model designated as the first model Distribution of GDT-TS Scores of MULTICOM Server Models GDT: 0.87 GDT: 0.73 GDT: Blue: structure Gold: model GDT-TS score: 0.86

Blue: structure Gold: model GDT-TS score: 0.59 Server models: Zhang-Server_TS1 BAKER-ROSETTASERVER_TS4 myprotein-me_TS1 Human model is better than Zhang-Server_TS1 Distribution of GDT-TS Scores of CASP Server Models

Blue: structure Gold: model GDT-TS score: 0.63 Human model: The same GDT-TS score Better side-chain quality Server models: nns_TS1 nns_TS3 nns_TS2 FFAS-3D_TS1 Distribution of GDT-TS Scores of CASP Server Models

Blue: structure Gold: model GDT-TS score: ~0.22 Selected and combined models of low (average) quality Distribution of GDT-TS Scores of CASP Server Models

Large-scale independent sampling Large-scale quality assessment Exception handling Model combination Model refinement Model refolding Template recognition in thin, remote profile Alignment in thin, remote profile Quality assessment with few good models

Group Members Badri Adhikari Deb Bhattacharya Renzhi Cao Jilong Li CASP Assessors Dr. Roland Dunbrack CASP Organizers CASP Server Predictors