Download presentation
Presentation is loading. Please wait.
Published byAlvin Hampton Modified over 8 years ago
1
Automated Structure Prediction using Robetta in CASP11 Baker Group David Kim, Sergey Ovchinnikov, Frank DiMaio
2
1.Domain parsing and assembly 2.Alignment cluster ranking 3.Sequence covariance restraints (GREMLIN) 4.Plus various bug fixes! CAMEO Benchmark http://www.cameo3d.org Updates since last CASP
3
Robetta Server Pipeline RosettaCM (template hybridization) All targetsHard RosettaAB (fragment assembly) Difficulty prediction Model selection Domain assembly HHSearchRaptorXSPARKS-X Domain parsing, template alignment and spatial restraint generation Sequence For each domain Informatics Modeling
4
Alignment clusters HHSearchRaptorXSPARKS-X PDB100 Cluster partial threads for distinct topologies Up to 10 alignments from each method Rank clusters by P(correct) – probability that an alignment is within Δ GDT of the best Probability distribution varies considerably with target difficulty Easy targetHard target
5
Target Difficulty Prediction GDT and predicted difficulty correlation Used in: Domain parsing Modeling decision making Run RosettaAB also if < 0.2 (twilight regime) Amount of RosettaCM sampling Very easy targets run locally on cluster (>0.80 and sequence identity > 40%): 10-100 Runs on Rosetta@Home Easy (>0.80): 2000 Medium (>0.30): 4000 Hard (<=0.30): 8000 Predict target difficulty based on the degree of structural consensus between the top-ranked alignment from each threading program.
6
Domain Parsing Objective is to identify optimal non-overlapping alignment clusters 1.Run alignment method and partial-thread clustering on sequence 2.Identify potential “chunks” based on windowed difficulty of top alignment cluster 30 residue window 0.1, 0.2, 0.3 difficulty thresholds 30 residue minimum domain length 3.Boundaries are fine-tuned using PSIPRED loop probability 4.All potential “chunks” are run through steps 1 to 3. 5.Final “chunks” are selected based on difficulty. 6.For twilight “chunks” (difficulty < 0.2) parsing is also based on Pfam and MSA (CASP9 GINZU method)
7
Modeling Method Sequence Template Alignments Sequence-based fragments Restraint functions Threaded templates Gradient-based energy minimization/lo op closure Torsion space fragment insertion Cartesian space template chunk recombination Full-atom refinement Gremlin
8
BAKER SERVER MODELS BAKER SERVER MODELS ALIGNMENT HHblits Jackhmmer 90% identity redundancy cutoff To find at least 2L sequences vary evalue and coverage 1e-20 to 1e-4 and 75% to 50% If >= 1L Sequences Contact PREDICTIONS Contact PREDICTIONS GREMLIN TARGET GREMLIN used if difficulty = 1L Used for 27 domains
9
RosettaCM performance using GREMLIN CASP11 targets Best vs Best (rerun without GREMLIN)Model1 vs Model1 (rerun without GREMLIN) T0768-D1 GREMLIN is used in ranking alignment clusters and sampling
10
RosettaAB performance using GREMLIN CASP11 targets T0789-D1 T0790-D2 Best vs Best (rerun without GREMLIN)Model1 vs Model1 (rerun without GREMLIN) GREMLIN is used in sampling
11
T0789-D1 trimmed (76 aa) GREMLIN predicted contacts helped w/ ~2L sequences Domain over parsed as 1-109 vs 6-151 official parse 2.84 Å RMSD over 71 res NATIVE T0789-D1 Models generated: 96731 Top scoring models clustered: 4519
12
T0790-D2 (130aa) GREMLIN predicted contacts helped w/ ~3L sequences Domain under parsed as 101-293 vs 136-265 official parse NATIVE T0790-D2 Models generated: 28988 Top scoring models clustered: 1441 3.91 Å RMSD over 92 res
13
T0767-D2 (180aa) Domain correctly parsed as 131-318 vs 133-312 official parse 3.99 Å RMSD over 92 res Models generated: 29638 Top scoring models clustered: 1565 MSA based domain parse was accurate
14
What went wrong 1.Ranking Twilight target ranking (to ab initio or not to ab initio that is the ?) New hybrid domain assembler (T0840-D1, T0852-D2) 2.Informatics Domain parse errors (T0808-D1, T0812-D1) Incorrect template (T0816-D1)
15
Ranking improved by a simple switch for twilight targets Submitted model 1 (CM) vs Submitted model 2 (AB) Simply choose submitted model 2 for AB targets (difficulty < 2.0)
16
Model1 CM GDT vs Model1 AB GDT Colored by difficulty
17
Acknowledgements Hetunandan Kamichetty (GREMLIN) Johannes Söding (HHpred) Jinbo Xu (RaptorX) Yaoqi Zhou and Yuedong Yang (Sparks-X) Rosetta Commons David Baker Rosetta@home users for generous computing resources Juergen Haas (CAMEO http://www.cameo3d.org) CASP organizers, assessors, structural biologists who provided structures Andriy Kryshtafovych
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.