Automated Structure Prediction using Robetta in CASP11 Baker Group David Kim, Sergey Ovchinnikov, Frank DiMaio.

Automated Structure Prediction using Robetta in CASP11 Baker Group David Kim, Sergey Ovchinnikov, Frank DiMaio

1.Domain parsing and assembly 2.Alignment cluster ranking 3.Sequence covariance restraints (GREMLIN) 4.Plus various bug fixes! CAMEO Benchmark http://www.cameo3d.org Updates since last CASP

Robetta Server Pipeline RosettaCM (template hybridization) All targetsHard RosettaAB (fragment assembly) Difficulty prediction Model selection Domain assembly HHSearchRaptorXSPARKS-X Domain parsing, template alignment and spatial restraint generation Sequence For each domain Informatics Modeling

Alignment clusters HHSearchRaptorXSPARKS-X PDB100 Cluster partial threads for distinct topologies Up to 10 alignments from each method Rank clusters by P(correct) – probability that an alignment is within Δ GDT of the best Probability distribution varies considerably with target difficulty Easy targetHard target

Target Difficulty Prediction GDT and predicted difficulty correlation Used in: Domain parsing Modeling decision making Run RosettaAB also if < 0.2 (twilight regime) Amount of RosettaCM sampling Very easy targets run locally on cluster (>0.80 and sequence identity > 40%): 10-100 Runs on Rosetta@Home Easy (>0.80): 2000 Medium (>0.30): 4000 Hard (<=0.30): 8000 Predict target difficulty based on the degree of structural consensus between the top-ranked alignment from each threading program.

Domain Parsing Objective is to identify optimal non-overlapping alignment clusters 1.Run alignment method and partial-thread clustering on sequence 2.Identify potential “chunks” based on windowed difficulty of top alignment cluster 30 residue window 0.1, 0.2, 0.3 difficulty thresholds 30 residue minimum domain length 3.Boundaries are fine-tuned using PSIPRED loop probability 4.All potential “chunks” are run through steps 1 to 3. 5.Final “chunks” are selected based on difficulty. 6.For twilight “chunks” (difficulty < 0.2) parsing is also based on Pfam and MSA (CASP9 GINZU method)

Modeling Method Sequence Template Alignments Sequence-based fragments Restraint functions Threaded templates Gradient-based energy minimization/lo op closure Torsion space fragment insertion Cartesian space template chunk recombination Full-atom refinement Gremlin

BAKER SERVER MODELS BAKER SERVER MODELS ALIGNMENT HHblits Jackhmmer 90% identity redundancy cutoff To find at least 2L sequences vary evalue and coverage 1e-20 to 1e-4 and 75% to 50% If >= 1L Sequences Contact PREDICTIONS Contact PREDICTIONS GREMLIN TARGET GREMLIN used if difficulty = 1L Used for 27 domains

RosettaCM performance using GREMLIN CASP11 targets Best vs Best (rerun without GREMLIN)Model1 vs Model1 (rerun without GREMLIN) T0768-D1 GREMLIN is used in ranking alignment clusters and sampling

RosettaAB performance using GREMLIN CASP11 targets T0789-D1 T0790-D2 Best vs Best (rerun without GREMLIN)Model1 vs Model1 (rerun without GREMLIN) GREMLIN is used in sampling

T0789-D1 trimmed (76 aa) GREMLIN predicted contacts helped w/ ~2L sequences Domain over parsed as 1-109 vs 6-151 official parse 2.84 Å RMSD over 71 res NATIVE T0789-D1 Models generated: 96731 Top scoring models clustered: 4519

T0790-D2 (130aa) GREMLIN predicted contacts helped w/ ~3L sequences Domain under parsed as 101-293 vs 136-265 official parse NATIVE T0790-D2 Models generated: 28988 Top scoring models clustered: 1441 3.91 Å RMSD over 92 res

T0767-D2 (180aa) Domain correctly parsed as 131-318 vs 133-312 official parse 3.99 Å RMSD over 92 res Models generated: 29638 Top scoring models clustered: 1565 MSA based domain parse was accurate

What went wrong 1.Ranking Twilight target ranking (to ab initio or not to ab initio that is the ?) New hybrid domain assembler (T0840-D1, T0852-D2) 2.Informatics Domain parse errors (T0808-D1, T0812-D1) Incorrect template (T0816-D1)

Ranking improved by a simple switch for twilight targets Submitted model 1 (CM) vs Submitted model 2 (AB) Simply choose submitted model 2 for AB targets (difficulty < 2.0)

Model1 CM GDT vs Model1 AB GDT Colored by difficulty

Acknowledgements Hetunandan Kamichetty (GREMLIN) Johannes Söding (HHpred) Jinbo Xu (RaptorX) Yaoqi Zhou and Yuedong Yang (Sparks-X) Rosetta Commons David Baker Rosetta@home users for generous computing resources Juergen Haas (CAMEO http://www.cameo3d.org) CASP organizers, assessors, structural biologists who provided structures Andriy Kryshtafovych

Automated Structure Prediction using Robetta in CASP11 Baker Group David Kim, Sergey Ovchinnikov, Frank DiMaio.

Similar presentations

Presentation on theme: "Automated Structure Prediction using Robetta in CASP11 Baker Group David Kim, Sergey Ovchinnikov, Frank DiMaio."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Automated Structure Prediction using Robetta in CASP11 Baker Group David Kim, Sergey Ovchinnikov, Frank DiMaio.

Similar presentations

Presentation on theme: "Automated Structure Prediction using Robetta in CASP11 Baker Group David Kim, Sergey Ovchinnikov, Frank DiMaio."— Presentation transcript:

Similar presentations

About project

Feedback