Lab Lab 10.2: Homology Modeling Lab Boris Steipe Departments of Biochemistry and Molecular and Medical Genetics Program in Proteomics and Bioinformatics University of Toronto
Lab Concepts 1.Sequence alignment is the single most important step in homology modeling. 2.Reasons to model need to be defined. 3.Fully automated homology modeling services perform well. 4.SwissModel in practice.
Lab Concept 1: Sequence alignment is the single most important step in homology modeling.
Lab Superposition vs Alignment The coordinates of two proteins are “superimposed” in space. An alignment may be derived by correlated pairs of alpha-carbons. A superposition may differ from an optimized symbolic alignment...
Lab Insert of 4 residues Optimal sequence aligment gktlit nfsqehip gktlisflyeqnfsqehip Optimal structure alignment (blue=helix) gktlitnfsq ehip gktlisflyeqnfsqehip
Lab Off by 1, Off by 4 A shift in alignment of 1 residue corresponds to a skew in the modeled structure of about 4 Å (3.8 Å is the inter-alpha carbon distance) Nothing you can do AFTER an alignment will fix this error (not even molecular dynamics). 3.8Å
Lab Alignment is the limiting step for homology model accuracy No amount of forcefield minimization will put a misaligned residue in the right place ! CASP4: Williams MG et al. (2001) Proteins Suppl.5: 92-97
Lab Indels (inserts or deletions) Observations of known similarities in structures demonstrate that uniform gap penalty assumptions are NOT BIOLOGICAL. Indels are most often observed in loops, less often in secondary structure elements When they do not occur in loops, there is usually a maintenance of helical or strand properties.
Lab Can we do better with the gap assumption? Required: position specific gap penalties One approach: implemented in Clustal as secondary structure masks Get secondary structure information, convert it to Clustal mask format. (Easy - read documentation !)
Lab Secondary structure from PDB.... (Algorithm ?)
Lab Secondary structure from RasMol.... (DSSP !)
Lab Concept 2: Reasons to model need to be defined.
Lab Use of homology models Interpreting homology models: biochemical inference from 3D similarity Bonds Angles, plain and dihedral Surfaces, solvent accessibility Amino acid functions, presence in structure patterns Spatial relationship of residues to active site Spatial relationship to other residues Participation in function / mechanism Static and dynamic disorder Electrostatics Conservation patterns (structural and functional) Posttranslational modification sites
Lab Abuse of homology models Modelling structures that cannot / will not be verified Analysing geometry of model Interpreting loop structures
Lab Databases of Models Don’t make models unless you check first... Swiss-Model repository 64,000 models based on 4000 structures and Swiss-Prot proteins ModBase Made with "Modeller" - 15,000 reliable models for substantial segments of approximately 4,000 proteins in the genomes of Saccharomyces cerevisiae, Mycoplasma genitalium, Methanococcus jannaschii, Caenorhabditis elegans, and Escherichia coli.
Lab Concept 3: Fully automated services perform well.
Lab Homology Modeling Software? Freely available packages perform as good as commercial ones at CASP (Critical Assessment of Structure Prediction) Swiss Model (tutorial) Modeller ( )
Lab Swiss-Model steps: Peitsch M & Guex N (1997) Electrophoresis 18: Search for sequence similarities BLASTP against EX-NRL 3D
Lab Swiss-Model steps: Peitsch M & Guex N (1997) Electrophoresis 18: Search for sequence similarities 2. Evaluate suitable templates Identity: > 25% Expected model : > 20 resid.
Lab Swiss-Model steps: Peitsch M & Guex N (1997) Electrophoresis 18: Search for sequence similarities 2. Evaluate suitable templates 3. Generate structural alignments Select regions of similarity and match in coordinate- space (EXPDB).
Lab Swiss-Model steps: Peitsch M & Guex N (1997) Electrophoresis 18: Search for sequence similarities 2. Evaluate suitable templates 3. Generate structural alignments 4. Average backbones Compute weighted average coordinates for backbone atoms expected to be in model.
Lab Swiss-Model steps: Peitsch M & Guex N (1997) Electrophoresis 18: Search for sequence similarities 2. Evaluate suitable templates 3. Generate structural alignments 4. Average backbones 5. Build loops Pick plausible loops from library, ligate to stems; if not possible, try combinatorial search.
Lab Bridge with overlapping pieces from pentapeptide fragment library, anchor with the terminal residues and add the three central residues. Swiss-Model steps: Peitsch M & Guex N (1997) Electrophoresis 18: Search for sequence similarities 2. Evaluate suitable templates 3. Generate structural alignments 4. Average backbones 5. Build loops 6. Bridge incomplete backbones
Lab Swiss-Model steps: Peitsch M & Guex N (1997) Electrophoresis 18: Search for sequence similarities 2. Evaluate suitable templates 3. Generate structural alignments 4. Average backbones 5. Build loops 6. Bridge incomplete backbones 7. Rebuild sidechains Rebuild sidechains from rotamer library - complete sidechains first, then regenerate partial sidechains from probabilistic approach.
Lab Swiss-Model steps: Peitsch M & Guex N (1997) Electrophoresis 18: Search for sequence similarities 2. Evaluate suitable templates 3. Generate structural alignments 4. Average backbones 5. Build loops 6. Bridge incomplete backbones 7. Rebuild sidechains 8. Energy minimize Gromos 96 - Energy minimization
Lab Swiss-Model steps: Peitsch M & Guex N (1997) Electrophoresis 18: results 1. Search for sequence similarities 2. Evaluate suitable templates 3. Generate structural alignments 4. Average backbones 5. Build loops 6. Bridge incomplete backbones 7. Rebuild sidechains 8. Energy minimize 9. Write Alignment and PDB file
Lab Swissmodel in comparison 3D-Crunch: 211,000 sequences -> 64,000 models Controls: >50 % ID: ~ 1 Å RMSD 40-49% ID: 63% < 3Å 25-29% ID: 49% < 4Å Guex et al. (1999) TIBS 24: EVA: Eyrich et al. (2001) Bioinformatics 17: ( Manual alternatives: Modeller... Automatic alternatives: SwissModel sdsc1 3djigsaw pcomb_pcons cphmodels easypred # 1 for RMSD and % correct aligned, #2 for coverage
Lab What structure elements change between similar sequence? Subtle changes in protein backbone path Changes in amino acid side-chain rotamer orientation backbone dependent Loops added or truncated Model may be incomplete
Lab Concept 4: SwissModel in practice.
Lab SwissModel... first approach mode
Lab enter the ExPDB template ID...
Lab run in Normal Mode (Except if defining a DeepView project )...
Lab successful submission. Results come by .
Lab Optimal sequence alignment [...] # Matrix: EBLOSUM35 # Gap_penalty: 10.0 # Extend_penalty: 0.5 # # Length: 122 # Identity: 36/122 (29.5%) # Similarity: 55/122 (45.1%) # Gaps: 28/122 (23.0%) # Score: [...] #======================================= 23 LNNKKTIAEGRRIPISKAVENPTATEIQDVCSAVGLNVFLEKNKMYSREW 72 |:.||:.|||||||...||.|....|:.:....:||. |..:.|.|.:.| 11 LDSKKSRAEGRRIPRRFAVPNVKLHELVEASKELGLK-FRAEEKKYPKSW NRDVQYRGRVRVQLKQEDGSLCLVQFPSRKSVMLYAAEMIPKLKTRTQKT 122.:..|||.|:.:.::..:|:..|..|.::: WEEGGRVVVEKR GTKTKLMIELARKIAEIR GGADQSLQQGEGSKKGKGKKKK 144 :|..:| ||.|.|||| EQKREQ----KKDKKKKKK 104
Lab Optimal structural superposition 1.4Å in 32 res.
Lab Questions ? Feedback ?