Comparative Modelling Threading Baldomero Oliva Miguel UniversitatPompeuFabra.

Slides:

Advertisements

Similar presentations

PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.

Advertisements

PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.

Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.

Protein Tertiary Structure Prediction

Structural bioinformatics

Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]

CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.

Tertiary protein structure viewing and prediction July 1, 2009 Learning objectives- Learn how to manipulate protein structures with Deep View software.

Protein Structure, Databases and Structural Alignment

Protein structure (Part 2 of 2).

Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]

Tertiary protein structure viewing and prediction July 5, 2006 Learning objectives- Learn how to manipulate protein structures with Deep View software.

CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Protein Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.

Thomas Blicher Center for Biological Sequence Analysis

The Protein Data Bank (PDB)

. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]

Tertiary protein structure modelling May 31, 2005 Graded papers will handed back Thursday Quiz#4 today Learning objectives- Continue to learn how to manipulate.

ProteinStructuralDatabases. Proteins are built from amino-acids. Introduction H | NH2-c-CO2H | R.

1 Protein Structure Prediction Reporter: Chia-Chang Wang Date: April 1, 2005.

Protein Tertiary Structure. Primary: amino acid linear sequence. Secondary:  -helices, β-sheets and loops. Tertiary: the 3D shape of the fully folded.

Protein structure prediction May 30, 2002 Quiz#4 on June 4 Learning objectives-Understand difference between primary secondary and tertiary structure.

Protein Structure and Function Prediction. Predicting 3D Structure –Comparative modeling (homology) –Fold recognition (threading) Outstanding difficult.

Protein Tertiary Structure Prediction Structural Bioinformatics.

CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.

Protein Tertiary Structure Prediction Structural Bioinformatics.

Protein Structures.

Protein Structure Prediction and Analysis

Protein Structural Prediction. Protein Structure is Hierarchical.

Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.

Protein modelling ● Protein structure is the key to understanding protein function ● Protein structure ● Topics in modelling and computational methods.

Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.

Protein Tertiary Structure Prediction

Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Structural alignment Protein structure Every protein is defined by a unique sequence (primary structure) that folds into a unique.

Tertiary Structure Prediction Methods Any given protein sequence Structure selection Compare sequence with proteins have solved structure Homology Modeling.

Macromolecular structure

COMPARATIVE or HOMOLOGY MODELING

Representations of Molecular Structure: Bonds Only.

Bioinformatics 2 -- Lecture 8 More TOPS diagrams Comparative modeling tutorial and strategies.

Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.

Modelling Genome Structure and Function Ram Samudrala University of Washington.

Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.

Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009

Multiple Mapping Method with Multiple Templates (M4T): optimizing sequence-to-structure alignments and combining unique information from multiple templates.

Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.

DALI Method Distance mAtrix aLIgnment

Applied Bioinformatics Week 12. Bioinformatics & Functional Proteomics How to classify proteins into functional classes? How to compare one proteome with.

Module 3 Protein Structure Database/Structure Analysis Learning objectives Understand how information is stored in PDB Learn how to read a PDB flat file.

Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:

Structure prediction: Homology modeling

Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.

Predicting Protein Structure: Comparative Modeling (homology modeling)

Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.

Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.

Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.

Protein Structure Prediction Graham Wood Charlotte Deane.

Homology Modeling 原理、流程，還有如何用該工具去預測三級結構 Lu Chih-Hao 1 1.

CARBOXYPEPTIDASE - Metalloprotease with Zn - Released as pro-protease

Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.

Principles of Protein Structure. AMINOACIDS Estereoisomer L Side-chain (-CH 3 ) }carboxyl-COOH amino amino -NH 2.

Protein Tertiary Structure Prediction Structural Bioinformatics.

Lab Lab 10.2: Homology Modeling Lab Boris Steipe Departments of Biochemistry and.

PROTEIN MODELLING Presented by Sadhana S.

Protein Structure Visualisation

Computational Structure Prediction

Protein Structure Prediction and Protein Homology modeling

Protein Structures.

Protein structure prediction.

Protein Homology Modelling

Presentation transcript:

Comparative Modelling Threading Baldomero Oliva Miguel UniversitatPompeuFabra

EVOLUTIONEVOLUTION

Diferences(proteins) Genetic information (DNA)

DNA secuence 3D

The Sanger Centre U.S. Genome Project ( 1990 ) Identify human genes Store the information in Data Banks Development of tools for analisis Venter, Patrinos & Collins DOEDOE NHGRINHGRINIHNIH

Increase of sequences GenBank (DNA) TrEMBL 92211Swiss Prot

Public Database Holdings:

3D known (20%) Homologs (30%) Remotes (10%) Unknown function (20%) NO homology (20%) sec.3D

Knowing the structure helps to: Knowing the structure helps to: TO KNOW TO KNOW the function: annotation IMPROVE IMPROVE the function: genetic engineering INHIBIT INHIBIT the function: drugs

Structure Factory express & purify cristallize X-ray analises structure

Objetive Enough information Not accurate NO information 3D

How does modelling help? Structural Genomics X-ray

Can we predict protein structures from genome sequences?

Study of known structures Assigning structure to sequences with unknown structure

secondary Tertiary Met Val Leu Tyr Asp Ser Thr sequence quaternary super-secondary domain Caracterize the conformation Ligand docking

TrioseFosfateIsomerase Carboxypeptidase Glucanase Ribonuclease Mioglobin

The number of different protein folds is limited: New Folds Known Folds

The number of different protein folds is limited: [ last update: Oct 2001 ]

Nº sequences > Nº folds Structura 3ª Classification

UCL, Janet Thornton & Christine Orengo Class (C), Architecture(A), Topology(T), Homologous superfamily (H) Protein Structure Classification CATH - Protein Structure Classification [ ] SCOP - Structural Classification of Proteins MRC Cambridge (UK), Alexey Murzin, Brenner S. E., Hubbard T., Chothia C. created by manual inspection comprehensive description of the structural and evolutionary relationships [ ]

Class(C) derived from secondary structure content is assigned automatically Architecture(A) describes the gross orientation of secondary structures, independent of connectivity. Topology(T) clusters structures according to their topological connections and numbers of secondary structures Homologous superfamily (H)

TSS toxin 8.8% Id Remote Cholera toxin 80% Id Homolog SCOP Family Superfamily Fold Class Enterotoxin Aminoacyl tRNA synthetase 4.4% Id Analog

MNIFEMLRID EGLRLKIYKD TEGYYTIGIG HLLTKSPSLN AAKSELDKAI GRNCNGVITK DEAEKLFNQD VDAAVRGILR NAKLKPVYDS LDAVRRCALI NMVFQMGETG VAGFTNSLRM LQQKRWDEAA VNLAKSRWYN QTPNRAKRVI TTFRTGTWDA YKNL Can we predict protein structures ?

Homology modeling Idea: Extrapolation of the structure for a new (target) sequence from the known 3D-structures of related family members (templates).

Alignments Nº sequences > Nº folds Step 1 Selection of templates Step 2 Alignment with templates

identity Number of residues aligned Percentage sequence identity/similarity (B.Rost, Columbia, NewYork) Sequence identity implies structural similarity Don’t know region..... Sequence similarity implies structural similarity?

Hidropaticity Plot 2D Structure 2D prediction Nº sequences > Nº folds

2D prediction (PHD)

Alignments Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs Hidropaticity Plot 2D Structure 2D prediction

Comparative Modelling (homology) Fold is more conserved than sequence. Secondary structure is the most conserved structure Loops have the higher variability in structure.

Comparative modelling - Composer - Modeler SCR & VR SCR VR - Ring Closure - DB Search - Classification Alignments Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs Hidropaticity Plot 2D Structure 2D prediction

MODEL Comparative modelling - Composer - Modeler SCR & VR SCR VR - Ring Closure - DB Search - Classification Alignments Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs Hidropaticity Plot 2D Structure 2D prediction Step 3 Model Building

Fragment Extraction Rigid-body Assembly Loop Construction Spatial Constrains Optimisation Methods Satisfaction of Spatial Constrains

averaging core template backbone atoms (weighted by local sequence similarity with the target sequence) Leave non-conserved regions (loops) for later …. a) Build conserved core framework Rigid Body assembly

use the “spare part” algorithm to find compatible fragments in a Loop-Database “ab-initio” rebuilding of loops (Monte Carlo, molecular dynamics, genetic algorithms, etc.) b) Loop modeling Rigid Body assembly

EF-Hand Calcium binding aa{baalal}bbXh{DXDpDG}Xh aa{baalal}bb Xh{DXDpDG}Xh D 100% D/N 100%

P-loop GTP binding Conformation bb{eppgag}aa Motif sequence hh{GhXXpG}Kp P-loop GTP binding Conformation bb{eppgag}aa Motif sequence hh{GhXXpG}Kp

NAD(P)/FADbinding bb{eab}aahh{GhG}hX bb{eab}aa hh{GhG}hX

Ring closure search Distance restraints

Building CA Superposition

Building

A. Sali & T. Blundel (1993) JMB 234, Modeling by Satisfaction of Spatial restraints M A T E A F T S G Q

Feature properties can be associated with a protein (e.g. X-ray resolution) residues (e.g. solvent accessibility) pairs of residues (e.g. C a - C a distance) other features (e.g. main chain classes) How can we derive modeling restraints from this data? A restraint is defined as probability density function (pdf) p(x): with Modeling by Satisfaction of Spatial restraints

combine basis pdfs to molecular probability density functions Modeling by Satisfaction of Spatial restraints

Satisfaction of spatial restraints Find the protein model with the highest probability Variable target function: Start with e.g. a random conformation model and use only local restraints minimize some steps using a conjugate gradient optimization repeat with introducing more and more long range restraints until all restraints are used Modeling by Satisfaction of Spatial restraints

Threading Idea: Find the optimal structure for a new (target) sequence in the set of known 3D-structures (templates) by threading the target sequence.

Nº sequences > Nº folds Classification - Homologs - Remotes - Analogs Step 1 Selection of templates FOLD ASSIGNMENT Structure 3D

Pseudo Potencial Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs

Pseudo-potencial Principles Potential of Mean Force: Applied on proteins Boltzman law Use frequencies instead of probabilities. They are taken from non-homologous proteins on the PDB.

Fold recognition / Threading Principle: Find a compatible fold for a given sequence.... >Protein XY MSTLYEKLGGTTAVDLAV DKFYERVLQDDRIKHFFA DVDMAKQRAHQKAFLTYA FGGTDKYDGRYMREAHKE LVENHGLNGEHFDAVAED LLATLKEMGVPEDLIAEV AAVAGAPAHKRDVLNQ  ? Using... 1D – 3D profile matching, secondary structure predictions, position specific scoring matrices (PSSM), mean force potentials, keyword statistics,....

Predicción de “fold” Pseudo Potencial Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs “fold” prediction

Predicción del “fold” Combine sequence with all folds Evaluate energies Data Base PDB Data Base PDB Energy < threshold ? Sequence Are energy profiles native-like ? YES NO

Fold recognition from Secondary structure prediction TOPITS

MODEL Predicción de “fold” Pseudo Potencial Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs “fold” prediction

Hidropaticity Plot 2D prediction MODEL Predicción de “fold” Pseudo Potencial Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs “fold” prediction 2D Structure Comparative modelling - Composer - Modeler SCR & VR SCR VR - Ring Closure - DB Search - Classification Alignments

Hidropaticity Plot 2D prediction Evaluate de model MODEL Predicción de “fold” Pseudo Potencial Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs “fold” prediction 2D Structure Comparative modelling - Composer - Modeler SCR & VR SCR VR - Ring Closure - DB Search - Classification Alignments

Why Model Evaluation ? Just because it’s pretty doesn’t make it right ! Topics: correct fold model coverage (%) C  - deviation (rmsd) alignment accuracy (%) side chain placement

1.Errors in side-chain packing. 2.Shifts of correctly aligned residues. 3.Regions without template. 4.Errors due to misalignments. 5.Errors produced by incorrect templates. Types of Errors

EVA Evaluation of Automatic protein structure prediction [ Burkhard Rost, Andrej Sali, ] CASP Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction 3D - Crunch Very Large Scale Protein Modelling Project Model Accuracy Evaluation

Hidropaticity Plot 2D prediction Evaluate de model MODEL Predicción de “fold” Pseudo Potencial Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs “fold” prediction 2D Structure Comparative modelling - Composer - Modeler SCR & VR SCR VR - Ring Closure - DB Search - Classification Alignments

Hidropaticity Plot 2D prediction Evaluate de model MODEL Predicción de “fold” Pseudo Potencial Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs “fold” prediction 2D Structure Comparative modelling - Composer - Modeler SCR & VR SCR VR - Ring Closure - DB Search - Classification Alignments Side-chains - Multi-Copy

c) Side Chain placement Find the most probable side chain conformation, using homologues structures back-bone dependent rotamer libraries energetic and packing criteria Rigid Body assembly

Hidropaticity Plot 2D prediction Evaluate de model MODEL Predicción de “fold” Pseudo Potencial Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs “fold” prediction 2D Structure Comparative modelling - Composer - Modeler SCR & VR SCR VR - Ring Closure - DB Search - Classification Alignments Side-chains - Multi-Copy

Hidropaticity Plot 2D prediction Evaluate de model MODEL Predicción de “fold” Pseudo Potencial Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs “fold” prediction 2D Structure Comparative modelling - Composer - Modeler SCR & VR SCR VR - Ring Closure - DB Search - Classification Alignments Side-chains - Multi-Copy Optimization - Minimization E - Simulated Annealing

modeling will produce unfavorable contacts and bonds  idealization of local bond and angle geometry extensive energy minimization will move coordinates away  keep it to a minimum SwissModel is using GROMOS 96 force field for a steepest descent d) Energy minimization Rigid Body assembly

OPTIMIZATION 1- Minimization by Newton-Rapson X i+1 = X i + grad(E) 2- Simulated Annealing

Optimization schedule and progress [ A. Sali & T. Blundel (1993) JMB 234, 779 – 815 ] Modeling by Satisfaction of Spatial restraints

Hidropaticity Plot 2D prediction Evaluate de model MODEL Predicción de “fold” Pseudo Potencial Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs “fold” prediction 2D Structure Comparative modelling - Composer - Modeler SCR & VR SCR VR - Ring Closure - DB Search - Classification Alignments Side-chains - Multi-Copy Optimization - Minimization E - Simulated Annealing

Hidropaticity Plot 2D prediction Evaluate de model MODEL Predicción de “fold” Pseudo Potencial Nº sequences > Nº folds Structure 3D Classification - Homologs - Remotes - Analogs “fold” prediction 2D Structure Comparative modelling - Composer - Modeler SCR & VR SCR VR - Ring Closure - DB Search - Classification Alignments Side-chains - Multi-Copy Optimization - Minimization E - Simulated Annealing Molecular Dynamics

MOLECULAR DYNAMICS Finite differences algorithm Coupling to a thermal bath Statistical Mechanics Ergodic Hipothesis

EXAMPLE

clevage clevage + Activation segment activation Enzime PRO-CARBOXIPEPTIDASA

PRO-CARBOXYPEPTIDASES Porcine Pro Carboxypeptidase B PCPBpPorcine Pro Carboxypeptidase A1 PCPA1pBovine PCPA1b

PCPA2h -ETFVGDQVLEIVPSNEEQIKNLLQLEAQEHLQLDFWKSPTTPGETAHVRVPFVNVQAVKVFLESQGIAYSIMIEDVQVL PCPA2h PCPA1b KEDFVGHQVLRITAADEAEVQTVKELEDLEHLQLDFWRGPGQPGSPIDVRVPFPSLQAVKVFLEAHGIRYRIMIEDVQSL PCPA1b PCPA1p KEDFVGHQVLRISVDDEAQVQKVKELEDLEHLQLDFWRGPARPGFPIDVRVPFPSIQAVKVFLEAHGIRYTIMIEDVQLL PCPA1p PCPA2h LDKENEEMLFNRRRERSGN-FNFGAYHTLEEISQEMDNLVAEHPGLVSKVNIGSSFENRPMNVLKFSTGG-DKPAIWLDA PCPA2h PCPA1b LDEEQEQMFASQSRARSTNTFNYATYHTLDEIYDFMDLLVAEHPQLVSKLQIGRSYEGRPIYVLKFSTGGSNRPAIWIDL PCPA1b PCPA1p LDEEQEQMFASQGRARTTSTFNYATYHTLEEIYDFMDILVAEHPALVSKLQIGRSYEGRPIYVLKFSTGGSNRPAIWIDS PCPA1p PCPA2h GIHAREWVTQATALWTANKIVSDYGKDPSITSILDALDIFLLPVTNPDGYVFSQTKNRMWRKTRSKVSGSLCVGVDPNRN PCPA2h PCPA1b GIHSREWITQATGVWFAKKFTEDYGQDPSFTAILDSMDIFLEIVTNPDGFAFTHSQNRLWRKTRSVTSSSLCVGVDANRN PCPA1b PCPA1p GIXSRXWITQASGVWFAKKITENYGQNSSFTAILDSMDIFLEIVTNPNGFAFTHSDNRLWRKTRSKASGSLCVGSDSNRN PCPA1p PCPA2h WDAGFGGPGASSNPCSDSYHGPSANSEVEVKSIVDFIKSHGKVKAFIILHSYSQLLMFPYGYKCTKLDDFDELSEVAQKA PCPA2h PCPA1b WDAGFGKAGASSSPCSETYHGKYANSEVEVKSIVDFVKDHGNFKAFLSIHSYSQLLLYPYGYTTQSIPDKTELNQVAKSA PCPA1b PCPA1p WDAGFGGAGASSSPCAETYHGKYPNSEVEVKSITDFVKNNGNIKAFISIXSYSQLLLYPYGYKTQSPADKSELNQIAKSA PCPA1p PCPA2h AQSLRSLHGTKYKVGPICSVIYQASGGSIDWSYDYGIKYSFAFELRDTGRYGFLLPARQILPTAEETWLGLKAIMEHVRD PCPA2h PCPA1b VEALKSLYGTSYKYGSIITTIYQASGGSIDWSYNQGIKYSFTFELRDTGRYGFLLPASQIIPTAQETWLGVLTIMEHTLN PCPA1b PCPA1p VAALKSLYGTSYKYGSIITVIYQASGGVIDWTYNQGIKYSFSFELRDTGRRGFLLPASQIIPTAQETWLALLTIMEHTLN PCPA1p SEQUENCE ALIGNMENT OF PRO-CARBOXYPEPTIDASES

ComparativeModelling Model building of conserved regions Baldomero Oliva Miguel UniversitatPompeuFabra

UniversitatPompeuFabra ComparativeModelling Model building of conserved regions

Add and model build loop regions Baldomero Oliva Miguel UniversitatPompeuFabra ComparativeModelling Model building of conserved regions

Baldomero Oliva Miguel UniversitatPompeuFabra Add and model build loop regions ComparativeModelling Model building of conserved regions

Secondary Structure Prediction and Multiple Alignment of Pro-Carboxypeptidases EEEEE HHHHHHHHHHHHHHH EEEE EEEEE HHHHHHHHHH EEEEHHHHHHHH PCPA2h PHD PCPA2h -ETFVGDQVLEIVPSNEEQIKNLLQLEAQEHLQLDFWKSPTTPGETAHVRVPFVNVQAVKVFLESQGIAYSIMIEDVQVL PCPA2h PCPA1b KEDFVGHQVLRITAADEAEVQTVKELEDLEHLQLDFWRGPGQPGSPIDVRVPFPSLQAVKVFLEAHGIRYRIMIEDVQSL PCPA1b PCPA1p KEDFVGHQVLRISVDDEAQVQKVKELEDLEHLQLDFWRGPARPGFPIDVRVPFPSIQAVKVFLEAHGIRYTIMIEDVQLL PCPA1p HHHHHHHHHHHHHH PCPA2h PHD PCPA2h LDKENEEMLFNRRRERSGN-FNFGAYHTLEEISQEMDNLVAEHPGLVSKVNIGSSFENRPMNVLKFSTGG-DKPAIWLDA PCPA2h PCPA1b LDEEQEQMFASQSRARSTNTFNYATYHTLDEIYDFMDLLVAEHPQLVSKLQIGRSYEGRPIYVLKFSTGGSNRPAIWIDL PCPA1b PCPA1p LDEEQEQMFASQGRARTTSTFNYATYHTLEEIYDFMDILVAEHPALVSKLQIGRSYEGRPIYVLKFSTGGSNRPAIWIDS PCPA1p PCPA2h GIHAREWVTQATALWTANKIVSDYGKDPSITSILDALDIFLLPVTNPDGYVFSQTKNRMWRKTRSKVSGSLCVGVDPNRN PCPA2h PCPA1b GIHSREWITQATGVWFAKKFTEDYGQDPSFTAILDSMDIFLEIVTNPDGFAFTHSQNRLWRKTRSVTSSSLCVGVDANRN PCPA1b PCPA1p GIXSRXWITQASGVWFAKKITENYGQNSSFTAILDSMDIFLEIVTNPNGFAFTHSDNRLWRKTRSKASGSLCVGSDSNRN PCPA1p PCPA2h WDAGFGGPGASSNPCSDSYHGPSANSEVEVKSIVDFIKSHGKVKAFIILHSYSQLLMFPYGYKCTKLDDFDELSEVAQKA PCPA2h PCPA1b WDAGFGKAGASSSPCSETYHGKYANSEVEVKSIVDFVKDHGNFKAFLSIHSYSQLLLYPYGYTTQSIPDKTELNQVAKSA PCPA1b PCPA1p WDAGFGGAGASSSPCAETYHGKYPNSEVEVKSITDFVKNNGNIKAFISIXSYSQLLLYPYGYKTQSPADKSELNQIAKSA PCPA1p PCPA2h AQSLRSLHGTKYKVGPICSVIYQASGGSIDWSYDYGIKYSFAFELRDTGRYGFLLPARQILPTAEETWLGLKAIMEHVRD PCPA2h PCPA1b VEALKSLYGTSYKYGSIITTIYQASGGSIDWSYNQGIKYSFTFELRDTGRYGFLLPASQIIPTAQETWLGVLTIMEHTLN PCPA1b PCPA1p VAALKSLYGTSYKYGSIITVIYQASGGVIDWTYNQGIKYSFSFELRDTGRRGFLLPASQIIPTAQETWLALLTIMEHTLN PCPA1p

Pseudo-energy(threading) Wrong model Evaluation

G (C’) N (C’’) S (Ccap ) R (C1) E (C2) R (C3) R (C5) R (C4) Schellman Motif Profile C´´ » C3 / C´ Gly C3 C2 C1 Ccap C’ C’’ L Q E H G I L Q E H G I A R A K G V A R A K G V M K K L G K M K K L G K pont d’ hidrògen Interacció hidrofòbica N R R R E R S G N F N ponts d’hidrògen C3 Ccap C’’ ^ ^ ^  -Helix C-cap Schellman Motif Schellman Motif

Refinement of the Model

pseudo-energy residue IMPROVED

Protein Structure Resources PDBhttp:// PDB – Protein Data Bank of experimentally solved structures (RCSB) CATHhttp:// Hierarchical classification of protein domain structures SCOPhttp://scop.mrc-lmb.cam.ac.uk/scop/ Alexey Murzin’s Structural Classification of proteins DALIhttp://www2.ebi.ac.uk/dali/ Lisa Holm and Chris Sander’s protein structure comparison server SS-Prediction and Fold Recognition PHDhttp://cubic.bioc.columbia.edu/predictprotein/ Burkhard Rost’s Secondary Structure and Solvent Accessibility Prediction Server 3DPSSMhttp:// Fold Recognition Server using 1D and 3D Sequence Profiles coupled with Secondary Structure and Solvation Potential Information.

Protein Homology Modeling Resources SWISS MODEL: Deep View - SPDBV: homepage: Tutorials WhatIfhttp:// Gert Vriend’s protein structure modeling analysis program WhatIf Modeller: Andrej Sali's homology protein structure modelling by satisfaction of spatial restraints FAMS: Full Automatic Modelling System (FAMS); Kitasato University; Tokyo, Japan 3D-JIGSAW: Comparative Modelling Server; Imperial Cancer Research Fund; London, UK CPHmodels: Centre for Biological Sequence Analysis; The Technical University of Denmark; Denmark SDSC1: SDSC Structure Homology Modelling Server; San Diego Supercomputing Centre

Protein Homology Modeling Resources SWISS MODEL: Example Dear baldo, Title of your Request: subtilisin Swiss-Model '(SWISS-MODEL Version )' started on 27-Oct-2003 Process identification is AAAa08uiF The modelling procedure is now in progress, and its results should be sent to you shortly. In case of difficulties, please contact: Torsten Schwede Nicolas Guex

Protein Homology Modeling Resources SWISS MODEL: Example Length of target sequence: 379 residues Searching sequences of known 3D structures Found 1c3lA.pdb with P(N)=0 Found 1be8_.pdb with P(N)=0 Found 1cseE.pdb with P(N)=0 Found 1scd_.pdb with P(N)=0 Found 1sbc_.pdb with P(N)=0 Found 1be6_.pdb with P(N)=0 …….

Protein Homology Modeling Resources 3D JigSaw Example This is an automatic split of your sequence in protein domains Follow the link below and select the modelling templates for each modelable domain 2 domain/s found in your query sequence (2 have homologous templates). See them at: (this file will be erased in a week time) Your submission: >subtilisin MMRKKSFWFGMLTAFMLVFTMEFSDSASAAQPGKNVEKDYFVGFKSGVKTASVKKDIIKE SGGKVDKQFRIINAAKATLDKEALKEVKNDPDVAYVEEDHVAHALGQTVPYGIPLIKADK VQAQGFKGANVKVAVLDTGIQASHPDLNVVGGASFVAGEAYNTDGNGHGTHVAGTVAALD NTTGVLGVAPSVSLFAVKVLNSSGSGSYSGIVSGIEWATTNGMDVINMSLGGPSGSTAMK QAVDNAYSKGVVPVAAAGNSGSSGYTNTIGYPAKYDSVIAVGAVDSNSNRASFSSVGAEL EVMAPGAGVYSTYPTNTYATLNGTSMASPHVAGAAALILSKHPNLSASQVRNRLSSTATY LGSSFYYGKGLINVEAAAQ Send comments See you soon