Presentation is loading. Please wait.

Presentation is loading. Please wait.

11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction1 11/11/05 Protein Structure Prediction & Modeling.

Similar presentations


Presentation on theme: "11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction1 11/11/05 Protein Structure Prediction & Modeling."— Presentation transcript:

1 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction1 11/11/05 Protein Structure Prediction & Modeling

2 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction2 Bioinformatics Seminars Nov 11 Fri 12:10 BCB Seminar in E164 Lago Building Supertrees Using Distances Steve Willson, Dept of Mathematics http://www.bcb.iastate.edu/courses/BCB691-F2005.html Next week - Baker Center/BCB Seminars:Baker Center/BCB Seminars: (seminar abstracts available at above link) Nov 14 Mon 1:10 PM Doug Brutlag, Stanford Discovering transcription factor binding sites Nov 15 Tues 1:10 PM Ilya Vakser, Univ Kansas Modeling protein-protein interactions both seminars will be in Howe Hall Auditorium

3 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction3 Protein Structure & Function: Analysis & Prediction Mon Protein structure: basics; classification,databases, visualization Wed Protein structure databases - cont. Thurs Lab Protein structure databases Visualization software Secondary structure prediction Fri Protein structure prediction Protein-nucleic acid interactions

4 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction4 Reading Assignment (for Mon-Fri) Mount Bioinformatics Chp 10 Protein classification & structure prediction http://www.bioinformaticsonline.org/ch/ch10/index.html pp. 409-491 Ck Errata: http://www.bioinformaticsonline.org/help/errata2.html

5 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction5 BCB 544 Additional Reading Required: Gene Prediction: Burge & Karlin 1997 JMB 268:78 Prediction of complete gene structures in human genomic DNA Optional: Structure Prediction: Schueler-Furman…Baker 2005 Science 310:638 Progress in modeling of protein structures and interactions

6 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction6 Review last lecture: Protein Structure: Databases, Classification & Visualization

7 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction7 Protein sequence databases UniProt (SwissProt, PIR, EBI) http://www.pir.uniprot.org NCBI Protein http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Protein http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Protein More on these later: protein function prediction

8 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction8 Protein sequence & structure: analysis Diamond STING Millennium - many useful structure analysis tools, including Protein Dossier http://trantor.bioc.columbia.edu/SMS/ http://trantor.bioc.columbia.edu/SMS/ SwissProt (UniProt) protein knowledgebase http://us.expasy.org/sprot InterPRO sequence analysis tools http://www.ebi.ac.uk/interpro

9 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction9 Protein structure databases PDB Protein Data Bank http://www.rcsb.org/pdb/ http://www.rcsb.org/pdb/ (RCSB) - THE protein structure database MMDB Molecular Modeling Database http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Structure (NCBI Entrez) - has "added" value MSD Molecular Structure Database http://www.ebi.ac.uk/msd http://www.ebi.ac.uk/msd Especially good for interactions, binding sites

10 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction10 Protein structure classification SCOP = Structural Classification of Proteins Levels reflect both evolutionary and structural relationships http://scop.mrc-lmb.cam.ac.uk/scop CATH = Classification by Class, Architecture, Topology & Homology http://cathwww.biochem.ucl.ac.uk/latest/ http://cathwww.biochem.ucl.ac.uk/latest/ DALI/FSSP (recently moved to EBI & reorganized) fully automated structure alignments DALI serverhttp://www.ebi.ac.uk/dali/index.htmlhttp://www.ebi.ac.uk/dali/index.html DALI Database (fold classification) http://ekhidna.biocenter.helsinki.fi/dali/start http://ekhidna.biocenter.helsinki.fi/dali/start

11 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction11 Protein structure visualization Molecular Visualization Freeware: http://www.umass.edu/microbio/rasmol MolviZ.Org http://www.umass.edu/microbio/chime Protein Explorer http://www.umass.edu/microbio/chime/pe/protexpl/frntdoor.htm RASMOL (& many decendents: Protein Explorer,PyMol, MolMol, etc.) http://www.umass.edu/microbio/rasmol/index2.htm CHIME http://www.umass.edu/microbio/chime/getchime.htm Cn3D http://www.biosino.org/mirror/www.ncbi.nlm.nih.gov/Structure/cn3d/ http://www.biosino.org/mirror/www.ncbi.nlm.nih.gov/Structure/cn3d/ Deep View = Swiss-PDB Viewer http://www.expasy.org/spdbv

12 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction12 Protein structure visualization Superb interactive structure visualization software by Jane & Dave Richardson, Duke University KINIMAGE http://kinemage.biochem.duke.edu/ Fantastic research tools for structure analysis & refinement

13 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction13 RCSB PDB - Beta site http://pdbbeta.rcsb.org/pdb/Welcome.do http://pdbbeta.rcsb.org/pdb/Welcome.do

14 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction14 MMDB http://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtml http://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtml

15 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction15 Cn3D http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml

16 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction16 Cn3D : Displaying 2' Structures

17 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction17 Cn3D: Structural Alignments

18 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction18 SCOP - Structure Classification http://scop.mrc-lmb.cam.ac.uk/scop http://scop.mrc-lmb.cam.ac.uk/scop

19 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction19 6 main classes of protein structure 1)  Domains Bundles of helices connected by loops 2)  Domains Mainly antiparallel sheets, usually with 2 sheets forming sandwich 3)  Domains Mainly parallel sheets with intervening helices, also mixed sheets 4)  Domains Mainly segregated helices and sheets 5) Multidomain (  Containing domains from more than one class 6 ) Membrane & cell-surface proteins

20 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction20 CATH - Structure Classification http://cathwww.biochem.ucl.ac.uk/latest/ http://cathwww.biochem.ucl.ac.uk/latest/

21 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction21 Structural Genomics ~ 30,000 "traditional" genes in human genome (not counting: ???) ~ 3,000 proteins in a typical cell > 2 million sequences in UniProt > 33,000 protein structures in the PDB  Experimental determination of protein structure lags far behind sequence determination! Goal: Determine structures of "all" protein folds in nature, using combination of experimental structure determination methods (X-ray crystallography, NMR, mass spectrometry) & structure prediction

22 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction22 Structural Genomics Projects TargetDB: database of structural genomics targets http://targetdb.pdb.org Protein Structure Prediction?

23 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction23 Protein Folding " Major unsolved problem in molecular biology" In cells:spontaneous assisted by enzymes assisted by chaperones In vitro: many proteins fold spontaneously & many do not!

24 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction24 Steps in Protein Folding 1- "Collapse"- driving force is burial of hydrophobic aa’s (fast - msecs) 2- Molten globule - helices & sheets form, but "loose" (slow - secs) 3- "Final" native folded state - compaction, some 2' structures rearranged Native state? - assumed to be lowest free energy - may be an ensemble of structures

25 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction25 Protein Dynamics Protein in native state is NOT static Function of many proteins depends on conformational changes, sometimes large, sometimes small Globular proteins are inherently "unstable" (NOT evolved for maximum stability) Energy difference between native and denatured state is very small (5-15 kcal/mol) (this is equivalent to 1 or 2 H-bonds!) Folding involves changes in both entropy & enthalpy

26 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction26 Protein Structure Prediction Structure is largely determined by sequence BUT: Similar sequences can assume different structures Dissimilar sequences can assume similar structures Many proteins are multi-functional Protein folding: determination of folding pathways prediction of tertiary structure  still largely unsolved problems

27 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction27 New today: Protein Structure Prediction Secondary structure (text focuses on this - I won't) Tertiary structure (let's do this instead!)

28 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction28 Deciphering the Protein Folding Code Protein Structure Prediction or "Protein Folding" Problem given the amino acid sequence of a protein, predict its 3-dimensional structure (fold) "Inverse Folding" Problem given a protein fold, identify every amino acid sequence that can adopt its 3-dimensional structure

29 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction29 Protein Structure Determination? High-resolution structure determination X-ray crystallography (<1A  ) Nuclear magnetic resonance (NMR) (~1-2.5A  ) Lower-resolution structure determination Cryo-EM (electron-microscropy) ~10-15A  Theoretical Models? Highly variable - now, some equiv to X-ray!

30 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction30 Tertiary Structure Prediction Fold or tertiary structure prediction problem can be formulated as a search for minimum energy conformation search space is defined by psi/phi angles of backbone and side-chain rotamers search space is enormous even for small proteins! number of local minima increases exponentially of the number of residues Computationally it is an exceedingly difficult problem!

31 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction31 Ab Initio Prediction 1.Develop energy function bond energy bond angle energy dihedral angle energy van der Waals energy electrostatic energy 2.Calculate structure by minimizing energy function (usually Molecular Dynamics or Monte Carlo methods)  Ab initio prediction - not practical in general Computationally? very expensive Accuracy? Usually poor for all but short peptides (but see Baker review!) Provides both folding pathway & folded structure

32 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction32 Comparative Modeling Provide folded structure only Two primary methods 1) Homology modeling 2) Threading (fold recognition) Note: both rely on availability of experimentally determined structures that are "homologous" or at least structurally very similar to target

33 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction33 Homology Modeling 1.Identify homologous protein sequences PSI-BLAST multiple sequence alignment (MSA) 2.Among those with available structures, choose closest sequence match for template 3.Build model by placing residues into corresponding positions of homologous structure models & refine by "tweaking"  Homology modeling - works "well" Computationally? not very expensive Accuracy? higher sequence identity  better model Requires >30% sequence identity

34 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction34 Threading - Fold Recognition Identify “best” fit between target sequence & template structure 1.Develop energy function 2.Develop template library 3.Align target sequence with each template & score 4.Determine best score (1D to 3D alignment) 5.Build refine structure as in homology modeling  Threading - works "sometimes" Computationally? Can be expensive or cheap, depends on energy function & whether "all atom" or "backbone only" threading Accuracy? in theory, should not depend on sequence identity (should depend on quality of template library & "luck") But, usually higher sequence identity  better model

35 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction35 1.Align target sequence with template structures (fold library) from the Protein Data Bank (PDB) 2.Calculate energy (score) to evaluate goodness of fit between target sequence & template structure 3.Rank models based on energy scores Target Sequence Structure Templates ALKKGF…HFDTSE Threading - a "local" example

36 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction36 Threading Goals & Issues Structure database - must be complete: no decent model if no good template in library! Sequence-structure alignment algorithm: Bad alignment  Bad score! Energy function (scoring scheme): must distinguish correct sequence-fold alignment from incorrect sequence-fold alignments must distinguish “correct” fold from close decoys Prediction reliability assessment - how determine whether predicted structure is correct (or even close?) Find “correct” sequence-structure alignment of a target sequence with its native-like fold in PDB

37 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction37 Threading – Structure database Build a template database (e.g., ASTRAL domain library derived from PDB) Supplement with additional decoys, e.g., generated using ab initio approach such as Rosetta (Baker)

38 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction38 Threading - Energy function Two main methods (and combinations of these) Structural profile (environmental) physico-chemical properties of aa’s Contact potential (statistical) based on contact statistics from PDB (Miyazawa & Jernigan - Jernigan now at ISU)

39 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction39 Protein Threading – typical energy function MTYKLILNGKTKGETTTEAVDAATAEKVFQYANDNGVDGEWTYTE How well does a specific residue fit structural environment? What is "probability" that two specific residues are in contact? Alignment gap penalty? Total energy: E_p + E_s + E_g Find a sequence-structure alignment that minimizing the energy function

40 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction40 A Rapid Threading Approach for Protein Structure Prediction Kai-Ming Ho, Physics Haibo Cao Yungok Ihm Zhong Gao James Morris Cai-zhuang Wang Drena Dobbs, GDCB Jae-Hyung Lee Michael Terribilini Jeff Sander

41 11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction41 Performance Evaluation? "Blind Test" CASP5 Competition (Critical Assessment of Protein Structure Prediction) Given: Amino acid sequence Goal: Predict 3-D structure ( before experimental results published)

42 Typical Results: well, actually, BEST Results: HO = #1 ranked CASP prediction for this target Target 174 PDB ID = 1MG7 Actual Structure Predicted Structure T174_1 T174_2

43 FR Fold Recognition (targets manually assessed by Nick Grishin) ----------------------------------------------------------- Rank Z-Score Ngood Npred NgNW NpNW Group-name 1 24.26 9.00 12.00 9 12 Ginalski 2 21.64 7.00 12.00 7 12 Skolnick Kolinski 3 19.55 8.00 12.50 9 14 Baker 4 16.88 6.00 10.00 6 10 BIOINFO.PL 5 15.25 7.00 7.00 7 7 Shortle 6 14.56 6.50 11.50 7 13 BAKER-ROBETTA 7 13.49 4.00 11.00 4 11 Brooks 8 11.34 3.00 6.00 3 6 Ho-Kai-Ming 9 10.45 3.00 5.50 3 6 Jones-NewFold ----------------------------------------------------------- FR NgNW - number of good predictions without weighting for multiple models FR NpNW - number of total predictions without weighting for multiple models Overall Performance in CASP5 Contest (M. Levitt, Stanford)


Download ppt "11/11/05 D Dobbs ISU - BCB 444/544X: Protein Structure Prediction1 11/11/05 Protein Structure Prediction & Modeling."

Similar presentations


Ads by Google