An Introduction to Bioinformatics Protein Structure Prediction.

Slides:



Advertisements
Similar presentations
Secondary structure prediction from amino acid sequence.
Advertisements

Protein Structure C483 Spring 2013.
Protein Structure Prediction
The amino acids in their natural habitat. Topics: Hydrogen bonds Secondary Structure Alpha helix Beta strands & beta sheets Turns Loop Tertiary & Quarternary.
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
Structural bioinformatics
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
1 September, 2004 Chapter 5 Macromolecular Structure.
Strict Regularities in Structure-Sequence Relationship
Tools to analyze protein characteristics Protein sequence -Family member -Multiple alignments Identification of conserved regions Evolutionary relationship.
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Protein Functions: catalyze reactions (enzymes) receptors (eg. pain receptors) transport (ions across membranes, oxygen in blood) molecular motors recognition.
Computational Biology, Part 10 Protein Structure Prediction and Display Robert F. Murphy Copyright  1996, 1999, All rights reserved.
CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.
Protein Modules An Introduction to Bioinformatics.
Protein structure prediction May 30, 2002 Quiz#4 on June 4 Learning objectives-Understand difference between primary secondary and tertiary structure.
Protein structure determination & prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray.
1 Protein Structure Prediction Charles Yan. 2 Different Levels of Protein Structures The primary structure is the sequence of residues in the polypeptide.
PREDICTION OF PROTEIN FEATURES Beyond protein structure (TM, signal/target peptides, coiled coils, conservation…)
Protein Structures.
Protein structure prediction 29/01/2015 Mail: Prof. Neri Niccolai Simone Gardini
Bioinformatics for biomedicine Protein domains and 3D structure Lecture 4, Per Kraulis
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Protein Tertiary Structure Prediction
Macromolecular structure
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
Supersecondary structures. Supersecondary structures motifs motifs or folds, are particularly stable arrangements of several elements of the secondary.
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
Secondary Structure Prediction Protein Analysis Workshop 2008 Bioinformatics group Institute of Biotechnology University of helsinki Hung Ta
Secondary Structure Prediction and Signal Peptides Protein Analysis Workshop 2012 Bioinformatics group Institute of Biotechnology University of helsinki.
CS 177 Proteins, part 2 (Computational modeling) Review of protein structures Computational Modeling Three-dimensional structural analysis in laboratory.
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
Day 2: Protein Sequence Analysis 1.Physico-chemical properties. 2.Cellular localization. 3.Signal peptides. 4.Transmembrane domains. 5.Post-translational.
© Wiley Publishing All Rights Reserved. Protein 3D Structures.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
Protein Folding & Biospectroscopy F14PFB David Robinson Mark Searle Jon McMaster
Secondary structure prediction
Protein Secondary Structure, Bioinformatics Tools, and Multiple Sequence Alignments Finding Similar Sequences Predicting Secondary Structures Predicting.
Part I : Introduction to Protein Structure A/P Shoba Ranganathan Kong Lesheng National University of Singapore.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
The α-helix forms within a continuous strech of the polypeptide chain 5.4 Å rise, 3.6 aa/turn  1.5 Å/aa N-term C-term prototypical  = -57  ψ = -47 
Protein Structure (Foundation Block) What are proteins? Four levels of structure (primary, secondary, tertiary, quaternary) Protein folding and stability.
Structural proteomics
1 Protein Structure Prediction (Lecture for CS397-CXZ Algorithms in Bioinformatics) April 23, 2004 ChengXiang Zhai Department of Computer Science University.
Module 3 Protein Structure Database/Structure Analysis Learning objectives Understand how information is stored in PDB Learn how to read a PDB flat file.
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
PROTEIN PHYSICS LECTURES 22-23
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Structural proteomics Handouts. Proteomics section from book already assigned.
CS 177 Proteins I (Structure-function relationships) Review of protein structures Computational Modeling Three-dimensional structural analysis in laboratory.
Protein Properties Function, structure Residue features Targeting Post-trans modifications BIO520 BioinformaticsJim Lund Reading: Chapter , 11.7,
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
1 Proteins Proteins are polymers made of monomers called amino acids All proteins are made of 20 different amino acids linked in different orders Proteins.
Comparative methods Basic logics: The 3D structure of the protein is deduced from: 1.Similarities between the protein and other proteins 2.Statistical.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Marlou Snelleman 2012 Protein structure. Overview Sequence to structure Hydrogen bonds Helices Sheets Turns Hydrophobicity Helices Sheets Structure and.
Protein Structure Prediction. Protein Sequence Analysis Molecular properties (pH, mol. wt. isoelectric point, hydrophobicity) Secondary Structure Super-secondary.
Structural organization of proteins
Secondary Structure Prediction
CHM 708: MEDICINAL CHEMISTRY
The heroic times of crystallography
Protein Structure September 7,
Protein Structure Prediction
Protein Structure Prediction
Protein Structures.
CS 177 Proteins, part 2 (Computational modeling)
Protein structure prediction.
The Three-Dimensional Structure of Proteins
Presentation transcript:

An Introduction to Bioinformatics Protein Structure Prediction

Aims Understand the use of algorithms Recognize different approaches Understand the limitations Objectives Predict occurrence of aspects of structure To select appropriate tools

Introduction Structure has several levels –1  primary –2  secondary –3  tertiary –4  quaternary

1  primary Amino acid sequence NH 2 -MRLSWYDPDFQARLTRSNSKCQGQLEV YLKDGWHMVC SQSWGRSSKQWEDPSQASKVCQRLNCGVPLSLGPFLVTYTP QSSIICYGQLGSFSNCSHSRNDMCHSLGLTCLE-COOH

2  secondary Localized organisation  -helices and  - sheets

3  tertiary Three-dimensional organisation

4  quaternary Multi protein assembly

The problem….. The best way is by X-ray crystallography or NMR etc… Structure databases only hold about 10,000 + structures Therefore devise programs to deduce structural solutions Complex!

Secondary Structure prediction Signal peptides Intracellular targeting Trans-membrane  -helices  -helices and  -sheets Super-secondary structure (motifs)

Signal peptides Short N-terminal amino acid sequences Direct to membrane Cleaved after translocation SignalP –Nobel Prize 1999 Günter Blobel

SignalP predicts signal peptide cleavage sites Only first  Using neural networks

Is the sequence a signal peptide? # Measure Position Value Cutoff Conclusion max. C YES max. Y YES max. S YES mean S YES # Most likely cleavage site between pos. 24 and 25: SRA-LE

Intracellular targeting TargetP Predict subcellular location of eukaryotic protein Presequences –Chloroplasts –Mitochondria –signal peptide

Transmembrane Domains Lots of programs TMHMM –  -helices –hydrophobic   –helix topology –R or K +ve charge cytoplasmic side –Hidden Markov Modelling

Paste as FASTA file e.g Serotonin Receptor

Predicts the transmembrane domains and orientation

 -helices and  -sheets GOR algorithim Assigns each residue to one conformational state of  -helix, extended chain, reverse turn or coil 64.4% accurate Many other sites most use multiple alignments

 -helices and  - sheets | | | | | | | MKFSWRTALLWSLPLLVVGFFFWQGSFGGADANLGSNTANTRMTYGRFLEYVDAGRITSVDLYENGRTAI cccceeeeeecccceeeeeeeeccccccccccccccccccchhhhcceeeeccccceeeeeeccccceee VQVSDPEVDRTLRSRVDLPTNAPELIARLRDSNIRLDSHPVRNNGMVWGFVGNLIFPVLLIASLFFLFRR eeccccccchhhhccccccccchhhhhhhhhccccccccceecccceeeeecccccchhhhhhhhheeec SSNMPGGPGQAMNFGKSKARFQMDAKTGVMFDDVAGIDEAKEELQEVVTFLKQPERFTAVGAKIPKGVLL cccccccccchhhhcchhhhhhhhccceeeecchhhhhhhhhhhhhhhhhhcccchhhhhcccccceeee VGPPGTGKTLLAKAIAGEAGVPFFSISGSEFVEMFVGVGASRVRDLFKKAKENAPCLIFIDEIDAVGRQR ecccccchhhhhhhhhcccccceeecccccceeeeeecccchhhhhhhhhcccccceeeecchhhhcccc GAGIGGGNDEREQTLNQLLTEMDGFEGNTGIIIIAATNRPDVLDSALMRPGRFDRQVMVDAPDYSGRKEI ccccccccchhhhhhhhhhhhhcccccccceeeeeeccccchhhhhhccccccceeeeecccccccchhh LEVHARNKKLAPEVSIDSIARRTPGFSGADLANLLNEAAILTARRRKSAITLLEIDDAVDRVVAGMEGTP hhhhhhhhccccccchhhhccccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhheeecccccc LVDSKSKRLIAYHEVGHAIVGTLLKDHDPVQKVTLIPRGQAQGLTWFTPNEEQGLTTKAQLMARIAGAMG cccccccchhhhhcccceeeeeecccccccceeeecccccccceeccccccccchhhhhhhhhhhhhhhh GRAAEEEVFGDDEVTTGAGGDLQQVTEMARQMVTRFGMSNLGPISLESSGGEVFLGGGLMNRSEYSEEVA hhhhhhhcccccceeeccccchhhhhhhhhhhhhhhccccccccccccccceeeecccccccccchhhhh TRIDAQVRQLAEQGHQMARKIVQEQREVVDRLVDLLIEKETIDGEEFRQIVAEYAEVPVKEQLIPQL hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccccchhhhhhhhhhcccccccccccc

Super-secondary Structure Secondary structure elements combined into specific geometric arrangements known as motifs Beta corner

Super-secondary Structure Several programs/websites for specific domains e.g. PAIRCOIL and MULTICOIL - detect coiled- coiled regions –regions separating domains TRESPASSER - detects Leucine Zippers –Leu-X6-Leu-X6-Leu-X6-Leu protein interaction domain Helix-Turn-Helix –Protein interaction/DNA binding

Integrated stucture prediction One stop shop! Predict Protein at EBI –secondary structure –solvent accessibility globular regions –transmembrane helices coiled-coil regions –a multiple sequence alignment P roSite sequence motifs –low-complexity retions –ProDom domain assignments

Tertiary Structure Prediction Homology modelling Fold recognition Threading Model building

Protein sequence (primary structure) Database searching for homologues Homologue of known structure No homologue of known structure Comparative modelling 3D-structure Fold prediction, ab initio methods etc.

Homology Modelling Method of choice following BLAST search SWISS Model is a good WWW Interface URL:

Requires at least one sequence of known 3D-structure with significant similarity to the target sequence. Compare the target sequence with database - FastA and BLAST. Sequences with a FastA score 10.0 standard deviations above the mean of the random scores or a P(N) lower than 10-5 (BLAST) considered for the model building Restrict to those which share at least 30% residue identity Homology Modelling

Framework construction – compare atom positions - C  s Build non-conserved loops Complete backbone - add other atoms Add side chains Refine

Insulin like gene from C.elegans Red = Insulin Blue = ILGF1

What if I have no homologue? Ab initio methods - Threading Sequence of unknown structure Thread through a through a sequence of known structure Move query sequence through residue by resudue and compare computationally – include thermodynamic criteria, solvent accessibility, secondary structure information Computing intensive