Proteins Structure Predictions Structural Bioinformatics.

Slides:

Advertisements

Similar presentations

Amino Acids PHC 211.  Characteristics and Structures of amino acids  Classification of Amino Acids  Essential and Nonessential Amino Acids  Levels.

Advertisements

A Ala Alanine Alanine is a small, hydrophobic

1 Lesson 5 Protein Prediction and Classification.

Protein Tertiary Structure Prediction

Proteins Structural Bioinformatics. 2 3 Specific databases of protein sequences and structures  Swissprot  PIR  TREMBL (translated from DNA)  PDB.

Applied Bioinformatics The amino acids. Overview Proteins (sneak preview) – Primary structure – Secondary structure – Tertiary structure The amino acids.

Peptides to Proteins. What are proteins? How are proteins made? How do proteins fold? Why are proteins important?

Protein Tertiary Structure Prediction Structural Bioinformatics.

Computing for Bioinformatics Lecture 8: protein folding.

©CMBI 2006 Amino Acids “ When you understand the amino acids, you understand everything ”

Protein Tertiary Structure Prediction Structural Bioinformatics.

You Must Know How the sequence and subcomponents of proteins determine their properties. The cellular functions of proteins. (Brief – we will come back.

Protein Structures.

Protein Structure.

Proteins account for more than 50% of the dry mass of most cells

Protein Tertiary Structure Prediction

Proteins Secondary Structure Predictions Structural Bioinformatics.

©CMBI 2006 Amino Acids “ When you understand the amino acids, you understand everything ”

©CMBI 2003 MUTANT DESIGN BIO- INFORMATICS QUESTION ‘MOLECULAR BIOLOGY’ BIOPHYSICS.

Now playing: Frank Sinatra “My Way” A large part of modern biology is understanding large molecules like Proteins A large part of modern biology is understanding.

Secondary structure prediction

Doug Raiford Lesson 19.  Framework model  Secondary structure first  Assemble secondary structure segments  Hydrophobic collapse  Molten: compact.

Practice Quiz (you will need your amino acid sheet for this quiz)

Do Now Look at the picture below and answer the following questions.

Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.

Amino Acids ©CMBI 2001 “ When you understand the amino acids, you understand everything ”

Chapter 3 Proteins.

Proteins Secondary Structure Predictions

Structural Bioinformatics

Pg. 55. Carbohydrates Organic compounds composed of carbon, hydrogen, and oxygen in a ratio of 1:2:1 Carbohydrates can exist as 1) monosaccharides (simple.

Proteins Secondary Structure Predictions

Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.

Comparative methods Basic logics: The 3D structure of the protein is deduced from: 1.Similarities between the protein and other proteins 2.Statistical.

Machine Learning Methods of Protein Secondary Structure Prediction Presented by Chao Wang.

Proteins Protos “of prime importance” Big Idea: Proteins perform the actions of the cell, they are coded for by the DNA. DNA is the principal, proteins.

Protein Tertiary Structure Prediction Structural Bioinformatics.

Carbon Compounds / Organic Chemistry Fall Carbon Atomic Structure  Carbon atoms have four valence electrons that can join with the electrons from.

A PRESENTATION ON AMINO ACIDS AND PROTEINS PRESENTED BY SOMESH SHARMA Chemical Engineering Arham Veerayatan Institute of Engineering Technology.

Amino Acids. Amino acids are used in every cell of your body to build the proteins you need to survive. Amino Acids have a two-carbon bond: – One of the.

Sparse nonnegative matrix factorization for protein sequence motifs information discovery Presented by Wooyoung Kim Computer Science, Georgia State University.

1 4. Nucleic acids and proteins in one and more dimensions - second part.

Prepared By: Syed Khaleelulla Hussaini. Outline Proteins DNA RNA Genetics and evolution The Sequence Matching Problem RNA Sequence Matching Complexity.

molecule's structure prediction

Proteins Tertiary Protein Structure of Enzyme Lactasevideo Video 2.

Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:

Protein chemistry Lecture Amino acids are the basic structural units of proteins consisting of: - Amino group, (-NH2) - Carboxyl group(-COOH)

Biochemistry Free For All

Amino Acids and Protein Chemistry

Protein structure is conceptually divided into four levels of organization Primary structure is the amino acid sequence of a protein's polypeptide chain.

Protein Folding Notes.

Lecture 3 Proteins Proteins consist of amino-acids linked together in chains through peptide bonds. An amino acid consists of a carbon atom bound to.

Protein Structure September 7,

Protein Folding.

Do now activity #2 Name all the DNA base pairs.

Conformationally changed Stability

Introduction to Bioinformatics II

3. Proteins Monomer = Amino acids Globular in shape Or Spherical.

Chapter 3 Proteins.

Protein Structures.

Introduction and Fundamentals of Protein Structure

Proteins Genetic information in DNA codes specifically for the production of proteins Cells have thousands of different proteins, each with a specific.

Conformationally changed Stability

Introduction and Fundamentals of Protein Structure

Do now activity #5 How many strands are there in DNA?

Protein Structure.

Proteins Proteins have many structures, resulting in a wide range of functions Proteins do most of the work in cells and act as enzymes 2. Proteins are.

“When you understand the amino acids,

Protein structure prediction

Presentation transcript:

Proteins Structure Predictions Structural Bioinformatics

Reminder Final date to chose a project 10.1 Submission project overview (one page) -Title -Main question -Major Tools you are planning to use to answer the questions 11.1 /18.1– meetings on projects 9.3 Poster submission 16.3 Poster presentation

3 In there were 114,402 protein structures in the protein structure database. Was solved in 1958 by Max Perutz John Kendrew of Cambridge University. (Won the 1962 and Nobel Prize in Chemistry ) The first high resolution structure of a protein-myoglobin

The 3D structure of a protein is stored in a coordinate file Each atom is represented by a coordinate in 3D (X, Y, Z)

The coordinate file can be viewed graphically RBP

6 Predicting the three dimensional structure from sequence of a protein is very hard (some times impossible) However we can predict with relative high precision the secondary structure MERFGYTRAANCEAP…. What can we do to bridge the gap?? >10,000,000>100,000

What do we mean by Secondary Structure ? Secondary structure are the building blocks of the protein structure: =

8 What do we mean by Secondary Structure ? Secondary structure is usually divided into three categories: Alpha helix Beta strand (sheet) Anything else – turn/loop

9 The different secondary structures are combined together to form the Tertiary Structure of the Proteins

10 RBP Globin Tertiary Secondary ? ? ?

Secondary Structure Prediction Given a primary sequence ADSGHYRFASGFTYKKMNCTEAA what secondary structure will it adopt (alpha helix, beta strand or random coil) ? 11

12 Secondary Structure Prediction Methods Statistical methods –Based on amino acid frequencies –HMM (Hidden Markov Model) Machine learning methods –SVM, Neural networks

13 Chou and Fasman (1974) Name P(a) P(b) P(turn) Alanine Arginine Aspartic Acid Asparagine Cysteine Glutamic Acid Glutamine Glycine Histidine Isoleucine Leucine Lysine Methionine Phenylalanine Proline Serine Threonine Tryptophan Tyrosine Valine The propensity of an amino acid to be part of a certain secondary structure (e.g. – Proline has a low propensity of being in an alpha helix or beta sheet  breaker) Not very useful for predictions Statistical Methods for SS prediction

What is missing? 14

15 HMM enables us to calculate the probability of assigning a sequence to a specific secondary structure TGTAGPOLKCHIQWML HHHHHHHLLLLBBBBB p = ? HMM (Hidden Markov Model) An approach for predicting Secondary Structure considering dependency between the position

16 The probability of observing a residue which belongs to an α-helix followed by a residue belonging to a turn = 0.15 The probability of observing Alanine as part of a β- sheet Table built according to large database of known secondary structures α-helix followed by α-helix Beginning with an α- helix

Example What is the probability that the sequence TGQ will be in a helical structure?? TGQ HHH p = 0.45 x x 0.8 x x 0.8x = What can we learn from secondary structure predictions??

csc Mad Cow Disease PrP c to PrP sc PRP c PRP sc

Predicting 3D Structure based on homology Comparative Modeling/homology modeling Similar sequences suggests similar structure

Sequence and Structure alignments of two Retinol Binding Protein

How do we evaluate structure similarity?? Structure Alignment

Structure Alignments The outputs of a structural alignment are a superposition of the atomic coordinates and a minimal Root Mean Square Distance (RMSD) between the structures. There are many different algorithms for structural Alignment.

Atoms in Protein V Atoms in Protein W Atom N (x, y, z) The RMSD of two aligned structures indicates their divergence from one another. Low values of RMSD mean similar structures

24 Different sequences can result in similar structures 1ecd2hhd RMSD<1

25 We can learn about the important features which determine structure and function by comparing the sequences and structures ?

26 The Globin Family

27 Why is Proline 36 conserved in all the globin family ?

28 Where are the gaps?? The gaps in the pairwise alignment are mapped to the loop regions

29 How are remote homologs related in terms of their structure? b-lactoglobulin RBD

30 PSI-BLAST alignment of RBP and  -lactoglobulin: iteration 3 Score = 159 bits (404), Expect = 1e-38 Identities = 41/170 (24%), Positives = 69/170 (40%), Gaps = 19/170 (11%) Query: 3 WVWALLLLAAWAAAERD CRVSSFRVKENFDKARFSGTWYAMAKKDPEGLFLQ 54 V L+ LA A + S V+ENFD ++ G WY + K Sbjct: 1 MVTMLMFLATLAGLFTTAKGQNFHLGKCPSPPVQENFDVKKYLGRWYEIEKIPASFE-KG 59 Query: 55 DNIVAEFSVDETGQMSATAKGRVRLLNNWDVCADMVGTFTDTEDPAKFKMKYWGVASFLQ I A +S+ E G + K V PAK Sbjct: 60 NCIQANYSLMENGNIEVLNKELSPDGTMNQVKGE--AKQSNVSEPAKLEVQFFPL Query: 115 KGNDDHWIVDTDYDTYAVQYSCRLLNLDGTCADSYSFVFSRDPNGLPPEA 164 +WI+ TDY+ YA+ YSC + ++ R+P LPPE Sbjct: 113 MPPAPYWILATDYENYALVYSCTTFFWL--FHVDFFWILGRNPY-LPPET 159

31 The Retinol Binding Proteinb-lactoglobulin

32 MERFGYTRAANCEAP…. Taken together FUNCTION

Comparative Modeling Builds a protein structure model based on its alignment (sequence) to one or more related protein structures in the database Similar sequence suggests similar structure

Comparative Modeling General algorithm Modeling of a sequence based on known structures Consist of four major steps : 1.Finding a known structure(s) related to the sequence to be modeled (template), using sequence comparison methods such as PSI-BLAST 2. Aligning sequence with the templates 3. Building a model 4. Assessing the model

Comparative Modeling Accuracy of the comparative model is usually related to the sequence identity on which it is based >50% sequence identity = high accuracy 30%-50% sequence identity= 90% can be modeled <30% sequence identity =low accuracy (many errors) However other parameters (such as identify length) can influence the results

What is a good model? ModBase- for homology modelling

What is a good model?

Extra Slides (for your interest) 39

residues 5.6 Å Alpha Helix : Pauling (1951) A consecutive stretch of 5-40 amino acids (average 10). A right-handed spiral conformation. 3.6 amino acids per turn. Stabilized by Hydrogen bonds

41 Beta Strand : Pauling and Corey (1951) > An extended polypeptide chains is called β –strand (consists of 5-10 amino acids > The chains are connected together by Hydrogen bonds to form b-sheet β -strand β -sheet

42 Loops Connect the secondary structure elements (alpha helix and beta strands). Have various length and shapes.