Intro to Bioinformatics Computational Approaches to Receptor Structure Prediction Uğur Sezerman Biological Sciences and Bioengineering Program Sabancı.

Slides:



Advertisements
Similar presentations
Proteins - Many Structures, Many Functions 1.A polypeptide is a polymer of amino acids connected to a specific sequence 2.A protein’s function depends.
Advertisements

S ASC Answer to Practice Problem
Proteins Function and Structure.
Proteins. Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings Concept 5.4: Proteins have many structures, resulting in a wide range.
Review: Amino Acid Side Chains Aliphatic- Ala, Val, Leu, Ile, Gly Polar- Ser, Thr, Cys, Met, [Tyr, Trp] Acidic (and conjugate amide)- Asp, Asn, Glu, Gln.
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala.
From Sequences to Structure
Disulfide Bonds Two cyteines in close proximity will form a covalent bond Disulfide bond, disulfide bridge, or dicysteine bond. Significantly stabilizes.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
CISC667, F05, Lec21, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction 3-Dimensional Structure.
College 4. Coordination interaction A dipolar bond, or coordinate covalent bond, is a description of covalent bonding between two atoms in which both.
Thomas Blicher Center for Biological Sequence Analysis
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
Energetics and kinetics of protein folding. Comparison to other self-assembling systems?
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Proteins Function and Structure. Proteins more than 50% of dry mass of most cells functions include – structural support – storage, transport – cellular.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Roadmap The topics: basic concepts of molecular biology more on Perl
Design of a novel globular protein with atomic-level accuracy.
Protein Structural Prediction. Protein Structure is Hierarchical.
Proteins account for more than 50% of the dry mass of most cells
Bioinformatics for biomedicine Protein domains and 3D structure Lecture 4, Per Kraulis
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Proteins account for more than 50% of the dry mass of most cells
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
Protein Folding & Biospectroscopy F14PFB Dr David Robinson Lecture 2.
Representations of Molecular Structure: Bonds Only.
PROTEINS PROTEINS Levels of Protein Structure.
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
A protein’s function depends on its specific conformation (shape) A functional proteins consists of one or more polypeptides that have been precisely twisted,
Department of Mechanical Engineering
Secondary structure prediction
Operone lac Principles of protein structure and function Function is derived from structure Structure is derived from amino acid sequence Different.
Protein Structure (Foundation Block) What are proteins? Four levels of structure (primary, secondary, tertiary, quaternary) Protein folding and stability.
CS790 – BioinformaticsProtein Structure and Function1 Disulfide Bonds  Two cyteines in close proximity will form a covalent bond  Disulfide bond, disulfide.
Last Tuesday and Beyond Common 2° structural elements: influenced by 1° structure –alpha helices –beta strands –beta turns Structure vs. function –Fibrous.
Applied Bioinformatics Week 12. Bioinformatics & Functional Proteomics How to classify proteins into functional classes? How to compare one proteome with.
Proteins are instrumental in about everything that an organism does. These functions include structural support, storage, transport of other substances,
THE STRUCTURE AND FUNCTION OF MACROMOLECULES Proteins - Many Structures, Many Functions 1.A polypeptide is a polymer of amino acids connected to a specific.
Tertiary Structure Globular proteins (enzymes, molecular machines)  Variety of secondary structures  Approximately spherical shape  Water soluble 
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Structure prediction: Ab-initio Lecture 9 Structural Bioinformatics Dr. Avraham Samson Let’s think!
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
PROTEIN FOLDING Major Question: Precisely how is the one- dimensional sequence of a protein programmed to achieve a definitive three- dimensional structure?
Objective 7: TSWBAT recognize and give examples of four levels of protein conformation and relate them to denaturation.
Proteins.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
PROTEINS Characteristics of Proteins Contain carbon, hydrogen, oxygen, nitrogen, and sulfur Serve as structural components of animals Serve as control.
Ab-initio protein structure prediction ? Chen Keasar BGU Any educational usage of these slides is welcomed. Please acknowledge.
Proteins - Many Structures, Many Functions
CHAPTER 5 THE STRUCTURE AND FUNCTION OF MACROMOLECULES
Proteins account for more than 50% of the dry mass of most cells
Proteins Primary structure: Amino acids link together to form a linear polypeptide. The primary structure of a protein is a linear chain of amino acids.
Protein Structure and Properties
Proteins.
Proteins account for more than 50% of the dry mass of most cells
Enzyme Kinetics & Protein Folding 9/7/2004
3-Dimensional Structure
Proteins account for more than 50% of the dry mass of most cells
Proteins Genetic information in DNA codes specifically for the production of proteins Cells have thousands of different proteins, each with a specific.
Protein structure prediction.
Proteins Proteins have many structures, resulting in a wide range of functions Proteins do most of the work in cells and act as enzymes 2. Proteins are.
CHAPTER 5 THE STRUCTURE AND FUNCTION OF MACROMOLECULES
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Presentation transcript:

Intro to Bioinformatics Computational Approaches to Receptor Structure Prediction Uğur Sezerman Biological Sciences and Bioengineering Program Sabancı University, Istanbul

Intro to BioinformaticsProtein Folding2 Determining Protein Structure  There are O(100,000) distinct proteins in the human proteome.  3D structures have been determined for over 60,000 proteins, from all organisms Includes duplicates with different ligands bound, etc. X-ray crystallography or NMR  Coordinates are determined by X-ray crystallography or NMR

Intro to BioinformaticsProtein Folding3 X-Ray Crystallography ~0.5mm The crystal is a mosaic of millions of copies of the protein. As much as 70% is solvent (water)! May take months (and a “green” thumb) to grow.

Intro to BioinformaticsProtein Folding4 X-Ray diffraction  Image is averaged over: Space (many copies) Time (of the diffraction experiment)

Intro to BioinformaticsProtein Folding5 Electron Density Maps  Resolution is dependent on the quality/regularity of the crystal  R-factor is a measure of “leftover” electron density  Solvent fitting  Refinement

Intro to BioinformaticsProtein Folding6 The Protein Data Bank ATOM 1 N ALA E APR 213 ATOM 2 CA ALA E APR 214 ATOM 3 C ALA E APR 215 ATOM 4 O ALA E APR 216 ATOM 5 CB ALA E APR 217 ATOM 6 N GLY E APR 218 ATOM 7 CA GLY E APR 219 ATOM 8 C GLY E APR 220 ATOM 9 O GLY E APR 221 ATOM 10 N VAL E APR 222 ATOM 11 CA VAL E APR 223 ATOM 12 C VAL E APR 224 ATOM 13 O VAL E APR 225 ATOM 14 CB VAL E APR 226 ATOM 15 CG1 VAL E APR 227 ATOM 16 CG2 VAL E APR 228 

Intro to BioinformaticsProtein Folding7 A Peek at Protein Function  Serine proteases – cleave other proteins Catalytic Triad: ASP, HIS, SER

Intro to BioinformaticsProtein Folding8 Cleaving the peptide bond

Intro to BioinformaticsProtein Folding9 Three Serine Proteases  Chymotrypsin – Cleaves the peptide bond on the carboxyl side of aromatic (ring) residues: Trp, Phe, Tyr; and large hydrophobic residues: Met.  Trypsin – Cleaves after Lys (K) or Arg (R) Positive charge  Elastase – Cleaves after small residues: Gly, Ala, Ser, Cys

Intro to BioinformaticsProtein Folding10 Specificity Binding Pocket

Intro to BioinformaticsProtein Folding11 Protein Folding – Biological perspective  “Central dogma”: Sequence specifies structure  Denature – to “unfold” a protein back to random coil configuration  -mercaptoethanol – breaks disulfide bonds Urea or guanidine hydrochloride – denaturant Also heat or pH  Anfinsen’s experiments Denatured ribonuclease Spontaneously regained enzymatic activity Evidence that it re-folded to native conformation

Intro to BioinformaticsProtein Folding12 PROTEIN FOLDING PROBLEM  STARTING FROM AMINO ACID SEQUENCE FINDING THE STRUCTURE OF PROTEINS IS CALLED THE PROTEIN FOLDING PROBLEM

Intro to BioinformaticsProtein Folding13 The Protein Folding Problem Given a particular sequence of amino acid residues (primary structure), what will the tertiary/quaternary structure of the resulting protein be?”  Central question of molecular biology: “Given a particular sequence of amino acid residues (primary structure), what will the tertiary/quaternary structure of the resulting protein be?”  Input: AAVIKYGCAL… Output:  1  1,  2  2 … = backbone conformation: (no side chains yet)

Intro to BioinformaticsProtein Folding14 Folding intermediates  Levinthal’s paradox – Consider a 100 residue protein. If each residue can take only 3x3=9 positions, there are possible conformations.  Folding must proceed by progressive stabilization of intermediates Molten globules – most secondary structure formed, but much less compact than “native” conformation.

Intro to BioinformaticsProtein Folding15 Protein Packing occurs in the cytosol (~60% bulk water, ~40% water of hydration) involves interaction between secondary structure elements and solvent may be promoted by chaperones, membrane proteins tumbles into molten globule states overall entropy loss is small enough so enthalpy determines sign of  E, which decreases (loss in entropy from packing counteracted by gain from desolvation and reorganization of water, i.e. hydrophobic effect) yields tertiary structure

Intro to BioinformaticsProtein Folding16 Folding help  Proteins are, in fact, only marginally stable Native state is typically only 5 to 10 kcal/mole more stable than the unfolded form  Many proteins help in folding Protein disulfide isomerase – catalyzes shuffling of disulfide bonds Chaperones – break up aggregates and (in theory) unfold misfolded proteins

Intro to BioinformaticsProtein Folding17 Forces driving protein folding  It is believed that hydrophobic collapse is a key driving force for protein folding Hydrophobic core Polar surface interacting with solvent  Minimum volume (no cavities)  Disulfide bond formation stabilizes  Hydrogen bonds  Polar and electrostatic interactions

Intro to BioinformaticsProtein Folding18 Secondary Structure  non-linear  3 dimensional  localized to regions of an amino acid chain  formed and stabilized by hydrogen bonding, electrostatic and van der Waals interactions

Intro to BioinformaticsProtein Folding19 Common motifs

Intro to BioinformaticsProtein Folding20 The Hydrophobic Core  Hemoglobin A is the protein in red blood cells (erythrocytes) responsible for binding oxygen.  The mutation E6  V in the  chain places a hydrophobic Val on the surface of hemoglobin  The resulting “sticky patch” causes hemoglobin S to agglutinate (stick together) and form fibers which deform the red blood cell and do not carry oxygen efficiently  Sickle cell anemia was the first identified molecular disease

Intro to BioinformaticsProtein Folding21 Sickle Cell Anemia Sequestering hydrophobic residues in the protein core protects proteins from hydrophobic agglutination.

Intro to BioinformaticsProtein Folding22 Computational Approaches  Ab initio methods  Threading  Comperative Modelling  Fragment Assembly

Intro to BioinformaticsProtein Folding23 Why is ab-initio prediction hard?

Intro to BioinformaticsProtein Folding24 conformation energy Ab-initio protein structure prediction as an optimization problem 2.Solve the computational problem of finding an optimal structure Define a function that map protein structures to some quality measure.

Intro to BioinformaticsProtein Folding25 A dream function Has a clear minimum in the native structure. Has a clear path towards the minimum. Global optimization algorithm should find the native structure. Chen Keasar BGU

Intro to BioinformaticsProtein Folding26 An approximate function Easier to design and compute.  Native structure not always the global minimum.  Global optimization methods do not converge. Many alternative models (decoys) should be generated. Chen Keasar BGU

Intro to BioinformaticsProtein Folding27 An approximate function Easier to design and compute.  Native structure not always the global minimum.  Global optimization methods do not converge. Many alternative models (decoys) should be generated.  No clear way of choosing among them. Decoy set Chen Keasar BGU

Intro to BioinformaticsProtein Folding28 Fold Optimization  Simple lattice models (HP- models) Two types of residues: hydrophobic and polar 2-D or 3-D lattice The only force is hydrophobic collapse Score = number of H  H contacts

Intro to BioinformaticsProtein Folding29  H/P model scoring: count noncovalent hydrophobic interactions.  Sometimes: Penalize for buried polar or surface hydrophobic residues Scoring Lattice Models

Intro to BioinformaticsProtein Folding30 What can we do with lattice models?  For smaller polypeptides, exhaustive search can be used Looking at the “best” fold, even in such a simple model, can teach us interesting things about the protein folding process  For larger chains, other optimization and search methods must be used Greedy, branch and bound Evolutionary computing, simulated annealing Graph theoretical methods

Intro to BioinformaticsProtein Folding31  The “hydrophobic zipper” effect: Learning from Lattice Models Ken Dill ~ 1997

Intro to BioinformaticsProtein Folding32 Threading: Fold recognition  Given: Sequence: IVACIVSTEYDVMKAAR… A database of molecular coordinates  Map the sequence onto each fold  Evaluate Objective 1: improve scoring function Objective 2: folding

Intro to BioinformaticsProtein Folding33 Protein Fold Families  CATH website

Intro to BioinformaticsProtein Folding34 Secondary Structure Prediction AGVGTVPMTAYGNDIQYYGQVT… A-VGIVPM-AYGQDIQY-GQVT… AG-GIIP--AYGNELQ--GQVT… AGVCTVPMTA---ELQYYG--T… AGVGTVPMTAYGNDIQYYGQVT… ----hhhHHHHHHhhh--eeEE…

Intro to BioinformaticsProtein Folding35 Secondary Structure Prediction  Easier than folding Current algorithms can prediction secondary structure with 70-80% accuracy  Chou, P.Y. & Fasman, G.D. (1974). Biochemistry, 13, Based on frequencies of occurrence of residues in helices and sheets  PhD – Neural network based Uses a multiple sequence alignment Rost & Sander, Proteins, 1994, 19, 55-72

Intro to BioinformaticsProtein Folding36 Chou-Fasman Parameters

Intro to BioinformaticsProtein Folding37 HOMOLOGY MODELLING  Using database search algorithms find the sequence with known structure that best matches the query sequence  Assign the structure of the core regions obtained from the structure database to the query sequence  Find the structure of the intervening loops using loop closure algorithms

Intro to BioinformaticsProtein Folding38 Homology Modeling: How it works oFind template oAlign target sequence with template oGenerate model: - add loops - add sidechains oRefine model

Intro to BioinformaticsProtein Folding39 Prediction of Protein Structures  Examples – a few good examples actualpredicted actual predicted

Intro to BioinformaticsProtein Folding40 Prediction of Protein Structures  Not so good example

Intro to BioinformaticsProtein Folding41 1esr

Intro to BioinformaticsProtein Folding42

Intro to BioinformaticsProtein Folding43

Intro to BioinformaticsProtein Folding44 How can we predict protein structures? Are we lucky? yes A V C W K A G K C AC WKA VGKC C + A V C W K A G K C C homology no ab initio a bit fold recognition

Intro to BioinformaticsProtein Folding45 HOMOLOGY MODELLING  Using database search algorithms find the sequence with known structure that best matches the query sequence  Assign the structure of the core regions obtained from the structure database to the query sequence  Find the structure of the intervening loops using loop closure algorithms

Intro to BioinformaticsProtein Folding46 Homology Modeling: How it works oFind template oAlign target sequence with template oGenerate model: - add loops - add sidechains oRefine model

Intro to BioinformaticsProtein Folding47 Prediction of Protein Structures  Examples – a few good examples actualpredicted actual predicted

Intro to BioinformaticsProtein Folding48 Prediction of Protein Structures  Not so good example

Intro to BioinformaticsProtein Folding49 1esr

Intro to BioinformaticsProtein Folding50

Intro to BioinformaticsProtein Folding51

Intro to BioinformaticsProtein Folding52 G-protein coupled receptors (GPCRs) G-protein coupled receptors (GPCRs)  Vital protein bundles with versatile functions.  Play a key role in cellular signaling, regulation of basic physiological processes by interacting with more than 50% of prescription drugs.  Therefore excellent potential therapeutic target for drug design and the focus of current pharmaceutical research.

Intro to BioinformaticsProtein Folding53 GPCR Functional Classification Problem  Although thousands of GPCR sequences are known, the crystal structure solved only for one GPCR sequence at medium resolution to date.  For many of them, the activating ligand is unknown.  Functional classification methods for automated characterization of such GPCRs is imperative.  Not suitable for homology modelling but hybrid methods may work. A Rayan J. Mol. Modelling (2010) p

Intro to BioinformaticsProtein Folding54 Schematic overview of the MHC-I antigen processing and presentation pathway

Intro to BioinformaticsProtein Folding55 Pathway and MHC Molecule  Cytotoxic T-cells recognize antigen peptides (8-10 residues) bound to a MHC class I molecule on the cell surface.

Intro to BioinformaticsProtein Folding56 MHC-I bound epitope is scanned by T-cell receptor