Protein Structure Analysis - I

Slides:



Advertisements
Similar presentations
Protein Structure.
Advertisements

Protein Structure C483 Spring 2013.
Protein Structure Prediction
The amino acids in their natural habitat. Topics: Hydrogen bonds Secondary Structure Alpha helix Beta strands & beta sheets Turns Loop Tertiary & Quarternary.
1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala.
Tema 14. Bases of protein structure and structural prediction. Structural data bank. Protein Data Bank. Molecular Visualization Tools for 3D. Prediction.
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Protein-a chemical view A chain of amino acids folded in 3D Picture from on-line biology bookon-line biology book Peptide Protein backbone N / C terminal.
Protein structure Visualization Molecular Story. Review “Central Dogma”: Sequence  Structure  function Sequence based analysis Structure based analysis.
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Introduction to Structural Bioinformatics Dong Xu Computer Science Department 271C Life Sciences Center 1201 East Rollins Road University of Missouri-Columbia.
Computational Biology, Part 10 Protein Structure Prediction and Display Robert F. Murphy Copyright  1996, 1999, All rights reserved.
The Protein Data Bank (PDB)
CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.
1. Primary Structure: Polypeptide chain Polypeptide chain Amino acid monomers Peptide linkages Figure 3.6 The Four Levels of Protein Structure.
Protein structure prediction May 30, 2002 Quiz#4 on June 4 Learning objectives-Understand difference between primary secondary and tertiary structure.
Computing for Bioinformatics Lecture 8: protein folding.
Roadmap The topics: basic concepts of molecular biology more on Perl
Retrieving and Viewing Protein Structures from the Protein Data Base 7.88J Protein Folding Prof. David Gossard Room 3-336, x September.
BMI 731 Protein Structures and Related Database Searches.
Protein Structures.
Protein Structure Prediction and Analysis
Bioinformatics for biomedicine Protein domains and 3D structure Lecture 4, Per Kraulis
Chapter 12 Protein Structure Basics. 20 naturally occurring amino acids Free amino group (-NH2) Free carboxyl group (-COOH) Both groups linked to a central.
Housekeeping Your performance on the exam has caused me to re-evaluate how homework will be handled I will now be picking up every problem assigned on.
The structural organization within proteins Kevin Slep June 13 th, 2012.
Macromolecular structure
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
Lecture 10: Protein structure
Introduction to Protein Structure
Proteins. Proteins? What is its How does it How is its How does it How is it Where is it What are its.
Proteins: Amino Acid Chains DNA Polymerase from E. coli Standard amino acid backbone: Carboxylic acid group, amino group, the alpha hydrogen and an R group.
 Four levels of protein structure  Linear  Sub-Structure  3D Structure  Complex Structure.
CS 177 Proteins, part 2 (Computational modeling) Review of protein structures Computational Modeling Three-dimensional structural analysis in laboratory.
© Wiley Publishing All Rights Reserved. Protein 3D Structures.
CATH – a hierarchic classification of protein domain structures Rui Kuang.
11/09/05 D Dobbs ISU - BCB 444/544X: Protein Structure Databases - cont.1 11/9/05 Protein Structure Databases (continued) Prediction & Modeling.
Protein Folding & Biospectroscopy F14PFB David Robinson Mark Searle Jon McMaster
Protein Structure Prediction and Structural Genomics Computer Science Department North Dakota State University Fargo, ND.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Mrs. Einstein Research in Molecular Biology. Importance of proteins for cell function: Proteins are the end product of the central dogma YOU are your.
Part I : Introduction to Protein Structure A/P Shoba Ranganathan Kong Lesheng National University of Singapore.
Protein Structure 1 Primary and Secondary Structure.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
STRUCTURAL BIOLOGY Martina Mijušković ETH Zürich, Switzerland.
Module 3 Protein Structure Database/Structure Analysis Learning objectives Understand how information is stored in PDB Learn how to read a PDB flat file.
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
3-D Structure of Proteins
CS 177 Proteins I (Structure-function relationships) Review of protein structures Computational Modeling Three-dimensional structural analysis in laboratory.
10/8/07BCB 444/544 F07 ISU Dobbs #20 - Protein Structure Basics & Classification1 BCB 444/544 Lecture 20 Protein Structure Basics, Visualization, Classification.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein backbone Biochemical view:
PROTEIN STRUCTURE (Donaldson, March 10,2003) What are we trying to learn about genes and their proteins: Predict function for unknown protein by comparison.
Lecture 10 CS566 Fall Structural Bioinformatics Motivation Concepts Structure Solving Structure Comparison Structure Prediction Modeling Structural.
Marlou Snelleman 2012 Protein structure. Overview Sequence to structure Hydrogen bonds Helices Sheets Turns Hydrophobicity Helices Sheets Structure and.
Peptides to Proteins. What are PROTEINS? Proteins are large, complex molecules that serve diverse functional and structural roles within cells.
Chapter 13 Protein structure Bioinformatics and Functional Genomics
Protein Structure September 7,
The Peptide Bond Amino acids are joined together in a condensation reaction that forms an amide known as a peptide bond.
The Peptide Bond Amino acids are joined together in a condensation reaction that forms an amide known as a peptide bond.
There are four levels of structure in proteins
Haixu Tang School of Inforamtics
Protein Structures.
CS 177 Proteins, part 2 (Computational modeling)
Levels of Protein Structure
The Three-Dimensional Structure of Proteins
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Protein Structure Analysis - I PLPTH 890 Introduction to Genomic Bioinformatics Lecture 20 Protein Structure Analysis - I Liangjiang (LJ) Wang ljwang@ksu.edu April 8, 2005

Outline Basic concepts. How protein structures are determined? X-ray crystallography. NMR spectroscopy. Protein structure databases (PDB, MMDB). Protein structure visualization (RasMol, Cn3D, etc). Protein structure classification (SCOP and CATH).

Structural Bioinformatics A subdiscipline of bioinformatics that focuses on the representation, storage, visualization, prediction and evaluation of structural information. References: Baxevanis and Ouellette. 2005. Bioinformatics - A practical guide to the analysis of genes and proteins. 3rd edition. Chapter 9 and part of chapter 8. Pevsner. 2003. Bioinformatics and functional genomics. Chapter 9. Bourne and Weissig. 2003. Structural bioinformatics.

Protein Primary Structures (Brandon and Tooze, 1998) R Protein Primary Structures Amino acid sequence of a polypeptide chain. 20 amino acids, each with a different side chain (R). Peptide units are building blocks of protein structures. The angle of rotation around the N−Cα bond is called phi (), and the angle around the Cα−C′ bond from the same Cα atom is called psi ().

Protein Secondary Structures Local substructures as a result of hydrogen bond formation between neighboring amino acids (backbone interactions). The amino acid side chains affect secondary structure formation. Types of secondary structures:  helix,  sheet, Loop or random coil.

 Helix Most abundant secondary structure. 3.6 amino acids per turn, and hydrogen bond formed between every fourth residue. Often found on the surface of proteins.

 Sheet Hydrogen bonds formed between adjacent polypeptide chains. The chain directions can be same (parallel sheet), opposite (anti-parallel), or mixed.

Loop or Coil Regions between  helices and  sheets. Various lengths and 3-D configurations. Often functionally significant (e.g., part of an active site). (Brandon and Tooze, 1998) The active site of open /-barrel structures is in a crevice outside the carboxy ends of the  strands.

Protein Tertiary Structure The 3-D structure of a protein is assembled from different secondary structure components. Tertiary structure is determined primarily by hydrophobic interactions between side chains. Different classes of protein structures: Hemoglobin (3HHB) All  T cell CD8 (1CD8) All  Thermolysin (7TLN) Mixed

Protein Tertiary Structure (Cont’d) Fold: a certain type of 3-D arrangement of secondary structures. Protein structures evolves more slowly than primary amino acid sequences. E. coli cytochrome b562 (256B) Four-helix bundles Human growth hormone (1HUW) Three-helix bundle Drosophila engrailed homeodomain (1ENH)

Protein Quaternary Structure Two or more independent tertiary structures are assembled into a larger protein complex. Important for understanding protein-protein interactions. E. coli ribosome (1ML5) Horse spleen ferritin (1IES)

Biological Knowledge from Structures (Bourne, 2004)

X-Ray Crystallography Basic steps: Expression, purification X-ray diffraction Structure solution Crystallization Gene targets Proteins Advantages: High-resolution structures. Large protein complexes or membrane proteins. Disadvantages: Molecules in a solid-state (crystal) environment. Requirement for crystals.

Nuclear Magnetic Resonance (NMR) NMR reveals the neighborhood information of atoms in a molecule, and the information can be used to construct a 3-D model of the molecule. Advantages: No requirement for crystals. Proteins in a liquid state (near physiological state). Disadvantages: Limited by molecule size (up to 30 kD). Membrane proteins may not be studied. Inherently less precise than X-ray crystallography.

Protein Data Bank (PDB) The primary repository for protein structures. Established in 1971 (the first bioinformatics database, set up with 7 protein structures). Contains 30,179 structures by March 22, 2005. Supports services for structure submission, search, retrieval, and visualization. Search options: SearchLite: PDB ID and key word search. SearchFields: advanced search. (PDB can be accessed at http://www.rcsb.org/pdb/)

PDB Content Growth structures year Last updated: 06-Mar-2005 2005 1972 30,000 5,000

Access to Structures through NCBI MMDB (Molecular Modeling Database): Structures obtained from PDB. Data in NCBI’s ASN.1 format. Integrated into NCBI’s Entrez system. Cn3D (“see in 3D”): NCBI’s 3-D protein structure viewer. VAST (Vector Alignment Search Tool): for direct comparison of 3-D protein structures. (NCBI at http://www.ncbi.nlm.nih.gov/)

Ramachandran Plot Used to assess the quality of structures.  sheet Used to assess the quality of structures. Good structures – tight clustering patterns. PSI  helix Thioredoxin (2TRX) PHI (Baxevanis and Ouellette, 2005)

3-D Visualization Tool - RasMol An open source software package, and the most popular tool for viewing 3-D structures. RasMol represented a major break-through in software-driven 3-D structure visualization. Structure file formats supported by RasMol: PDB file format: outdated but human-readable. mmCIF: a new and robust data representation, but supported by few software tools. RasTop: provides a user-friendly graphical interface to RasMol. RasTop is available at http://www.geneinfinity.org/rastop/.

Cn3D: NCBI’s Structure Viewer Cn3D (“see in 3D”): allows interactive exploration of 3-D structures, sequences and alignments. Can be used to produce high-quality molecular images. Limitation: only accepts structure files in NCBI’s ASN.1 format (from MMDB). Cn3D is available at http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml.

Other 3-D Visualization Tools Chime: a Netscape plug-in for 3-D structure visualization; based on RasMol source code. Protein Explorer (http://www.proteinexplorer.org/): A Chime-based software package. Particularly user friendly and feature-rich. Swiss-Pdb Viewer (Deep View, available at http://us.expasy.org/spdbv/): Probably the most powerful, freely available molecular modeling and visualization package. Supports homology modeling, site-directed mutagenesis, structure superposition, etc.

Protein Structure Comparison Why is structure comparison important? To understand structure-function relationship. To study the evolution of many key proteins (structure is more conserved than sequence). Comparing 3-D structures is much more difficult than sequence comparison. Protein structure classification: SCOP: Structure Classification Of Proteins. CATH: Class, Architecture, Topology and Homology. Protein structure alignment: DALI and VAST.

SCOP SCOP at http://scop.mrc-lmb.cam.ac.uk/scop/. SCOP is based on expert definition of protein structural similarities, and is manually curated. Classification hierarchy: Class → Fold → Superfamily → Family SCOP has 7 major classes: all , all , /, +, multi-domain proteins ( and ), membrane and cell surface proteins, and small proteins. Domain is the base unit of the SCOP hierarchy, and proteins with multiple domains may appear at different places in the hierarchy. SCOP at http://scop.mrc-lmb.cam.ac.uk/scop/.

An Example of the SCOP Hierarchy SCOP fold definition: Same major secondary structures. Same arrangement. Same topology. (Bourne, 2004)

CATH CATH at http://www.biochem.ucl.ac.uk/bsm/cath/. Classification hierarchy: Class (C) → Architecture (A) → Topology (T) → Homologous superfamily (H) Based on secondary structure content (for C), literature (for A), structure connectivity and general shape (for T, using the SSAP algorithm), and sequence similarity (for H). Multi-domain proteins are partitioned into their constituent domains before classification. CATH at http://www.biochem.ucl.ac.uk/bsm/cath/.

An Example of the CATH Hierarchy CATH classes: mainly . mainly . mixed  and . Few secondary structures. (Pevsner, 2003)

Summary Protein structures are important for addressing many biological questions. Protein Data Bank (PDB) is the primary repository for protein structures. Powerful software tools (e.g., RasMol) are available for viewing 3-D protein structures. SCOP and CATH are two manually curated databases for structure classification. Next: structure alignment and prediction.