Basic bioinformatics tools for studying proteins Dong Xu Computer Science Department C. S. Bond Life Sciences Center University of Missouri, Columbia

Slides:



Advertisements
Similar presentations
Proteins: Structure reflects function….. Fig. 5-UN1 Amino group Carboxyl group carbon.
Advertisements

Review.
Amino Acids PHC 211.  Characteristics and Structures of amino acids  Classification of Amino Acids  Essential and Nonessential Amino Acids  Levels.
A Ala Alanine Alanine is a small, hydrophobic
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Protein structure prediction with constraint logic programming François Fages, Constraint.
Review of Basic Principles of Chemistry, Amino Acids and Proteins Brian Kuhlman: The material presented here is available on the.
Fundamentals of Protein Structure August, 2006 Tokyo University of Science Tadashi Ando.
Proteins Function and Structure.
Proteins include a diversity of structures, resulting in a wide range of functions Protein functions include structural support, storage, transport, enzymes,
Proteins. Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings Concept 5.4: Proteins have many structures, resulting in a wide range.
• Exam II Tuesday 5/10 – Bring a scantron with you!
5’ C 3’ OH (free) 1’ C 5’ PO4 (free) DNA is a linear polymer of nucleotide subunits joined together by phosphodiester bonds - covalent bonds between.
Introduction to Structural Bioinformatics Dong Xu Computer Science Department 271C Life Sciences Center 1201 East Rollins Road University of Missouri-Columbia.
Computing for Bioinformatics Lecture 8: protein folding.
©CMBI 2001 A Ala Alanine Alanine is a small, hydrophobic residue. Its side chain, R, is just a methyl group. Alanine likes to sit in an alpha helix,it.
©CMBI 2006 Amino Acids “ When you understand the amino acids, you understand everything ”
You Must Know How the sequence and subcomponents of proteins determine their properties. The cellular functions of proteins. (Brief – we will come back.
Chapter 27 Amino Acids, Peptides, and Proteins. Nucleic Acids.
Proteins and Enzymes Nestor T. Hilvano, M.D., M.P.H. (Images Copyright Discover Biology, 5 th ed., Singh-Cundy and Cain, Textbook, 2012.)
Proteins account for more than 50% of the dry mass of most cells
1.What makes an enzyme specific to one type of reaction (in other words, what determines the function of a protein)? –SHAPE determines the function of.
Unit 7 RNA, Protein Synthesis & Gene Expression Chapter 10-2, 10-3
How does DNA work? What is a gene?
Protein Synthesis. DNA RNA Proteins (Transcription) (Translation) DNA (genetic information stored in genes) RNA (working copies of genes) Proteins (functional.
Proteins account for more than 50% of the dry mass of most cells
Proteins Secondary Structure Predictions Structural Bioinformatics.
©CMBI 2006 Amino Acids “ When you understand the amino acids, you understand everything ”
How Proteins Are Made Mrs. Wolfe. DNA: instructions for making proteins Proteins are built by the cell according to your DNA What kinds of proteins are.
. Sequence Alignment. Sequences Much of bioinformatics involves sequences u DNA sequences u RNA sequences u Protein sequences We can think of these sequences.
LESSON 4: Using Bioinformatics to Analyze Protein Sequences PowerPoint slides to accompany Using Bioinformatics : Genetic Research.
AMINO ACIDS.
Secondary structure prediction
Learning Targets “I Can...” -State how many nucleotides make up a codon. -Use a codon chart to find the corresponding amino acid.
Welcome Back! February 27, 2012 Sit in any seat for today. You will have assigned seats tomorrow Were you absent before the break? Plan on coming to tutorial.
Macromolecules of Life Proteins and Nucleic Acids
A Ala Alanine Alanine is a small, hydrophobic residue. Its side chain, R, is just a methyl group. Alanine likes to sit in an alpha helix, it doesn’t like.
Amino Acids ©CMBI 2001 “ When you understand the amino acids, you understand everything ”
Proteins.
Chapter 3 Proteins.
Protein structure prediction Haixu Tang School of Informatics.
Amino acids Common structure of 19 AAs H3N+H3N+ COO - R H C Proline.
Proteins Structure Predictions Structural Bioinformatics.
The Structure and Function of Macromolecules Chpt. 5 The Structure and Function of Macromolecules.
Amino Acids. Amino acids are used in every cell of your body to build the proteins you need to survive. Amino Acids have a two-carbon bond: – One of the.
Prepared By: Syed Khaleelulla Hussaini. Outline Proteins DNA RNA Genetics and evolution The Sequence Matching Problem RNA Sequence Matching Complexity.
Proteins Tertiary Protein Structure of Enzyme Lactasevideo Video 2.
Amino acids.
Protein Folding Notes.
Proteins account for more than 50% of the dry mass of most cells
Chpt. 5 The Structure and Function of Macromolecules
Protein Sequence Alignments
Proteins.
Transport proteins Transport protein Cell membrane
Conformationally changed Stability
Proteins account for more than 50% of the dry mass of most cells
Chemistry 121 Winter 2016 Introduction to Organic Chemistry and Biochemistry Instructor Dr. Upali Siriwardane (Ph.D. Ohio State)
Chapter 3 Proteins.
Fig. 5-UN1  carbon Amino group Carboxyl group.
A Ala Alanine Alanine is a small, hydrophobic
Introduction and Fundamentals of Protein Structure
Proteins account for more than 50% of the dry mass of most cells
Proteins Genetic information in DNA codes specifically for the production of proteins Cells have thousands of different proteins, each with a specific.
Conformationally changed Stability
The 20 amino acids.
Introduction and Fundamentals of Protein Structure
Translation.
The 20 amino acids.
Example of regression by RBF-ANN
Proteins Proteins have many structures, resulting in a wide range of functions Proteins do most of the work in cells and act as enzymes 2. Proteins are.
“When you understand the amino acids,
Presentation transcript:

Basic bioinformatics tools for studying proteins Dong Xu Computer Science Department C. S. Bond Life Sciences Center University of Missouri, Columbia

Introduction l Broaden knowledge for undergraduate education l Many opportunities for biomedical and agricultural related jobs l Practice basic protein tools: å Useful for biological studies å Intellectually stimulating l Dong’s picks for beginners : å Not unnecessarily the most accurate tool å Easy to use and understand å Very popular

Proteins – Some Basics l What Is a Protein? å Linear Sequence of Amino Acids... l What is an Amino Acid?

20 Amino acids Glycine (G) Glutamic acid (E) Asparatic acid (D) Methionine (M) Threonine (T) Serine (S) Glutamine (Q) Asparagine (N) Tryptophan (W) Phenylalanine (F) Cysteine (C) Proline (P) Leucine (L) Isoleucine (I) Valine (V) Alanine (A) Histidine (H) Lysine (K) Tyrosine (Y) Arginine (R) White: Hydrophobic, Green: Hydrophilic, Red: Acidic, Blue: Basic

l Amino Acids connect via PEPTIDE BOND Peptide Bond AA F N G G S T S D K

An Overview o A protein folds into a unique 3D structure under the physiological condition Lysozyme sequence (129 amino acids): KVFGRCELAA AMKRHGLDNY RGYSLGNWVC AAKFESNFNT QATNRNTDGS TDYGILQINS RWWCNDGRTP GSRNLCNIPC SALLSSDITA SVNCAKKIVS DGNGMNAWVA WRNRCKGTDV QAWIRGCRL Protein backbones: Side chain

Primary, Secondary and Tertiary Structures of Proteins

Protein Structure Representations Lysozyme structure: ball & stick strand surface

Structure Visualization l Rasmol ( l MDL Chime (plug-in) ( l Protein Explorer ( l Jmol: l Pymol: l Vmd:

Sequence Homology Software l NCBI-BLAST å l Comparing 2 (pairwise) or more (multiple) sequences. l Searching for a series of identical or similar characters in the sequences. VLSPADKTNVKAAWAKVGAHAAGHG ||| | | |||| | |||| VLSEAEWQLVLHVWAKVEADVAGHG

Typical BLAST Output

InterPro Scan

InterPro Scan PCNA

MyHits Local Motifs Search

MyHits Local Motifs Summary

MyHits Local Motif Hits

Multiple Alignment VTISCTGSESNIGAG-NHVKWYQQLPG VTISCTGTESNIGS--ITVNWYQQLPG LRLSCSSSDFIFSS--YAMYWVRQAPG LSLTCTVSETSFDD--YYSTWVRQPPG PEVTCVVVDVSHEDPQVKFNWYVDG-- ATLVCLISDFYPGA--VTVAWKADS-- AALGCLVKDYFPEP--VTVSWNSG--- VSLTCLVKEFYPSD--IAVEWWSNG--

Phylogeny Tree Multiple protein sequence alignment conserved sites and hence possibly functional sites phylogenetic tree

MSA with ClustalW ClustalW:

Cell localization

Typical Sorting Signals Signal FunctionExample Import into nucleus-P-P-K-K-K-R-K-V- Export from nucleus-L-A-L-K-L-A-G-L-D-I- Import into mitochondria<-MLSLRQSIRFFKPATRTLCSSRYLL- Import into plastid <-MVAMAMASLQSSMSSLSLSSNS FLGQPLSPITLSPFLQG- Import into peroxisomes-S-K-L-> Import into ER <-MMSFVSLLLVGILFWAT EAEQLTKCEVFN- Return to ER-K-D-E-L->

Localizations Cell localization  PSORT:  TargetP: Signal peptide  SingalP:

SignalP result

Membrane Bilayer with Proteins

Helix Bundle TM Proteins PDB = 1QHJ PDB = 1RRC Single helix or helical bundles (> 90% of TM proteins) Examples: Human growth hormone receptor, Insulin receptor ATP binding cassette family - CFTR Multidrug resistance proteins 7TM receptors - G protein-linked receptors

Beta Barrel TM Proteins

Transmembrane Prediction (alpha) (beta)

Secondary Structure Prediction SSpro 4.1: PSI-PRED: SAM: PHD:

Coiled coil prediction bin/npsa_automat.pl?page=/NPSA/npsa_lupas.htm l

Special motif prediction Helix-turn-helix motif prediction bin/npsa_automat.pl?page=/NPSA/npsa_hth.html Kinase related motifs Leucine Zippers

Protein disorder prediction PreDisorder: A collection of disorder predictors:

2D: Contact Map Prediction 1 2 ………..………..…j...…………………..…n i n i n 3D Structure 2D Contact Map Distance Threshold = 8A o

Contact Prediction l SVMcon: l NNcon: l SCRATCH: l SAM: apps/HMM-applications.htmlhttp://compbio.soe.ucsc.edu/HMM- apps/HMM-applications.html

Structure Comparison Visualize structure alignment using VAST: Two ferredoxins, 1DOI and 1AWD, are aligned structurally, showing an insertion in 1DOI that contains potassium-ion binding sites. This may be the result of adaptations to the high salt environment of the Dead Sea.

Structure Alignment Tools l CE ( l DALI ( ) l TM-Align:

Structure-Based Search Comparing a query protein structure against all the structures in the PDB The DALI server: When new structures are solved, researchers often submit them to the DALI server to find structural neighbors and their alignments.

Swiss Model: Comparative Modeling Server

Protein Structure Homology Modeling: Modeller

Analysis software l PROCHECK l WHATCHECK l Suite Biotech l PROSA

Entrez Databases

Design Program l DEZYMER (Hellinga) å Given a ligand and a protein with known structure, suggest residues to be mutated so that the resulting protein binds the ligand. l ORBIT (Mayo) å Given a backbone structure, design a sequence such that it folds to that backbone. l Rosetta (Baker) å One program to treat diverse problems å Prediction and design

DEZYMER 1. Define the expected binding geometry 2. Find backbone places where if appropriate side chains are added, the predefined geometry is satisfied 3. Place the side chains and ligand, and optimize there position 4. Repack residues in positions other than binding residues. If necessary, change residue type Hellinga and Richards, JMB, Construction of new ligand binding sites in protein of known structure

ORBIT Comparison between the designed backbone (averaged NMR structure, blue) and the target backbone (red) Solution structure of the designed protein. Stereoview showing the best-fit superposition of the Divide the target structure into three parts: core, surface and boundary 2. Core: Ala, Val, Leu, Ile, Phe, Tyr, Trp Surface: Ala, Ser, Thr, His, Asp, Asn, Glu, Gln, Lys, and Arg Boundary: union of the above two *10 27 possible sequence 4. Select best sequence efficiently, using dead end elimination (DDE)

Calciomics l Calciomics is a specialized area of biochemistry focusing on the study of calcium- binding biological macromolecules and proteins to understand the factors that contribute to calcium-binding affinity and the selectivity of proteins and calcium-dependent conformational change. l m m

SOSUI Remove transmembrane region s SignalP Remove signal region ProDom Modified sequences PROSPECT Original sequence Set of domain sequences Coiled coils Remove disorder regions SSP Secondary Structure prediction PSI-BLAST Iterations: Analysis of E-value, set of profile sequences STOP if homolog found in PDB 3D model Function annotation SWISS-PROT annotation PFAM Family classification Motif Active sites PSORT Subcellular location Enzyme structure DB Medline Literature search WHATIF / PROCHECK Evaluate & adjust alignments MODELLER / Jackle sequence analysis and processing structure prediction and evaluation function inference toolkit

Summary l Practice 10 selected tools l Help answer the question: what does this protein do? l Collaborate with experimentalists l Find more tools at å å

Acknowledgments This file is for the educational purpose only. Some materials (including pictures and text) were taken from the Internet at the public domain.