Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.

Slides:



Advertisements
Similar presentations
Gene expression From Gene to Protein
Advertisements

PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Protein Structure Prediction
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala.
Structural bioinformatics
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Predicting RNA Structure and Function. Non coding DNA (98.5% human genome) Intergenic Repetitive elements Promoters Introns mRNA untranslated region (UTR)
Predicting RNA Structure and Function
Strict Regularities in Structure-Sequence Relationship
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Protein Structure, Databases and Structural Alignment
Protein structure (Part 2 of 2).
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
The Protein Data Bank (PDB)
CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.
Protein structure determination & prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray.
Predicting RNA Structure and Function. Following the human genome sequencing there is a high interest in RNA “Just when scientists thought they had deciphered.
. Class 5: RNA Structure Prediction. RNA types u Messenger RNA (mRNA) l Encodes protein sequences u Transfer RNA (tRNA) l Adaptor between mRNA molecules.
Predicting RNA Structure and Function. Nobel prize 1989 Nobel prize 2009 Ribozyme Ribosome.
Protein Structures.
Introduction to RNA Bioinformatics Craig L. Zirbel October 5, 2010 Based on a talk originally given by Anton Petrov.
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
Protein Tertiary Structure Prediction
Chapter 12 Protein Structure Basics. 20 naturally occurring amino acids Free amino group (-NH2) Free carboxyl group (-COOH) Both groups linked to a central.
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Introduction to Protein Structure
Protein Structure Prediction. Historical Perspective Protein Folding: From the Levinthal Paradox to Structure Prediction, Barry Honig, 1999 A personal.
Good solutions are advantageous Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Intelligent Systems for Bioinformatics Michael J. Watts
Lecture 10 – protein structure prediction. A protein sequence.
Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches1 By Jayakumar Rudhrasenan S Primary Supervisor: Prof. Heiko Schroder.
GENE EXPRESSION © 2007 Paul Billiet ODWSODWS. Two steps are required 1. Transcription The synthesis of mRNA use the gene on the DNA molecule as a template.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
© Wiley Publishing All Rights Reserved. RNA Analysis.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Protein Strucure Comparison Chapter 6,7 Orengo. Helices α-helix4-turn helix, min. 4 residues helix3-turn helix, min. 3 residues π-helix5-turn helix,
Structural proteomics
1 Protein Structure Prediction (Lecture for CS397-CXZ Algorithms in Bioinformatics) April 23, 2004 ChengXiang Zhai Department of Computer Science University.
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
Cell Protein Production. Transcription : process of mRNA formation. 1. Triggered by chem. messengers from cytoplasm which bind to DNA 2. This causes release.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Structural proteomics Handouts. Proteomics section from book already assigned.
NUCLEIC ACIDS (2).
RNA Structure Prediction RNA Structure Basics The RNA ‘Rules’ Programs and Predictions BIO520 BioinformaticsJim Lund Assigned reading: Ch. 6 from Bioinformatics:
Introduction to Bioinformatics Algorithms Algorithms for Molecular Biology CSCI Elizabeth White
This seems highly unlikely.
Motif Search and RNA Structure Prediction Lesson 9.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
PROTEIN STRUCTURE (Donaldson, March 10,2003) What are we trying to learn about genes and their proteins: Predict function for unknown protein by comparison.
Lecture 10 CS566 Fall Structural Bioinformatics Motivation Concepts Structure Solving Structure Comparison Structure Prediction Modeling Structural.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structure Prediction. Protein Sequence Analysis Molecular properties (pH, mol. wt. isoelectric point, hydrophobicity) Secondary Structure Super-secondary.
RNAs. RNA Basics transfer RNA (tRNA) transfer RNA (tRNA) messenger RNA (mRNA) messenger RNA (mRNA) ribosomal RNA (rRNA) ribosomal RNA (rRNA) small interfering.
bacteria and eukaryotes
DNA vs RNA.
Predicting RNA Structure and Function
Protein Synthesis.
Protein Synthesis Notes
RNA Secondary Structure Prediction
Translation 2.7 & 7.3.
Protein Structure Prediction
Protein Structures.
Protein structure prediction.
Presentation transcript:

Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department of Health Information Management

Outline RNA structure Protein structure Pharmacogenomics

Department of Health Information Management Two Types of Genes Protein coding genes –Common patterns: promoter region, start codon, codons, stop codon –Translated to protein sequence RNA genes –No consistent patterns common to all RNA genes –Not translated to proteins –Functional as RNA molecules

Department of Health Information Management Types of RNA mRNA: messager RNA tRNA: transfer RNA for providing codons and amino acids rRNA: ribosomal RNA for protein translation miRNA: MicroRNAs are small (22 nucleotides) non- coding RNA gene products that seem to regulate translation snRNAs: small nuclear RNAs –Spliceosomal RNAs found in spliceosome which is involved in splicing –Small nucleolar RNA located in the nucleolus

Department of Health Information Management RNA Genes RNA has various functions There are software developed to search for RNA genes in the genome. –tRNAscan searched for tRNA

Department of Health Information Management RNA Databases Ribosomal RNA database –Ribosomal Database Project: tRNA Databases –Genomic tRNA Database: snoRNA Databases –Yeast snoRNA Database:

Department of Health Information Management Secondary and Tertiary Structure RNA sequence  RNA structure –folding and pairing of bases within the sequence Canonical pairing: G-C and A-U –G-C pairing give more energetic stability (3 bonds) Non-canonical pairing: G-U (very common), A-C, A-G, etc. Double stranded regions and loop regions are the secondary structure elements Tertiary structure is the interaction between secondary structure elements

Department of Health Information Management RNA Secondary Structure For RNAs, secondary structures are conserved, but primary sequences are not necessarily conserved

Department of Health Information Management RNA Structure Prediction Methods Sequence and base pairing patterns Energy minimization –Find the energetically most stable structure –Energy calculations based on base pairings –All possible structures are sampled using the Monte Carlo method –Zuker and Stiegler (1981) used dynamic programming and energy rules to get the energetically most favorable structure. –Mfold is software developed by Zuker and co-workers. It is very computationally expensive and can be used on a maximum of about 1000 nucleotides.

Department of Health Information Management Exercises Use mfold to predict the secondary structure of a RNA sequence GTTTCCGTAGTGTAGTGGTTATCACGTTCGCCTCACACGCGAAAGG TCCCCGGTTCGAAACCGGGCGGAAACA

Protein Structure

Department of Health Information Management Four Levels of Protein Structure Primary Structure – Sequence of amino acids Secondary Structure – Local Structure such as alpha-helices and beta-sheets Tertiary Structure – Arrangement of the secondary structural elements to give 3D structure of a protein Quaternary Structure – Arrangement of the subunits to give a protein complex its 3D structure

Department of Health Information Management Protein Basic Structure A protein is made of a chain of amino acids A amino acid sequence is generally reported from the N- terminal end to the C-terminal end J. Biol. Chem. 1973, 248, p. 7670

Department of Health Information Management Secondary Structure (Helices)

Department of Health Information Management Helix Examples

Department of Health Information Management Secondary Structure (Beta-sheets)

Department of Health Information Management Beta Sheet Examples Parallel beta sheet Anti-parallel beta sheet

Department of Health Information Management Beta Sheet Examples (Cont’d)

Department of Health Information Management Protein Structure Example Beta Sheet HelixLoop ID: 12as 2 chains

Protein Classification

Department of Health Information Management Domain and Motif Domain: a discrete portion of a protein assumed to fold independently of the rest of the protein and possessing its own function. –Most proteins have multiple domains Motif: –Frequently occurring structure patterns among multiple proteins

Department of Health Information Management Protein Classification Family: the proteins in the same family are homologous, evolved from the same ancestor. Usually, the identity of two sequences are very high. Super Family: distant homologous sequences, evolved from the same ancestor. Sequence identity is around 25%- 30%. Fold: only shapes are similar, no homologous relationship. Usually, sequence identity is very low. Protein classification databases: SCOP, CATH

Department of Health Information Management SCOP The SCOP database aims to provide a detailed and comprehensive description of the structural and evolutionary relationships between all proteins whose structure is known. Proteins are classified to reflect both structural and evolutionary relatedness. –Many levels exist in the hierarchy –The principal levels are family, super family and fold

Department of Health Information Management CATH CATH is novel hierarchical classification of protein domain structures, which clusters proteins at four major levels: –Class –Architecture –Topology –Homologous super family

Department of Health Information Management CATH-Protein Structure Classification Class Architecture Topology

Protein Structure Determination

Department of Health Information Management Experimental Methods for Protein Structure Determination X-ray crystallography –Crystallize proteins –Measure X-ray diffraction pattern NMR spectroscopy –NMR – Use nuclear magnetic resonance to predict distances between different Functional groups in a protein in solution. –Calculate possible structure using these distances. Neutron diffraction Electron microscopy Atomic force microscopy

Department of Health Information Management Limitations of Experimental Methods X-ray Diffraction –Only a small number of proteins can be made to form crystals –A crystal is not the protein’s native environment –Very time consuming NMR Distance Measurement –Not all proteins are found in solution –This method generally looks at isolated proteins rather than protein complexes –Very time consuming

Department of Health Information Management Computational Structure Prediction The functions of a protein is determined by its structure. Experimental methods to determine protein structure are time-consuming and expensive. Big gap between the available protein sequences and structures.

Department of Health Information Management Observations Sequences determine structures Proteins fold into minimum energy state. Structures are more conserved than sequences. If two protein sequences share 30% identical residues, then they have a very good chance to have the same fold.

Department of Health Information Management Prediction Methods Ab initio folding: build a structure without referring to an existing structure Homology Modeling: sequence-based method Protein Threading: sequence-structure alignment Consensus Method: vote a prediction from some candidates generated by several prediction programs

Department of Health Information Management Ab Initio Folding Based on the “first-principle” Build structures purely from protein sequences, no templates used Unaffordable computing demands Paradigm is changing, knowledge-based methods are proposed

Department of Health Information Management Secondary Structure Prediction Three-state model: helix (H), strand (E), coil (L) Given a protein sequence: –NWVLSTAADMQGVVTDGMASGLDKD… Predict are secondary structure sequence: –LLEEEELLLLHHHHHHHHHHLHHHL… –Accuracy: 50-85%

Department of Health Information Management Predict Protein Secondary Structure Using PredictProtein Protein Sequence >gi| |ref|NP_ | unknown protein; protein id: At1g [Arabidopsis thaliana] MPSESSYKVHRPAKSGGSRRDSSPDSIIFTPESNLSLFSSASVSVDRCSSTSDAHDRDD SLISAWKEEFEVKKDDESQNLDSARSSFSVALRECQERRSRSEALAKKLDYQRTVSLDL SNVTSTSPRVVNVKRASVSTNKSSVFPSPGTPTYLHSMQKGWSSERVPLRSNGGRSPPN AGFLPLYSGRTVPSKWEDAERWIVSPLAKEGAARTSFGASHERRPKAKSGPLGPPGFAY YSLYSPAVPMVHGGNMGGLTASSPFSAGVLPETVSSRGSTTAAFPQRIDPSMARSVSIH GCSETLASSSQDDIHESMKDAATDAQAVSRRDMATQMSPEGSIRFSPERQCSFSPSSPS PLPISELLNAHSNRAEVKDLQVDEKVTVTRWSKKHRGLYHGNGSKM PredictProtein web server: –

Department of Health Information Management Read the Results

Department of Health Information Management Evolutionary Methods Taking into account related sequences helps in identification of “structurally important”residues. Algorithm: –Find similar sequences –Construct multiple alignment –Use alignment profile for secondary structure prediction Additional information used for prediction –Mutation statistics –Residue position in sequence –Sequence length

Department of Health Information Management Sequence Similarity Methods for Structure Prediction These methods can be very accurate if there is >50% sequence similarity They are rarely accurate if the sequence similarity <30% They use similar methods as used for sequence alignment such as the dynamic programming algorithm, hidden markov models, and clustering algorithms.