Intersubunit contacts are often facilitated by specificity-determining positions Computational identification of protein positions that possibly account.

Slides:



Advertisements
Similar presentations
Chemistry 2100 Lecture 10.
Advertisements

Proteins: Structure reflects function….. Fig. 5-UN1 Amino group Carboxyl group carbon.
Review.
A Ala Alanine Alanine is a small, hydrophobic
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Protein structure prediction with constraint logic programming François Fages, Constraint.
Review of Basic Principles of Chemistry, Amino Acids and Proteins Brian Kuhlman: The material presented here is available on the.
FUNDAMENTALS OF MOLECULAR BIOLOGY Introduction -Molecular Biology, Cell, Molecule, Chemical Bonding Macromolecule -Class -Chemical structure -Forms Important.
• Exam II Tuesday 5/10 – Bring a scantron with you!
5’ C 3’ OH (free) 1’ C 5’ PO4 (free) DNA is a linear polymer of nucleotide subunits joined together by phosphodiester bonds - covalent bonds between.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Sequence analysis June 20, 2006 Learning objectives-Understand sliding window programs. Understand difference between identity, similarity and homology.
Amino Acids and Proteins 1.What is an amino acid / protein 2.Where are they found 3.Properties of the amino acids 4.How are proteins synthesized 1.Transcription.
Lectures on Computational Biology HC Lee Computational Biology Lab Center for Complex Systems & Biophysics National Central University EFSS II National.
Sequence analysis June 18, 2008 Learning objectives-Understand the concept of sliding window programs. Understand difference between identity, similarity.
Introduction to Bioinformatics Algorithms Sequence Alignment.
©CMBI 2008 Aligning Sequences The most powerful weapon in the bioinformaticist’s armory is sequence alignment. Why? Lets’ think about an alignment. It.
It og Sundhed Thomas Nordahl Petersen, Associate Professor Center for Biological Sequence Analysis, DTU
Scoring Matrices June 19, 2008 Learning objectives- Understand how scoring matrices are constructed. Workshop-Use different BLOSUM matrices in the Dotter.
Scoring Matrices June 22, 2006 Learning objectives- Understand how scoring matrices are constructed. Workshop-Use different BLOSUM matrices in the Dotter.
Identifying functional residues of proteins from sequence info Using MSA (multiple sequence alignment) - search for remote homologs using HMMs or profiles.
Molecular Techniques in Molecular Systematics. DNA-DNA hybridisation -Measures the degree of genetic similarity between pools of DNA sequences. -Normally.
©CMBI 2005 Why align sequences? Lots of sequences with unknown structure and function. A few sequences with known structure and function If they align,
Introduction to Bioinformatics Algorithms Sequence Alignment.
The relative orientation observed for  helices packed on ß sheets.
Protein Structure FDSC400. Protein Functions Biological?Food?
You Must Know How the sequence and subcomponents of proteins determine their properties. The cellular functions of proteins. (Brief – we will come back.
Proteins. The central role of proteins in the chemistry of life Proteins have a variety of functions. Structural proteins make up the physical structure.
Chapter 27 Amino Acids, Peptides, and Proteins. Nucleic Acids.
Proteins and Enzymes Nestor T. Hilvano, M.D., M.P.H. (Images Copyright Discover Biology, 5 th ed., Singh-Cundy and Cain, Textbook, 2012.)
Unit 7 RNA, Protein Synthesis & Gene Expression Chapter 10-2, 10-3
How does DNA work? What is a gene?
Protein Synthesis. DNA RNA Proteins (Transcription) (Translation) DNA (genetic information stored in genes) RNA (working copies of genes) Proteins (functional.
Structural Bioinformatics R. Sowdhamini National Centre for Biological Sciences Tata Institute of Fundamental Research Bangalore, INDIA.
How Proteins Are Made Mrs. Wolfe. DNA: instructions for making proteins Proteins are built by the cell according to your DNA What kinds of proteins are.
BIOCHEMISTRY REVIEW Overview of Biomolecules Chapter 4 Protein Sequence.
. Sequence Alignment. Sequences Much of bioinformatics involves sequences u DNA sequences u RNA sequences u Protein sequences We can think of these sequences.
Pairwise alignment of DNA/protein sequences I519 Introduction to Bioinformatics, Fall 2012.
Identification of specificity-determining positions in protein alignments Mikhail Gelfand Research and Training Center “Bioinformatics” Institute for Information.
Mount Mary College Students: Jessica Benson, Amy Ramirez, Nerissa Seward Faculty Advisor: Dr. Colleen Conway Medical College of Wisconsin Research Mentor:
LESSON 4: Using Bioinformatics to Analyze Protein Sequences PowerPoint slides to accompany Using Bioinformatics : Genetic Research.
Today Building a genome –Nucleotides, GC content and isochores –Gene structure and expression; introns –Evolution of noncoding RNAs Evolution of transcription.
AMINO ACIDS.
Genetics in ~1920: 1. Cells have chromosomes Sketch of Drosophila chromosomes (Bridges, C. 1913)
Secondary structure prediction
Learning Targets “I Can...” -State how many nucleotides make up a codon. -Use a codon chart to find the corresponding amino acid.
intro-VIRUSES Virus NamePDB ID HUMAN PAPILLOMAVIRUS 161DZL BACTERIOPHAGE GA1GAV L-A virus1M1C SATELLITE PANICUM MOSAIC VIRUS1STM SATELLITE TOBACCO NECROSIS2BUK.
CELL REPRODUCTION: MITOSIS INTERPHASE: DNA replicates PROPHASE: Chromatin condenses into chromosomes, centrioles start migrating METAPHASE: chromosomes.
End Show Slide 1 of 39 Copyright Pearson Prentice Hall 12-3 RNA and Protein Synthesis 12–3 RNA and Protein Synthesis.
A program of ITEST (Information Technology Experiences for Students and Teachers) funded by the National Science Foundation Background Session #3 DNA &
RNA 2 Translation.
1 Protein synthesis How a nucleotide sequence is translated into amino acids.
DNA. Week 2 Review 1.Draw and label a diagram showing the cell membrane. 2.Define Osmosis 3.Define Active and Passive Transport 4.Describe the difference.
Amino Acids ©CMBI 2001 “ When you understand the amino acids, you understand everything ”
Proteins.
Chapter 3 Proteins.
Biochemistry I Chapter 4 Amino Acids revised 9/5/2013
SDPpred: a method for identification of amino acid residues that determine differences in functional specificity of homologous proteins and application.
1 Mona Singh What is computational biology?. 2 Mona Singh Genome The entire hereditary information content of an organism.
Protein structure prediction Haixu Tang School of Informatics.
Chapter 17 How to read a table of codons. These are two forms in which you might see a table of codons.
Prepared By: Syed Khaleelulla Hussaini. Outline Proteins DNA RNA Genetics and evolution The Sequence Matching Problem RNA Sequence Matching Complexity.
Genomics Lecture 3 By Ms. Shumaila Azam. Proteins Proteins: large molecules composed of one or more chains of amino acids, polypeptides. Proteins are.
Arginine, who are you? Why so important?. Release 2015_01 of 07-Jan-15 of UniProtKB/Swiss-Prot contains sequence entries, comprising
BIOLOGY 12 Protein Synthesis.
Protein Sequence Alignments
There are four levels of structure in proteins
Proteins Genetic information in DNA codes specifically for the production of proteins Cells have thousands of different proteins, each with a specific.
The 20 amino acids.
Translation.
Example of regression by RBF-ANN
Presentation transcript:

Intersubunit contacts are often facilitated by specificity-determining positions Computational identification of protein positions that possibly account for precise recognition of the interaction partner

Abundance of sequence data Abundance of sequence data Little experimental information on protein function Little experimental information on protein function => annotation by homology Even less information on protein specificity Even less information on protein specificity => prediction of specificity-determining positions (SDPs)

SDP (Specificity-Determining Position) Alignment position that is conserved within groups of proteins having the same specificity (specificity groups) but differs between them Alignment position that is conserved within groups of proteins having the same specificity (specificity groups) but differs between them SDP is not equivalent to a functionally important position!

What can we infer from SDPs? Targets for protein functional redesign Targets for protein functional redesign Specificity signature Specificity signature Sites of protein-protein interaction Sites of protein-protein interaction

Talk overview SDPpred, an algorithm for identification of SDPs SDPpred, an algorithm for identification of SDPs A studied example: isocitrate/isopropylmalate dehydrogenases A studied example: isocitrate/isopropylmalate dehydrogenases Link to PPI Link to PPI

SDPpred Multiple protein alignment divided into specificity groups Multiple protein alignment divided into specificity groups SDPpred SDPs: positions best discriminating between specificity groups SDPs: positions best discriminating between specificity groups === AQP === %sp|Q9L772|AQPZ_BRUME mlnklsaeffgtfwlvfggcgsa ilaa--afp elgigflgvalafgltvltmayavggisg--ghfnpavslgltv iiilgsts slap qlwlfwvaplvgavigaiiwkgllgrd … === GLP === %sp|P11244|GLPF_ECOLI msqt---stlkgqciaeflgtglliffgvgcv aalkvag a-sfgqweisviwglgvamaiyltagvsg--ahlnpavtialwl glilaltd dgn g-vpr -flvplfgpivgaivgafayrkligrhlpcdicvveek--etttpseqkasl …

What is in the black box: the algorithm Mutual information I p reflect the extent to which an alignment position tends to be a SDP. Mutual information I p reflect the extent to which an alignment position tends to be a SDP. Statistical significance of I p. Statistical significance of I p. Expected mutual information I p exp of an alignment column. Expected mutual information I p exp of an alignment column. Z-score. Z-score. (Mirny&Gelfand, 2002, J Mol Biol, 321(1)) Are 5 SDP with Z-score >10.5 better than 10 SDP with Z-score >9.0? Bernoulli estimator for selection of proper number of SDPs Are 5 SDP with Z-score >10.5 better than 10 SDP with Z-score >9.0? Bernoulli estimator for selection of proper number of SDPs Smoothed amino acid frequencies: a leucine is more a methionine than a valine, and any arginine has a dash of lysine… Smoothed amino acid frequencies: a leucine is more a methionine than a valine, and any arginine has a dash of lysine… - ratio of occurences of amino acid α in group i in position p to the height of the alignment column - frequency of amino acid α in position p - fraction of proteins in group i

Other similar techniques Evolutionary trace (Lichtarge et al. 1996, 1997) Evolutionary trace (Lichtarge et al. 1996, 1997) Evolutionary rate shifts (Gaucher et al. 2002)  Evolutionary rate shifts (Gaucher et al. 2002)  Surface patches of slowly evolving residues (Rate4Site, Pupko et al. 2002)  Surface patches of slowly evolving residues (Rate4Site, Pupko et al. 2002)  PCA in sequence space (Casari et al. 1999, del Sol Mesa et al. 2003) PCA in sequence space (Casari et al. 1999, del Sol Mesa et al. 2003) Correlated mutations (Pazos and Valencia, 2002) Correlated mutations (Pazos and Valencia, 2002) Prediction of functional sub-types (Hannenhalli and Russell, 2000) and identification of PSDR (Mirny and Gelfand, 2002) Prediction of functional sub-types (Hannenhalli and Russell, 2000) and identification of PSDR (Mirny and Gelfand, 2002)

Special features of SDPpred Smoothed amino acid frequencies allow to account for functional (structural, chemical, evolutionary, …) similarities among amino acids Smoothed amino acid frequencies allow to account for functional (structural, chemical, evolutionary, …) similarities among amino acids Automatic cutoff setting -> no prior knowledge about protein family Automatic cutoff setting -> no prior knowledge about protein family Does not require 3D structure -> use of structural data solely for interpretation and verification of results Does not require 3D structure -> use of structural data solely for interpretation and verification of results – Kalinina OV, Mironov AA, Gelfand MS, Rakhmaninova AB. (2004) Protein Sci 13(2): – Kalinina OV, Novichkov PS, Mironov AA, Gelfand MS, Rakhmaninova AB. (2004) Nucl Acids Res 32(Web Server issue): W –

Example: isocitrate/isopropylmalate dehydrogenases (IDH/IMDH) IDH: catalyzes the oxidation of isocitrate to α- ketoglutorate and CO 2 (TCA) using either NAD or NADP as a cofactor in different organisms from bacteria to higher eukaryotes IDH: catalyzes the oxidation of isocitrate to α- ketoglutorate and CO 2 (TCA) using either NAD or NADP as a cofactor in different organisms from bacteria to higher eukaryotes IMDH: catalyzes oxidative decarboxylation of 3- isopropylmalate into 2-oxo-4-methylvalerate (leucine biosynthesis) in bacteria and fungi IMDH: catalyzes oxidative decarboxylation of 3- isopropylmalate into 2-oxo-4-methylvalerate (leucine biosynthesis) in bacteria and fungi

IDH/IMDH: combinations of specificities towards substrate and cofactor NAD-dependent IDHs NAD-dependent IDHs NADP-dependent IDHs from bacteria and archaea (type I) NADP-dependent IDHs from bacteria and archaea (type I) NADP-dependent IDHs from eukaryota (type II) NADP-dependent IDHs from eukaryota (type II) NAD-dependent IMDH NAD-dependent IMDH Mitochondria ArchaeaBacteria Eukaryota ArchaeaBacteriaEukaryota

IDH/IMDH: selecting specificity groups 1. All NAD-dependent vs. all NADP- dependent 2. All IDHs vs. all IMDHs 3. Four groups IDH (NAD) IDH (NADP) type II IDH (NADP) type II IDH (NADP) type II IMDH (NAD) IDH (NADP) type I IDH (NADP) type I IDH (NADP) type I

IDH/IMDH: predicted SDPs (cofactor-specific) NADP-dependent IDH from E. coli (1ai2) Substrate Cofactor Subunit I Subunit II SDPs

IDH/IMDH: predicted SDPs (substrate-specific) NADP-dependent IDH from E. coli (1ai2) Substrate Cofactor Subunit I Subunit II SDPs

IDH/IMDH: predicted SDPs (four groups) NADP-dependent IDH from E. coli (1ai2) Substrate Cofactor Subunit I Subunit II SDPs

IDH/IMDH: predicted SDPs (overview)

IDH/IMDH: SDPs predicted for different groupings All NAD- dependent vs. all NADP-dependent -> cofactor- specific SDPs All IDHs vs. all IMDHs -> substrate- specific SDPs Four groups 154Glu 158Asp 208Arg 229His 231Gly 233Ile 287Gln 300Ala 305Asn 308Tyr 327Asn 344Lys 345Tyr351Val 38Gly40Asp 100Lys 103Leu 105Thr 115Asn 155Asn 164Glu 241Phe 337Ala 341Thr 97Val 98Ala 104Thr 107Val152Phe 161Ala 162Gly 232Asn245Gly 31Tyr 323Ala 36Gly 45Met Color code: Contacts substrate Contacts cofactor Contacts the other subunit Contacts substrate AND cofactor Contacts substrate AND the other subunit

IDH/IMDH: SDPs in contact with cofactor Substrate (isocitrate) Cofactor (NADP) Nicotinamide nucleotide Adenine nucleotide 344Lys, 345Tyr, 351Val: cofactor-specific SDPs, known determinants of specificity to cofactor 100Lys, 104Thr, 105Thr, 107Val, 337Ala, 341Thr: substrate-specific and four group SDPs, functionally not characterized NADP-dependent IDH from E. coli (1ai2)

Clusters of SDPs on the intersubunit contact surface in the IDH/IMDH family… in the IDH/IMDH family… Cluster I Cluster II

…and in other protein families The LacI family of bacterial transcription factors The LacI family of bacterial transcription factors Bind specific operator sequences upon interaction with effector molecules, mainly various sugars Bind specific operator sequences upon interaction with effector molecules, mainly various sugars LacI (lactose repressor) from E.coli (1jwl) Effector DNA operator Cluster I Cluster II

Bacterial membrane transporters from the MIP family Bacterial membrane transporters from the MIP family Water and glycerol/water channels Glpf (glycerol facilitator) from E. coli (1fx8) Cluster I Cluster II Subunit I Substrate(glycerol)

Conclusions SDPpred, a method for identification of amino acids that account for differences in protein specificity SDPpred, a method for identification of amino acids that account for differences in protein specificity Results obtained for several protein families of different functional type agree with structural and experimental data Results obtained for several protein families of different functional type agree with structural and experimental data A substantial fraction of SDPs are located on the intersubunit contacts interface, where they form distinct spatial clasps A substantial fraction of SDPs are located on the intersubunit contacts interface, where they form distinct spatial clasps

Olga V. Kalinina Olga V. Kalinina Pavel S. Novichkov Pavel S. Novichkov Andrey A. Mironov Andrey A. Mironov Mikhail S. Gelfand Mikhail S. Gelfand Aleksandra B. Rakhmaninova Aleksandra B. Rakhmaninova Department of Bioengineering and Bioinformatics, Moscow State University, Moscow, Russia Department of Bioengineering and Bioinformatics, Moscow State University, Moscow, Russia Institute for Information Transmission Problems RAS, Moscow, Russia Institute for Information Transmission Problems RAS, Moscow, Russia State Scientific Center GosNIIGenetika, Moscow, Russia State Scientific Center GosNIIGenetika, Moscow, Russia Acknowledgements Acknowledgements Leonid A. Mirny Olga Laikova Vsevolod Makeev Roman Sutormin Shamil Sunyaev Aleksey Finkelstein Thank you!