Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity Nicholas M. Luscombe and Janet M. Thornton JMB (2002)

Slides:



Advertisements
Similar presentations
1 Introduction to Sequence Analysis Utah State University – Spring 2012 STAT 5570: Statistical Bioinformatics Notes 6.1.
Advertisements

Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein.
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Suggested readings on regulation/dna bp Watson pp , Voet pp Problems 2, 4 Here’s a quiz on the lac operon:quiz
Profiles for Sequences
Introduction to Bioinformatics
Molecular Biology Fifth Edition
Structural bioinformatics
Intro to Bioinformatics Summary. What did we learn Pairwise alignment – Local and Global Alignments When? How ? Tools : for local blast2seq, for global.
Chapter 9 Structure Prediction. Motivation Given a protein, can you predict molecular structure Want to avoid repeated x-ray crystallography, but want.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Non-coding RNA William Liu CS374: Algorithms in Biology November 23, 2004.
Profile-profile alignment using hidden Markov models Wing Wong.
1 Computational Analysis of Protein-DNA Interactions Changhui (Charles) Yan Department of Computer Science Utah State University.
HIDDEN MARKOV MODELS IN MULTIPLE ALIGNMENT
The Protein Data Bank (PDB)
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Tertiary protein structure modelling May 31, 2005 Graded papers will handed back Thursday Quiz#4 today Learning objectives- Continue to learn how to manipulate.
How does a repressor find its operator in a sea of other sequences? It is not enough just for the regulatory protein to recognize the correct DNA.
A System Approach to Measuring the Binding Energy Landscapes of Transcription Factors Authors: Sebastian J. et. al Presenter: Hongliang Fei.
1-month Practical Course Genome Analysis Lecture 3: Residue exchange matrices Centre for Integrative Bioinformatics VU (IBIVU) Vrije Universiteit Amsterdam.
Computational Biology, Part 2 Sequence Comparison with Dot Matrices Robert F. Murphy Copyright  1996, All rights reserved.
Ab initio motif finding
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Protein Sequence Analysis - Overview Raja Mazumder Senior Protein Scientist, PIR Assistant Professor, Department of Biochemistry and Molecular Biology.
Department of Biochemistry
Tasha A. Desai, Dmitry A. Rodionov, Mikhail S. Gelfand, Eric J. Alm, and Christopher V. Rao 1 Alvin Chen April 14, 2010.
Pattern databasesPattern databasesPattern databasesPattern databases Gopalan Vivek.
Alignment Statistics and Substitution Matrices BMI/CS 576 Colin Dewey Fall 2010.
Protein Tertiary Structure Prediction
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Evolution and Scoring Rules Example Score = 5 x (# matches) + (-4) x (# mismatches) + + (-7) x (total length of all gaps) Example Score = 5 x (# matches)
PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches Gaurav Sahni, Ph.D.
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
Pairwise Sequence Alignment. The most important class of bioinformatics tools – pairwise alignment of DNA and protein seqs. alignment 1alignment 2 Seq.
Chap. 9 DNA-Protein Interactions in Bacteria. The Family of Repressors Repressors have recognition helices that lie in the major groove of appropriate.
CSCI 6900/4900 Special Topics in Computer Science Automata and Formal Grammars for Bioinformatics Bioinformatics problems sequence comparison pattern/structure.
Construction of Substitution Matrices
BLOCKS Multiply aligned ungapped segments corresponding to most highly conserved regions of proteins- represented in profile.
A Study of Residue Correlation within Protein Sequences and its Application to Sequence Classification Christopher Hemmerich Advisor: Dr. Sun Kim.
Conserved features of protein-DNA interaction in all X-ray characterized families of DNA-binding proteins N.N. (GI/MR/M) / N.N. (GI/MR/M) Introduction.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
HMMs for alignments & Sequence pattern discovery I519 Introduction to Bioinformatics.
Bioinformatics Ayesha M. Khan 9 th April, What’s in a secondary database?  It should be noted that within multiple alignments can be found conserved.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
Protein and RNA Families
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
PROTEIN PATTERN DATABASES. PROTEIN SEQUENCES SUPERFAMILY FAMILY DOMAIN MOTIF SITE RESIDUE.
Sequence Based Analysis Tutorial March 26, 2004 NIH Proteomics Workshop Lai-Su L. Yeh, Ph.D. Protein Science Team Lead Protein Information Resource at.
Construction of Substitution matrices
Polish Infrastructure for Supporting Computational Science in the European Research Space EUROPEAN UNION Examining Protein Folding Process Simulation and.
Hidden Markov Model and Its Application in Bioinformatics Liqing Department of Computer Science.
Combining Evolutionary Information Extracted From Frequency Profiles With Sequence-based Kernels For Protein Remote Homology Detection Name: ZhuFangzhi.
The statistics of pairwise alignment BMI/CS 576 Colin Dewey Fall 2015.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
Techniques for Protein Sequence Alignment and Database Searching G P S Raghava Scientist & Head Bioinformatics Centre, Institute of Microbial Technology,
Substitution Matrices and Alignment Statistics BMI/CS 776 Mark Craven February 2002.
Bioinformatics Overview
Alignment table: group 4
Protein Families, Motifs & Domains.
A Very Basic Gibbs Sampler for Motif Detection
There are four levels of structure in proteins
LMU Department of Biology
Protein structure prediction.
Structure of the BRCT Repeats of BRCA1 Bound to a BACH1 Phosphopeptide
Deep Learning in Bioinformatics
Presentation transcript:

Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity Nicholas M. Luscombe and Janet M. Thornton JMB (2002) 320,

Outline Background Methods and Tools Results Discussions

Background DNA-binding proteins have a central role in all aspects of the genetic activity within an organism:  transcription, packaging, rearrangement, replication and repair Of great importance to understand the nature of interactions between proteins and DNA

Previous Methods Individual structure studies Surveys in search for common principles of binding that apply across most, or all protein– DNA complexes  atomic contacts between amino acid residues and bases  Secondary structural elements and small structural motifs  whole protein structure interactions

Existing Conclusions There is no simple code relating amino acid sequence to the DNA sequence it binds. Detailed rules for DNA-sequence recognition is best understood within the context of individual protein families  strong underlying trends: e.g. arginine–guanine

This Paper The first global analysis of the conservation of amino acid residue sequences in DNA- binding proteins.  to see whether amino acid residues that interact with DNA are better conserved  to assess the effect that amino acid mutations have on binding specificity

Methods and Tools 1. Select 240 protein-DNA complexes (3.0A or better) from PDB 2. Classify into structural families by pairwise SSAP (54 families). 3. Structural multiple alignment of family members via CORA program suite. 4. Identify distinct DNA-binding domains 5. Use HMMER suite to train an HMM sequence template for each structural “template”.. 6. Use the trained HMMS to search SWISS-PROT.

Methods and Tools 7. Discard non DNA-binding proteins and collapse sets with greater than 95% sequence identity 8. Build multiple alignments of the selected SWISS- PROT entries via HMMER 9. Score amino-acid conservation via PET91 matrix  [0, 100] – Unconserved - conserved 10. Identify surface residues via NACCESS 11. Identify DNA-binding positions via HBPLUS

Results Main conclusion: 3 classes

Result Statistics

Results (Aligned Positions)

Results Summary The average length of a multiple alignment is 138 amino acid residue positions, including gaps. Many more protein residues interact with the DNA backbone than with bases. The ratios are lower for multi-specific and highly specific families—emphasis towards interactions with bases

Results (Conservation)

Analysis Amino acids that interact with the DNA are better conserved than those that do not. Sequence-specific families place greater emphasis on interactions with DNA bases than non-specific families. DNA backbone-contacting positions are well conserved in all families.

About Mutations Conservation of base-contacting positions depends on the binding class of the family.  For non-specific families, invariably in the minor groove.  For highly-specific families target-contacting positions are very conserved. Fuzzy recognition allows single proteins to recognize different, but related target sequences. Members of multi-specific families recognize different DNA sequences by mutating amino acids at base contacting positions

“ Universal ” Code (Preferences)

Discussions First comprehensive assessment of the level of conservation in DNA-binding proteins Confirms many expectations about the nature of DNA-protein complexes. Interesting insight into the evolution of divergent bindings.

Personal Comments “Old”—2002 No silver bullet (various families) No DNA side analysis yet  Ahmad et al,2004, Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information, Bioinformatics  Ahmad et al,2008, Protein–DNA interactions: structural, thermodynamic and clustering patterns of conserved residues in DNA- binding proteins, NAR Thanks