Protein domains Miguel Andrade Mainz, Germany Faculty of Biology,

Slides:



Advertisements
Similar presentations
Protein Structure.
Advertisements

Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins iProClass Protein Knowledgebase.
Chimera. Chimera 1/3 Starting Chimera Open Chimera from desktop (ZDV app) (If there is an update it will take a minute or two) Open a 3D structure by.
©CMBI 2005 Exploring Protein Sequences - Part 2 Part 1: Patterns and Motifs Profiles Hydropathy Plots Transmembrane helices Antigenic Prediction Signal.
Structural bioinformatics
Homology 3D modeling and effect of mutations. X-ray crystallography (70,714 in PDB) need crystals Nuclear Magnetic Resonance (NMR) (9,312) proteins in.
Secondary structure prediction. Amino acid sequence -> Secondary structure Alpha helix Beta strand Disordered/coil 70% accuracy 1991, 81% accuracy in.
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Tertiary protein structure viewing and prediction July 5, 2006 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Identifying functional residues of proteins from sequence info Using MSA (multiple sequence alignment) - search for remote homologs using HMMs or profiles.
Computational Biology, Part 10 Protein Structure Prediction and Display Robert F. Murphy Copyright  1996, 1999, All rights reserved.
Protein Modules An Introduction to Bioinformatics.
PREDICTION OF PROTEIN FEATURES Beyond protein structure (TM, signal/target peptides, coiled coils, conservation…)
Protein and Function Databases
Introduction to Bioinformatics - Tutorial no. 8 Protein Prediction: - PROSITE - Pfam - SCOP - TOPITS - genThreader.
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Protein Sequence Analysis - Overview Raja Mazumder Senior Protein Scientist, PIR Assistant Professor, Department of Biochemistry and Molecular Biology.
Genome Evolution: Duplication (Paralogs) & Degradation (Pseudogenes)
Protein structure prediction 29/01/2015 Mail: Prof. Neri Niccolai Simone Gardini
Pattern databasesPattern databasesPattern databasesPattern databases Gopalan Vivek.
PAT project Advanced bioinformatics tools for analyzing the Arabidopsis genome Proteins of Arabidopsis thaliana (PAT) & Gene Ontology (GO) Hongyu Zhang,
Wellcome Trust Workshop Working with Pathogen Genomes Module 3 Sequence and Protein Analysis (Using web-based tools)
Protein Bioinformatics Course
Protein domains. Protein domains are structural units (average 160 aa) that share: Function Folding Evolution Proteins normally are multidomain (average.
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
Protein 3D-structure analysis Exercises. Practicals Find update frequency for RCSB PDB: weekly. When was the last update? How many protein structures.
BASys: A Web Server for Automated Bacterial Genome Annotation Gary Van Domselaar †, Paul Stothard, Savita Shrivastava, Joseph A. Cruz, AnChi Guo, Xiaoli.
The Pfam and MEROPS databases EMBO course 2004 Robert Finn
Module 3 Sequence and Protein Analysis (Using web-based tools) Working with Pathogen Genomes - Uruguay 2008.
Functional Annotation of Proteins via the CAFA Challenge Lee Tien Duncan Renfrow-Symon Shilpa Nadimpalli Mengfei Cao COMP150PBT | Fall 2010.
MolIDE2: Homology Modeling Of Protein Oligomers And Complexes Qiang Wang, Qifang Xu, Guoli Wang, and Roland L. Dunbrack, Jr. Fox Chase Cancer Center Philadelphia,
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
Copyright OpenHelix. No use or reproduction without express written consent1.
Exercises Pairwise alignment Homology search (BLAST) Multiple alignment (CLUSTAL W) Iterative Profile Search: Profile Search –Pfam –Prosite –PSI-BLAST.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
Homology 3D modeling and effect of mutations. X-ray crystallography (70,714 in PDB) need crystals Nuclear Magnetic Resonance (NMR) (9,312) proteins in.
EBI is an Outstation of the European Molecular Biology Laboratory. A web based integrated search service to understand ligand binding and secondary structure.
DNA / protein sequence analysis 第九組成員: 吳宇軒 侯卜夫 朱子豪 王俊偉
Homology 3D modeling and effect of mutations Miguel Andrade Faculty of Biology, Johannes Gutenberg University Institute of Molecular Biology Mainz, Germany.
Sequence: PFAM Used example: Database of protein domain families. It is based on manually curated alignments.
Protein domains Miguel Andrade Mainz, Germany Faculty of Biology,
Repeats and composition bias
Homology 3D modeling Miguel Andrade Mainz, Germany Faculty of Biology,
Protein 3D representation
Prediction of protein features. Beyond protein structure
Secondary structure prediction
polyQ and other homorepeats
Protein domains Miguel Andrade Mainz, Germany Faculty of Biology,
Protein 3D representation
Take a REST from manual searching: PDBe, programmatically
Protein Families, Motifs & Domains.
Demo: Protein Information Resource
Getting the Most out of the PDBe
Secondary structure prediction
Protein 3D representation
Protein domains Miguel Andrade Mainz, Germany Faculty of Biology,
Homology 3D modeling and effect of mutations
PIR: Protein Information Resource
Protein Bioinformatics Course
Protein Sequence Analysis - Overview -
Molecular Modeling By Rashmi Shrivastava Lecturer
Protein Sequence Analysis - Overview -
Protein structure prediction.
Homology 3D modeling Miguel Andrade Mainz, Germany Faculty of Biology,
Protein 3D representation
Examining repeats with databases
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Protein domains Miguel Andrade Mainz, Germany Faculty of Biology, Institute of Organismic Molecular Evolution, Johannes Gutenberg University Mainz, Germany andrade@uni-mainz.de

Introduction Protein domains are structural units (average 160 aa) that share: Function Folding Evolution Proteins normally are multidomain (average 300 aa)

Introduction Protein domains are structural units (average 160 aa) that share: Function Folding Evolution Proteins normally are multidomain (average 300 aa)

Domains Why to search for domains: Protein structural determination methods such as X-ray crystallography and NMR have size limitations that limit their use. Multiple sequence alignment at the domain level can result in the detection of homologous sequences that are more difficult to detect using a complete chain sequence. Methods used to gain an insight into the structure and function of a protein work best at the domain level.

SMART Domain databases Peer Bork http://smart.embl.de/ Manual definition of domain (bibliography) Generate profile from instances of domain Search for remote homologs (HMMer) Include them in profile Iterate until convergence Schultz et al (1998) PNAS … Letunic et al (2017) Nucleic Acids Research

Domain databases

Domain databases SMART

Domain databases SMART SORL_HUMAN

SMART Domain databases Extra features: Signal-peptide, low complexity, TM, coiled coils SORL_HUMAN

Domain databases SMART

Domain databases SMART

Domain databases PFAM Erik Sonnhammer/Ewan Birney/Alex Bateman http://pfam.xfam.org/ Sonnhammer et al (1997) Proteins ... El Gebali et al (2019) Nucleic Acids Research

Domain databases PFAM

Domain databases PFAM Wikipedia rules!

Domain databases PFAM

CDD Domain databases Stephen Bryant http://www.ncbi.nlm.nih.gov/cdd Marchler-Bauer et al (2017) Nucleic Acids Res

Domain databases CDD

Domain databases SORLA/SORL1 from Homo sapiens SMART PFAM CDD

Exercise 1 Examine a UniProt Entry and find related PDBs Let’s see whether human myosin X (UniProt id Q9HD67) or its homologs have a solved structure. Go to PDB’s BLAST page: Menu > Search > Sequences http://www.rcsb.org/pdb/secondary.do?p=v2/secondary/search.jsp#search_sequences Obtain from UniProt the protein sequence “Q9HD67” and paste it the Entry Query Sequence window. In the Choose Search Set you can select the database to search against: select as Database option "Protein Data Bank".

Exercise 1 Examine a UniProt Entry and find related PDBs Paste sequence here

Exercise 1 Examine a UniProt Entry and find related PDBs Hit the “Run Sequence Search” below the input window. Considering that your query was a human myosin X, can you interpret the first three hits? Which part of your query was matched? Which protein was hit in the database? What about the 4th hit? 1st hit 3AU4 1486-2045 / 2nd 3AU5 1486-2045 / 3rd 3PZD 1503-2045 / 4th 5IOH 1-725

Exercise 1 Examine a UniProt Entry and find related PDBs Hit the “Run Sequence Search” below the input window. Considering that your query was a human myosin X, can you interpret the first three hits? Which part of your query was matched? Which protein was hit in the database? What about the 4th hit? Can you find a hit to a protein that is not human myosin X? Which part of your query was matched? 1W9J myosin II Dictyostelium 58-725

Exercise 2 Analyse domain predictions with PFAM Let’s look at the domains predicted for human myosin X. Go to PFAM: http://pfam.xfam.org/ Select the option VIEW A SEQUENCE Type in the window the UniProt id of the protein sequence “Q9HD67” and hit the Go button. Compare the positions of the domains predicted with the ranges of the BLAST matches in PDB from the previous exercise. Which domains were matched in the human myosin X by each of those hits? Myosin_head 65-727 / MyTH4, RA, FERM_M

Exercise 3 Examine domains in Chimera Open the structure of the 3rd hit (3PZD) in Chimera Now colour the fragments corresponding to the PFAM domains MyTH4 (in purple), RAS associated (in orange) and FERM_M (in pink). How do the PFAM annotations fit the structure? How many more domains can you identify visually? Chain B in this structure is a small peptide. Which part of the human myosin X is interacting with this peptide in relation to the domains you have coloured? And what about the glycerol? MyTH4 :1586-1694a / RA :1711-1752a / FERM_M :1793-1958a