Protein domains Miguel Andrade Mainz, Germany Faculty of Biology,

Slides:



Advertisements
Similar presentations
Protein Structure.
Advertisements

1 Genome information GenBank (Entrez nucleotide) Species-specific databases Protein sequence GenBank (Entrez protein) UniProtKB (SwissProt) Protein structure.
Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins iProClass Protein Knowledgebase.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
©CMBI 2005 Exploring Protein Sequences - Part 2 Part 1: Patterns and Motifs Profiles Hydropathy Plots Transmembrane helices Antigenic Prediction Signal.
Homology 3D modeling and effect of mutations. X-ray crystallography (70,714 in PDB) need crystals Nuclear Magnetic Resonance (NMR) (9,312) proteins in.
Secondary structure prediction. Amino acid sequence -> Secondary structure Alpha helix Beta strand Disordered/coil 70% accuracy 1991, 81% accuracy in.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Computational Biology, Part 10 Protein Structure Prediction and Display Robert F. Murphy Copyright  1996, 1999, All rights reserved.
Protein Modules An Introduction to Bioinformatics.
PREDICTION OF PROTEIN FEATURES Beyond protein structure (TM, signal/target peptides, coiled coils, conservation…)
Protein and Function Databases
Introduction to Bioinformatics - Tutorial no. 8 Protein Prediction: - PROSITE - Pfam - SCOP - TOPITS - genThreader.
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Protein Sequence Analysis - Overview Raja Mazumder Senior Protein Scientist, PIR Assistant Professor, Department of Biochemistry and Molecular Biology.
Protein structure prediction 29/01/2015 Mail: Prof. Neri Niccolai Simone Gardini
Pattern databasesPattern databasesPattern databasesPattern databases Gopalan Vivek.
© Wiley Publishing All Rights Reserved. Searching Sequence Databases.
PAT project Advanced bioinformatics tools for analyzing the Arabidopsis genome Proteins of Arabidopsis thaliana (PAT) & Gene Ontology (GO) Hongyu Zhang,
Wellcome Trust Workshop Working with Pathogen Genomes Module 3 Sequence and Protein Analysis (Using web-based tools)
Protein Bioinformatics Course
Protein domains. Protein domains are structural units (average 160 aa) that share: Function Folding Evolution Proteins normally are multidomain (average.
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
Protein 3D-structure analysis Exercises. Practicals Find update frequency for RCSB PDB: weekly. When was the last update? How many protein structures.
Discover the UniProt Blast tool. Murcia, February, 2011Protein Sequence Databases Customize the BLAST results.
BASys: A Web Server for Automated Bacterial Genome Annotation Gary Van Domselaar †, Paul Stothard, Savita Shrivastava, Joseph A. Cruz, AnChi Guo, Xiaoli.
The Pfam and MEROPS databases EMBO course 2004 Robert Finn
© Wiley Publishing All Rights Reserved. Protein 3D Structures.
Functional Annotation of Proteins via the CAFA Challenge Lee Tien Duncan Renfrow-Symon Shilpa Nadimpalli Mengfei Cao COMP150PBT | Fall 2010.
1 Enter the following Micro-RNA sequence into the box Run MFold and look at the results MFold Using MFold to predict RNA secondary structure
MolIDE2: Homology Modeling Of Protein Oligomers And Complexes Qiang Wang, Qifang Xu, Guoli Wang, and Roland L. Dunbrack, Jr. Fox Chase Cancer Center Philadelphia,
Pfam, DAS and the future Rob Finn DAS Workshop 2009.
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
Copyright OpenHelix. No use or reproduction without express written consent1.
Basic Local Alignment Search Tool BLAST Why Use BLAST?
Exercises Pairwise alignment Homology search (BLAST) Multiple alignment (CLUSTAL W) Iterative Profile Search: Profile Search –Pfam –Prosite –PSI-BLAST.
Guidelines for sequence reports. Outline Summary Results & Discussion –Sequence identification –Function assignment –Fold assignment –Identification of.
Homology 3D modeling and effect of mutations. X-ray crystallography (70,714 in PDB) need crystals Nuclear Magnetic Resonance (NMR) (9,312) proteins in.
EBI is an Outstation of the European Molecular Biology Laboratory. A web based integrated search service to understand ligand binding and secondary structure.
DNA / protein sequence analysis 第九組成員: 吳宇軒 侯卜夫 朱子豪 王俊偉
Homology 3D modeling and effect of mutations Miguel Andrade Faculty of Biology, Johannes Gutenberg University Institute of Molecular Biology Mainz, Germany.
Sequence: PFAM Used example: Database of protein domain families. It is based on manually curated alignments.
Repeats and composition bias
Homology 3D modeling Miguel Andrade Mainz, Germany Faculty of Biology,
Protein 3D representation
Prediction of protein features. Beyond protein structure
Secondary structure prediction
Protein domains Miguel Andrade Mainz, Germany Faculty of Biology,
Protein 3D representation
Take a REST from manual searching: PDBe, programmatically
Protein Families, Motifs & Domains.
Demo: Protein Information Resource
Getting the Most out of the PDBe
Secondary structure prediction
Protein 3D representation
Protein domains Miguel Andrade Mainz, Germany Faculty of Biology,
Homology 3D modeling and effect of mutations
PIR: Protein Information Resource
Protein Bioinformatics Course
Protein Sequence Analysis - Overview -
BLAST.
Protein Sequence Analysis - Overview -
Basic Local Alignment Search Tool
Protein structure prediction.
SnapDRAGON: protein 3D prediction-based
Homology 3D modeling Miguel Andrade Mainz, Germany Faculty of Biology,
Protein domains Miguel Andrade Mainz, Germany Faculty of Biology,
Protein 3D representation
Examining repeats with databases
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Protein domains Miguel Andrade Mainz, Germany Faculty of Biology, Johannes Gutenberg University Institute of Molecular Biology Mainz, Germany andrade@uni-mainz.de

Introduction Protein domains are structural units (average 160 aa) that share: Function Folding Evolution Proteins normally are multidomain (average 300 aa)

Introduction Protein domains are structural units (average 160 aa) that share: Function Folding Evolution Proteins normally are multidomain (average 300 aa)

Domains Why to search for domains: Protein structural determination methods such as X-ray crystallography and NMR have size limitations that limit their use. Multiple sequence alignment at the domain level can result in the detection of homologous sequences that are more difficult to detect using a complete chain sequence. Methods used to gain an insight into the structure and function of a protein work best at the domain level.

SMART Domain databases Peer Bork http://smart.embl.de/ Manual definition of domain (bibliography) Generate profile from instances of domain Search for remote homologs (HMMer) Include them in profile Iterate until convergence Schultz et al (1998) PNAS … Letunic et al (2014) Nucleic Acids Research

Domain databases

Domain databases SMART

Domain databases SMART SORL_HUMAN

SMART Domain databases Extra features: Signal-peptide, low complexity, TM, coiled coils SORL_HUMAN

Domain databases SMART

Domain databases SMART

Domain databases PFAM Erik Sonnhammer/Ewan Birney/Alex Bateman http://pfam.xfam.org/ Sonnhammer et al (1997) Proteins ... Finn et al (2016) Nucleic Acids Research

Domain databases PFAM

Domain databases PFAM Wikipedia rules!

Domain databases PFAM

CDD Domain databases Stephen Bryant http://www.ncbi.nlm.nih.gov/cdd Marchler-Bauer et al (2015) Nucleic Acids Res

Domain databases CDD

Domain databases SORLA/SORL1 from Homo sapiens SMART PFAM CDD

Exercise 1/3 Examine a UniProt Entry and find related PDBs Let’s see whether human myosin X (UniProt id Q9HD67) or its homologs have a solved structure. Go to NCBI’s BLAST page: http://blast.ncbi.nlm.nih.gov/ Click on the “protein blast” link. Type the UniProt id of the protein sequence “Q9HD67” in the Entry Query Sequence window. In the Choose Search Set you can select the database to search against: select as Database option "Protein Data Bank".

Exercise 1/3 Examine a UniProt Entry and find related PDBs Q9HD67 Protein Data Bank

Exercise 1/3 Examine a UniProt Entry and find related PDBs Hit the BLAST button at the bottom of the page. Considering that your query was a human myosin X, can you interpret the first two hits? Which part of your query was matched? Which protein was hit in the database? What about the 3rd hit?

Exercise 2/3 Analyse domain predictions with PFAM Let’s look at the domains predicted for human myosin X. Go to PFAM: http://pfam.xfam.org/ Select the option VIEW A SEQUENCE Type in the window the UniProt id of the protein sequence “Q9HD67” and hit the Go button. Compare the positions of the domains predicted with the ranges of the first three BLAST matches in PDB from the previous exercise. Which domains where matched in the human myosin X by each of those hits?

Exercise 3/3 Examine domains in Chimera Open the structure of the 2nd hit (3PZD) in Chimera Now colour the fragments corresponding to the PFAM domains MyTH4 (in purple), RAS associated (in orange) and FERM_M (in pink). How do the PFAM annotations fit the structure? How many more domains can you identify visually? Chain B in this structure is a small peptide. Which part of the human myosin X is interacting with this peptide in relation to the domains you have coloured? MyTH4 :1586-1694a / RA :1711-1752a / FERM_M :1793-1958a