Protein databases Morten Nielsen. Background- Nucleotide databases GenBank, National Center for Biotechnology Information.

Slides:



Advertisements
Similar presentations
Genome Annotation: A Protein-centric Perspective.
Advertisements

Bioinformatics Ayesha M. Khan Spring 2013.
NCBI data, sliding window programs and dot plots Sept. 25, 2012 Learning objectives-Become familiar with OMIM and PubMed. Understand the difference between.
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Swiss-Prot Protein Database Daniel Amoruso December 2, 2004 BI 420.
Archives and Information Retrieval
Protein Databases EBI – European Bioinformatics Institute
Genome Related Biological Databases. Content DNA Sequence databases Protein databases Gene prediction Accession numbers NCBI website Ensembl website.
The Cell, Central Dogma and Human Genome Project.
Biological Databases Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
IST Computational Biology1 Information Retrieval Biological Databases 2 Pedro Fernandes Instituto Gulbenkian de Ciência, Oeiras PT.
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis- part 2.
Protein databases Henrik Nielsen. Background- Nucleotide databases GenBank, National Center for Biotechnology Information.
Proteins and Protein Function Charles Yan Spring 2006.
Class European Resources Protein Focused. Protein Databases EBI – European Bioinformatics Institute
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
URL: European Bioinformatics Institute (EMBL-EBI) Swiss Institute of Bioinformatics (SIB) Protein Information Resource.
UniProt - The Universal Protein Resource
ExPASy - Expert Protein Analysis System The bioinformatics resource portal and other resources An Overview.
Claire O’Donovan EMBL-EBI. In UniProtKB, we aim to provide… o A high quality protein sequence database A non redundant protein database, with maximal.
Protein Sequence Analysis - Overview Raja Mazumder Senior Protein Scientist, PIR Assistant Professor, Department of Biochemistry and Molecular Biology.
An Introduction to Bioinformatics Molecular Biology Databases.
The PIR-PSD current release 78.03, November 24, 2003, contains entries. 65 proteins The PIR was established in 1984 by the National Biomedical.
On line (DNA and amino acid) Sequence Information
Bioinformatics.
bioinformatics seminar on BY S.JHANSI RANI MPHARMACY II SEMESTER
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
Bioinformatics for biomedicine
Archives and Information Retrieval
Introduction to databases Tuomas Hätinen. Topics File Formats Databases -Primary structure: UniProt -Tertiary structure: PDB Database integration system.
© Wiley Publishing All Rights Reserved. Protein and Specialized Sequence Databases.
Biological Databases By : Lim Yun Ping E mail :
UniProt Non-redundant Reference Cluster (UniRef) Databases Swiss Institute of Bioinformatics (SIB) European Bioinformatics Institute (EMBL-EBI)
Doug Raiford Lesson 3.  More and more sequence data is being generated every day  Useless if not made available to other researchers.
Fortaleza 31.VII.2006 UniProtKB: Questions and answers UniProtKB/Swiss-Prot: Questions, Answers and a few Tips.
Day 2: Protein Sequence Analysis 1.Physico-chemical properties. 2.Cellular localization. 3.Signal peptides. 4.Transmembrane domains. 5.Post-translational.
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
Protein Information Resource Protein Information Resource, 3300 Whitehaven St., Georgetown University, Washington, DC Contact
Function preserves sequences
PROTEIN DATABASES. The ideal sequence database for computational analyses and data-mining: I t must be complete with minimal redundancy It must contain.
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
Sequencing the World of Possibilities for Energy & Environment MGM workshop. 19 Oct 2010 Information Sources for Genomics Konstantinos Mavrommatis Genome.
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
Central dogma: the story of life RNA DNA Protein.
EMBL – EBI European Bioinformatics Institute UniProt - The Universal Protein Resource Claire O’Donovan.
Bioinformatics and Computational Biology
Computer Storage of Sequences
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Central hub for biological data UniProtKB/Swiss-Prot is a central hub for biological data: over 120 databases are cross-referenced (EMBL/DDBJ/GenBank,
Information retrieval and sliding window programs April 5, 2011 Hand in Homework #1. Homework #2 due Tuesday, April 12. Learning objectives- Understand.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
BASICS OF BIOINFORMATICS Biotechnology Division North-East Institute of Science & Technology (Council of Scientific & Industrial Research) Jorhat ,
BIOINFORMATICS. Bioinformatics is the application of statistics and computer science to the field of molecular biology. The term bioinformatics was coined.
Protein databases Henrik Nielsen
Archives and Information Retrieval
Biological Sequence Databases
생물정보학 Bioinformatics.
UniProt: Universal Protein Resource
UniProt: the Universal Protein Resource
Genomes and Their Evolution
There are four levels of structure in proteins
Introduction to Bioinformatics
Protein Sequence Analysis - Overview -
Introduction to Databases
Biological Databases.
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Protein databases Morten Nielsen

Background- Nucleotide databases GenBank, National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), USA. EMBL, European Bioinformatics Institute (EBI), England (Established in 1980 by the European Molecular Biology Laboratory, Heidelberg, Tyskland) DDBJ, National Institue of Genetics, Japan Together they form International Nucleotide Sequence Database Collaboration,

Protein databases Swiss-Prot, Established in 1986 in Swizerland ExPASy (Expert Protein Analysis System) Swiss Institute of Bioinformatics (SIB) and European Bioinformatics Institute (EBI) PIR, Established in 1984 National Biomedical Research Foundation, Georgetown University, USA In 2002 merged to: UniProt, A collaboration between SIB, EBI and Georgetown University.

UniProt UniProt Knowledgebase (UniProtKB) UniProt Reference Clusters (UniRef) UniProt Archive (UniParc) UniProt Knowledgebase Release 12.8 consists of: UniProtKB/Swiss-Prot: Annotated manually (curated) Release 54.8 of 05-Feb-2008: entries UniProtKB/TrEMBL: Computer annotated Release 37.8 of 05-Feb-2008: entries

Content of UniProt Knowledgebase Amino acids sequences Functional and structural annotations –Function / activity –Secondary structure –Subcellular location –Mutations, phenotypes –Post-translational modifications Origin –organism: Species, subspecies; classification –tissue References Cross references

Amino acid sequences From where do you get amino acids sequences? Translation of nucleotide sequences (Genebank) Amino acids sequencing: Edman degradation Mass spectrometry 3D-structurer

Protein structure Primary structure: Amino acids sequences Secondary structure: Helix/Beta sheet Tertiary structure: Fold, 3D cordinates

Subcellular location An animal cell:

Post-translational modifications Cleavage of signal peptide, transit peptide or pro-peptide Phosphorylation Glycosylation Lipid anchors Disulfide bond Prosthetic groups (i.e. metal-ions, and lipids)

Content of UniProt Knowledgebase Name, entry data etc Organism References Functional annotations (comments) Cross references –3D structure - PDB –EMBL –.. Sequence

Evidence 3 types of non-experimental qualifiers in Sequence annotation and General comment: –Potential: Predicted using sequence analysis –Probable: Uncertain experimental evidence –By similarity: Predicted using sequence similarity

Cross references Other databases (there are 95 in total): Nucleotide sequences 3D structure Protein-protein interactions Enzymatic activities and pathways Gen expression (microarrays and 2D-PAGE) Ontologies Families and domains Organism specific databases