Biological Databases Morten Nielsen BioSys, DTU. Different kinds of data DNA –NCBI GenBankNCBI GenBank –Organism specific databases Protein –UniProt SwissProt.

Slides:



Advertisements
Similar presentations
Genome Annotation: A Protein-centric Perspective.
Advertisements

Bioinformatics growth curves Medline records Computer power DNA sequences 3-D structures.
BiGCaT Bioinformatics Hunting strategy of the bigcat.
It og Sundhed Nov Jan. Thomas Nordahl Petersen, Associate Professor Center for Biological Sequence Analysis, DTU Normal
It og Sundhed Thomas Nordahl Petersen, Associate Professor Center for Biological Sequence Analysis, DTU Building 208, room 021
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Sequence information, logos and Hidden Markov Models Morten Nielsen, CBS, BioCentrum,
The fast way Morten Nielsen BioSys, DTU. The fast algorithm (O2) Database (m) Query (n) Open a gapExtending a gap P Q Affine gap penalties.
Gibbs sampling Morten Nielsen, CBS, BioSys, DTU. Class II MHC binding MHC class II binds peptides in the class II antigen presentation pathway Binds peptides.
Databanks (A) NCBINCBI (National Center for Biotechnology Information) is a home for many public biological databases (see an older diagram below). All.
It og Sundhed Nov Jan. Thomas Nordahl Petersen, Associate Professor Center for Biological Sequence Analysis, DTU
Prediction of B cell epitopes Pernille Haste Andersen Immunological Bioinformatics CBS, DTU
On line (DNA and amino acid) Sequence Information Lecture 7.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
GENBANK, SWISSPROT AND OTHERS As Problem Sources for CSE 549 Andriy Tovkach Genetics.
Algorithms in Bioinformatics Morten Nielsen BioSys, DTU.
The Sense of Sequense The Sense of Sequense Chris Evelo BiGCaT Bioinformatics Universiteit Maastricht.
Protein databases Morten Nielsen. Background- Nucleotide databases GenBank, National Center for Biotechnology Information.
Introduction to Bioinformatics Lecturer: Dr. Yael Mandel-Gutfreund Teaching Assistant: Shula Shazman Sivan Bercovici Course web site :
Archives and Information Retrieval
Alignment of mRNAs to genomic DNA Sequence Martin Berglund Khanh Huy Bui Md. Asaduzzaman Jean-Luc Leblond.
Performance measures Morten Nielsen, CBS, BioCentrum, DTU.
Role of IT in Bioinformatics Naveena.Y. What is bioinformatics ? Study of Information content and information flow in biological systems and processes.
Protein Databases EBI – European Bioinformatics Institute
Genome Related Biological Databases. Content DNA Sequence databases Protein databases Gene prediction Accession numbers NCBI website Ensembl website.
The Cell, Central Dogma and Human Genome Project.
Biological Databases Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
Protein databases Henrik Nielsen. Background- Nucleotide databases GenBank, National Center for Biotechnology Information.
Class European Resources Protein Focused. Protein Databases EBI – European Bioinformatics Institute
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
An Introduction to Bioinformatics Molecular Biology Databases.
Login: BITseminar Pass: BITseminar2011 Login: BITseminar Pass: BITseminar2011.
On line (DNA and amino acid) Sequence Information
Bioinformatics.
Algorithms in Bioinformatics Morten Nielsen Department of Systems Biology, DTU.
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
Bioinformatics for biomedicine
Introduction to databases Tuomas Hätinen. Topics File Formats Databases -Primary structure: UniProt -Tertiary structure: PDB Database integration system.
© Wiley Publishing All Rights Reserved. Protein and Specialized Sequence Databases.
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.
Secondary Databases Ansuman sahoo Roll: Y Bioinformatics Class Presentation 30 Jan 2013.
Discover the UniProt Blast tool. Murcia, February, 2011Protein Sequence Databases Customize the BLAST results.
Doug Raiford Lesson 3.  More and more sequence data is being generated every day  Useless if not made available to other researchers.
Part I: Identifying sequences with … Speaker : S. Gaj Date
Alignment, Part I Vasileios Hatzivassiloglou University of Texas at Dallas.
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
Function preserves sequences
Biological databases Exercises. Discovery of distinct sequence databases using ensembl.
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
Motif discovery and Protein Databases Tutorial 5.
(PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt.
1 Discussion Practical 1. Features of major databases (PubMed and NCBI Protein Db) 2.
Dealing with Sequence redundancy Morten Nielsen Department of Systems Biology, DTU.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Supplementary Figure 1: Comparison of the results obtained from three widely used databases (namely AmiGO, ArrayExpress and GeneCards) with that from HypoxiaDB.
1 of 28 Evaluating Genes and Transcripts (“Genebuild”)
Performance measures Morten Nielsen, CBS, Department of Systems Biology, DTU.
Immunology Ontology Workshop Buffalo, NY June 11-13, 2012.
Information retrieval and sliding window programs April 5, 2011 Hand in Homework #1. Homework #2 due Tuesday, April 12. Learning objectives- Understand.
亚洲的位置和范围 吉林省白城市洮北区教师进修学校 郑春艳. Q 宠宝贝神奇之旅 —— 亚洲 Q 宠快递 你在网上拍的一套物理实验器材到了。 Q 宠宝贝打电话给你: 你好,我是快递员,有你的邮件,你的收货地址上面 写的是学校地址,现在学校放假了,能把你家的具体 位置告诉我吗? 请向快递员描述自己家的详细位置!
Bioinformatics Computing 1 CMP 807 – Day 4 Kevin Galens.
Randi Vita, M.D. Better living through ontologies at the Immune Epitope Database La Jolla Institute for Allergy & Immunology Division of Vaccine Discovery.
Basics of BLAST Basic BLAST Search - What is BLAST?
생물정보학 Bioinformatics.
Access to Sequence Data and Related Information
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Morten Nielsen, CBS, BioSys, DTU
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Biological Databases Morten Nielsen BioSys, DTU

Different kinds of data DNA –NCBI GenBankNCBI GenBank –Organism specific databases Protein –UniProt SwissProt TrEMBL –NCBINCBI

Different kinds of data Protein Structure –PDBPDB Expression Data –NCBI GeoNCBI Geo Epitopes –IEDBIEDB

PDB

IEDB Immune Epitope Database:

UniProt UniProt database

Data redundancy! Databases have non-biological redundancy This is problematic when training data- driven prediction methods –As you saw for PSSM construction Uniprot has a feature to remove redundancy (90% or 50%). How is this done? This and much more you will find out in the next episode of...