SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.

Slides:



Advertisements
Similar presentations
Bioinformatics Ayesha M. Khan Spring 2013.
Advertisements

Databanks (A) NCBINCBI (National Center for Biotechnology Information) is a home for many public biological databases (see an older diagram below). All.
Databases (“knowledge bases”) used in genome analysis
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
COT 6930 HPC and Bioinformatics Bioinformatics Resources and Databases Xingquan Zhu Dept. of Computer Science and Engineering.
On line (DNA and amino acid) Sequence Information Lecture 7.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
Protein databases Morten Nielsen. Background- Nucleotide databases GenBank, National Center for Biotechnology Information.
Archives and Information Retrieval
©M. Thollesson, 2001 Bioinformatics – Biological databases Mikael Thollesson Evolutionary Biology Centre and Linnaeus Centre for Bioinformatics, Uppsala.
Protein Databases EBI – European Bioinformatics Institute
Genome Related Biological Databases. Content DNA Sequence databases Protein databases Gene prediction Accession numbers NCBI website Ensembl website.
The Cell, Central Dogma and Human Genome Project.
Biological Databases Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
IST Computational Biology1 Information Retrieval Biological Databases 2 Pedro Fernandes Instituto Gulbenkian de Ciência, Oeiras PT.
The Protein Data Bank (PDB)
Protein databases Henrik Nielsen. Background- Nucleotide databases GenBank, National Center for Biotechnology Information.
Class European Resources Protein Focused. Protein Databases EBI – European Bioinformatics Institute
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
Chapter 2 Sequence databases A list of the databases’ uniform resource locators (URLs) discussed in this section is in Box 2.1.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
ExPASy - Expert Protein Analysis System The bioinformatics resource portal and other resources An Overview.
An Introduction to Bioinformatics Molecular Biology Databases.
Joint EBI-Wellcome Trust Summer School June 2010.
From T. MADHAVAN, & K.Chandrasekaran Lecturers in Zoology.. EXIT.
On line (DNA and amino acid) Sequence Information
Bioinformatics.
Development of Bioinformatics and its application on Biotechnology
bioinformatics seminar on BY S.JHANSI RANI MPHARMACY II SEMESTER
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
Bioinformatics for biomedicine
Archives and Information Retrieval
Introduction to databases Tuomas Hätinen. Topics File Formats Databases -Primary structure: UniProt -Tertiary structure: PDB Database integration system.
Information Resources for Bioinformatics 1 MARC: Developing Bioinformatics Programs July, 2008 Alex Ropelewski Hugh Nicholas
© Wiley Publishing All Rights Reserved. Protein and Specialized Sequence Databases.
Secondary Databases Ansuman sahoo Roll: Y Bioinformatics Class Presentation 30 Jan 2013.
Biological Databases By : Lim Yun Ping E mail :
Doug Raiford Lesson 3.  More and more sequence data is being generated every day  Useless if not made available to other researchers.
1 Orthology and paralogy A practical approach Searching the primaries Searching the secondaries Significance of database matches DB Web addresses Software.
1 Review of Biological Database Utilization. 2 Biological Databases We will discuss: Usefulness to the bioinformaticist Database types Search methods.
Bioinformatics Overview, NCBI & GenBank JanPlan 2012.
Part I: Identifying sequences with … Speaker : S. Gaj Date
جلسه اول بیو انفورماتیک گردآوری:مسعود رسول آبادی
Organizing information in the post-genomic era The rise of bioinformatics.
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
REMINDERS 2 nd Exam on Nov.17 Coverage: Central Dogma of DNA Replication Transcription Translation Cell structure and function Recombinant DNA technology.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Mining Biological Data. Protein Enzymatic ProteinsTransport ProteinsRegulatory Proteins Storage ProteinsHormonal ProteinsReceptor Proteins.
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Sequencing the World of Possibilities for Energy & Environment MGM workshop. 19 Oct 2010 Information Sources for Genomics Konstantinos Mavrommatis Genome.
Bioinformatics and Computational Biology
Computer Storage of Sequences
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
Information retrieval and sliding window programs April 5, 2011 Hand in Homework #1. Homework #2 due Tuesday, April 12. Learning objectives- Understand.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
Tutorial: Bioinformatics Resources ( georgetown
Biological Databases By: Komal Arora.
Demo: Protein Information Resource
Biological databases: Collection, storage and maintenance
Archives and Information Retrieval
Biological Sequence Databases
생물정보학 Bioinformatics.
Access to Sequence Data and Related Information
PIR: Protein Information Resource
Introduction to Bioinformatics
Introduction to Databases
Overview of Enzyme, Protein and Network Databases
Presentation transcript:

SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS

Biological databases: Biological databases are stores of biological information. Biological databases Based on their contents, biological databases can be roughly divided into three categories: Primary databases, Secondary databases, and Specialized databases.

Primary Databases Primary Databases: Primary databases contain data that is derived experimentally. They usually store information related to the sequences or structures of biological components. They can be further divided into protein or nucleotide databases which can be further divided as sequence or structure databases. The most commonly used primary databases are: DNA Data Bank of Japan (DDBJ), European Molecular Biology Laboratory (EMBL) Nucleotide Sequence Database, GenBank, and Protein Data Bank (PDB)

Database of Nucleotide sequences 1. GenBank- This is a public sequence database and it can be accessed through a web addess GenBank is the most complete collection of annotated nucleic acid sequence data for almost every organism. The content includes genomic DNA, mRNA, cDNA, ESTs, high throughput raw sequence data, and sequence polymorphisms. 2. Entrez-Entrez system is used to search all NCBI associated databases. It is a powerful tool to peform simple or complicated searches by combining key word with the logical operator (AND, NOT). For example, searching a protein kinase sequence in human can be done by the following search syntax: Homo sapiens [ORGN] AND protein kinase. 3.EMBL and DDBJ- EMBL is the nucleotide sequence database present at European bioinformatics institute where as DDBJ is the DNA sequence database present at centre for information biology, Japan. EMBL can be accessed at where as DDBJ canbe accessed at

Database of protein sequences SWISSPROT-It is the collection of the annoted protein sequence of the swiss instituite of bioinformatics (SIB). SWISSPROT can be accessed The protein sequence entry in the swissprot is manually curated and if required it is compared with the available literature. Swissprot is part of the UniProt database and collectively known as UniProt Knowledge base. NCBI protein database-It is a compilation of the protein sequence present in other databases. The NCBI database contains the entries from the swissprot, PIR database, PDB database and other known databases. UniProt-It contains the information about the 3-D structure, expression profile, secondary structures and biochemical function of the protein.

Database of protein sequences The Protein Databank (PDB) The Protein Databank is the main repository for protein structural information,three-dimensional structures are stored in the Protein Databank (PDB). This is the single world-wide archive of structural data derived by X- ray crystallography, nuclear magnetic resonance spectroscopy, and other techniques, as well as structural models. The database is maintained by the Research Collaboratory for Structural Bioinformatics (RCSB), at Rutgers University.

Database of protein sequences Molecular Modeling Database (MMDB) There are also other structural databases such as the NCBI’s Molecular Modeling Database (MMDB) which aims to provide information on sequence and structure neighbors, links between the scientific literature and 3D structures, and sequence and structure visualization.

Secondary Databases: Secondary databases contain the data that is obtained through the analysis or treatment of data present in primary databases. For instance, it can contain conserved protein sequence, signature sequence active site residues of protein families which are obtained from multiple sequence alignment of related proteins, etc. These databases can be further classified as metabolic pathways database, protein family database,etc. The most common examples are Class Architecture Topology Homology (CATH), Kyoto Encyclopedia of Genes Genomics (KEGG), Protein Families (Pfam) and Structural Classification of Proteins (SCOP).