NCBI FieldGuide NCBI Molecular Biology Resources A Field Guide part 2 (post intermission) September 30, 2004 ICGEB.

Slides:



Advertisements
Similar presentations
NCBI BLAST, CDD, Mini-courses Katia Guimarães 2007/2.
Advertisements

1 Genome information GenBank (Entrez nucleotide) Species-specific databases Protein sequence GenBank (Entrez protein) UniProtKB (SwissProt) Protein structure.
Databases (“knowledge bases”) used in genome analysis
Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.
NCBI web resources I: databases and Entrez Yanbin Yin Fall 2014 Most materials are downloaded from ftp://ftp.ncbi.nih.gov/pub/education/ 1.
On line (DNA and amino acid) Sequence Information Lecture 7.
NCBI Minicourses BLAST Quick Start
NCBI Minicourses BLAST Quick Start
Biological databases.
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
Readings for this week Gogarten et al Horizontal gene transfer….. Francke et al. Reconstructing metabolic networks….. Sign up for meeting next week for.
Lecture 2.21 Retrieving Information: Using Entrez.
PSI (position-specific iterated) BLAST The NCBI page described PSI blast as follows: “Position-Specific Iterated BLAST (PSI-BLAST) provides an automated,
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Biological Databases Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Bioinformatics Student host Chris Johnston Speaker Dr Kate McCain.
Genomic Database - Ensembl Ka-Lok Ng Department of Bioinformatics Asia University.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
NCBI FieldGuide NCBI Molecular Biology Resources A Field Guide part 2 August 2-3, 2005.
Sequence/Structure Alignment Resources from NCBI Steve Bryant Protein Data Bank Rutgers University November 19, 2005.
BLAST.
Chapter 2 Sequence databases A list of the databases’ uniform resource locators (URLs) discussed in this section is in Box 2.1.
Introduction to BLAST David Fristrom Bibliographer/Librarian
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
On line (DNA and amino acid) Sequence Information
NCBI FieldGuide NCBI Molecular Biology Resources A Field Guide part 2 September 30, 2004 ICGEB.
Sequence Alignment Lakshmanan Iyer, Ph. D.. The Building Blocks… ATGC VLMFNQEDHKRCSTPYW.
Pairwise Alignment How do we tell whether two sequences are similar? BIO520 BioinformaticsJim Lund Assigned reading: Ch , Ch 5.1, get what you can.
NCBI FieldGuide A Minimal Guide to NCBI Nucleotide Resources.
NCBI’s Bioinformatics Resources Michele R. Tennant, Ph.D., M.L.I.S. Health Science Center Libraries U.F. Genetics Institute January 2015.
Introduction to Bioinformatics CPSC 265. Interface of biology and computer science Analysis of proteins, genes and genomes using computer algorithms and.
NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Peter Cooper Using NCBI BLAST.
NCBI Review Concepts Chuong Huynh. NCBI Pairwise Sequence Alignments Purpose: identification of sequences with significant similarity to (a)
NCBI FieldGuide NCBI Molecular Biology Resources January 2008 Using Entrez.
NCBI FieldGuide NCBI Molecular Biology Resources Part 2 November 2008 Peter Cooper.
Workshop OUTLINE Part 1: Introduction and motivation How does BLAST work? Part 2: BLAST programs Sequence databases Work Steps Extract and analyze results.
Corrections. - The cacao genome is currently being sequenced - Human Chromosome 1 sequence Search ‘Genome’
Bioinformatics Overview, NCBI & GenBank JanPlan 2012.
Searching Molecular Databases with BLAST. Basic Local Alignment Search Tool How BLAST works Interpreting search results The NCBI Web BLAST interface Demonstration.
NCBI FieldGuide NCBI Molecular Biology Resources January 2008 Peter Cooper Using NCBI BLAST.
CISC667, F05, Lec9, Liao CISC 667 Intro to Bioinformatics (Fall 2005) Sequence Database search Heuristic algorithms –FASTA –BLAST –PSI-BLAST.
NCBI resources II: web-based tools and ftp resources Yanbin Yin Fall 2014 Most materials are downloaded from ftp://ftp.ncbi.nih.gov/pub/education/ 1.
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
Function preserves sequences
NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.
Biological databases Exercises. Discovery of distinct sequence databases using ensembl.
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
NCBI Literature Databases: PubMed
Part 2- OUTLINE Introduction and motivation How does BLAST work?
Bioinformatics and Computational Biology
A Field Guide to GenBank and NCBI Molecular Biology Resources
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
Copyright OpenHelix. No use or reproduction without express written consent1.
What is BLAST? Basic BLAST search What is BLAST?
NCBI: something old, something new. What is NCBI? Create automated systems for knowledge about molecular biology, biochemistry, and genetics. Perform.
Lab 3.2: Database Similarity Searching “The BLAST Buffet” Stephanie Minnema University of Calgary.
Keeping Current: Genetics Resources. This workshop will provide an overview of NCBI resources for finding-- Background information & journal articles.
What is BLAST? Basic BLAST search What is BLAST?
A Practical Guide to NCBI BLAST
NCBI Molecular Biology Resources
Basics of BLAST Basic BLAST Search - What is BLAST?
Genome Annotation Continued
BLAST.
Genome of the week Bacillus subtilis Gram-positive soil bacterium
Gene Safari (Biological Databases)
Basic Local Alignment Search Tool
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

NCBI FieldGuide NCBI Molecular Biology Resources A Field Guide part 2 (post intermission) September 30, 2004 ICGEB

NCBI FieldGuide PSI-BLAST Position-Specific Iterated BLAST Mining for protein domains Confirming relationships among related proteins

NCBI FieldGuide Position Specific Substitution Rates Active site serine Weakly conserved serine

NCBI FieldGuide Position Specific Score Matrix (PSSM) A R N D C Q E G H I L K M F P S T W Y V 206 D G V I D S C N G D S G G P L N C Q A Active site nucleophile Serine scored differently in these two positions

NCBI FieldGuide >gi|113340|sp|P03958|ADA_MOUSE ADENOSINE DEAMINASE (ADENOSINE MAQTPAFNKPKVELHVHLDGAIKPETILYFGKKRGIALPADTVEELRNIIGMDKPLSLPGF VIAGCREAIKRIAYEFVEMKAKEGVVYVEVRYSPHLLANSKVDPMPWNQTEGDVTPDDVVD EQAFGIKVRSILCCMRHQPSWSLEVLELCKKYNQKTVVAMDLAGDETIEGSSLFPGHVEAY RTVHAGEVGSPEVVREAVDILKTERVGHGYHTIEDEALYNRLLKENMHFEVCPWSSYLTGA VRFKNDKANYSLNTDDPLIFKSTLDTDYQMTKKDMGFTEEEFKRLNINAAKSSFLPEEEKK PSI-BLAST e value cutoff for PSSM

NCBI FieldGuide RESULTS: Initial BLASTP Same results as protein-protein BLAST

NCBI FieldGuide Results of First PSSM Search Other purine nucleotide metabolizing enzymes not found by ordinary BLAST

NCBI FieldGuide Third PSSM Search: Convergence Just below threshold, another nucleotide metabolism enzyme Check to add to PSSM

NCBI FieldGuide MegaBLAST AI AI AI BE C:\seq\hs.4.fsa > gnl|UG|Hs#S qd43b11.x1 Homo sapiens cDNA, 3' end CATGTAAGCCATTTATTGGTTTGTTTTAAAAATATGTATTTTATTTATACATGAAGTTTG GTGAGAAGTGCTCGATTAGTTCAGACAACATCTGGCACTTGATGTCTGTCCTTCCCTCCT TTTTCCTACTCTCTTCTCCCCTCCTGCTGGTCATTGTGCAGTTCTGGAAATTAAAAAGGT GACAGCCAGGCTAAAAGCTAAGGGTTGGGTCTAGCTCACCTCCCACCCCCAACCACACCG TCTGCAGCCAGCCCCAGGCACCTGTCTCAAAGCTCCCGGGCTGTCCACACACACAAAAAC CACAGTCTCCTTCCGGCCAGCTGGGCTGGCAGCCCGACCTGC > gnl|UG|Hs#S qv37f11.x1 Homo sapiens cDNA, 3' end GAGAAGACGACAGAAGGGGAGAAGAGAGTAGGAAAAAGGAGGGAAGGACAGACATCAAGT GCCAGATGTTGTCTGAACTAATCGAGCACTTCTCACCAAACTTCATGTATAAATAAAATA CATATTTTTAAAACAAACCAATAAATGGCTTACATCAAAAAAAAAAAAAAAAAAAAAAAA GTCGTATCGATGT > gnl|UG|Hs#S qv33c06.x1 Homo sapiens cDNA, 3' end GAGAAGACGACAGAAGGGGAGAAGAGAGTAGGAAAAAGGAGGGAAGGACAGACATCAAGT GCCAGATGTTGTCTGAACTAATCGAGCACTTCTCACCAAACTTCATGTATAAATAAAATA CATATTTTTAAAACAAACCAATAAATGGCTTACATCAAAAAAAAAAAAAAAAAAAAAAAA GTCGTATCGATGT > gnl|UG|Hs#S e65f04.x1 Homo sapiens cDNA, 3' end TTTCATGTAAGCCATTTATTGGTTTGTTTTAAAAATATGTATTTTATTTATACATGAAGT TTGGTGAGAAGTGCTCGATTAGTTCAAACAACATCTGGCACTTGATGTCTGTCCTTCCCT CCTTTTTCCTACTCTCTTCTCCCCTCCTGCTGGTCATTGTGCAGTTCTGGAAATTAAAAA GGTGACAGCCAGGCTAAAAGCTAAGGGTTGGGTCTAGCTCACCTCCCACCCCCAACCACA CCGTCTGCAGCCAGCCCCAGGCACCTGTCTCAAAGCTCCCGGGCTGTCCACACACACAAA AACCACAGTCTCCTTCCGGCCAGCTGGGCTGGCAGCCCGACCTGCCTCCCAACCGCATTC CTGCCTGTGTAGCAGGCGGTGAGCACCCAGAAGGGGCACATACCTCTCCAAGCCTTGAAA GCAAAGCATGGAGATCTACAAAAATAGGATTTCCACTTGGAGAAATGTCGCTGGGACAGT

NCBI FieldGuide What is Discontiguous (Cross-species) MegaBLAST? W = 11, t = 16, coding: W = 11, t = 16, non-coding: W = 12, t = 16, coding: W = 12, t = 16, non-coding: W = 11, t = 18, coding: W = 11, t = 18, non-coding: W = 12, t = 18, coding: W = 12, t = 18, non-coding: W = 11, t = 21, coding: W = 11, t = 21, non-coding: W = 12, t = 21, coding: W = 12, t = 21, non-coding: Ma, B., Tromp, J., Li, M., "PatternHunter: faster and more sensitive homology search", Bioinformatics 2002 Mar;18(3):440-5

NCBI FieldGuide Neighbors: Precomputed BLAST Nucleotide Protein Entrez Related Sequences produces a list of sequences sorted by BLAST score, but with no alignment details.

NCBI FieldGuide Blink – Protein BLAST Alignments Lists only 200 hits List is nonredundant

NCBI FieldGuide Blink – Linking Sequence to Structure Cn3D

NCBI FieldGuide BLAST: Related Structures

NCBI FieldGuide BLAST Databases: Non-redundant protein nr (non-redundant protein sequences) –GenBank CDS translations –NP_ RefSeqs –Outside Protein PIR, Swiss-Prot, PRF –PDB (sequences from structures)

NCBI FieldGuide BLAST Databases: Nucleic Acid nr (nt) –Traditional GenBank Divisions –NM_ and XM_ RefSeqs dbest –EST Division htgs –HTG division gss –GSS division chromosome –NC_ RefSeqs wgs –whole genome shotgun

NCBI FieldGuide Genomic BLAST These pages provide customized nucleotide and protein databases for each genome If a Map Viewer is available, the BLAST hits can be viewed on the maps

NCBI FieldGuide What if Your Favorite Gene is not found in the latest genome build? POSSIBLE VARIANTS: The gene does not exist; It exists, but there is a problem with assembly; It exists, but there is a problem with annotation

NCBI FieldGuide An example: finding prestin in Human genome We start with rat prestin, BLAST it against the Human genome and look for evidences that human prestin exists as well.

NCBI FieldGuide Searching the Human Genome >gi| |emb|AJ |RNO Rattus norvegicus ATGGATCATGCTGAAGAAAATGAAATTCCTGCAGAGATCAGAAGTACCTCGTGGAA GTCATCCGGTCCTCCAGGAGAGGCTGCACGTCAAGGACAAAGTCACAGACTCCATC GCAGGCATTCACGTGCACTCCTAAAAAAGTAAGAAACATCATCTACATGTTCTTGC TTGCCAGCATATAAATTCAAGGAGTATGTGCTGGGTGACTTGGTCTCGGGCATAAG AGCTCCCCCAAGGCTTAGCCTTCGCGATGCTGGCAGCTGTGCCTCCGGTGTTCGGC On for same species comparisons

NCBI FieldGuide BLAST Results 16 hits to one contig Human Genome Database 953 contigs 2.9 billion letters

NCBI FieldGuide Map Viewer: Genomic Context of BLAST Hits Genes Genome Scan Models Human EST hits Contig GenBank Mouse EST hits

NCBI FieldGuide Human prestin: now appears in Build 34

NCBI FieldGuide Now we can compare genes

NCBI FieldGuide Three prestin genes: finally together!

NCBI FieldGuide Same prestin, different assemblies

NCBI FieldGuide Does homology mean the common biological function? Not always; the existence of the common ancestor does not guarantee that some function won’t be lost or acquired after the divergence. An example: zeta-crystallin is a component of a transparent lens matrix of the vertebrate eye. Its homolog in E.coli is the metabolic enzyme quinone oxidoreductase.

NCBI FieldGuide BLAST VAST Entrez Text Sequence Structure

NCBI FieldGuide Structure similarity: No More BLASTing! Three-dimensional structures are most conserved during the evolution; One still can detect the existence of the common ancestor based on the structure similarity; Spatial similarity is not calculated the same way we do it for sequences

NCBI FieldGuide VAST: Structure Neighbors Vector Alignment Search Tool For each protein chain, locate SSEs (secondary structure elements), and represent them as individual vectors Human IL-4

NCBI FieldGuide VAST: Structure Neighbors

NCBI FieldGuide Structure Neighbors in Cn3D SH3SH2 C-Src kinase Human vs. Chicken

NCBI FieldGuide 3D Domain Neighbors Human C-Src Kinase (Tyr) vs. Chk1 kinase (Ser/Thr)

NCBI FieldGuide NCBI is changing From sequence data storage facility to one-stop shop with integrated databases of various kind. You can be part of the future – work with us! Your expertise and data are indispensable.

NCBI FieldGuide GenBank

NCBI FieldGuide Refseq

NCBI FieldGuide Entrez Gene

NCBI FieldGuide Homologene database

NCBI FieldGuide New generation of databases: an example

NCBI FieldGuide Protein interaction database: a seed for future precomputed resources

NCBI FieldGuide New databases: GenSAT

NCBI FieldGuide PubChem

NCBI FieldGuide Headache? Take Aspirin

NCBI FieldGuide Aspirin has 432 neighbors

NCBI FieldGuide Link to 3D protein structures

NCBI FieldGuide PubCrawler – Update Alerting Service for PubMed and GenBank

NCBI FieldGuide MedBlast: searching for articles related to a sequence.

NCBI FieldGuide For More Information… General addresses The (free!) NCBI Newsletter The NCBI Handbook The NCBI Education Page Follow the link from the NCBI Home Page