Summer Bioinformatics Workshop 2008 BLAST Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State University – Rochester Center

Slides:



Advertisements
Similar presentations
SCHOOL OF COMPUTING ANDREW MAXWELL 9/11/2013 SEQUENCE ALIGNMENT AND COMPARISON BETWEEN BLAST AND BWA-MEM.
Advertisements

Bioinformatics Tutorial I BLAST and Sequence Alignment.
BLAST Sequence alignment, E-value & Extreme value distribution.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 2: “Homology” Searches and Sequence Alignments.
Local alignments Seq X: Seq Y:. Local alignment  What’s local? –Allow only parts of the sequence to match –Results in High Scoring Segments –Locally.
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
Summer Bioinformatics Workshop 2008 Sequence Alignments Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University.
Biological Databases Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
Chapter 2 Sequence databases A list of the databases’ uniform resource locators (URLs) discussed in this section is in Box 2.1.
Introduction to Bioinformatics BLAST. Introduction –What is BLAST? –Query Sequence Formats –What does BLAST tell you? Choices –Variety of BLAST –BLAST.
Sequence alignment, E-value & Extreme value distribution
BLAST: Basic Local Alignment Search Tool Urmila Kulkarni-Kale Bioinformatics Centre University of Pune.
Making Sense of DNA and protein sequence analysis tools (course #2) Dave Baumler Genome Center of Wisconsin,
Wellcome Trust Workshop Working with Pathogen Genomes Module 3 Sequence and Protein Analysis (Using web-based tools)
An Introduction to Bioinformatics
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
Introduction to Bioinformatics CPSC 265. Interface of biology and computer science Analysis of proteins, genes and genomes using computer algorithms and.
Tweaking BLAST Although you normally see BLAST as a web page with boxes to place data in and tick boxes, etc., it is actually a command line program that.
Sequence Alignment Goal: line up two or more sequences An alignment of two amino acid sequences: …. Seq1: HKIYHLQSKVPTFVRMLAPEGALNIHEKAWNAYPYCRTVITN-EYMKEDFLIKIETWHKP.
BIOINFORMATICS IN BIOCHEMISTRY Bioinformatics– a field at the interface of molecular biology, computer science, and mathematics Bioinformatics focuses.
Copyright OpenHelix. No use or reproduction without express written consent1.
Bacterial Genetics - Assignment and Genomics Exercise: Aims –To provide an overview of the development and.
Module 3 Sequence and Protein Analysis (Using web-based tools) Working with Pathogen Genomes - Uruguay 2008.
Database Searches BLAST. Basic Local Alignment Search Tool –Altschul, Gish, Miller, Myers, Lipman, J. Mol. Biol. 215 (1990) –Altschul, Madden, Schaffer,
Last lecture summary. Window size? Stringency? Color mapping? Frame shifts?
Sequence-based Similarity Module (BLAST & CDD only ) & Horizontal Gene Transfer Module (Ortholog Neighborhood & GC content only)
BLAST Anders Gorm Pedersen & Rasmus Wernersson. Database searching Using pairwise alignments to search databases for similar sequences Database Query.
CISC667, F05, Lec9, Liao CISC 667 Intro to Bioinformatics (Fall 2005) Sequence Database search Heuristic algorithms –FASTA –BLAST –PSI-BLAST.
1 P6a Extra Discussion Slides Part 1. 2 Section A.
BLAST Basic Local Alignment Search Tool (Altschul et al. 1990)
You have worked for 2 years to isolate a gene involved in axon guidance. You sequence the cDNA clone that contains axon guidance activity. What do you.
Function preserves sequences Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
A Tutorial of Sequence Matching in Oracle Haifeng Ji* and Gang Qian** * Oklahoma City Community College ** University of Central Oklahoma.
BLAST Slides adapted & edited from a set by Cheryl A. Kerfeld (UC Berkeley/JGI) & Kathleen M. Scott (U South Florida) Kerfeld CA, Scott KM (2011) Using.
Basic Local Alignment Search Tool BLAST Why Use BLAST?
Database search. Overview : 1. FastA : is suitable for protein sequence searching 2. BLAST : is suitable for DNA, RNA, protein sequence searching.
Pairwise Local Alignment and Database Search Csc 487/687 Computing for Bioinformatics.
Biocomputation: Comparative Genomics Tanya Talkar Lolly Kruse Colleen O’Rourke.
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
Tutorial 3 BLAST 1. BLAST tutorial How to use BLAST Score vs. E-value Exercise Cool story of the day: How Alzheimer is studied in yeast 2.
Tweaking BLAST Although you normally see BLAST as a web page with boxes to place data in and tick boxes, etc., it is actually a command line program that.
David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN.
Step 3: Tools Database Searching
Bioinformatics zInterdisciplinary science that involves developing and applying information technology for analyzing biological data Overview of Bioinformatics.
What is BLAST? Basic BLAST search What is BLAST?
Welcome to the combined BLAST and Genome Browser Tutorial.
Using BLAST To Teach ‘E-value-tionary’ Concepts Cheryl A. Kerfeld 1, 2 and Kathleen M. Scott 3 1.Department of Energy-Joint Genome Institute, Walnut Creek,
Keeping Current: Genetics Resources. This workshop will provide an overview of NCBI resources for finding-- Background information & journal articles.
What is BLAST? Basic BLAST search What is BLAST?
Bacterial infection by lytic virus
Bacterial infection by lytic virus
Introduction to Bioinformatics Resources for DNA Barcoding
Basics of BLAST Basic BLAST Search - What is BLAST?
BLAST Anders Gorm Pedersen & Rasmus Wernersson.
Identifying templates for protein modeling:
Mangaldai College, Mangaldai
Sequencing Data Analysis
Genome Center of Wisconsin, UW-Madison
Bioinformatics and BLAST
Sequence Based Analysis Tutorial
BLAST.
Comparative Genomics.
Basic Local Alignment Search Tool
Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool
Basic Local Alignment Search Tool (BLAST)
BLAST Slides adapted & edited from a set by
Sequence alignment, E-value & Extreme value distribution
BLAST Slides adapted & edited from a set by
Sequencing Data Analysis
Presentation transcript:

Summer Bioinformatics Workshop 2008 BLAST Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State University – Rochester Center

Summer Bioinformatics Workshop BLAST Introduction –What is BLAST? –Query Sequence in FASTA Format –What does BLAST tell you? Choices –BLAST Programs: Which One to Use? –Commonly Used BLAST programs –BLAST Databases: Which One to Search? Understanding the Output Database Search with BLAST Blast Steps – How It Works Acknowledgement: The presentation includes adaptations from NCBI’s Introduction to Molecular Biology Information ResourcesIntroduction to Molecular Biology Information Resources Modules

Summer Bioinformatics Workshop What is BLAST? Basic Local Alignment Search Tool The Google TM of bioinformatics query is a DNA or protein sequence, not a text term character string comparison against all the sequences in the target database rigorous statistics used to identify statistically significant matches

Summer Bioinformatics Workshop Query Sequence in FASTA Format FASTA definition line ("def line") that begins with a >, followed by some text that briefly describes the query sequence on a single line up to 80 nucleotide bases or amino acids per line example and additional informationexample >gi|532319|pir|TVFV2E|TVFV2E envelope protein ELRLRYCAPAGFALLKCNDADYDGFKTNCSNVSVVHCTNLMNTTVTTGLLLNGSYSENRT QIWQKHRTSNDSALILLNKHYNLTVTCKRPGNKTVLPVTIMAGLVFHSQKYNLRLRQAWC HFPSNWKGAWKEVKEEIVNLPKERYRGTNDPKRIFFQRQWGDPETANLWFNCHGEFFYCK MDWFLNYLNNLTVDADHNECKNTSGTKSGNKRAPGPCVQRTYVACHIRSVIIWLETISKK

Summer Bioinformatics Workshop What does BLAST tell you? putative identity and function of your query sequence helps to direct experimental design to prove the function find similar sequences in model organisms (e.g., yeast, C. elegans, mouse), which can be used to further study the gene compare complete genomes against each other to identify similarities and differences among organisms

Summer Bioinformatics Workshop BLAST Programs: Which One to Use? Depends on: what type of query sequence you have (nucleotide or protein) what type of database you will search against (nucleotide or protein) Most commonly used BLAST programs –blastn –blastp –blastx

Summer Bioinformatics Workshop Commonly Used BLAST Programs BLASTN –Nucleic acids against nucleic acids BLASTP –Protein query against protein database –usually better to use than nucleotide-nucleotide BLAST –...but... if we don't have a protein query sequence, what are our options? BLASTX –Translated nucleic acids against protein database –one way to do a protein BLAST search if you have a nucleotide query sequence –the BLAST program does the translating for you, in all 6 reading framesreading frames

Summer Bioinformatics Workshop Request ID: RID An RID is like a ticket number that allows you to retrieve your search results and format them in many different ways over the next 24 hours. If you've saved RIDs from your recent searches, you can enter the RIDs directly using the Retrieve results with a Request ID page, which is accessible from the bottom of the BLAST home pageRetrieve results with a Request IDBLAST

Summer Bioinformatics Workshop Search Results: Understanding the Output Reference to BLAST paper Reminders about your specific query –RID –query sequence reminder (contains the information from your FASTA def line) –what database you searched against Graphical summary –shows where the hits aligned to your query –colors indicate score range –mouse over a colored bar to see info about that hit Text summary (GI numbers and Def lines) –GI links to complete record in Entrez –Score links to pairwise alignment between your query sequence and the hit Pairwise alignments BLAST statistics for your search

Summer Bioinformatics Workshop Database Search w/ BLAST Used most often!

Summer Bioinformatics Workshop Database Search w/ BLAST Selecting a BLAST program Insert sequence Hit “BLAST” near the end of the web page In general, if you select blastn, select “Others” as your Database to search.

Summer Bioinformatics Workshop Database Search w/ BLAST RID and search status will appear RID

Summer Bioinformatics Workshop Database Search w/ BLAST Wait for your result (patiently …)

Summer Bioinformatics Workshop Database Search w/ BLAST Interpret the result –Graphic result –The black color lines are sequences that matched the least while the red lines would be sequences that matched best. In the example below, the purple color sequences are the best matches available. Source of the image:

Summer Bioinformatics Workshop Database Search w/ BLAST BLAST result  Matching sequences w/ bit-score & E-value  Hyperlinks to database entry for sequence Example Notes that 3e-188 means 3 

Summer Bioinformatics Workshop BLAST – Statistical Evaluation E Value – The number of different alignments with scores equivalent to or better than alignment score that are expected to occur in a database search by chance. – The lower the E value, the more significant the score.

Summer Bioinformatics Workshop BLAST Steps – How It Works 1. Seeding - Prepare a list of short, fixed-length segments (words) from the query 2. Searching - Find highly similar or exact match for each word 3. Extension - Extend each match to (potentially) a longer match 4. Evaluation - Evaluate the results using E values