Download presentation
Presentation is loading. Please wait.
Published byMolly Stokes Modified over 9 years ago
1
Tutorial 3 BLAST 1
2
BLAST tutorial How to use BLAST Score vs. E-value Exercise Cool story of the day: How Alzheimer is studied in yeast 2
3
BLAST program Database Query BLAST What is BLAST? Basic Local Alignment Search Tool Set of similarity search programs for exploring sequence databases. 3
4
Why perform a similarity search? Find genes/proteins with possibly similar function Find the origin of a sequence (what organism it is taken form) Different degrees of similarity can be found in database search 4
5
Query type Database type blastnGenomic blastpProteomic blastxTranslated genomicProteomic tblastnProteomicTranslated genomic tblastxTranslated genomic BLAST Databases 5 Genomic: A T G C Proteomic: G A S T C V L I M P F Y W D E N Q H K R Translated genomic: The query is genomic, translated to protein using 6 possible reading frames ATGCCGTTC -> MPF, CR, AV
6
http://blast.ncbi.nlm.nih.gov/Blast.cgi 6
7
Place Query Choose Database ? 7 Job title – helpful when running multiple runs In case you want to restrict to a specific organism In case you want to eliminate specific sequences Query and DB parameters
8
How to choose the database? A good place to start if you don’t know what you’re looking for nr/nt : non-redundant nucleotide 8 Depends on what you’re looking for…
9
Alignment parameters 9 Optimizes the parameters for the desired similarity level of the search
10
10 Alignment parameters Threshold for results significance Primary word match (16-64 nt) Scores of matching and mismatching bases Cost to create and extend a gap
11
11 How to interpret BLAST results?
12
Search for homologous to chick “olfactory receptor 6” gene 12
13
Search results 13
14
14 Query sequence Matched sequences from DBs Graphic Summary
15
15 Descriptions Sequence Identifier + link Sequence description Score(bits) %Coverage %Identity E value
16
16 Descriptions Query covered=55% Only 55% of the query is covered => ~230 bp Identity=71% Out of the 230 bp of alignment only 71% was of matches
17
17 Alignments Query info Alignment info Alignment
18
It is possible to get multiple hits per sequence 18
19
E-values and scores 19
20
Score vs. E-value The score is a measure of the similarity of the query to the a sequence from the database. The E-value is a measure of the reliability of the score. The definition of the E-value is: The number of expected alignments with observed score or higher due to chance. 20
21
Score vs. E-value Score (S) = (identities + mismatches) - gaps Depends on search space Query length(bp) Effective length (total number of bases) of the database(bp) Depends on scoring system Score Bit Score (S’): 21 E-values cannot be compared across different DBs, even if the score is the same. ‘
22
Intuition for “significance” Think of the query as a ball, each color represents a part of the sequence. The DB is a pool of colored balls. If the ball has many colors (longer query) – there is a higher probability to see the same color in the pool by chance. If the pool of balls is very big, there is a higher probability to see one of the balls colors in the pool. 22
23
The typical threshold for a good E-value from a BLAST search is E=10 -6 ≈e -6 or lower. This does not mean that higher E-values are given for queries with no biological significance. 23 E-value Threshold http://www.youtube.com/watch?v=Z7ek7UoP7Bg&src_vid=nO0wJgZRZJs& feature=iv&annotation_id=annotation_234259
24
E-value vs. P-value 24 http://homepages.ulb.ac.be/~dgonze/TEACHING/stat_scores.pdf http://www.ncbi.nlm.nih.gov/BLAST/tutorial/
25
Exercise 25
26
Find homologs for CFTR gene in human 26 You can put the gene ID rather than the sequence Human DB only We’ll start with high similarity
27
27
28
28 Now change to more distinct sequences
29
29 We get more results
30
Find homologs for CFTR gene in other organisms 30 Not only human sequences
31
31
32
32 Where to run a nucleotide sequence - blastn or blastx ? blastn (genomic vs. genomics) blastx (translated genomics vs. proteomic) ncRNA If you know your sequence is a protein – blastx is better, since you will get more reliable results.
33
Cool Story of the day How Alzheimer is studied in yeast
34
Alzheimer's disease (AD) Alzheimer's disease leads to nerve cell death and tissue loss throughout the brain. Symptoms can include confusion, aggression, trouble with language, and long term memory loss. Gradually, bodily functions are lost, ultimately leading to death. There are no available treatments that stop or reverse the progression of the disease. The disease is associated with plaques and tangles in the brain. 34 http://www.alz.org/braintour/alzheimers_changes.asp http://en.wikipedia.org/wiki/Alzheimer's_disease
35
How can AD be studied in yeast? Yeast cells lack the specialized processes of neuronal cells and the cell-cell communications that modulate neuropathology. However, the most fundamental features of eukaryotic cell biology evolved before the split between yeast and metazoans. 35 Treusch et al. Science (2011) http://lindquistlab.wi.mit.edu/
36
36 Thinakaran et al JOURNAL OF BIOLOGICAL CHEMISTRY 2008
37
Susan Linquist’s lab showed it was toxic when expressed in yeast. Later they tested the affect of this protein on rat neuron cells and in C.elegans neurons. To recapitulate this multicompartment trafficking in yeast, we fused an endoplasmic reticulum (ER) targeting signal to the N terminus of Ab 1-42. 37 Treusch et al. Science (2011) http://lindquistlab.wi.mit.edu/
38
38 Treusch et al. Science (2011) Wild-type worms invariably have five glutamatergic neurons in their tails.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.