Download presentation
Presentation is loading. Please wait.
Published byCynthia McLaughlin Modified over 9 years ago
1
Welcome to Introduction to Bioinformatics Monday, 21 March 2005 Genome Comparison Coming attractions How to compare genomes Chi-squared analysis
3
E. coli: What makes it kill? Escherichia coli...... very small lab rats Courtesy of Kent State University Microbiology
4
E. coli: What makes it kill? Escherichia coli... haemorrhagic colitis
5
E. coli: What makes it kill? E. coli K12E. coli O157:H7 TCTACTTATA TTCAATCCAC AGGGCTACAC AAGAGTCTGT TGAATGAACA CATACATGGT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA TCTACTTATA TTCAATCCAC AGGGCTACAC AAGAGTCTGT TGAATGAACA CATACATGGT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA
6
How to compare genomes E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAAGAGTGACCGGTCATCCTTCTCCGCTGCGAAATATCCTTCTTGTTGGCATACCACGCCTTGA... E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC...
7
E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC... How to compare genomes E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAAGAGTGACCGGTCATCCTTCTCCGCTGCGAAATATCCTTCTTGTTGGCATACCACGCCTTGA...
8
E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC... How to compare genomes E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAACCACGCCTTGA... GATAGATCCCC
9
E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC... How to compare genomes E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAACCACGCCTTGA... GATAGATCCCC
10
E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC... How to compare genomes E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAACCACGCCTTGA... GATAGATCCCC
11
E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC... How to compare genomes E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAACCACGCCTTGA... GATAGATCCCC
12
E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC... How to compare genomes E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAACCACGCCTTGA... CCCACGCCTAT
13
E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC... How to compare genomes E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAACCACGCCTTGA... CCCACGCCTAT
14
E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC... How to compare genomes E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAACCACGCCTTGA...
15
E. coli O157:H7 E. coli K12
16
E. coli O157:H7 E. coli K12 O-Islands
17
Prochlor ss120 Prochlor. MED4 Prochlorococcus SS120 Prochlorococcus MED4 (100 nuc)
18
Prochlor ss120 Prochlor. MED4 Prochlorococcus SS120 Prochlorococcus MED4 (25 nuc)
19
Nature of Pathogenicity Islands Horizontal transfer of foreign DNA E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAAGAGTGACCGGTCATCCTTCTCCGCTGCGAAATATCCTTCTTGTTGGCATACCACGCCTTGA... E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC...
20
How do differences arise between genomes? Infection Phage Bacterial chromosome Phage genome Lysogenic pathway Lytic pathway Phage genome Deat h General transduction
21
How do differences arise between genomes? Infection Phage Bacterial chromosome Phage genome Lysogenic pathway Lytic pathway Phage genome Life!
22
How do differences arise between genomes? Infection Phage Bacterial chromosome Phage genome Lysogenic pathway Lytic pathway Phage genome Life!
23
How do differences arise between genomes? Infection Phage Bacterial chromosome Phage genome Lysogenic pathway Lytic pathway Phage genome Special transduction
24
How to compare genomes E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAAGAGTGACCGGTCATCCTTCTCCGCTGCGAAATATCCTTCTTGTTGGCATACCACGCCTTGA... E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC... Differences in genome sequence Useful only if very related
25
How to compare genomes E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAAGAGTGACCGGTCATCCTTCTCCGCTGCGAAATATCCTTCTTGTTGGCATACCACGCCTTGA... E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC... Differences in genome sequence Useful only if very related Differences in protein content Useful for even distant comparisons
26
How to compare genomes E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAAGAGTGACCGGTCATCCTTCTCCGCTGCGAAATATCCTTCTTGTTGGCATACCACGCCTTGA... E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC... Differences in genome sequence Useful only if very related Differences in protein content Useful for even distant comparisons How to find orthologous protein?
27
How to compare genomes E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAAGAGTGACCGGTCATCCTTCTCCGCTGCGAAATATCCTTCTTGTTGGCATACCACGCCTTGA... E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC... Differences in genome sequence Useful only if very related Differences in protein content Useful for even distant comparisons How to find corresponding protein?
28
X X X X X X X Yeast E. coli Anabaena Methanobacter
29
How to find corresponding protein? X X X X X X X Yeast E. coli Anabaena Methanobacter All similar protein? Most related by common descent? Orthologs Orthologs Paralogs
30
How to find corresponding protein? Most related by common descent? All similar protein? Orthologs Paralogs Blast E-value threshold Organism X Organism Y
31
How to find corresponding protein? Most related by common descent? Orthologs Blast E-value threshold Organism Y Organism X Organism Y Defined by bidirectional Blast hit
32
How to find corresponding protein? PROTEINS-SIMILAR-TO ORTHOLOG-OF COMMON-ORTHOLOGS-OF
33
Nature of Pathogenicity Islands Horizontal transfer of foreign DNA E. coli O157:H7 genome GATAGATCCCCACGCCTATAATGGCGCATAACACACTAAACTTGGGGTATTGAAGCAGTCGCCAAAGAGTGACCGGTCATCCTTCTCCGCTGCGAAATATCCTTCTTGTTGGCATACCACGCCTTGA... E. coli K12 genome GCGGAGCAAACTGGGCGTCTTTCGAGAACTAACAAATCCGATTGCGGGCTTCTCACGCATAGGCGCAGTTATGGTTAATGCCAAAACTTTTTTTTCGCGCCGAAATAACATAATGCACAGGCATGGC...
34
Nature of Pathogenicity Islands Horizontal transfer of foreign DNA
35
Nature of Pathogenicity Islands Nucleotide frequencies comparisons BaseSequence1Sequence2Total A1,000 6001,600 C 1,0008001,800 G1,0007001,700 T1,0009001,900 Total4,0003,0007,000 Nucleotide Count
36
Nucleotide frequencies to detect foreign genes 1. Find nucleotide frequencies of native genes 2. Find nucleotide frequencies of test gene 3. Compare frequencies 4. How likely differences arose by chance? Chi-squared analysis
37
Result: 705 purple 224 white = 929 plantsResult: 698 purple 231 white = 929 plantsResult: 688 purple 241 white = 929 plantsResult: 710 purple 219 white = 929 plantsResult: 695 purple 234 white = 929 plantsResult: 702 purple 227 white = 929 plants Where does 2 come from? A million repetitions of Mendel’s experiment Create a million universes -- purple:white on average = 3:1
38
200,000 repetitions Where does 2 come from? A million repetitions of Mendel’s experiment
39
500,000 repetitions Where does 2 come from? A million repetitions of Mendel’s experiment
40
1,000,000 repetitions Why is it that the two dotted lines are on opposite sides of the mean?
41
Where does 2 come from? A million repetitions of Mendel’s experiment 1,000,000 repetitions What’s the most likely result? How often does it occur?
42
Deviation from Expectation Two example experiments Why is there shading on both sides of the curve? The farther away O from E, the smaller/larger the shaded area?
43
Steps in Performing a Chi 2 Test Determine the expected values for the experiment Model: 3 purple : 1 white flower Total counted: 929 Purple = 75% of 929 = 696.75 White = 25% of 929 = 232.25 Calculate the squares of the deviations Chi 2 = Sum of (O - E) 2 / E Chi 2 = (705 - 696.75) 2 /696.75 + (224 - 232.25) 2 /232.25 ~8 2 / 700 + ~8 2 / 230 ~0.09 ~0.3 Chi 2 = approx 0.39 (actually = 0.37)
44
Steps in Performing a Chi 2 Test Determine the degrees of freedom What was the experiment? - Count 929 flowers a million times Ask: purple? (if not, then white) Look up probability for 2 value 2 = 0.30 80% > P > 50%. Call it ~60% Therefore ONE degree of freedom
45
Steps in Performing a Chi 2 Test P ~60% Draw a conclusion The result has a 50% chance of being correctThe hypothesis has a 50% chance of being correct60% of the time, Mendel’s result or worse would have arisen by chance if purple:white truly occurs in a 3:1ratio.
46
Deviation from Expectation Two example experiments Study Question 20: What if Mendel had counted not 929 but 929,000 plants -- what does the curve and shading look like then? (d still = 29) P =.50P = ???
47
Interpretation of Chi-Square Does a high P value indicate the hypothesis is correct? Does a low P value indicate the hypothesis is incorrect?
48
Bag of Marbles 1000’s of marbles! 50% red, 50% blue Guaranteed!
49
Test Claim of 50%:50% 41 marbles 59 marbles 100 marbles TOTAL Is their claim correct? How to tell how close is close enough?
50
2 Test of Claim Chi 2 = Sum of (O - E) 2 / E Chi 2 = (53 - 50) 2 /50 + (47 - 50) 2 /50 9 / 50 + 9 / 50 18/50 0.36 P = ? P = ~60%
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.