Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genome annotation. What we have GATCAATGATGATAGGAATTGAAAGTGTCTTAATTACAATCCCTGTGCAATTATTAATAACTTTTTTGTT CACCTGTTCCCAGAGGAAACCTCAAGCGGATCTAAAGGAGGTATCTCCTCAAAAGCATCCTCTAATGTCA.

Similar presentations


Presentation on theme: "Genome annotation. What we have GATCAATGATGATAGGAATTGAAAGTGTCTTAATTACAATCCCTGTGCAATTATTAATAACTTTTTTGTT CACCTGTTCCCAGAGGAAACCTCAAGCGGATCTAAAGGAGGTATCTCCTCAAAAGCATCCTCTAATGTCA."— Presentation transcript:

1 Genome annotation

2 What we have GATCAATGATGATAGGAATTGAAAGTGTCTTAATTACAATCCCTGTGCAATTATTAATAACTTTTTTGTT CACCTGTTCCCAGAGGAAACCTCAAGCGGATCTAAAGGAGGTATCTCCTCAAAAGCATCCTCTAATGTCA GAAGCAAGTGAGCACTGGGAAGAATACTTGAGAAAGTGGCATGCTTACGAAACTGCTAAGGTGCACCCCA GGGAGGTTGCAAAACCTGCATCTAAAGGAAAGCCCAGGCTTCCAAAGGCTTCTCCTAAGGCAACCTCCAA ACCCAAGCACAGGCATAGGAAAGCACAAATCAAGACCCCGGAGACCCTCGGGCCAAATACAAATTCCAAT AACAACATAGAAGATGATCAGGATGTCCATTCCGAACAGCACCCTTCCCAAAAGGATCTCCAGCAGCTTA AGAAAAAGCCCCGGATCGTCCTACCTTGGTGGTGTGTTTATGTTGCATGGTTTTTGGTTTTTGCTACTTC TAGCATATCCTCATTCTTCATTGTATTTTATGGACTGACTTACGGCTATGACAAGTCAATAGAATGGCTC TTTGCATCTTTTTGTTCATTCTGTCAGTCAGTTCTTCTGGTGCAGCCATCTAAAATTATACTCCTGTCAG GCTTCAGAACGAATAAACCCAAGTATTGCAAAAACCTTTCATGGTCAACCAAGTATAAATATACTGAGAT CAGGTTGGATGGAATGCGTATGCATCCAGAAGAAATGCAGAGGATACATGACCAGATCGTCCGAATCCGA GGCACGAGGATGTACCAACCCCTTACAGAAGATGAAATCAGAATATTCAAAAGAAAGAAGAGGATCAAGA GAAGAGCACTCCTGTTTCTGAGTTACATTCTAACTCACTTTATCTTTCTAGCCCTTCTGTTGATCCTTAT CGTCTTACTACGTCACACTGACTGCTTTTACTATAACCAGTTTATTCGTGATCGGTTCTCTATGGATCTT GCTACTGTGACTAAGCTGGAAGACATCTATAGATGGCTAAACAGCGTGCTGTTGCCTTTGTTACACAATG ACCTGAATCCAACATTTCTTCCTGAAAGCTCGTCTAAAATCCTTGGCCTTCCATTGATGAGGCAAGTGAG AGCAAAATCTAGTGAAAAAATGTGTCTACCTGCCGAAAAGTTTGTGCAAAACAGCATCAGAAGAGAAATT CATTGTCACCCCAAATATGGCATTGACCCAGAAGACACAAAAAACTATTCTGGCTTTTGGAATGAAGTTG ATAAGCAGGCTATAGATGAGAGTACCAATGGATTTACTTATAAGCCTCAAGGAACGCAATGGCTATATTA TTCCTATGGACTACTACACACCTATGGATCTGGAGGATATGCACTCTATTTTTTTCCAGAACAGCAGCGG TTTAATTCCACACTGAGGCTCAAAGAACTTCAAGAAAGCAATTGGCTGGATGAGAAGACATGGGCTGTGG TTTTGGAATTAACAACTTTTAATCCAGATATAAATCTGTTCTGTAGCATTTCGGTCATATTTGAAGTCTC TCAGTTAGGAGTTGTCAACACAAGCATATCTCTGCACTCTTTTTCACTTGCTGATTTTGACAGAAAAGCT TCAGCAGAAATCTACTTGTATGTGGCCATTCTCATTTTTTTCTTAGCCTACGTTGTTGATGAGGGTTGTA TCATTATGCAAGAAAGAGCCTCCTATGTGAGAAGTGTGTATAATTTGCTCAACTTTGCTTTAAAGTGCAT ATTTACTGTGTTGATTGTGCTCTTTCTCAGGAAACATTTCCTGGCCACTGGCATAATTCGGTTTTACTTG TCGAACCCAGAAGACTTCATTCCCTTTCATGCAGTTTCTCAGGTAGATCACATTATGAGGATAATTTTGG GTTTCCTGTTATTTCTGACAATTTTGAAGACCCTCAGGTATTCCAGATTCTTCTACGATGTGCGCCTGGC TCAGAGGGCCATCCAGGCTGCCCTCCCTGGCATCTGCCACATGGCATTTGTTGTGTCCGTGTATTTCTTC GTATACATGGCTTTTGGTTACCTGGTGTTTGGTCAGCATGAATGGAACTACAGTAACTTGATTCATTCCA CTCAGACAGTATTTTCCTATTGTGTCTCAGCTTTCCAGAACACTGAATTTTCCAATAACAGGATTCTGGG GGTCCTGTTCCTCTCATCTTTCATGCTGGTGATGATCTGCGTCTTGATCAACTTATTTCAGGCTGTAATT

3 What we want: Annotated sequence GATCAATGATGATAGGAATTGAAAGTGTCTTAATTACAATCCCTGTGCAATTATTAATAACTTTTTTGTT CACCTGTTCCCAGAGGAAACCTCAAGCGGATCTAAAGGAGGTATCTCCTCAAAAGCATCCTCTAATGTCA GAAGCAAGTGAGCACTGGGAAGAATACTTGAGAAAGTGGCATGCTTACGAAACTGCTAAGGTGCACCCCA GGGAGGTTGCAAAACCTGCATCTAAAGGAAAGCCCAGGCTTCCAAAGGCTTCTCCTAAGGCAACCTCCAA ACCCAAGCACAGGCATAGGAAAGCACAAATCAAGACCCCGGAGACCCTCGGGCCAAATACAAATTCCAAT AACAACATAGAAGATGATCAGGATGTCCATTCCGAACAGCACCCTTCCCAAAAGGATCTCCAGCAGCTTA AGAAAAAGCCCCGGATCGTCCTACCTTGGTGGTGTGTTTATGTTGCATGGTTTTTGGTTTTTGCTACTTC TAGCATATCCTCATTCTTCATTGTATTTTATGGACTGACTTACGGCTATGACAAGTCAATAGAATGGCTC TTTGCATCTTTTTGTTCATTCTGTCAGTCAGTTCTTCTGGTGCAGCCATCTAAAATTATACTCCTGTCAG GCTTCAGAACGAATAAACCCAAGTATTGCAAAAACCTTTCATGGTCAACCAAGTATAAATATACTGAGAT CAGGTTGGATGGAATGCGTATGCATCCAGAAGAAATGCAGAGGATACATGACCAGATCGTCCGAATCCGA GGCACGAGGATGTACCAACCCCTTACAGAAGATGAAATCAGAATATTCAAAAGAAAGAAGAGGATCAAGA GAAGAGCACTCCTGTTTCTGAGTTACATTCTAACTCACTTTATCTTTCTAGCCCTTCTGTTGATCCTTAT CGTCTTACTACGTCACACTGACTGCTTTTACTATAACCAGTTTATTCGTGATCGGTTCTCTATGGATCTT GCTACTGTGACTAAGCTGGAAGACATCTATAGATGGCTAAACAGCGTGCTGTTGCCTTTGTTACACAATG ACCTGAATCCAACATTTCTTCCTGAAAGCTCGTCTAAAATCCTTGGCCTTCCATTGATGAGGCAAGTGAG AGCAAAATCTAGTGAAAAAATGTGTCTACCTGCCGAAAAGTTTGTGCAAAACAGCATCAGAAGAGAAATT CATTGTCACCCCAAATATGGCATTGACCCAGAAGACACAAAAAACTATTCTGGCTTTTGGAATGAAGTTG ATAAGCAGGCTATAGATGAGAGTACCAATGGATTTACTTATAAGCCTCAAGGAACGCAATGGCTATATTA TTCCTATGGACTACTACACACCTATGGATCTGGAGGATATGCACTCTATTTTTTTCCAGAACAGCAGCGG TTTAATTCCACACTGAGGCTCAAAGAACTTCAAGAAAGCAATTGGCTGGATGAGAAGACATGGGCTGTGG TTTTGGAATTAACAACTTTTAATCCAGATATAAATCTGTTCTGTAGCATTTCGGTCATATTTGAAGTCTC TCAGTTAGGAGTTGTCAACACAAGCATATCTCTGCACTCTTTTTCACTTGCTGATTTTGACAGAAAAGCT TCAGCAGAAATCTACTTGTATGTGGCCATTCTCATTTTTTTCTTAGCCTACGTTGTTGATGAGGGTTGTA TCATTATGCAAGAAAGAGCCTCCTATGTGAGAAGTGTGTATAATTTGCTCAACTTTGCTTTAAAGTGCAT ATTTACTGTGTTGATTGTGCTCTTTCTCAGGAAACATTTCCTGGCCACTGGCATAATTCGGTTTTACTTG TCGAACCCAGAAGACTTCATTCCCTTTCATGCAGTTTCTCAGGTAGATCACATTATGAGGATAATTTTGG GTTTCCTGTTATTTCTGACAATTTTGAAGACCCTCAGGTATTCCAGATTCTTCTACGATGTGCGCCTGGC TCAGAGGGCCATCCAGGCTGCCCTCCCTGGCATCTGCCACATGGCATTTGTTGTGTCCGTGTATTTCTTC GTATACATGGCTTTTGGTTACCTGGTGTTTGGTCAGCATGAATGGAACTACAGTAACTTGATTCATTCCA CTCAGACAGTATTTTCCTATTGTGTCTCAGCTTTCCAGAACACTGAATTTTCCAATAACAGGATTCTGGG GGTCCTGTTCCTCTCATCTTTCATGCTGGTGATGATCTGCGTCTTGATCAACTTATTTCAGGCTGTAATT Exon 1 Exon 2 Exon 3 Exon 4

4 Making sense of genomic seqs HMM analysis Compare genomes to each other Compare to other kind of supprting data –Which kinds of data can you think of?

5 Other kinds of data 1.mRNA sequences (and ESTs) 2.Protein sequences

6 -OMEs Genome Transcriptome Proteome Interactome Metabolome Phenome

7 -OMEs Technologies Genome Transcriptome Proteome Interactome Metabolome Phenome Sequencing Microarray Computer (ORFs), Mass-spec Y2H, Mass-spec Mass-spec Phenotype Biochemical Disease

8 Transcript databases RefSeq contains full length sequences of mRNAs, carefully reviewed –Currently 28.000 human sequences dbEST contains 5’ and 3’ reads of random cDNAs –Currently 5.9 mio. human seqs

9 What are ESTs?

10

11 UniGene UniGene: Merge (cluster) any two ESTs when >100 bp are identical 5 mio -> 107.014 clusters

12 ESTs UniGene: total # clusters 107.014 Cluster size Number of clusters 1 (singletons)8131 2 38169 3-4 23302 5-8 11989 9-16 5616 17-32 3733 33-64 3303 65-128 3799 129-256 4368 257-512 3029 513-1024 1079 1025-2048 343 2049-4096 106 4097-8192 33 8193-16384 16 16385-32768 2

13 Transcripts: what can we learn? Comparing genome sequences to transcripts allows: –Confirmation of gene predictions –Experimental identification of Exons/Introns, 5’ UTRs, 3’ UTRs –Alternative splicing Asses the relative abundance of transcripts: Digital differential display (DDD).

14

15 Annotation example

16 18 exons 623 AA

17 Proteomics: What for? Disease targets Gene finding Secondary modifications Measuring expression levels Protein-protein interactions

18 Whats new? Mass spectrometry was invented turn of century (Thomson) Noble price to Aston 1930s MALDI-TOF (Henzel et al, 1993) Nano-electro-spray (Wilm, Mann 1996s) coupled to tandem mass spectrometer

19

20

21

22

23

24 2-D electrophoresis

25

26

27


Download ppt "Genome annotation. What we have GATCAATGATGATAGGAATTGAAAGTGTCTTAATTACAATCCCTGTGCAATTATTAATAACTTTTTTGTT CACCTGTTCCCAGAGGAAACCTCAAGCGGATCTAAAGGAGGTATCTCCTCAAAAGCATCCTCTAATGTCA."

Similar presentations


Ads by Google