Download presentation
Presentation is loading. Please wait.
1
Lecture 7: Gen(om)e duplications 9/23/09
2
Homework 1. Clustal and trees 2. Ensembl links 3. OMIM
3
HW #1 GNAT1
4
Fasta file >Human_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLEECLEFIAIIY GNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMSDIIQRLWKDSGIQACFERAS EYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGIIETQFSFKDLNFRMFDVGGQRSERKKWIHC FEGVTCIIFIAALSAYDMVLVEDDEVNRMHESLHLFNSICNHRYFATTSIVLFLNKKDVFFEKIKKAHLS ICFPDYDGPNTYEDAGNYIKVQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF >Chimp_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLEECLEFIAIIY GNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMSDIIQRLWKDSGIQACFERAS EYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGIIETQFSFKDLNFRMFDVGGQRSERKKWIHC FEGVTCIIFIAALSAYDMVLVEDDEVNRMHESLHLFNSICNHRYFATTSIVLFLNKKDVFFEKIKKAHLS ICFPDYDGPNTYEDAGNYIKVQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF >Dog_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLEECLEFIAIIY GNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMSDIIQRLWKDSGIQACFERAS EYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGIIETQFSFKDLNFRMFDVGGQRSERKKWIHC FEGVTCIIFIAALSAYDMVLVEDDEVNRMHESLHLFNSICNHRYFATTSIVLFLNKKDVFSEKIKKAHLS ICFPDYDGPNTYEDAGNYIKVQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF Note: Programs will use whatever is in the identifier up to the 1st space as labels. If you don’t like genbank #s, you can change this to species names.
5
CLUSTAL 2.0.8 multiple sequence alignment Human_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE 60 Chimp_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE 60 Dog_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE 60 Cow_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE 60 Rat_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE 60 Mouse_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE 60 Zfish_GNAT1 MGAGASAEEKHSRELEKKLKEDADKDARTVKLLLLGAGESGKSTIVKQMKIIHKDGYSLE 60 ***********************:*****************************:****** Human_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS 120 Chimp_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS 120 Dog_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS 120 Cow_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS 120 Rat_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS 120 Mouse_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS 120 Zfish_GNAT1 ECLEFIVIIYSNTMQSILAVVRAMTTLNIGYGDAAAQDDARKLMHLADTIEEGTMPKELS 120 ******.***.**:*****:********* ***:* *********:************:* Human_GNAT1 DIIQRLWKDSGIQACFERASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI 180 Chimp_GNAT1 DIIQRLWKDSGIQACFERASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI 180 Dog_GNAT1 DIIQRLWKDSGIQACFERASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI 180 Cow_GNAT1 DIIQRLWKDSGIQACFDRASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI 180 Rat_GNAT1 DIIQRLWKDSGIQACFDRASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI 180 Mouse_GNAT1 DIIQRLWKDSGIQACFDRASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI 180 Zfish_GNAT1 DIILRLWKDSGIQACFDRASEYQLNDSAGYYLNDLERLIQPGYVPTEQDVLRSRVKTTGI 180 *** ************:***************.*****: ******************** Human_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 Chimp_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 Dog_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 Cow_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 Rat_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 Mouse_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 Zfish_GNAT1 IETQFSFKDLNFRMFDVGGQRSERKKWIHCFEGVTCIIFIAALSAYDMVLVEDDEVNRMH 240 ************************************************************ Human_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFFEKIKKAHLSICFPDYDGPNTYEDAGNYIK 300 Chimp_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFFEKIKKAHLSICFPDYDGPNTYEDAGNYIK 300 Dog_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFSEKIKKAHLSICFPDYDGPNTYEDAGNYIK 300 Cow_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFSEKIKKAHLSICFPDYNGPNTYEDAGNYIK 300 Rat_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFSEKIKKAHLSICFPDYDGPNTYDDAGNYIK 300 Mouse_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFSEKIKKAHLSICFPDYDGPNTYEDAGNYIK 300 Zfish_GNAT1 ESLHLFNSICNHRYFATTSIVLFLNKKDVFVEKIKKAHLSMCFPEYDGPNTFEDAGNYIK 300 ****************************** *********:***:*:****::******* Human_GNAT1 VQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF 350 Chimp_GNAT1 VQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF 350 Dog_GNAT1 VQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF 350 Cow_GNAT1 VQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF 350 Rat_GNAT1 VQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF 350 Mouse_GNAT1 VQFLELNMRRDVKEIYSHMTCATDTQNVKFVFDAVTDIIIKENLKDCGLF 350 Zfish_GNAT1 VQFLDLNLRRDIKEIYSHMTCATDTENVKFVFDAVTDIIIKENLKDCGLF 350 ****:**:***:*************:************************ 350 sites * Fixed : 324 324/350 = 92.6%
6
HW #1 GNGT1 Fixed =49/74 = 66% Human_GNGT1 MPVINIEDLTEKDKLKMEVDQLKKEVTLERMLVSKCCEEVRDYVEERSGEDPLVKGIPED 60 Chimp_GNGT1 MPVINIEDLTEKDKLKMEVDQLKKEVTLERMLVSKCCEEVRDYVEERSGEDPLVKGIPED 60 Dog_GNGT1 MPVINIEDLTEKDKLKMEVDQLKKEVTLERMLVSKCCEEVRDYVEERSGEDPLVKGIPED 60 Cow_GNGT1 MPVINIEDLTEKDKLKMEVDQLKKEVTLERMLVSKCCEEFRDYVEERSGEDPLVKGIPED 60 Mouse_GNGT1 MPVINIEDLTEKDKLKMEVDQLKKEVTLERMMVSKCCEEVRDYIEERSGEDPLVKGIPED 60 Rat_GNGT1 MPVINIEDLTEKDKLKMEVDQLKKEVTLERVMVSKCCEEVRDYIEERSREDPLVKGIPED 60 Zfish_GNGT1 MPIIDVENMTDLDKAKMEVTQLKTEVKLERAKVSKCCEEITEYIQGGADEDPLVKGIPEE 60 **:*::*::*: ** **** ***.**.*** *******. :*:: : **********: Human_GNGT1 KNPFKELKGGCVIS 74 Chimp_GNGT1 KNPFKELKGGCVIS 74 Dog_GNGT1 KNPFKELKGGCVIS 74 Cow_GNGT1 KNPFKELKGGCVIS 74 Mouse_GNGT1 KNPFKELKGGCVIS 74 Rat_GNGT1 KNPFKELKGGCVIS 74 Zfish_GNGT1 KNPFKE-KGGCVIC 73 ****** ******.
7
Protein interactions Rhodopsin GNAT1 GNB1 GNGT1
8
Relative constraint, % of fixed sites GNAT1324 / 350 = 92.6% GNB1306 / 340 = 88% GNGT149 / 74 = 66%
9
Trees
10
Ensembl search finds lots of groups Interpro domain - identifies and groups proteins by protein signatures Ensembl families - proteins grouped by phylogenetic relationship Vega / Havana - the human hand curated part of the ensembl database. They confirm each predicted gene in different genomes Find proteins, pseudogenes, processed pseudogenes
11
We want Ensembl protein_coding Gene Check that it is rhodopsin and not some rhodopsin related gene
12
Transcript and protein info are useful
13
Protein - use links at left to look at the sequence
14
Protein sequence
15
Exon shows sequences of exons as well as those of UTRs, and introns Start 5’UTR Intron
16
cDNA sequence includes known SNPs Variation in human population
17
Can export sequence
18
Ensembl There is a dizzying array of data and info on this web site. We will try to use it as a “helpful” tool to gather more sequences Often we just want to get all the homologs from all the species where Ensembl has made that link -
19
At bottom of sequence list is link to sequence display
20
Go back to the gene page and scroll down to find orthologs This shows pairwise comparisons in clustalw format.
21
OMIM
22
Q4. Making trees Clustalw is a bit limited Sequences are compared using distances Trees are drawn by neighbor joining Nice to have more options Max likelihood, distance, parsimony Phylip - set of modules that you can mix and match to make trees Phylemon Pasteur Institute
23
Methods Parsimony - Alignment Input characters to parsimony tree program Distance Alignment Calculate distances Input distances to tree program Maximum likelihood Alignment Input characters to ML program
24
Steps to make a distance tree StepsProgram Align sequencesClustalw-multialign Calculate distancesDNAdist Protdist Use distances to make a tree Neighbor Display treeExternal program
25
Steps to make a distance tree Align sequences Can do in clustalw at EBI web site or at Pasteur web site
26
Pasteur Institute - Phylogenetics
27
Clustalw2 at Pasteur - under alignment and under multiple Either paste in sequences or select fasta file and upload
28
Leave defaults and hit Run
29
Save files to keep results Clustal does make dendogram which you can save
30
Save files to keep results You can pass the results of this to the next program here
31
Calculate distances If DNA use DNAdist If protein (AA) use Protdist
32
Pass alignment to protdist
33
Use Protdist under distance Upload or paste data and say Run
34
Save distance matrix then send to neighbor joining program to make tree
35
Tell it which # taxa is the outgroup - this will root your tree! 7
36
CLUSTAL 2.0.11 multiple sequence alignment Human_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE Chimp_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE Dog_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE Cow_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE Rat_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE Mouse_GNAT1 MGAGASAEEKHSRELEKKLKEDAEKDARTVKLLLLGAGESGKSTIVKQMKIIHQDGYSLE Zfish_GNAT1 MGAGASAEEKHSRELEKKLKEDADKDARTVKLLLLGAGESGKSTIVKQMKIIHKDGYSLE ***********************:*****************************:****** Human_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS Chimp_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS Dog_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS Cow_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS Rat_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS Mouse_GNAT1 ECLEFIAIIYGNTLQSILAIVRAMTTLNIQYGDSARQDDARKLMHMADTIEEGTMPKEMS Zfish_GNAT1 ECLEFIVIIYSNTMQSILAVVRAMTTLNIGYGDAAAQDDARKLMHLADTIEEGTMPKELS ******.***.**:*****:********* ***:* *********:************:* Human_GNAT1 DIIQRLWKDSGIQACFERASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI Chimp_GNAT1 DIIQRLWKDSGIQACFERASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI Dog_GNAT1 DIIQRLWKDSGIQACFERASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI Cow_GNAT1 DIIQRLWKDSGIQACFDRASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI Rat_GNAT1 DIIQRLWKDSGIQACFDRASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI Mouse_GNAT1 DIIQRLWKDSGIQACFDRASEYQLNDSAGYYLSDLERLVTPGYVPTEQDVLRSRVKTTGI Zfish_GNAT1 DIILRLWKDSGIQACFDRASEYQLNDSAGYYLNDLERLIQPGYVPTEQDVLRSRVKTTGI *** ************:***************.*****: ******************** Note: Zebrafish is taxa #7
37
Save tree
38
What does this tree mean??? Tree shows relationships and branch lengths (((Cow_GNAT1:0.00281,Rat_GNAT1:0.00281):0.00004,Mouse_G NAT:-0.00004):0.00070, ((Human_GNAT:0.00000,Chimp_GNAT:0.00001):0.00244,Dog_ GNAT1:0.00037):0.00210,Zfish_GNAT:0.06645); Just relationships: (((Cow,Rat),Mouse),((Human,Chimp),Dog),Zfish)
39
You can download FigTree for drawing trees Mac PC
40
Tree - does this make sense?
41
What is the difference between homologs, orthologs and paralogs?????
42
Orthologs Have common ancestor, derived by descent Paralogs Gene duplicates within the same organism Homologs = orthologs + paralogs
43
LWS RH2 SWS2 SWS1 RH1 Lamprey LWS Lamprey RHB Lamprey RHA Lamprey S2 Lamprey S1
44
How do gen(ome)s evolve? What can change? DNA mutation DNA deletions / insertions (indels) Recombination Selection - change in gene frequency Gene transfer Duplications
45
Human Chicken Frog Zebrafish Dog Human Chicken Frog Zebrafish Dog Lamprey Gene duplication
46
Ohno Evolution by Gene Duplication, 1970 Gene duplication is the primary way that you get new genes to work with Genome duplications Double # of chromosomes Keep balance in biochemical machinery Duplicate regulatory structure New genes can evolve to do new jobs!
47
Gene vs genome duplications How do you know what has duplicated?
48
Mechanisms for duplication 1.Tandem duplication 2.Insertion of retrotransposed gene 3.Genome / chromosome duplication
49
1. Mismatched recombination Leads to extra genes inserted right next to original gene Unequal crossover
50
Normal DNA recombination Switches genes from one chromosome to the other Leads to new gene combinations
51
Mismatched recombination If chromosomes misalign, recombination leads to gain of gene on one chromosome and loss of gene on the other. Tandem arrays of genes
52
Opsin gene tandem arrays on X chromosome Only first 2 genes are expressed so it doesn’t matter if there are more green genes. They are just along for ride.
53
Misaligned recombination If recombination happens within gene, get chimera Intermediate phenotype - changes pigment light sensitivity Opsin genes on X chromosome
54
Human red and green opsins 530 nm 560 nm A S A A164S=+2 nm Y F T F261Y=+10 nm A269T=+14 nm 554 nm
55
Normal human visual pigments Normal max = 420, 535, 565 nm
56
Deuteranomoly - green pigment shifted towards red max = 420, 550, 565 nm 5% male 0.04% female
58
2. Insertion of retrotransposed gene Gene can be transcribed to mRNA mRNA then gets reverse transcribed and inserted into DNA Clue a gene is retrotransposed? No introns - all coding sequence
59
Comparison of rhodopsin genes Vertebrate rhodopsin gene Fish rhodopsin gene
60
Possibilities Lost introns and stayed in place mRNA sequence reinserted somewhere else in the genome
61
Fugu - human comparison Rh1 Human chr 3 Fugu scaffold 830 Human chr Z Fugu Rh gene has been inserted into chromosome
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.