Download presentation
Presentation is loading. Please wait.
Published byScot George Modified over 9 years ago
1
http://cs273a.stanford.edu [BejeranoFall13/14] 1 MW 12:50-2:05pm in Beckman B302 Profs: Serafim Batzoglou & Gill Bejerano TAs: Harendra Guturu & Panos Achlioptas CS273A Lecture 9: Repetitive Elements
2
http://cs273a.stanford.edu [BejeranoFall13/14] 2 Announcements HW1 done. HW2 enroute.
3
The Functional Genome http://cs273a.stanford.edu [BejeranoFall13/14] 3 Type# in genome% of genome genes25,0002% ncRNA15,0001% cis elements1,000,000>10%
4
TTATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATA CATATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTC AGTAATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTC CGTGCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACT AGCTTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATG ATAATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAA AAGCTGCATAACCACTTTAACTAATACTTTCAACATTTTCAGTTTGTATTACTTCTTATTCAAATGTCATAAAAGTATCAACAAAAAAT TGTTAATATACCTCTATACTTTAACGTCAAGGAGAAAAAACTATAATGACTAAATCTCATTCAGAAGAAGTGATTGTACCTGAGTTCAA TTCTAGCGCAAAGGAATTACCAAGACCATTGGCCGAAAAGTGCCCGAGCATAATTAAGAAATTTATAAGCGCTTATGATGCTAAACCGG ATTTTGTTGCTAGATCGCCTGGTAGAGTCAATCTAATTGGTGAACATATTGATTATTGTGACTTCTCGGTTTTACCTTTAGCTATTGAT TTTGATATGCTTTGCGCCGTCAAAGTTTTGAACGATGAGATTTCAAGTCTTAAAGCTATATCAGAGGGCTAAGCATGTGTATTCTGAAT CTTTAAGAGTCTTGAAGGCTGTGAAATTAATGACTACAGCGAGCTTTACTGCCGACGAAGACTTTTTCAAGCAATTTGGTGCCTTGATG AACGAGTCTCAAGCTTCTTGCGATAAACTTTACGAATGTTCTTGTCCAGAGATTGACAAAATTTGTTCCATTGCTTTGTCAAATGGATC ATATGGTTCCCGTTTGACCGGAGCTGGCTGGGGTGGTTGTACTGTTCACTTGGTTCCAGGGGGCCCAAATGGCAACATAGAAAAGGTAA AAGAAGCCCTTGCCAATGAGTTCTACAAGGTCAAGTACCCTAAGATCACTGATGCTGAGCTAGAAAATGCTATCATCGTCTCTAAACCA GCATTGGGCAGCTGTCTATATGAATTAGTCAAGTATACTTCTTTTTTTTACTTTGTTCAGAACAACTTCTCATTTTTTTCTACTCATAA CTTTAGCATCACAAAATACGCAATAATAACGAGTAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGA TAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTT GGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTTGCGAAGTT CTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGT TTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATAC CTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCT TGGCAAGTTGCCAACTGACGAGATGCAGTTTCCTACGCATAATAAGAATAGGAGGGAATATCAAGCCAGACAATCTATCATTACATTTA AGCGGCTCTTCAAAAAGATTGAACTCTCGCCAACTTATGGAATCTTCCAATGAGACCTTTGCGCCAAATAATGTGGATTTGGAAAAAGA GTATAAGTCATCTCAGAGTAATATAACTACCGAAGTTTATGAGGCATCGAGCTTTGAAGAAAAAGTAAGCTCAGAAAAACCTCAATACA GCTCATTCTGGAAGAAAATCTATTATGAATATGTGGTCGTTGACAAATCAATCTTGGGTGTTTCTATTCTGGATTCATTTATGTACAAC CAGGACTTGAAGCCCGTCGAAAAAGAAAGGCGGGTTTGGTCCTGGTACAATTATTGTTACTTCTGGCTTGCTGAATGTTTCAATATCAA CACTTGGCAAATTGCAGCTACAGGTCTACAACTGGGTCTAAATTGGTGGCAGTGTTGGATAACAATTTGGATTGGGTACGGTTTCGTTG GTGCTTTTGTTGTTTTGGCCTCTAGAGTTGGATCTGCTTATCATTTGTCATTCCCTATATCATCTAGAGCATCATTCGGTATTTTCTTC TCTTTATGGCCCGTTATTAACAGAGTCGTCATGGCCATCGTTTGGTATAGTGTCCAAGCTTATATTGCGGCAACTCCCGTATCATTAAT GCTGAAATCTATCTTTGGAAAAGATTTACAATGATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCT TGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTT TCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCT ATTCTTGACATGATATGACTACCATTTTGTTATTGTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTT TCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGA GATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTA TCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTT CATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTT CAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAA TAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGT ATGATAATGTTTTCAATGTAAGAGATTTCGATTATCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATAAAG 4
5
http://cs273a.stanford.edu [BejeranoFall13/14] 5 One Cell, One Genome, One Replication Every cell holds a copy of all its DNA = its genome. The human body is made of ~10 13 cells. All originate from a single cell through repeated cell divisions. cell genome = all DNA chicken ≈ 10 13 copies (DNA) of egg (DNA) chicken egg cell division DNA strings = Chromosomes
6
http://cs273a.stanford.edu [BejeranoFall13/14] 6 Every Genome is Different DNA Replication is imperfect – between individuals of the same species, even between the cells of an individual....ACGTACGACTGACTAGCATCGACTACGA... chicken egg...ACGTACGACTGACTAGCATCGACTACGA... functional junk TT CAT “anything goes” many changes are not tolerated chicken This has bad implications – disease, and good implications – adaptation.
7
http://cs273a.stanford.edu [BejeranoFall13/14] 7 Drift, Negative & Positive Selection Neutral Drift Positive Selection Negative Selection Time
8
Human Mutation Rate 10 -9 per base pair per generation This refers to mutations that are not repaired Thus, there are at least six new mutations in each child that were not present in either parent Mutations range from the smallest possible (single base pair change) to the largest – whole genome duplication. Selection does not tolerate all of these mutation, but it sure does tolerate some. chicken egg chicken 8
9
TTATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATA CATATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTC AGTAATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTC CGTGCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACT AGCTTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATG ATAATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAA AAGCTGCATAACCACTTTAACTAATACTTTCAACATTTTCAGTTTGTATTACTTCTTATTCAAATGTCATAAAAGTATCAACAAAAAAT TGTTAATATACCTCTATACTTTAACGTCAAGGAGAAAAAACTATAATGACTAAATCTCATTCAGAAGAAGTGATTGTACCTGAGTTCAA TTCTAGCGCAAAGGAATTACCAAGACCATTGGCCGAAAAGTGCCCGAGCATAATTAAGAAATTTATAAGCGCTTATGATGCTAAACCGG ATTTTGTTGCTAGATCGCCTGGTAGAGTCAATCTAATTGGTGAACATATTGATTATTGTGACTTCTCGGTTTTACCTTTAGCTATTGAT TTTGATATGCTTTGCGCCGTCAAAGTTTTGAACGATGAGATTTCAAGTCTTAAAGCTATATCAGAGGGCTAAGCATGTGTATTCTGAAT CTTTAAGAGTCTTGAAGGCTGTGAAATTAATGACTACAGCGAGCTTTACTGCCGACGAAGACTTTTTCAAGCAATTTGGTGCCTTGATG AACGAGTCTCAAGCTTCTTGCGATAAACTTTACGAATGTTCTTGTCCAGAGATTGACAAAATTTGTTCCATTGCTTTGTCAAATGGATC ATATGGTTCCCGTTTGACCGGAGCTGGCTGGGGTGGTTGTACTGTTCACTTGGTTCCAGGGGGCCCAAATGGCAACATAGAAAAGGTAA AAGAAGCCCTTGCCAATGAGTTCTACAAGGTCAAGTACCCTAAGATCACTGATGCTGAGCTAGAAAATGCTATCATCGTCTCTAAACCA GCATTGGGCAGCTGTCTATATGAATTAGTCAAGTATACTTCTTTTTTTTACTTTGTTCAGAACAACTTCTCATTTTTTTCTACTCATAA CTTTAGCATCACAAAATACGCAATAATAACGAGTAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGA TAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTT GGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTTGCGAAGTT CTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGT TTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATAC CTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCT TGGCAAGTTGCCAACTGACGAGATGCAGTTTCCTACGCATAATAAGAATAGGAGGGAATATCAAGCCAGACAATCTATCATTACATTTA AGCGGCTCTTCAAAAAGATTGAACTCTCGCCAACTTATGGAATCTTCCAATGAGACCTTTGCGCCAAATAATGTGGATTTGGAAAAAGA GTATAAGTCATCTCAGAGTAATATAACTACCGAAGTTTATGAGGCATCGAGCTTTGAAGAAAAAGTAAGCTCAGAAAAACCTCAATACA GCTCATTCTGGAAGAAAATCTATTATGAATATGTGGTCGTTGACAAATCAATCTTGGGTGTTTCTATTCTGGATTCATTTATGTACAAC CAGGACTTGAAGCCCGTCGAAAAAGAAAGGCGGGTTTGGTCCTGGTACAATTATTGTTACTTCTGGCTTGCTGAATGTTTCAATATCAA CACTTGGCAAATTGCAGCTACAGGTCTACAACTGGGTCTAAATTGGTGGCAGTGTTGGATAACAATTTGGATTGGGTACGGTTTCGTTG GTGCTTTTGTTGTTTTGGCCTCTAGAGTTGGATCTGCTTATCATTTGTCATTCCCTATATCATCTAGAGCATCATTCGGTATTTTCTTC TCTTTATGGCCCGTTATTAACAGAGTCGTCATGGCCATCGTTTGGTATAGTGTCCAAGCTTATATTGCGGCAACTCCCGTATCATTAAT GCTGAAATCTATCTTTGGAAAAGATTTACAATGATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCT TGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTT TCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCT ATTCTTGACATGATATGACTACCATTTTGTTATTGTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTT TCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGA GATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTA TCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTT CATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTT CAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAA TAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGT ATGATAATGTTTTCAATGTAAGAGATTTCGATTATCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATAAAG 9
10
Why this cartoon? http://cs273a.stanford.edu [BejeranoFall13/14] 10
11
Sequences that repeat many times in the genome Take up cumulatively a whooping half of the genome Come in two major, very different, flavors http://cs273a.stanford.edu [BejeranoFall13/14] 11 I II
12
http://cs273a.stanford.edu [BejeranoFall13/14] 12 I. Interspersed Repeats / TEs [Adapted from Lunter]
13
http://cs273a.stanford.edu [BejeranoFall13/14] 13
14
http://cs273a.stanford.edu [BejeranoFall13/14] 14 DNA Transposons
15
http://cs273a.stanford.edu [BejeranoFall13/14] 15 Genomic Transmission For repeat copies to accumulate through the generations they must make it into the germline cells (eggs & sperms). Equally true for any genomic mutation. cell genome = all DNA chicken ≈ 10 13 copies (DNA) of egg (DNA) chicken egg cell division DNA strings = Chromosomes
16
http://cs273a.stanford.edu [BejeranoFall13/14] 16 LINE & SINE Elements
17
http://cs273a.stanford.edu [BejeranoFall13/14] 17 Retrovirus-like Elements
18
TE composition and assortment vary among eukaryotic genomes 20% 40% 60% 80% 100% Slime mold Budding yeast Fission yeast NeurosporaArabidopsis Rice Nematode Drosophila Mosquito Fugu Mouse Human DNA transposons LTR Retro. Non-LTR Retro. Feschotte & Pritham 2006 18http://cs273a.stanford.edu [Bejerano Fall09/10]
19
http://cs273a.stanford.edu [BejeranoFall13/14] 19 Repeat Ages
20
Figure from Ryan Gregory (2005) INTERSPECIES VARIATION IN GENOME SIZE WITHIN VARIOUS GROUPS OF ORGANISMS 20
21
The amount of TE correlate positively with genome size Plasmodium Slime mold Budding yeast Fission yeast Neurospora Arabidopsis Brassica Rice Maize Nematode Drosophila Mosquito Sea squirt Zebrafish Fugu Mouse Human 0 500 1000 1500 2000 2500 3000 Genomic DNA TE DNA Protein-coding DNA Mb Feschotte & Pritham 2006 21http://cs273a.stanford.edu [Bejerano Fall09/10]
22
TEs Protein-coding genes The proportion of protein-coding genes decreases with genome size, while the proportion of TEs increases with genome size Gregory, Nat Rev Genet 2005 22
23
http://cs273a.stanford.edu [BejeranoFall13/14] 23
24
http://cs273a.stanford.edu [BejeranoFall13/14] 24
25
http://cs273a.stanford.edu [BejeranoFall13/14] 25 Repeat Insertions Can Break Things
26
http://cs273a.stanford.edu [BejeranoFall13/14] 26 Repeat Insertions Can Become Functional
27
http://cs273a.stanford.edu [BejeranoFall13/14] 27 Regulatory elements from obile Elements [Yass is a small town in New South Wales, Australia.] Co-option event, probably due to favorable genomic context
28
http://cs273a.stanford.edu [BejeranoFall13/14] 28 Britten & Davidson Hypothesis: Repeat to Rewire! Enhancer structure reminder
29
The Road to Co-Option http://cs273a.stanford.edu [BejeranoFall13/14] 29 Transposition Event Random Mutations Neutral decay Potential Co-Option States
30
http://cs273a.stanford.edu [BejeranoFall13/14] 30 Inferring Phylogeny Using Repeats [Nishihara et al, 2006]
31
http://cs273a.stanford.edu [BejeranoFall13/14] 31 Assemby Challenges
32
http://cs273a.stanford.edu [BejeranoFall13/14] 32 Transposons as Genetics Engineering Tools Human Gene Therapy
33
http://cs273a.stanford.edu [BejeranoFall13/14] 33 II. Simple Repeats Every possible motif of mono-, di, tri- and tetranucleotide repeats is vastly overrepresented in the human genome. These are called microsatellites, Longer repeating units are called minisatellites, The real long ones are called satellites. Highly polymorphic in the human population. Highly heterozygous in a single individual. As a result microsatellites are used in paternity testing, forensics, and the inference of demographic processes. There is no clear definition of how many repetitions make a simple repeat, nor how imperfect the different copies can be. Highly variable between species: e.g., using the same search criteria the mouse & rat genomes have 2-3 times more microsatellites than the human genome. They’re also longer in mouse & rat. AAAAAAAAA CACACACAC CAACAACAA
34
http://cs273a.stanford.edu [BejeranoFall13/14] 34 DNA Replication
35
http://cs273a.stanford.edu [BejeranoFall13/14] 35 Simple Repeats Create Funky DNA structures
36
http://cs273a.stanford.edu [BejeranoFall13/14] 36 These Bumps Give The DNA Polymerase Hiccups
37
http://cs273a.stanford.edu [BejeranoFall13/14] 37 Expandable Repeats and Disease
38
Restriction Enzymes Restriction enzymes recognize and make a cut within specific DNA sequences, known as restriction sites. This is usually a 4-6 base pair palindromic sequence. Naturally found in different types of bacteria Bacteria use restriction enzymes to protect themselves from foreign DNA Many have been isolated and sold for use in lab work http://cs273a.stanford.edu [BejeranoFall13/14] 38 blunt end sticky end
39
DNA Fingerprint Basics DNA fragments of different size will be produced by a restriction enzyme that cuts at the points shown by the arrows. 39
40
DNA fragments are then separated based on size using gel electrophoresis. 40
41
DNA Fingerprinting can be used in paternity testing or murder cases. 41
42
http://cs273a.stanford.edu [BejeranoFall13/14] 42 There are Tracks for it
43
http://cs273a.stanford.edu [BejeranoFall13/14] 43 Interspersed vs. Simple Repeats From an evolutionary point of view transposons and simple repeats are very different. Different instances of the same transposon share common ancestry (but not necessarily a direct common progenitor). Different instances of the same simple repeat most often do not.
44
Categories are NOT mutually exclusive We already discussed repeat instances that became Coding exons Enhancers There are known genomic loci that Code for protein coding exons and act as enhancers Ditto for non-coding RNA + enhancer There are bi-direction exons Coding in both directions Coding and anti-sense non-coding Both non-coding http://cs273a.stanford.edu [BejeranoFall13/14] 44
45
http://cs273a.stanford.edu [BejeranoFall13/14] 45
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.