Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome Brooke Peterson-Burch Voytas Laboratory Iowa State University.

Slides:



Advertisements
Similar presentations
Site-specific recombination
Advertisements

LINEs and SINEs ….& towards cancer! Presenter: Manindra Singh Course: MCB 720 (Winter Qt.)
Chap. 6 Problem 2 Protein coding genes are grouped into the classes known as solitary (single) genes, and duplicated or diverged genes in gene families.
Homology Based Analysis of the Human/Mouse lncRNome
Lecture 13. Retroid viruses See chapter 7, and appendix 1 pp. 835 – 837. The retroviral life cycle Salient features: Viral RNA genome is reverse transcribed.
Retroviruses And retroposons
Retroviruses and Trans(retro)posons. DNA fragments (elements) with the capacity to move in the genome Mobile DNA (elements) in the genome Jumping genes.
Retroviruses and Retroposons Chapter Introduction Figure 22.1.
Clinical Group n Biology of HIV infection n by Duangrat Inthorn n Mechanism of Reverse Transcriptase Inhibitors and Protease Inhibitors n by Tawitch Suriyo.
Orthology, paralogy and GO annotation Paul D. Thomas SRI International.
Genomic Repetitive Elements (Human Focus). TYPES OF ELEMENTS Tandem repeats: a) satellite DNA 1) centromeric and heterochromatic 2) minisatellite 3) microsatellite.
Transposition and transposable elements
Genomes summary 1.>930 bacterial genomes sequenced. 2.Circular. Genes densely packed Mbases, ,000 genes 4.Genomes of >200 eukaryotes (45.
Retroviruses and Retroposons
Goals of the Human Genome Project determine the entire sequence of human DNA identify all the genes in human DNA store this information in databases improve.
RETROVIRUSES.
ALL SORTS OF STRATEGIES
Manipulating the Genome: DNA Cloning and Analysis 20.1 – 20.3 Lesson 4.8.
HIV Structure, Lifecycle, and Replication
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Comparison of Drosophila Genomes Li-Lun, Ho. D. melanogaster vs. D. yakuba D. yakuba genome is assembled in Apr, D. yakuba genome has 14 times higher.
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Genome Structure of Retroviruses
Analyzing transcription modules in the pathogenic yeast Candida albicans Elik Chapnik Yoav Amiram Supervisor: Dr. Naama Barkai.
Transposition Evidence Mechanisms: DNA-mediated RNA-mediated.
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
Copyright ©The McGraw-Hill Companies, Inc. Permission required for reproduction or display CHAPTER 17 RECOMBINATION AND TRANSPOSITION AT THE MOLECULAR.
GenomesGenomes Chapter 21 Genomes Sequencing of DNA Human Genome Project countries 20 research centers.
Translesion DNA Synthesis Cells bypass lesions encountered at the replication fork during DNA synthesis and correct them after replication is finished.
Fig Genome = Genic + Intergenic (or non-genic) Eukaryotic genomes: composition of human genome.
DNA Organization, Replication, & Repair. Model for the structure of the nucleosome.
ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
Chapter 21 Eukaryotic Genome Sequences
A unified classification system for eukaryotic transposable elements
Sergey G. Kurdyukov a, Yuri B. Lebedev a, Irena I. Artamonova a, Tatyana N. Gorodentseva a, a Anastasia V. Batrak a, Ilgar Z. Mamedov a, Tatyana L. Azhikina.
whole-genome duplications and large segmental duplications… …seem to be a common feature in eukaryotic genome evolution …play a crucial role in the evolution.
Non-Coding Areas & Mutations Within the human genome the majority of the DNA (~75%) is made up of sequences not involved in coding for proteins, RNA, or.
Lecture 9 Site Specific Recombination and Transposition Quiz 5 due today at 4 PM.
Introduction to Molecular Genetics Studiju materiāli / MolekularasBiologijas / Ievads MolGen / EN.
HA Hong-seok, HUH Jae-Won, KIM Dae-Soo 1, JOO Myung-Jin 2 and KIM Heui-Soo* Division of Biological Sciences, College of Natural Sciences, Pusan National.
HERVs (Human endogenous retroviruses) and LTR (long terminal repeat) - like elements are dispersed over 8% of the whole human genome. There are at least.
HIV molecular biology BTY328: Virology
Table 8.3 & Alberts Fig.1.38 EVOLUTION OF GENOMES C-value paradox: - in certain cases, lack of correlation between morphological complexity and genome.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
ABSTRACT Isolation and phylogeny of endogenous retroviral elements belonging to the HERV-K LTR in cDNA library of human fetal brain and X q 21.3 region.
MPL The DNA Sequence of chimpanzee chromosome 22 and comparative analysis with its human ortholog, chromosome 21 Bioinformatics Dae-Soo Kim.
Chapter 3 The Interrupted Gene.
How many genes are there?
Finding genes in the genome
Shai Carmi, Erez Levanon Bar-Ilan University
PLANT OF THE DAY Native of Japan Family – Melanthaceae Large plant genome – 150 Gbp DNA from a single cell stretched out end- to-end would be taller than.
Source: A DULTS AND C HILDREN L IVING WITH HIV/AIDS (Est. Dec 2007) deaths: 2,900,000 in ,100,000 in 2007 new cases: 2,500,000 in.
Retroviruses and Trans(retro)posons
Retroviruses - Retroviridae
 DNA- genetic material of eukaryotes.  Are highly variable in size and complexity.  About 3.3 billion bp in humans.  Complexity- due to non coding.
Retroviruses and Trans(retro)posons
Gene-related Sequence
Genomes and Their Evolution
SGN23 The Organization of the Human Genome
Transposable Elements And Transposition
9.11 Viruses That Use Reverse Transcriptase
Genomes and Their Evolution
Evolution of eukaryote genomes
Retroviruses and Trans(retro)posons
Fig Figure 21.1 What genomic information makes a human or chimpanzee?
Organization of the human genome
Gene Density and Noncoding DNA
Basic Local Alignment Search Tool
Xiaowu Gai, Daniel F. Voytas  Molecular Cell 
Presentation transcript:

Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome Brooke Peterson-Burch Voytas Laboratory Iowa State University

Beyond genes Most DNA in eukaryotes doesn’t code for anything necessary for the survival and replication of the organism. How did that sequence get there? Why isn’t it eliminated? Genome sequences can teach us about genome evolution and the part that retroelements play

What’s a retroelement? Type of transposable element A mRNA copy of the parental element ‘genome’ is reverse transcribed into DNA and inserted into a new location in the host Transposition is replicative

Retroelement genomes pol env LTR vif vpr LTR gag MACA NC p6 PR RT RH IN TM SU tat nef HIV-1 vpu rev Retroviridae retroposons gag RTRHEN AAA n MACA NC PR RT RH IN Pseudoviridae MA CA NC PR RT RH IN Metaviridae Dirs RT RH λ Recombinase gag BEL gag PR RTRHIN

Element Retro living… Transcription mRNA pol env LTR vif vpr LTR gag MACA NC p6 PR RT RH IN TM SU tat nef HIV-1 vpu rev Translation LTR MACA NC PR RT RH IN LTR Pseudoviridae

Element Retroelement life cycle Particle Only viruses escape host cell Packaging pol env LTR vif vpr LTR gag MACA NC p6 PR RT RH IN TM SU tat nef HIV-1 vpu rev LTR MACA NC PR RT RH IN LTR Pseudoviridae

Element Retroelement life cycle cDNA Reverse Transcription pol env LTR vif vpr LTR gag MACA NC p6 PR RT RH IN TM SU tat nef HIV-1 vpu rev LTR MACA NC PR RT RH IN LTR Pseudoviridae

Element Retroelement life cycle New Copy cDNA IN Integration pol env LTR vif vpr LTR gag MACA NC p6 PR RT RH IN TM SU tat nef HIV-1 vpu rev LTR MACA NC PR RT RH IN LTR Pseudoviridae

Retroelements play a major role in the structure and evolution of many genomes Genome sequences provide a great resource for diversity, distribution, and element identification studies

Retroelements and Genomes Genome data-mining can help answer questions about: Number of Elements Types of Elements Diversity Physical distribution Impact on host Odd or interesting elements Evolutionary history Element sequence and domain characteristics

Diversity of the Pseudoviridae

A retroelement family tree Retroposons Pseudoviridae BEL Dirs Retroviridae Metaviridae

A.thaliana captures all plant Pseudoviridae diversity Retroposons Pseudoviridae BEL Dirs Retroviridae Metaviridae

Mapping proteases to HIV-1 structure helps explain patterns of conservation LTR MACA NC RT RH IN LTR PR

Integrase: what’s happening in the back? LTR MACA NC RT RH LTR PR IN

Putative env gene is conserved across species

0.1 changes Retroviridae Pseudoviridae Metaviridae Putative retroviruses Retroviruses independently evolved at least twice in plants

retrovirus envlike-coding regions show a bipartite structural organization Endovir1-1 env 668 aa ToRTL1 env 31% ID 24% ID 648 aa SIRE-1 env 476 aa pol env LTR vif vpr LTR gag MACA NC p6 PR RT RH IN TM SU tat nef HIV-1 vpu rev

Gag surprises… Putative retrovirus group (Hemi/Pseudo)virus BCCAAB A B A C CB LTR RT RH LTR PR IN MACA NC Gag is much larger in the retroviral lineage Sequence and structural conservation is evident

Diversity of the Pseudoviridae family summary Enzymatic regions appear to be highly constrained other than the IN C-terminus. Arabidopsis LTR retrotransposons are representative of plant elements in the family The putative retroviruses represent an uniquely evolving Pseudoviridae lineage bearing numerous changes in the retrotransposon genome. Sub-lineage differences suggest areas to focus experimental efforts for functional studies. Gag shows greater sequence conservation than previously thought

Summary continued… envlike-coding regions have been evolutionarily conserved indicating a functional role for the ORF features suggestive of viral env proteins have been identified in all LTR retrotransposon envlike ORFs putative env proteins have evolved in at least two independent plant LTR retrotransposon lineages, giving credence to the hypothesis that retroviruses evolved from retrotransposons

Organization of the retroelement populations of the Arabidopsis genome

Do retroelements of higher eukaryotes choose where they integrate? Is yeast a good model? Multicellular organism genome projects have noted that transposable element numbers are markedly increased near centromeres. This project quantitatively documents these anecdotal observations for the Arabidopsis genome

Completed genome? 10MB X

RetroMap: a graphical tool for simplifying whole-genome analysis of retroelements

RetroMap Features RetroMap provides the following tools to work with genome data: Parse blast results Assign Lineages or arbitrary groupings to retroelements View chromosomal locations Identify and extract LTRS Identify and extract full length elements Assign ages to complete LTR retroelements Extract sequence(s) for hits Visualize hit open reading frames Generate information about neighboring annotated features (Arabidopsis thaliana only) Generate tab-delimited datafiles of retroelement information for direct import into statistical software packages

Overview of how RetroMap generates retroelement data for a genome

Starting eprobe sequences 0.1 TAtRL ta11 L1 Hs R2 Dm. R1 Dm Jockey Dm 996 Tca2 Ca. Ty5 Sp copia Dm Art1 At Endovir1 1 At SIRE1 Gm 1000 Pao Bm BEL Dm Mazi Dm Roo Dm 1000 Prt1 Pbla Dirs1 Dd PAT Pred 861 HIV1 RSV SnRV MMLV WDSV Cer1 Ce Osvaldo Db Athila At con Ty3 Sc sushi Fr Tf1 Spom

A. thaliana LTR retrotransposon genome overview Full-lengthSolo LTRsRT onlyA. thal DNA Retroposon % Pseudoviridae % Metaviridae % Athila % Tat % Metavirus % Totals %

A. thaliana retroelements consist of retroposons and only two LTR families Pseudoviridae elements are significantly shorter (p=.0001)

Dating LTR retrotransposons gag pol identical at time of insertion Relative ages can be estimated from the sequence divergence (genetic distance) of the LTRs e.g. T = d (genetic distance: 1 – (% identity ÷ 100)) 2k (k: nucleotide substitution rate for genome)

Pseudos are younger than Metas. The Athila sublineage being the oldest tested

A. thaliana RT distributions

Going solo homologous recombination loops out and deletes retroelement internal sequences host DNA Full-length element solo LTR

Where have they been?

No family distribution is random Metaviridae Athila and Tat are found preferentially inside heterochromatic regions, others groups are not Pseudoviridae and retroposon distributions are not significantly different Solo LTRs show same distributions as full-length family members

Hypotheses Retroelement lineages show ‘universal’ organizational characteristics on the family level General retroelement abundance at centromeres is due to reduced elimination…the ‘graveyard scenario’ Metaviridae in Arabidopsis are targeted to heterochromatin

Conclusions Heterochromatic regions DO appear to act as graveyards, at least in the case of the Pseudoviridae (and presumably the retroposons) Younger Pseudoviridae elements tend to be found outside of heterochromatin Solo LTR distributions indicate that homologous recombination between LTRs is not greatly inhibited in heterochromatin The Metaviridae lineages appear to use targeting in their interactions with the host genome

Acknowledgements So many people helped make this research happen, I couldn’t have done it without their support and input. Special thanks go to the many members of the Voytas lab, past and present, undergrads too! I’ve been lucky to have good collaborators who are interesting and fun to work with. These have included Dr. Nettleton, Dr. Wright, Dr. Laten from Loyola University, and always Dr. Voytas. To the head honcho: no one can say it hasn’t been a crazy, crazy ride. Thanks. :o)

Basic Hit Redundancy Elimination Scheme Query sequence 1)Simple match, no overlap with nearest hit, no compression case 1 case 2 2)Overlap case(s) both hits merged into one representing their combined maximum extent on the database sequence case 3 3)Two non-overlapping hits which should be combined: a)Left checks it’s boundary position on its query sequence and determines if the other hit falls within that range. If so merge. b)Right repeats the proceedure if Left failed to indicate a merge case 4 4)An example of a merge case which may lead to false positives

BLAST false-positive amplification problem RT Blast Round 1 RT RT LTR RT RT LTR RT RT Blast Round 2

LTR prediction Works only for hits of a sequence interior to LTRs 10 kb Blast2Sequences Genome sequence Hit Hit Hit Blast2Sequences is used to detect repeats 10kb of sequence upstream and downstream are compared Innermost matching repeats are taken to be the LTRs

LTR Identification Errors Hit Predicted element Hit Tandem elements 10 kb Hit1 Hit2 Nested elements 10 kb Hit2 Predicted element Hit pA 10 kb Degenerate or simple internal repeat elements Hit

Sample distribution data Sample hit neighbors annotation data