Presentation is loading. Please wait.

Presentation is loading. Please wait.

Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome Brooke Peterson-Burch Voytas Laboratory Iowa State University.

Similar presentations


Presentation on theme: "Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome Brooke Peterson-Burch Voytas Laboratory Iowa State University."— Presentation transcript:

1 Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome Brooke Peterson-Burch Voytas Laboratory Iowa State University

2 Beyond genes Most DNA in eukaryotes doesn’t code for anything necessary for the survival and replication of the organism. How did that sequence get there? Why isn’t it eliminated? Genome sequences can teach us about genome evolution and the part that retroelements play

3 What’s a retroelement? Type of transposable element A mRNA copy of the parental element ‘genome’ is reverse transcribed into DNA and inserted into a new location in the host Transposition is replicative

4 Retroelement genomes pol env LTR vif vpr LTR gag MACA NC p6 PR RT RH IN TM SU tat nef HIV-1 vpu rev Retroviridae retroposons gag RTRHEN AAA n MACA NC PR RT RH IN Pseudoviridae MA CA NC PR RT RH IN Metaviridae Dirs RT RH λ Recombinase gag BEL gag PR RTRHIN

5 Element Retro living… Transcription mRNA pol env LTR vif vpr LTR gag MACA NC p6 PR RT RH IN TM SU tat nef HIV-1 vpu rev Translation LTR MACA NC PR RT RH IN LTR Pseudoviridae

6 Element Retroelement life cycle Particle Only viruses escape host cell Packaging pol env LTR vif vpr LTR gag MACA NC p6 PR RT RH IN TM SU tat nef HIV-1 vpu rev LTR MACA NC PR RT RH IN LTR Pseudoviridae

7 Element Retroelement life cycle cDNA Reverse Transcription pol env LTR vif vpr LTR gag MACA NC p6 PR RT RH IN TM SU tat nef HIV-1 vpu rev LTR MACA NC PR RT RH IN LTR Pseudoviridae

8 Element Retroelement life cycle New Copy cDNA IN Integration pol env LTR vif vpr LTR gag MACA NC p6 PR RT RH IN TM SU tat nef HIV-1 vpu rev LTR MACA NC PR RT RH IN LTR Pseudoviridae

9 Retroelements play a major role in the structure and evolution of many genomes Genome sequences provide a great resource for diversity, distribution, and element identification studies

10 Retroelements and Genomes Genome data-mining can help answer questions about: Number of Elements Types of Elements Diversity Physical distribution Impact on host Odd or interesting elements Evolutionary history Element sequence and domain characteristics

11 Diversity of the Pseudoviridae

12 A retroelement family tree Retroposons Pseudoviridae BEL Dirs Retroviridae Metaviridae

13 A.thaliana captures all plant Pseudoviridae diversity Retroposons Pseudoviridae BEL Dirs Retroviridae Metaviridae

14 Mapping proteases to HIV-1 structure helps explain patterns of conservation LTR MACA NC RT RH IN LTR PR

15 Integrase: what’s happening in the back? LTR MACA NC RT RH LTR PR IN

16 Putative env gene is conserved across species

17 0.1 changes Retroviridae Pseudoviridae Metaviridae Putative retroviruses Retroviruses independently evolved at least twice in plants

18 retrovirus envlike-coding regions show a bipartite structural organization Endovir1-1 env 668 aa ToRTL1 env 31% ID 24% ID 648 aa SIRE-1 env 476 aa pol env LTR vif vpr LTR gag MACA NC p6 PR RT RH IN TM SU tat nef HIV-1 vpu rev

19 Gag surprises… Putative retrovirus group (Hemi/Pseudo)virus BCCAAB A B A C CB LTR RT RH LTR PR IN MACA NC Gag is much larger in the retroviral lineage Sequence and structural conservation is evident

20 Diversity of the Pseudoviridae family summary Enzymatic regions appear to be highly constrained other than the IN C-terminus. Arabidopsis LTR retrotransposons are representative of plant elements in the family The putative retroviruses represent an uniquely evolving Pseudoviridae lineage bearing numerous changes in the retrotransposon genome. Sub-lineage differences suggest areas to focus experimental efforts for functional studies. Gag shows greater sequence conservation than previously thought

21 Summary continued… envlike-coding regions have been evolutionarily conserved indicating a functional role for the ORF features suggestive of viral env proteins have been identified in all LTR retrotransposon envlike ORFs putative env proteins have evolved in at least two independent plant LTR retrotransposon lineages, giving credence to the hypothesis that retroviruses evolved from retrotransposons

22 Organization of the retroelement populations of the Arabidopsis genome

23 Do retroelements of higher eukaryotes choose where they integrate? Is yeast a good model? Multicellular organism genome projects have noted that transposable element numbers are markedly increased near centromeres. This project quantitatively documents these anecdotal observations for the Arabidopsis genome

24 Completed genome? 10MB2030405060708090 3 4 X 28.0 2

25 RetroMap: a graphical tool for simplifying whole-genome analysis of retroelements

26 RetroMap Features RetroMap provides the following tools to work with genome data: Parse blast results Assign Lineages or arbitrary groupings to retroelements View chromosomal locations Identify and extract LTRS Identify and extract full length elements Assign ages to complete LTR retroelements Extract sequence(s) for hits Visualize hit open reading frames Generate information about neighboring annotated features (Arabidopsis thaliana only) Generate tab-delimited datafiles of retroelement information for direct import into statistical software packages

27 Overview of how RetroMap generates retroelement data for a genome

28 Starting eprobe sequences 0.1 TAtRL ta11 L1 Hs R2 Dm. R1 Dm Jockey Dm 996 Tca2 Ca. Ty5 Sp copia Dm Art1 At Endovir1 1 At SIRE1 Gm 1000 Pao Bm BEL Dm Mazi Dm Roo Dm 1000 Prt1 Pbla Dirs1 Dd PAT Pred 861 HIV1 RSV SnRV MMLV WDSV Cer1 Ce Osvaldo Db Athila At con Ty3 Sc sushi Fr Tf1 Spom 946 988

29 A. thaliana LTR retrotransposon genome overview Full-lengthSolo LTRsRT onlyA. thal DNA Retroposon-- 3110.22% Pseudoviridae220483831.25% Metaviridae21728031433.16% Athila47-- 0.60% Tat48-- 0.50% Metavirus88-- 0.64% Totals43732865374.63%

30 A. thaliana retroelements consist of retroposons and only two LTR families Pseudoviridae elements are significantly shorter (p=.0001)

31 Dating LTR retrotransposons gag pol identical at time of insertion Relative ages can be estimated from the sequence divergence (genetic distance) of the LTRs e.g. T = d (genetic distance: 1 – (% identity ÷ 100)) 2k (k: nucleotide substitution rate for genome)

32 Pseudos are younger than Metas. The Athila sublineage being the oldest tested

33 A. thaliana RT distributions

34 Going solo homologous recombination loops out and deletes retroelement internal sequences host DNA Full-length element solo LTR

35 Where have they been?

36 No family distribution is random Metaviridae Athila and Tat are found preferentially inside heterochromatic regions, others groups are not Pseudoviridae and retroposon distributions are not significantly different Solo LTRs show same distributions as full-length family members

37 Hypotheses Retroelement lineages show ‘universal’ organizational characteristics on the family level General retroelement abundance at centromeres is due to reduced elimination…the ‘graveyard scenario’ Metaviridae in Arabidopsis are targeted to heterochromatin

38 Conclusions Heterochromatic regions DO appear to act as graveyards, at least in the case of the Pseudoviridae (and presumably the retroposons) Younger Pseudoviridae elements tend to be found outside of heterochromatin Solo LTR distributions indicate that homologous recombination between LTRs is not greatly inhibited in heterochromatin The Metaviridae lineages appear to use targeting in their interactions with the host genome

39 Acknowledgements So many people helped make this research happen, I couldn’t have done it without their support and input. Special thanks go to the many members of the Voytas lab, past and present, undergrads too! I’ve been lucky to have good collaborators who are interesting and fun to work with. These have included Dr. Nettleton, Dr. Wright, Dr. Laten from Loyola University, and always Dr. Voytas. To the head honcho: no one can say it hasn’t been a crazy, crazy ride. Thanks. :o)

40

41 Basic Hit Redundancy Elimination Scheme Query sequence 1)Simple match, no overlap with nearest hit, no compression case 1 case 2 2)Overlap case(s) both hits merged into one representing their combined maximum extent on the database sequence case 3 3)Two non-overlapping hits which should be combined: a)Left checks it’s boundary position on its query sequence and determines if the other hit falls within that range. If so merge. b)Right repeats the proceedure if Left failed to indicate a merge case 4 4)An example of a merge case which may lead to false positives

42 BLAST false-positive amplification problem RT Blast Round 1 RT RT LTR RT RT LTR RT RT Blast Round 2

43 LTR prediction Works only for hits of a sequence interior to LTRs 10 kb Blast2Sequences Genome sequence Hit Hit Hit Blast2Sequences is used to detect repeats 10kb of sequence upstream and downstream are compared Innermost matching repeats are taken to be the LTRs

44 LTR Identification Errors Hit Predicted element Hit Tandem elements 10 kb Hit1 Hit2 Nested elements 10 kb Hit2 Predicted element Hit pA 10 kb Degenerate or simple internal repeat elements Hit

45 Sample distribution data Sample hit neighbors annotation data


Download ppt "Diversity and survival strategies of LTR retrotransposons in the Arabidopsis genome Brooke Peterson-Burch Voytas Laboratory Iowa State University."

Similar presentations


Ads by Google